Features should be implemented

This topic is: resolved

 

Thank you for contacting me. Please note that I live in the GMT+3 time zone - responses might be delayed by this.

Viewing 11 reply threads
  • Author
    Posts
    • #675


      Omini
      Participant
      Post count: 21

      Hello,
      I spent sometimes to explore crawlomatic and I’m very happy to have this plugin.
      However, it’s should have some features are very important:

      1- Wait Until Element Exist

      Sometime sites have a counter to show the content like (please wait 5 seconds to show the url) whoever now i think we can’t crawling this types of content

       

      2- Strip HTML Based On Regex & Strip HTML Based On XPATH

      We need to have this features because we need to strip content more complex than only tags and etc.

       

      3- Crawl link generated by previous crawled rule

      Right now we can only scraping the fix content or fix url we defined on the Scraper Start (Seed) URL however if we need more advance thing like scrap url that generated by a another rule.
      This feature really important because alot of sites (Download sites) have +2 pages to download the link for example:
      Click here to download (url-1) when you click that opening a new url (url-2) that tell you wait xseconds and the download should start automatically so we need to scrap the (url-2) and that should be happen if we can generatedynamic link to scrap

       

      4- Upload file on UpToBox

      This feature is really important. Why?
      Alot of sites has a download links on there server and if we scraping this download links the url of that websites will shown on our worldpress site and that is not what we want.
      So UpToBox Have a very good features that enable upload the content on it based on the download url Like here and they offer details API and how to use it at HERE

       

      Regards

    • #676


      Omini
      Participant
      Post count: 21

      5- Regex find/replace

      Now we can only Replace regex matches with one thing only, but what if i need to replace facebook.com with twitter.com and x.com with y.com based on regex?

    • #677


      Omini
      Participant
      Post count: 21

      6- Proxies usage

      I tried to implement proxy on my project but it not worked i don’t know what is the problem i searched on your youtube chanel to any tutorial that explaine how to use proxies but its not available.
      Also right now i think its accept only HTTP but what if we need to use Socks5 proxies?

    • #678


      Omini
      Participant
      Post count: 21

      7- Scraping the comments also!

      It will be super if we can also scrap the comments on his post instead of generated random comments using uComment plugin.

    • #679


      Omini
      Participant
      Post count: 21

      8- Shortcut links supported by using Json or Regex

      Now the only shortcut link supported is shorte.st however it’s really easy to make an option to put the API request URL and read the response as json or regex.
      For example:
      API URL Field:
      <div class=”well”>https://clk.sh/api?api=<b>5cafed9b61dab477dcaf77a1a30b1f09e0a84961</b>&url=<b>yourdestinationlink.com</b>&alias=<b>CustomAlias</b></div&gt;
      <div></div>
      <div>this api call will return response like this:</div>
      <div>          {“status”:”success”,”shortenedUrl”:””https:\/\/clk.sh\/xxxxx””}</div>
      <div>
      <b>Shortcut Query Type (Optional):</b> Regex</div>
      <div><b>API Response Query String (Optional): “(https?.*?)”

      </b>So now we can add any url shortcut for earnings with dynamic way, Not only shorte.st.</div>

    • #681


      Omini
      Participant
      Post count: 21

      9- Crawlomatic Auto-Generated Post Information

      I don’t see this option on my posts!
      http://prntscr.com/ptlpvd
      It’s works fine? it’s really important to know from where the articles imported.

    • #683


      Omini
      Participant
      Post count: 21

      Related to Point number (9) Crawlomatic Auto-Generated Post Information
      It’s solved by installing the Classic Editor plugin for WordPress.

    • #686


      Szabi – CodeRevolution
      Keymaster
      Post count: 4179

      Hello,

      1. The counter you pointed out is generated using JavaScript. The plugin can import JavaScript generated content only if you use phantomjs together with the plugin. Please check this tutorial video for details: https://www.youtube.com/watch?v=hnEPlQSeAZE

      2. Regex stripping is possible, using the ‘Run Regex On Content’ and ‘Replace Matches From Regex’ features in rule settings. Xpath stripping is a good idea, I added this to my update ideas list.

      3. I will think on how to do this, but it sounds like something that is really complicated to implement into the plugin.

      4. I check on uptobox.

      5. Yes, this is a current plugin limitation, I will be thinking on expanding it.

      6. Socks5 proxies should also work in the plugin. You need to specify their protocol in plugin settings. Example: socks5://bob.marley.com:222

      7. This is not implemented yet, sounds like a good idea for the future.

      8. This is actually much more complicated than you explained, each url shortner has a different way of authentication for their api and a different way in sending back their responses and interpreting their response structure. Dynamically, this would be hard to implement, and for sure it would not work with any api.

      9. Ok.

      Regards,

      Szabi – CodeRevolution.

    • #687


      Omini
      Participant
      Post count: 21

      8. a lot of shortcut sites implemented like this API like clk.sh, exe.io, fl.cl, and many many more.
      so we can say that this will not be working on all shortcut sites to check bla bla bla.

       

    • #688


      Szabi – CodeRevolution
      Keymaster
      Post count: 4179

      Ok, I will check this.

    • #701


      Omini
      Participant
      Post count: 21

      10- Add more Data to Metabox

      Right now we can only make the plugin ( Show Extended Item Information Metabox in Post) but what if we want to add additional data to that box? like for example download links on that post from our scraping, or any thing.

    • #702


      Szabi – CodeRevolution
      Keymaster
      Post count: 4179

      Yes, I will consider this also.

      Regards.

Viewing 11 reply threads

The topic ‘Features should be implemented’ is closed to new replies.