Hello,
I spent sometimes to explore crawlomatic and I'm very happy to have this plugin.
However, it's should have some features are very important:
1- Wait Until Element Exist
Sometime sites have a counter to show the content like (please wait 5 seconds to show the url) whoever now i think we can't crawling this types of content
2- Strip HTML Based On Regex & Strip HTML Based On XPATH
We need to have this features because we need to strip content more complex than only tags and etc.
3- Crawl link generated by previous crawled rule
Right now we can only scraping the fix content or fix url we defined on the Scraper Start (Seed) URL however if we need more advance thing like scrap url that generated by a another rule.
This feature really important because alot of sites (Download sites) have +2 pages to download the link for example:
Click here to download (url-1) when you click that opening a new url (url-2) that tell you wait xseconds and the download should start automatically so we need to scrap the (url-2) and that should be happen if we can generatedynamic link to scrap
4- Upload file on
UpToBox
This feature is really important. Why?
Alot of sites has a download links on there server and if we scraping this download links the url of that websites will shown on our worldpress site and that is not what we want.
So UpToBox Have a very good features that enable upload the content on it based on the download url Like here and they offer details API and how to use it at HERE
Regards