Thank you for contacting me. Please note that I live in the GMT+3 time zone - responses might be delayed by this.
This topic has 11 replies, 2 voices, and was last updated 5 years, 1 month ago by Szabi – CodeRevolution.
-
AuthorPosts
-
-
November 7, 2019 at 12:41 am #675
Hello,
I spent sometimes to explore crawlomatic and I’m very happy to have this plugin.
However, it’s should have some features are very important:1- Wait Until Element Exist
Sometime sites have a counter to show the content like (please wait 5 seconds to show the url) whoever now i think we can’t crawling this types of content
2- Strip HTML Based On Regex & Strip HTML Based On XPATH
We need to have this features because we need to strip content more complex than only tags and etc.
3- Crawl link generated by previous crawled rule
Right now we can only scraping the fix content or fix url we defined on the Scraper Start (Seed) URL however if we need more advance thing like scrap url that generated by a another rule.
This feature really important because alot of sites (Download sites) have +2 pages to download the link for example:
Click here to download (url-1) when you click that opening a new url (url-2) that tell you wait xseconds and the download should start automatically so we need to scrap the (url-2) and that should be happen if we can generatedynamic link to scrap4- Upload file on UpToBox
This feature is really important. Why?
Alot of sites has a download links on there server and if we scraping this download links the url of that websites will shown on our worldpress site and that is not what we want.
So UpToBox Have a very good features that enable upload the content on it based on the download url Like here and they offer details API and how to use it at HERERegards
-
November 7, 2019 at 1:03 am #676
5- Regex find/replace
Now we can only Replace regex matches with one thing only, but what if i need to replace facebook.com with twitter.com and x.com with y.com based on regex?
-
November 7, 2019 at 6:02 am #677
6- Proxies usage
I tried to implement proxy on my project but it not worked i don’t know what is the problem i searched on your youtube chanel to any tutorial that explaine how to use proxies but its not available.
Also right now i think its accept only HTTP but what if we need to use Socks5 proxies? -
November 7, 2019 at 6:14 am #678
7- Scraping the comments also!
It will be super if we can also scrap the comments on his post instead of generated random comments using uComment plugin.
-
November 7, 2019 at 7:47 am #679
8- Shortcut links supported by using Json or Regex
Now the only shortcut link supported is shorte.st however it’s really easy to make an option to put the API request URL and read the response as json or regex.
For example:
API URL Field:
<div class=”well”>https://clk.sh/api?api=<b>5cafed9b61dab477dcaf77a1a30b1f09e0a84961</b>&url=<b>yourdestinationlink.com</b>&alias=<b>CustomAlias</b></div>
<div></div>
<div>this api call will return response like this:</div>
<div> {“status”:”success”,”shortenedUrl”:””https:\/\/clk.sh\/xxxxx””}</div>
<div>
<b>Shortcut Query Type (Optional):</b> Regex</div>
<div><b>API Response Query String (Optional): “(https?.*?)”</b>So now we can add any url shortcut for earnings with dynamic way, Not only shorte.st.</div>
-
November 7, 2019 at 8:12 am #681
9- Crawlomatic Auto-Generated Post Information
I don’t see this option on my posts!
http://prntscr.com/ptlpvd
It’s works fine? it’s really important to know from where the articles imported. -
November 7, 2019 at 8:44 am #683
Related to Point number (9) Crawlomatic Auto-Generated Post Information
It’s solved by installing the Classic Editor plugin for WordPress. -
November 7, 2019 at 9:18 am #686
Hello,
1. The counter you pointed out is generated using JavaScript. The plugin can import JavaScript generated content only if you use phantomjs together with the plugin. Please check this tutorial video for details: https://www.youtube.com/watch?v=hnEPlQSeAZE
2. Regex stripping is possible, using the ‘Run Regex On Content’ and ‘Replace Matches From Regex’ features in rule settings. Xpath stripping is a good idea, I added this to my update ideas list.
3. I will think on how to do this, but it sounds like something that is really complicated to implement into the plugin.
4. I check on uptobox.
5. Yes, this is a current plugin limitation, I will be thinking on expanding it.
6. Socks5 proxies should also work in the plugin. You need to specify their protocol in plugin settings. Example: socks5://bob.marley.com:222
7. This is not implemented yet, sounds like a good idea for the future.
8. This is actually much more complicated than you explained, each url shortner has a different way of authentication for their api and a different way in sending back their responses and interpreting their response structure. Dynamically, this would be hard to implement, and for sure it would not work with any api.
9. Ok.
Regards,
Szabi – CodeRevolution.
-
November 7, 2019 at 9:25 am #687
8. a lot of shortcut sites implemented like this API like clk.sh, exe.io, fl.cl, and many many more.
so we can say that this will not be working on all shortcut sites to check bla bla bla. -
November 7, 2019 at 9:26 am #688
Ok, I will check this.
-
November 8, 2019 at 10:03 am #701
10- Add more Data to Metabox
Right now we can only make the plugin ( Show Extended Item Information Metabox in Post) but what if we want to add additional data to that box? like for example download links on that post from our scraping, or any thing.
-
November 8, 2019 at 10:35 am #702
Yes, I will consider this also.
Regards.
-
-
AuthorPosts
The topic ‘Features should be implemented’ is closed to new replies.