Forum Replies Created
-
AuthorPosts
-
March 28, 2024 at 9:52 am in reply to: Can not use the suggested xpath/class from Crawling Helper tool #10313
Thanks so much!
-
March 28, 2024 at 8:45 am in reply to: Can not use the suggested xpath/class from Crawling Helper tool #10311
Could please use this demo account?
I’ve put the rule in it and there is nothing from scraping.. thanks!
-
March 28, 2024 at 2:03 am in reply to: Can not use the suggested xpath/class from Crawling Helper tool #10309
Hi, I tried this before and it doesn’t work… thanks.
-
March 14, 2024 at 6:29 pm in reply to: Is it possible to scrap two webpages into one product? #10139
I see, thanks for fast replying!
-
February 15, 2024 at 5:02 pm in reply to: Web Crawl page timeout will cause all the checkbox is checked #9896
Hi,
I can’t reproduce this issue now, thanks so much for fixing it.
When will you release this version? I will check and install once it released.
Thanks again.
-
February 14, 2024 at 6:36 pm in reply to: Web Crawl page timeout will cause all the checkbox is checked #9890
Thanks for checking. I’ve tried to reproduce again and this time is clear with screenshot recording and all the checkbox are checked.
https://drive.google.com/file/d/1zXQU-qRWqUje04T3jCQFzE7xLzuICY-D/view?usp=drive_link
It seems a rare case but it will cause so many troubles if we have many rules need to be fixed.
Thank you!
-
February 14, 2024 at 5:37 pm in reply to: Web Crawl page timeout will cause all the checkbox is checked #9887
Hi, I successful reproduced this issue on your demo site.
Could you please take a look and see if anything you could found?
But the interesting thing is this time, all the checkbox are unchecked.
What I tried is using Browser Network condition to set it as offline, and click “Save Changes” button.
Then change to “Slow 3G” and click back page.
Thanks.
Attachments:
You must be logged in to view attached files. -
February 6, 2024 at 6:26 pm in reply to: How to remove specific html attributes from content crawling #9813
Hi, I just tried, but it will also removed the element I need to crawl.
I need //div[@id=”procuct-table”] but wanna remove its class attributes “tab-pane fade py-sm-5″, is it possible to do?
Here is the HTML for reference, thanks.
<div id=”procuct-table” class=”tab-pane fade py-sm-5″ role=”tabpanel” aria-labelledby=”nav-profile-tab”>
-
February 6, 2024 at 6:07 pm in reply to: Web Crawl page timeout will cause all the checkbox is checked #9812
Hi,
1, go to “Web Crawl to Posts” and make sure there are some existed rules in it.
2, try to edit something in a rule, and click “OK” to leave the rule setting.
3, try to idle this page for a long while (maybe go sleep for the laptop) = to create a situation that clicking “Save Settings” will show “this page is expired”.
4, go back to the page, and all the checkbox will be checked.
Sorry I can’t provide the admin to you since it’s online website, but it would be grateful if you could try above steps.
Thanks,
Luke
-
January 16, 2024 at 4:01 pm in reply to: How to check why some pages are not scraped? #9692
Hi, thanks for replying.
I found the reason is I scraped some pages before and put these pages on the trashcan. (not deleted)
Therefore, this rule automatically ignored already scraped pages…
It’s expected behavior for this plugin right?
And for such issues, is there any change logs or event tracking to see what happened when crawling?
Thanks,
Luke
-
December 25, 2023 at 7:42 am in reply to: Can’t crawl the pageurl via visual selector for kolin.com #9458
Thanks for replying!
I’ve encountered another issue while crawling content page.
Here is the log:
[25-Dec-2023 15:26:12 Etc/GMT-8] Failed to exec curl in crawlomatic_curl_exec_utf8! https://kolin.com.tw/assets/uploads/files/product/3fridge_cate/KR-258V05/KR-258V05_%E5%95%86%E8%AA%AA02.png – err: Connection timed out after 10001 milliseconds – 28 url: https://kolin.com.tw/assets/uploads/files/product/3fridge_cate/KR-258V05/KR-258V05_%E5%95%86%E8%AA%AA02.png
The page I crawled: https://kolin.com.tw/product/fridge/518 & xpath: //div[@class=’row pdt_content’]
It seems the content image is too big and timed out, may I ask how to solve such questions?
BTW, also wanna ask what’s <b>Crawled Pages Crawling Query </b>for and when I will need to use it?
Thanks and merry x’mas,
Luke
-
AuthorPosts