Thank you for contacting me. Please note that I live in the GMT+3 time zone - responses might be delayed by this.
Tagged: crawlomatic, proxy, woocommerce
This topic has 4 replies, 2 voices, and was last updated 1 year, 12 months ago by Szabi – CodeRevolution.
-
AuthorPosts
-
-
December 28, 2022 at 12:01 pm #6497
Hi Szabi, so i tried to scrape products with prices, descriptions etc.. And this site that im trying to scrape, but i failed because in the site there is activated recaptcha, tried using single static proxy, multiple static proxies and rotating proxy, but all of them doesn’t solve the problem…
The error that i get: [28-Dec-2022 11:59:40 UTC] Now processing: https://www.ceneo.pl/Zegarki/Typ:Meskie.htm
[28-Dec-2022 11:59:40 UTC] Puppeteer command: node “/var/www/html/wp-content/plugins/crawlomatic-multipage-scraper-post-generator/res/puppeteer/puppeteer.js” “https://www.ceneo.pl/Zegarki/Typ:Meskie.htm” “23.109.113.60:9001~~~9vRzeMMeNZYAOAln:wifi;af;;;” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36” “default” “default” “30000” “default” “default” “default” 2>&1
[28-Dec-2022 12:00:11 UTC] puppeteer failed to download resource: https://www.ceneo.pl/Zegarki/Typ:Meskie.htm – error: /var/www/html/wp-content/plugins/crawlomatic-multipage-scraper-post-generator/res/puppeteer/puppeteer.js:10 process.on(‘unhandledRejection’, up => { throw up }) ^ TimeoutError: Navigation timeout of 30000 ms exceeded at LifecycleWatcher._LifecycleWatcher_createTimeoutPromise (/var/www/html/node_modules/puppeteer/lib/cjs/puppeteer/common/LifecycleWatcher.js:167:12)
[28-Dec-2022 12:00:11 UTC] Delay between requests set(1), waiting 1000 ms
[28-Dec-2022 12:00:22 UTC] crawlomatic_str_get_html failed for page (first attempt), xpath is: product-full-description js_product-full-description overheight!
[28-Dec-2022 12:00:22 UTC] crawlomatic_str_get_html failed for page, xpath: product-full-description js_product-full-description overheight!
[28-Dec-2022 12:00:22 UTC] Already posted, skipping: https://www.ceneo.pl/Zegarki/Typ:Meskie.htm – ID: 1117
[28-Dec-2022 12:00:22 UTC] Crawling seed page for links: https://www.ceneo.pl/Zegarki/Typ:Meskie.htm using: visual = lazyloaded
[28-Dec-2022 12:00:22 UTC] 0 items scraped for URL: https://www.ceneo.pl/Zegarki/Typ:Meskie.htm
[28-Dec-2022 12:00:22 UTC] All crawled posts are already posted or no content found for your query. Rule ID: 1: visual — lazyloaded -
December 28, 2022 at 1:30 pm #6499
Hello,
First of all, thank you for your purchase.
Can you send me, please, temporary admin login credentials to your WordPress install, so I can do some tests in the plugin setup? Send it, please, to my email address: kisded@yahoo.com
I will try to help to solve this issue, however, I cannot promise that I can make it work, as some recaptchas are very resilient and almost impossible to bypass. But I will try.
Regards, Szabi – CodeRevolution.
-
December 28, 2022 at 2:01 pm #6500
This reply has been marked as private. -
December 28, 2022 at 2:04 pm #6501
This reply has been marked as private. -
December 28, 2022 at 4:27 pm #6502
Ok, I will let you know in email when I have an update on this.
-
-
AuthorPosts
The topic ‘Proxy problems’ is closed to new replies.