Proxy problems

This topic is: resolved

 

Thank you for contacting me. Please note that I live in the GMT+3 time zone - responses might be delayed by this.

Viewing 4 reply threads
  • Author
    Posts
    • #6497


      marcus
      Participant
      Post count: 8

      Hi Szabi, so i tried to scrape products with prices, descriptions etc.. And this site that im trying to scrape, but i failed because in the site there is activated recaptcha, tried using single static proxy, multiple static proxies and rotating proxy, but all of them doesn’t solve the problem…

      The error that i get: [28-Dec-2022 11:59:40 UTC] Now processing: https://www.ceneo.pl/Zegarki/Typ:Meskie.htm
      [28-Dec-2022 11:59:40 UTC] Puppeteer command: node “/var/www/html/wp-content/plugins/crawlomatic-multipage-scraper-post-generator/res/puppeteer/puppeteer.js” “https://www.ceneo.pl/Zegarki/Typ:Meskie.htm” “23.109.113.60:9001~~~9vRzeMMeNZYAOAln:wifi;af;;;” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36” “default” “default” “30000” “default” “default” “default” 2>&1
      [28-Dec-2022 12:00:11 UTC] puppeteer failed to download resource: https://www.ceneo.pl/Zegarki/Typ:Meskie.htm – error: /var/www/html/wp-content/plugins/crawlomatic-multipage-scraper-post-generator/res/puppeteer/puppeteer.js:10 process.on(‘unhandledRejection’, up => { throw up }) ^ TimeoutError: Navigation timeout of 30000 ms exceeded at LifecycleWatcher._LifecycleWatcher_createTimeoutPromise (/var/www/html/node_modules/puppeteer/lib/cjs/puppeteer/common/LifecycleWatcher.js:167:12)
      [28-Dec-2022 12:00:11 UTC] Delay between requests set(1), waiting 1000 ms
      [28-Dec-2022 12:00:22 UTC] crawlomatic_str_get_html failed for page (first attempt), xpath is: product-full-description js_product-full-description overheight!
      [28-Dec-2022 12:00:22 UTC] crawlomatic_str_get_html failed for page, xpath: product-full-description js_product-full-description overheight!
      [28-Dec-2022 12:00:22 UTC] Already posted, skipping: https://www.ceneo.pl/Zegarki/Typ:Meskie.htm – ID: 1117
      [28-Dec-2022 12:00:22 UTC] Crawling seed page for links: https://www.ceneo.pl/Zegarki/Typ:Meskie.htm using: visual = lazyloaded
      [28-Dec-2022 12:00:22 UTC] 0 items scraped for URL: https://www.ceneo.pl/Zegarki/Typ:Meskie.htm
      [28-Dec-2022 12:00:22 UTC] All crawled posts are already posted or no content found for your query. Rule ID: 1: visual — lazyloaded

    • #6499


      Szabi – CodeRevolution
      Keymaster
      Post count: 4195

      Hello,

      First of all, thank you for your purchase.

      Can you send me, please, temporary admin login credentials to your WordPress install, so I can do some tests in the plugin setup? Send it, please, to my email address: [email protected]

      I will try to help to solve this issue, however, I cannot promise that I can make it work, as some recaptchas are very resilient and almost impossible to bypass. But I will try.

      Regards, Szabi – CodeRevolution.

    • #6500


      marcus
      Participant
      Post count: 8
      This reply has been marked as private.
    • #6501


      marcus
      Participant
      Post count: 8
      This reply has been marked as private.
    • #6502


      Szabi – CodeRevolution
      Keymaster
      Post count: 4195

      Ok, I will let you know in email when I have an update on this.

Viewing 4 reply threads

The topic ‘Proxy problems’ is closed to new replies.