Getting a Blank Page when I use Visual Selector

This topic is: resolved

 

Thank you for contacting me. Please note that I live in the GMT+3 time zone - responses might be delayed by this.

This topic has 1 reply, 2 voices, and was last updated 3 years ago by Szabi – CodeRevolution.

Viewing 1 reply thread
  • Author
    Posts
    • #4272


      ncienfuegos
      Participant

      <div>There are a number of websites that I want to crawl that I get a blank page when I select  the Crawling Restrictions and then select the Visual Selector on the section:</div>
      <div class=”bws_help_box bws_help_box_right dashicons dashicons-editor-help cr_align_middle”></div>
      <b>Seed Page Crawling Query Type:</b>

       

      Is there a way to bypass this or to resolve this?, as about 50% of the websites I want to crawl and select via Visual Selector it gives me back this error.

       

      Here is an example:

      https://www.bizjournals.com/orlando/news/residential-real-estate

      This particular page gave me back this message instead of a blank page:

       
      <h2>Why am I seeing this page?</h2>
      The website you are visiting is protected and accelerated by Incapsula. Your computer may have been infected by malware and therefore flagged by the Incapsula network. Incapsula displays this page for you to verify that an actual human is the source of the traffic to this site, and not malicious software.
      <h2>What should I do?</h2>
      Just click the <b>I’m not a robot</b> checkbox to pass the security check. Incapsula will remember you and will not show this page again. We recommend you run a virus and malware scan on your computer to remove any infection.

      Attachments:
      You must be logged in to view attached files.
    • #4275


      Szabi – CodeRevolution
      Keymaster
      Post count: 4620

      Hello,

      First of all, thank you for your purchase.

      This is caused by the scraping protection mechanisms which are active on the sites you wish to scrape.

      To make this work, you can try one or multiple suggestions listed below:

      • Add a user agent to requests. You can do this using the ‘Set Custom Curl User Agent’ settings field. You can add there the user agent of the latest Chrome browser, ex: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36
      • You can install puppeteer on your server and configure the plugin to use it. Puppeteer is a headless browser, which will simulate a real browser when scraping pages and get access to more sites. You can do this by selecting Puppeteer in the ‘Content Scraping Method To Use’ settings field in importing rule settings in the plugin. Steps to install puppeteer: https://www.youtube.com/watch?v=KNOIJA4pTQo

      I hope this info helped.

      Regards, Szabi – CodeRevolution.

Viewing 1 reply thread

The topic ‘Getting a Blank Page when I use Visual Selector’ is closed to new replies.