Hello,
I checked again and indeed, this issue was caused by the scraped page limiting the usage of their images, because requests for image accessing were made too fast one after another. A scraping limiter kicks in on their part and denied access to some images.
I tried to get around this limitation by adding in importing rule settings for rule ID 81: ‘ Delay Between Multiple Requests (ms)’ -> 1000 and also ‘Set Custom Curl User Agent’ -> Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36
However, unfortunately none of the above helped scrape all the images correctly.
I am not yet sure which content scraping protection they are using, but I suspect that getting around it would be possible only by installing a headless browser on your server (like Puppeteer) and combining the plugin with it. However, I am not 100% sure about this neither that it will help. Depends on the scraping protection system’s aggressivity.
Please check details on the above, here: https://www.youtube.com/watch?v=g99IlDkt_SY
How to install Puppeteer on your server (VPS only): https://www.youtube.com/watch?v=KNOIJA4pTQo
Please check.
Regards, Szabi – CodeRevolution.