Thank you for contacting me. Please note that I live in the GMT+3 time zone - responses might be delayed by this.
-
AuthorPosts
-
-
September 1, 2021 at 7:10 pm #3726
Hi, first of all, I would like to thank you for the amazing plugins that you have created. I’m a proud owner of two of the CRAWLOMATIC AND NEWSOMATIC plugins that I bought the last week and have been using then.
I do I have a little problem that I have try all the methods without result and I ask for your help.
I have been trying to crawl content from this site below.
https://www.mynewsdesk.com/se/stories/
I only get an error and when I try to use the visual selector only appears a blank page, Can you please help me with this issue.
Thank you so much in advance,
Wishing you a nice evening.
Kind regards.
Leif Hansen
-
September 2, 2021 at 7:54 am #3729
Hello,
First of all, thank you for your purchase.
I checked the URL you linked and it seems that the specific site uses JavaScript to load its content (after the page is loaded in the browser of the visitor). Because of this, regular PHP scrapers cannot parse these links – because they are not visible for them.
However, the Crawlomatic plugin can be configured to scrape also this content, if it is combined with a headless browser (like Puppeteer or PhantomJS) – which needs to be installed on your server, or with HeadlessBrowserAPI (an API I created, which provides JavaScript generated content, without the need to install anything on your server).
Please check details about this in the videos below:
Puppeteer support: https://www.youtube.com/watch?v=g99IlDkt_SY
How to install Puppeteer: https://www.youtube.com/watch?v=XkVfYWRZpko
HeadlessBrowserAPI (as an alternative): https://www.youtube.com/watch?v=205EinBQAoo&list=PLEiGTaa0iBIjDrfexapWc3M28iHwJI5tT&index=2
I hope this info helps.
Regards, Szabi – CodeRevolution.
-
September 2, 2021 at 8:08 am #3731
Hi Szabi,
Thank you so much for your quick response. I will have a look at the links you have sent to me.
Once again, thank you so much for an amazing job, creating WordPress plugins solutions.
Keep the good work and wishing you a wonderful day.
Kind regards.
Mario Leif
-
September 2, 2021 at 8:23 am #3732
I also thank you and a great day to you too!
Cheers!
-
September 2, 2021 at 9:39 am #3737
Thank you so much Szabi, I have created a subscription with you for the HeadlessBrowserAPI, and I have added the API key, and is working.
Wish you a wonderful day.
Best regards.
Mario Leif
-
September 2, 2021 at 9:40 am #3738
Thank you, i am glad to help!
Regards.
-
-
AuthorPosts
The topic ‘Triying to craw and get content from https://www.mynewsdesk.com/se’ is closed to new replies.