Thank you for contacting me. Please note that I live in the GMT+3 time zone - responses might be delayed by this.
This topic has 5 replies, 2 voices, and was last updated 1 year, 11 months ago by Szabi – CodeRevolution.
-
AuthorPosts
-
-
December 23, 2022 at 3:16 pm #6478
Excellent plugin. I have said it before.
I am using it on my site http://www.neuquen.uno
I’m crawling almost 25 sites.I have a problem crawling the following site:
https://www.legislaturaneuquen.gob.ar/
or it can be also
https://www.legislaturaneuquen.gob.ar/prensaNuevo.aspxIt is developed with Active Server Page Extended (aspx)
Can Crawlomatic crawl sites developed with asp.net?
regards
-
December 23, 2022 at 8:45 pm #6479
Hello,
First of all, thank you for your purchase.
I managed to scrape this site using Crawlomatic, using the below settings:
Scraper Start (Seed) URL / Keywords
https://www.legislaturaneuquen.gob.ar/prensaNuevo.aspxDo Not Scrape Seed URL:
checkedSeed Page Crawling Query Type:
IDSeed Page Crawling Query String:
ctl00_ContentPlaceHolder1_DataList1Content Query Type
IDContent Query String
ctl00_ContentPlaceHolder1_lblTextoTitle Query Type
IDTitle Query String
ctl00_ContentPlaceHolder1_lblTituloFor https://www.legislaturaneuquen.gob.ar/ it is the same, excepting the below differences:
Scraper Start (Seed) URL / Keywords
https://www.legislaturaneuquen.gob.ar/Seed Page Crawling Query Type:
IDSeed Page Crawling Query String:
ctl00_ContentPlaceHolder1_div_NoticiasEnNoPrincipalesI hope this info helps.
Regards, Szabi – CodeRevolution.
-
December 24, 2022 at 1:29 am #6480
Thank you very much!, the only thing missing are the images.
Could you please help me with it?regards
-
December 24, 2022 at 12:55 pm #6481
For this specific site, scraping featured images is possible only using Puppeteer, which is a headless browser, which renders also JavaScript on scraped pages. If possible, please install Puppeteer on your site, as shown here: https://www.youtube.com/watch?v=pRUDcSOe724
If this is not possible for you, you can use also HeadlessBrowserAPI: https://www.youtube.com/watch?v=rj-LOI-sc14
Afterwards, please configure Crawlomatic as follows:
Content Scraping Method To Use
PuppeteerFeatured Image Query Type
IDFeatured Image Query String
ctl00_ContentPlaceHolder1_Image3Regards.
-
December 30, 2022 at 2:27 am #6506
I have access to the server (CENTOS)
I already installed Puppeteer according to the following instructions
https://coderevolution.ro/knowledge-base/faq/how-to-install-puppeteer-globally-on-centos/
Crawlomatic throws an error when I choose Puppeteer
shell_exec is not enabled on your server.
Besides this, what are all the features I need to enable?
Regards
-
December 30, 2022 at 4:39 am #6507
Hello,
You also need to enable shell_exec on your server, please check details: https://www.namecheap.com/support/knowledgebase/article.aspx/9396/2219/how-to-enable-exec/
Regards.
-
-
AuthorPosts
The topic ‘Can Crawlomatic crawl sites developed with asp.net?’ is closed to new replies.