Thank you for contacting me. Please note that I live in the GMT+3 time zone - responses might be delayed by this.
This topic has 7 replies, 2 voices, and was last updated 2 years, 6 months ago by Newbe.
-
AuthorPosts
-
-
May 28, 2022 at 12:15 pm #5211
hello any tutorial for scraping web like brainly.co.id
thanks
-
May 28, 2022 at 2:50 pm #5213
Hello,
First of all, thank you for your purchase.
Please give me more details about which parts of the site you wish to scrape.
Is it the search result which is found in this location? https://brainly.co.id/app/ask?entry=hero&q=test
Let me know details and I will help.
Regards, Szabi – CodeRevolution.
-
May 28, 2022 at 9:07 pm #5217
This reply has been marked as private. -
May 29, 2022 at 11:42 am #5221
Hello,
I checked and this website uses JavaScript to render its content, because of this, to be able to scrape it, you need to install Puppeteer on your server and configure the plugin to use it when scraping these sites. Please check this video for details on this: https://www.youtube.com/watch?v=g99IlDkt_SYHow to install puppeteer on your server: https://www.youtube.com/watch?v=KNOIJA4pTQoIf installing puppeteer is not possible on your server, you can also use HeadlessBrowserAPI, which is a cloud service which renders JavaScript on pages and allows scraping of them: https://headlessbrowserapi.com/Tutorial video on this: https://www.youtube.com/watch?v=rj-LOI-sc14Regards,Szabi – CodeRevolution. -
May 29, 2022 at 11:44 am #5222
Also, please check the below settings I used in the plugin to scrape the page, using puppeteer:
Scraper Start (Seed) URL / Keywords:
https://brainly.co.id/mapel/matematikaContent Scraping Method To Use:
PuppeteerDo Not Scrape Seed URL:
checkedSeed Page Crawling Query Type:
ClassSeed Page Crawling Query String:
brn-feed-item__contentContent Query Type:
ClassContent Query String:
question_box_textRegards.
-
May 29, 2022 at 12:16 pm #5223
Greats Thanks
is there a way to scraping post using post url list on txt ?
Like Import Url List and put it in queue list
Because I don’t want to take all the posts there, just take some of what I need
Regards
-
May 29, 2022 at 12:53 pm #5226
Yes, this is possible, for this, you need to create a file containing the URL list you wish to scrape and upload it to your server.
Afterwards, you can start scraping from that specific URL list file.
In this case, please be sure to uncheck the ‘Do Not Crawl External Links’ checkbox in importing rule settings, also set:
Do Not Scrape Seed URL:
checkedSeed Page Crawling Query Type:
Auto DetectI will make a tutorial video on this soon and publish it to my YouTube channel: https://www.youtube.com/channel/UCVLIksvzyk-D_oEdHab2Lgg
Regards.
-
May 29, 2022 at 12:59 pm #5227
Thank you very Much for your help and support
Regard
-
-
AuthorPosts
The topic ‘scraping web like brainly.co.id’ is closed to new replies.