Thank you for contacting me. Please note that I live in the GMT+3 time zone - responses might be delayed by this.
This topic has 4 replies, 2 voices, and was last updated 6 months, 2 weeks ago by Szabi – CodeRevolution.
-
AuthorPosts
-
-
June 8, 2024 at 2:46 am #10637
Hello,
I’m trying to scrape content from https://fapello.com/ (porn)
If I run the scrapper on a page with content, for example this page: https://fapello.com/ayelenn/
The scrapper copies the small gallery images that are on that page Url above into the post. But what I’d like to do is for the scrapper to copy each large image that you can see after clicking on each of the gallery images and add all those images to the wordpress gallery for that post.
Basically I’m trying to achieve the same “Gallery” result as the website I’m copying from.
Each link in the gallery has a number after it. For example: https://fapello.com/ayelenn/1/ https://fapello.com/ayelenn/2 https://fapello.com/ayelenn/3
I would need to use regex to accomplish this?
Also how do I add the images to the wordpress gallery instead of embedding them into the post?
Can you please provide some guidance on how to achieve this?
I can provide more details if needed.
Thanks.
-
June 8, 2024 at 10:24 am #10640
Hello,
This site can be scraped using the Crawlomatic plugin, please check it here: https://1.envato.market/crawlomatic
Tutorial video on usage: https://www.youtube.com/watch?v=F6vhRJgCR_M&list=PLEiGTaa0iBIgcqNzVBaoTCS4ws47vNMuQ&index=2
Regards, Szabi – CodeRevolution.
-
June 9, 2024 at 6:29 am #10646
Hello,
Thanks for the reply. I have the plugin installed and running. I’ve followed the video and used the visual selector to select an image on the seed page. It found the class: //*[@class=’w-full h-full absolute object-cover inset-0′]
When I run the scraper, it doesn’t work and comes back with the error: All crawled posts are already posted or no content found for your query. Rule ID: 0: visual — //*[@class=’w-full h-full absolute object-cover inset-0′]
The post doesn’t exist but the class for each gallery image does exist.
I changed the <b>Seed Page Crawling Query Type:</b> class and added: w-full h-full absolute object-cover inset-0
and it still doesn’t work.
I also tried the class: max-w-full lg:h-64 h-40 rounded-md relative overflow-hidden uk-transition-toggle
but it still didn’t work.
Testing on this page: https://fapello.com/jadeyanh-20/
Not sure how to proceed.
Thanks
-
June 9, 2024 at 6:40 am #10647
Hello,
I seem to have made it partially work.
I can scrape the first image in the gallery by using:
<b>Seed Page Crawling Query Type:</b> ID
Seed Page Crawling Query String: contentAnd the:
Content Query Type (Optional): Class
<b>Content Query String (Optional)</b>: uk-align-centerBut how do I make the scaper access every gallery link and scrape each image?
Right now it only goes through the first image gallery link.
Thanks
-
June 9, 2024 at 9:03 am #10649
Hello,
Please send me temporary admin login credentials to your WordPress install and I will try to help set up the plugin on your site. Send it, please, to my email address: kisded@yahoo.com
Regards, Szabi – CodeRevolution.
-
-
AuthorPosts
The topic ‘Scrapping a gallery that contains multiple images on mutiple pages.’ is closed to new replies.