Scrapping a gallery that contains multiple images on mutiple pages.

This topic is: resolved

 

Thank you for contacting me. Please note that I live in the GMT+3 time zone - responses might be delayed by this.

Viewing 4 reply threads
  • Author
    Posts
    • #10637


      Bekk1n
      Participant
      Post count: 2

      Hello,

      I’m trying to scrape content from https://fapello.com/ (porn)

      If I run the scrapper on a page with content, for example this page: https://fapello.com/ayelenn/

      The scrapper copies the small gallery images that are on that page Url above into the post. But what I’d like to do is for the scrapper to copy each large image that you can see after clicking on each of the gallery images and add all those images to the wordpress gallery for that post.

      Basically I’m trying to achieve the same “Gallery” result as the website I’m copying from.

      Each link in the gallery has a number after it. For example: https://fapello.com/ayelenn/1/ https://fapello.com/ayelenn/2 https://fapello.com/ayelenn/3

      I would need to use regex to accomplish this?

      Also how do I add the images to the wordpress gallery instead of embedding them into the post?

      Can you please provide some guidance on how to achieve this?

      I can provide more details if needed.

      Thanks.

       

    • #10640


      Szabi – CodeRevolution
      Keymaster
      Post count: 4573

      Hello,

      This site can be scraped using the Crawlomatic plugin, please check it here: https://1.envato.market/crawlomatic

      Tutorial video on usage: https://www.youtube.com/watch?v=F6vhRJgCR_M&list=PLEiGTaa0iBIgcqNzVBaoTCS4ws47vNMuQ&index=2

      Regards, Szabi – CodeRevolution.

    • #10646


      Bekk1n
      Participant
      Post count: 2

      Hello,

      Thanks for the reply. I have the plugin installed and running. I’ve followed the video and used the visual selector to select an image on the seed page. It found the class: //*[@class=’w-full h-full absolute object-cover inset-0′]

      When I run the scraper, it doesn’t work and comes back with the error: All crawled posts are already posted or no content found for your query. Rule ID: 0: visual — //*[@class=’w-full h-full absolute object-cover inset-0′]

      The post doesn’t exist but the class for each gallery image does exist.

      I changed the <b>Seed Page Crawling Query Type:</b> class and added: w-full h-full absolute object-cover inset-0

      and it still doesn’t work.

      I also tried the class: max-w-full lg:h-64 h-40 rounded-md relative overflow-hidden uk-transition-toggle

      but it still didn’t work.

      Testing on this page: https://fapello.com/jadeyanh-20/

      Not sure how to proceed.

      Thanks

    • #10647


      Bekk1n
      Participant
      Post count: 2

      Hello,

      I seem to have made it partially work.

      I can scrape the first image in the gallery by using:
      <b>Seed Page Crawling Query Type:</b> ID
      Seed Page Crawling Query String: content

      And the:
      Content Query Type (Optional): Class
      <b>Content Query String (Optional)</b>: uk-align-center

      But how do I make the scaper access every gallery link and scrape each image?

      Right now it only goes through the first image gallery link.

      Thanks

    • #10649


      Szabi – CodeRevolution
      Keymaster
      Post count: 4573

      Hello,

      Please send me temporary admin login credentials to your WordPress install and I will try to help set up the plugin on your site. Send it, please, to my email address: kisded@yahoo.com

      Regards, Szabi – CodeRevolution.

Viewing 4 reply threads

The topic ‘Scrapping a gallery that contains multiple images on mutiple pages.’ is closed to new replies.