Blob

Forum Replies Created

Viewing 4 posts - 1 through 4 (of 4 total)
  • Author
    Posts
  • in reply to: Unable to scrape image gallery #11757


    Blob
    Participant
    Post count: 4

    Hello,

    You’ve taught me that attributes are considered an id, meaning that to search for them you have to start with the “#” character. That’s going to be a great help.

    I had finally found via chatgpt a regex that also works, but still required a bit of work :
    <a[^>]*data-fancybox=[“‘][^”’]*product-gallery-9768587166039[^“’]*[”’][^>]*>\s*<picture[^>]*>\s*<img[^>]*src=[“‘]([^”’]+)[“’]

    Yours is much simpler and to the point 😉
    I’ve combined it with a regex to retrieve large images instead of thumbs : \?[^?]*
    And bim, I’ve got everything I need 😉

    Thank you very much!

  • in reply to: Improving variant scrapping #11526


    Blob
    Participant
    Post count: 4

    Hello Szabi,

    I would like to thank you and let you know that your modifications work perfectly for Shopify.
    I will test on Woocommerce soon and let you know.

    What’s really nice about your modifications is that the crawl time for a page has been divided up very significantly. So the script is much faster, but really much faster!
    It’s almost perfect for my needs, I’m impressed.

    My offer still stands for the attributes 😉

    Regards,
    Blob

  • in reply to: Improving variant scrapping #11516


    Blob
    Participant
    Post count: 4

    Hello Szabi,

    -https://univers-skull.com/collections/bracelet-tete-de-mort/products/bracelet-tete-de-mort-femme

    With this URL, when I change the variant, on the html side, I don’t see the image URL change. The product URL changes with the variant in it, but the product image doesn’t seem to change.
    It all depends on how you scrape Shopify, because I know it’s also possible to scrape json.

    “And for my original request:
    Is it possible to obtain the variant type for each value?
    Example:
    Color:
    $variant

    Length :
    $variant”

    If you’re looking into it, let me know and I’m sure a lot of people would be interested, but I’m willing to finance part of the development, as it’s a very substantial time-saver with an immediate return on investment. But I’d prefer it to benefit the whole community, as I’ve already played an active part in developing open source scripts.

    Thank you
    Regards,
    Blob

  • in reply to: Improving variant scrapping #11514


    Blob
    Participant
    Post count: 4

    Hello Szabi,

    I understand and I had seen the video about it, but as other scrappers do, I wanted to know if it was possible.

    I’m not creating an additional topic, but if you’d like I can, as it’s a topic in its own right:

    I’m scrapping a Shopify.
    1. I’ve noticed a lot of image files being uploaded. But the problem lies in the variants. There are many duplicates for each variant.

    1.1 There seems to be 1 image per variant. This seems logical, since a variant can be different, but in reality this is only true in certain cases.

    So if I have 15 variants with the same image, for example for a T-shirt that comes in 15 different sizes, I have 15 images. 1 useful, 14 useless.

    Isn’t it possible to detect if the image used is identical (same URL)?
    If so, don’t download the image. This seems possible, as other scrappers do it.

    I wish you all a Merry Christmas. Enjoy the time with your families.

    Thank you very much.
    Blob

Viewing 4 posts - 1 through 4 (of 4 total)