Improving variant scrapping

This topic is: resolved

 

Thank you for contacting me. Please note that I live in the GMT+3 time zone - responses might be delayed by this.

This topic has 7 replies, 2 voices, and was last updated 1 month ago by Szabi – CodeRevolution.

Viewing 7 reply threads
  • Author
    Posts
    • #11511


      Blob
      Participant
      Post count: 3

      Hello,

      Thank you for this perfectly functional scrapper for my needs.
      However, I have a question about variants.
      The variant data is returned in this format and is only the value of each variant:
      $variant1 / $variant2 / $variant3
      This is obviously an example.

      Is it possible to obtain the variant type for each value?
      Example:
      Color:
      $variant

      Length :
      $variant

      I’ve added an image to show that the information is available for Shopify, for example. Is it therefore possible to obtain this information so that everything is presented correctly without reworking the data, which is a very important time-saver?

      Thank you very much.
      Blob

      Attachments:
      You must be logged in to view attached files.
    • #11513


      Szabi – CodeRevolution
      Keymaster
      Post count: 4716

      Hello,

      First of all, thank you for your purchase.

      Unfortunately, scraping separate variants is not possible in this case, as the data in the site structure of Shopify is represented in the ‘$variant1 / $variant2 / $variant3’ structure, not in the ‘Color: $variant Length : $variant’ structure. Because of this, the variants are imported in the plugin as a single entity.

      I noted this suggestion, will think on it, however, I already been hit by this issue when I created the plugin and back them I concluded that scraping separate variant data would make the implementation of the plugin exponentially more complicated. But I check on this again, will give it a second thought.

      Regards,
      Szabi – CodeRevolution.

    • #11514


      Blob
      Participant
      Post count: 3

      Hello Szabi,

      I understand and I had seen the video about it, but as other scrappers do, I wanted to know if it was possible.

      I’m not creating an additional topic, but if you’d like I can, as it’s a topic in its own right:

      I’m scrapping a Shopify.
      1. I’ve noticed a lot of image files being uploaded. But the problem lies in the variants. There are many duplicates for each variant.

      1.1 There seems to be 1 image per variant. This seems logical, since a variant can be different, but in reality this is only true in certain cases.

      So if I have 15 variants with the same image, for example for a T-shirt that comes in 15 different sizes, I have 15 images. 1 useful, 14 useless.

      Isn’t it possible to detect if the image used is identical (same URL)?
      If so, don’t download the image. This seems possible, as other scrappers do it.

      I wish you all a Merry Christmas. Enjoy the time with your families.

      Thank you very much.
      Blob

    • #11515


      Szabi – CodeRevolution
      Keymaster
      Post count: 4716

      Hello,

      Yes, this is a good idea, it can save some hosting space when scraping.

      Can you send me a peoduct URL, please, which has variations and images are not duplicated, but variation images link to the same image URLs, so i can test this?

      Regards.

    • #11516


      Blob
      Participant
      Post count: 3

      Hello Szabi,

      -https://univers-skull.com/collections/bracelet-tete-de-mort/products/bracelet-tete-de-mort-femme

      With this URL, when I change the variant, on the html side, I don’t see the image URL change. The product URL changes with the variant in it, but the product image doesn’t seem to change.
      It all depends on how you scrape Shopify, because I know it’s also possible to scrape json.

      “And for my original request:
      Is it possible to obtain the variant type for each value?
      Example:
      Color:
      $variant

      Length :
      $variant”

      If you’re looking into it, let me know and I’m sure a lot of people would be interested, but I’m willing to finance part of the development, as it’s a very substantial time-saver with an immediate return on investment. But I’d prefer it to benefit the whole community, as I’ve already played an active part in developing open source scripts.

      Thank you
      Regards,
      Blob

    • #11517


      Szabi – CodeRevolution
      Keymaster
      Post count: 4716

      Hello,

      I updated the plugin to 2.6.6, now it will reuse existing media library ids, in case the same image is found. Please check.

      Regards.

    • #11526


      Blob
      Participant
      Post count: 3

      Hello Szabi,

      I would like to thank you and let you know that your modifications work perfectly for Shopify.
      I will test on Woocommerce soon and let you know.

      What’s really nice about your modifications is that the crawl time for a page has been divided up very significantly. So the script is much faster, but really much faster!
      It’s almost perfect for my needs, I’m impressed.

      My offer still stands for the attributes 😉

      Regards,
      Blob

    • #11527


      Szabi – CodeRevolution
      Keymaster
      Post count: 4716

      I am glad to hear that it works.

      I will think and investigate also the variants importing.

      Regards.

Viewing 7 reply threads

The topic ‘Improving variant scrapping’ is closed to new replies.