Thank you for contacting me. Please note that I live in the GMT+3 time zone - responses might be delayed by this.
This topic has 18 replies, 2 voices, and was last updated 2 years, 1 month ago by Szabi – CodeRevolution.
-
AuthorPosts
-
-
October 1, 2022 at 6:48 pm #5992
Hello
I’m trying to scrape a shopify website for products, and the product content that it scrapes I want to add to the short description of the product not the long description. Currently by default it’s adding it to the long description.
Can you please help with that?
Also I tried to scrape the image gallery too but I don’t know which selector to use. I tried Class, CSS Selector, XPath nothing worked for me to scrape the product gallery. It is auto detecting the product featured image but how do I get the gallery?
For example this is the product – https://labelfoot.fr/products/ajax-amesterdam-coupe-vent-rouge-20-21
Attachments:
You must be logged in to view attached files. -
October 1, 2022 at 8:07 pm #6002
Hello,
First of all, thank you for your purchase.
1. I checked the product you linked but I don’t see the short description you want to scrape in the page, can you highlight it, please, in a screenshot of this page: https://labelfoot.fr/products/ajax-amesterdam-coupe-vent-rouge-20-21#
2. Regarding the gallery, please use:
Product Gallery Query Type
Regex – All MatchesProduct Gallery Query String:
data-ratio=”1.0″ data-rationav=”” data-sizes=”auto” data-src=”([^”]*?)”Regards, Szabi – CodeRevolution.
-
October 2, 2022 at 8:35 am #6015
Hello Szabi,
Thank you MUCH for the quick reply!
I will try the product gallery images regex expression.
Here is the link to the product which has a description (see attached screenshot for description to scrape) – https://labelfoot.fr/products/ajax-amsterdam-maillot-domicile-2022-23-enfant
Well, I’m trying to scrape the photograph of the size guide in the description. The plugin is able to scrape the description (size guide images), but it’s currently adding the image of the size guide to the “Long Description” in the woocommerce instead of the short description. I want it to go directly to the short description.
Attachments:
You must be logged in to view attached files. -
October 2, 2022 at 8:51 am #6018
This reply has been marked as private. -
October 2, 2022 at 7:25 pm #6035
Please note that while you copied and pasted the Regex expression, the quotes were automatically modified by your browser and the expression was changed to an incorrect one. Sorry for this, I fixed it now on your site, please check this screenshot, you will see 2 types of quotes (the first one is correct – I think my support system changed the quote types automatically, sorry for this): https://ibb.co/4fyYN97
<pre>
data-ratio=”1.0″ data-rationav=”” data-sizes=”auto” data-src=”([^”]*?)”
</pre>
I fixed now settings in the plugin – now gallery will work.
Regarding the adding of the image to the short description, I changed in rule settings (now the image will be added to the short description also):
Excerpt Query Type:
ClassExcerpt Query String:
sp-tab-contentLet me know if this helped.
Regards.
-
October 3, 2022 at 3:06 pm #6060
Thank you so much! Everything is working good now!
-
October 3, 2022 at 3:30 pm #6061
I am glad to help!
-
October 30, 2022 at 7:22 pm #6201
This reply has been marked as private. -
October 30, 2022 at 9:26 pm #6202
Hello,
I checked and this site seems to have a scraping protection mechanism active, which prevents its scraping using the conventional method.
I managed to scrape this site only using Puppeteer. It is a headless browser which needs to be installed on your server, please check this tutorial video for installation: https://www.youtube.com/watch?v=KNOIJA4pTQo
After installation, you can set in rule settings:
Content Scraping Method To Use:
PuppeteerIf installation is not possible on your server, you can also use HeadlessBrowserAPI, which is a cloud based service which renders the page using Cloud Puppeteer instances, please check details here: https://www.youtube.com/watch?v=205EinBQAoo
If you use this method, you need to set in rule settings:
Content Scraping Method To Use:
Puppeteer (HeadlessBrowserAPI)Regards.
-
October 31, 2022 at 6:07 pm #6203
This reply has been marked as private. -
October 31, 2022 at 6:15 pm #6206
This reply has been marked as private. -
November 1, 2022 at 6:53 pm #6209
This reply has been marked as private. -
November 1, 2022 at 8:46 pm #6211
This reply has been marked as private. -
November 2, 2022 at 4:19 am #6215
Thank you for fixing it..
There is an issue I’m facing. When I scrape more than 1 products at one time, the category doesn’t get scraped on the 2nd product or when I scrape more than 3-4 then variations doesn’t get scraped. Is there a limitation?
Also I’m interested in Custom update, how much will it cost?
-
November 2, 2022 at 7:59 am #6217
Hello,
I disabled the automatic proxy usage of HeadlessBrowserAPI, by setting in the ‘Main Settings’ of the plugin ->
Web Proxy Address List:
disabledI did this, as I saw in plugin logs that some IP addresses from the proxy range were not able to properly scrape the website. Please check if this helped.
Regarding custom updates, please contact me at my email kisded@yahoo.com and describe your ideas. Currently I am fully booked with custom work, but I might be able to help, depending on your requirements complexity.
Regards.
-
November 10, 2022 at 4:34 am #6248
This reply has been marked as private. -
November 10, 2022 at 10:43 am #6252
This reply has been marked as private. -
November 17, 2022 at 2:04 pm #6284
This reply has been marked as private. -
November 17, 2022 at 2:57 pm #6287
Issue is fixed again after some talks with hosting provider’s support.
-
-
AuthorPosts
The topic ‘Product Short Description’ is closed to new replies.