Forum Replies Created
-
AuthorPosts
-
February 7, 2023 at 11:41 pm in reply to: Guest Author #6805
I know this is a strech but is there an option to also make a “reverse restriction” under the “Crawling restriction:” and “URL Patterns to Not Crawl and Import:”.
So that only URL with specific pattern would be imported. Because in this case I only need to insert one URL pattern that contains the category name in the URL and not all others possible URL category patterns that I need to exclude.
Just for example:
A lot of the sites have URL patterns like this http://www.example.com/category1/title of the article
So with the reverse rule restriction I could only insert http://www.example.com/category1/ URL pattern and all other possible categories would be excluded.
Attachments:
You must be logged in to view attached files. -
February 7, 2023 at 2:56 pm in reply to: Guest Author #6803
Yes it works great.
Can you also tell me how to sort the scraping of websites by category? I tried a few options in the settings but nothing is working as I would like.
My problem is this:
Let us take theverge.com for example. If I choose TECH(www.theverge.com/tech) as a category I have posts within that category that are related to TECH category but also a few that are related to other categories like GOOGLE, TRANSPORTATION, REVIEWS, HOW-TO etc… I would only like to scrape posts related to TECH category. I watched a few of your videos related to this question but still can’t make it work. I know it is probaly stupid simple but with so many options I am not sure what should be enabled and what not under category customizations.
-
February 6, 2023 at 8:59 pm in reply to: Guest Author #6801
Wow. Amazing support. Things like this makes you feel money well spent on a plugin. Just a quick question? Could there also be an option for domain without the “www” prefix? Like “theverge.com” instead of “www.theverge.com”. Not trying to be picky after your quick implementation but I had to ask.
-
AuthorPosts