Strip HTML Elements by XPATH/CSS Selector does not work

This topic is: resolved


Thank you for contacting me. Please note that I live in the GMT+3 time zone - responses might be delayed by this.

This topic has 1 reply, 2 voices, and was last updated 3 years, 1 month ago by Szabi – CodeRevolution.

Viewing 1 reply thread
  • Author
    • #4453


      Hi, I’m new here and working my way into Crawlomatic. I can’t get two problems solved.
      1) I want to crawl the content of a blog. But I always want only the first five articles. If I crawl every 12 hours the plugin crawls the first five the first time, the second five the second time and so on. How can I set this to always crawl only the first five posts.

      2) I crawl a press portal. From the result I want to remove some elements. I tried it with “Strip HTML Elements by Class:” and “Strip HTML Elements by XPATH/CSS Selector:”. But unfortunately it does not work. Below is the address of a sample page:

      Thanks a lot for your help. I am very pleased with the plugin

      Greetings from meckerman

      You must be logged in to view attached files.
    • #4455

      Szabi – CodeRevolution
      Post count: 4756


      First of all, thank you for your purchase.

      1. To scrape always only the first 5 posts from a page, please go to importing rule settings in the plugin -> click ‘Settings’ for the rule you created -> add in the ‘Maximum Links To Crawl From Each URL’ settings field the value: 5  and save settings -> run importing again.

      2. To strip content from this page, I used the below settings:

      Content Query Type: Class

      Content Query String: card

      Strip HTML Elements by Class:


      I hope this info helps.

      Szabi – CodeRevolution.

Viewing 1 reply thread

The topic ‘Strip HTML Elements by XPATH/CSS Selector does not work’ is closed to new replies.