Strip HTML Elements by XPATH/CSS Selector does not work

This topic is: resolved

 

Thank you for contacting me. Please note that I live in the GMT+3 time zone - responses might be delayed by this.

This topic has 1 reply, 2 voices, and was last updated 2 years, 10 months ago by Szabi – CodeRevolution.

Viewing 1 reply thread
  • Author
    Posts
    • #4453


      meckerman
      Participant

      Hi, I’m new here and working my way into Crawlomatic. I can’t get two problems solved.
      1) I want to crawl the content of a blog. But I always want only the first five articles. If I crawl every 12 hours the plugin crawls the first five the first time, the second five the second time and so on. How can I set this to always crawl only the first five posts.

      2) I crawl a press portal. From the result I want to remove some elements. I tried it with “Strip HTML Elements by Class:” and “Strip HTML Elements by XPATH/CSS Selector:”. But unfortunately it does not work. Below is the address of a sample page: https://www.presseportal.de/blaulicht/pm/24843/5115998

      Thanks a lot for your help. I am very pleased with the plugin

      Greetings from meckerman

      Attachments:
      You must be logged in to view attached files.
    • #4455


      Szabi – CodeRevolution
      Keymaster
      Post count: 4577

      Hello,

      First of all, thank you for your purchase.

      1. To scrape always only the first 5 posts from a page, please go to importing rule settings in the plugin -> click ‘Settings’ for the rule you created -> add in the ‘Maximum Links To Crawl From Each URL’ settings field the value: 5  and save settings -> run importing again.

      2. To strip content from this page, I used the below settings:

      Content Query Type: Class

      Content Query String: card

      Strip HTML Elements by Class:
      date,customer,story-sharing,contact-headline,contact-text,originator

       

      I hope this info helps.

      Regards,
      Szabi – CodeRevolution.

Viewing 1 reply thread

The topic ‘Strip HTML Elements by XPATH/CSS Selector does not work’ is closed to new replies.