Thank you for contacting me. Please note that I live in the GMT+3 time zone - responses might be delayed by this.
This topic has 3 replies, 2 voices, and was last updated 1 year, 3 months ago by Szabi – CodeRevolution.
-
AuthorPosts
-
-
August 21, 2023 at 1:50 pm #8456
Hey Szabi,
I have 2 Questions for the Use of Crawlomatic.
I set the Import up and it works fine, no problem there. But i have one thing, which i can´t set up:
The Import Runs, let´s say, every 12 hours. If a new post is there, it crawls the newest post, everything is fine, but if there´s no new post, he starts grabbing older posts from the feed.Let me make an Example:
It crawls a “news” oder “post” from 21. August 2023 , on the next crawl there is a new post – from 22. August 2023. He grabs it and import it. Then 10 Days comes no new news and he is starting Grabbing the news vom 20. August, 19. august. 18. august backwards. and import it as “new Posts”.
Is there a possibility to set it up like: Import at First time the Last X posts, then only newer and not the old ones?
And my second Question is, if i import an Image, he automatically imports the “ALT”-Text, thats finde, but is there a way, to set up an Description text, which is used as Picture Caption? It would be great if there is a way, to bring the ALT-Text also as Caption – with the possibility to write a own part, like a copyright info in the caption.
Maybe i am stupid and the ways are there but i didn´t see them… so… i ask 🙂
Happy to hear from you!
Thanks for you awesome work. -
August 21, 2023 at 9:32 pm #8458
Hello,
Thank you for contacting me.
Sorry, but currently controlling the plugin scrapes content in the way you described (Import at First time the Last X posts, then only newer and not the old ones) is not possible.
However, for this, the posts dates would be able needed to be scraped correctly. Did you set up the plugin to scrape also post dates for this source? If yes, let me know and I will think on this.
Regarding picture caption, this is also not possible, but I noted it to my plugin update ideas list for upcoming versions.
Regards, Szabi – CodeRevolution.
-
August 22, 2023 at 8:13 am #8463
Hey Szabi, thanks for your Response!
Yeah, in the most Cases it is possible to scrape the Date.
I was thinking about another way, which works with another Scraping Tool (Octolooks Scrape): There is a function, which is described with this: “The field to set how many on existing post occurrence is needed to stop the task until next run time in order to save resources (Required).”
That works great. Let me explain it like this:
The tool scans, how many posts are imported. Lets say 10. if on the next cron run are the same 10 posts existent, the cron stopped. if there is a new post (from “top to down”) its getting imported. That works also without dates, cause he check with the post title or URL, cause he counts the imported articles like “if i have reconnized X posts, i stop working” – “oh, heres a new one, this had to be imported.
I hope i could explain it understandable. I put a Picture in the Post for better understanding 🙂
The second thing with the Image Description (caption) would be great, cause in german law, the Copyright has to be written near the picture -> Featured Image -> Below the Image. If theres than a way to set this up, it will automatically placed via wordpress (like i write manually the description Text / Caption Text)
Thanks a lot for your Work!
Have a nice day!
Attachments:
You must be logged in to view attached files. -
August 22, 2023 at 9:36 pm #8465
Ok, I see. I think that the ‘Total Maximum Crawl Results to Get’ settings field from importing rule settings will help.
Please set it to a value, like 10. This will make the plugin process only the first 10 scraped results.
Let me know if this helped.
Regards.
-
-
AuthorPosts
The topic ‘2 Questions’ is closed to new replies.