One of the bots had the scraper UA newspaper/0.2.8
not2easy
3:34 am on Aug 25, 2021 (gmt 0)
I block that range since about 4 years ago. In the off chance that it is a human that should be allowed I have a minimal method on the 403 document that they can let me know and I could poke a hole in the block. It has never happened. Yet.
jmccormac
3:39 am on Aug 25, 2021 (gmt 0)
Seen it quite regularly. It isn't aggressive and seems to be a kind of article extractor for blogs. It might be more aggressive with blogs and text rich sites. If it is a text rich site, it might be a good thing to check if the text is being recycled.
There are also ads.txt checkers that run out of the Google Cloud. That said, I've also seen scrapers from the Google Cloud ranges and they generally get an IP level block.
Regards...jmcc
lucy24
3:41 am on Aug 25, 2021 (gmt 0)
One of the bots had the scraper UA newspaper/0.2.8
Ooh, I've just recently started seeing that UA in a few (blocked) robots. One or two of them came in from the exact IP of a long-blocked robot, which is how I happened to notice them at all.
Jonesy
4:52 pm on Aug 28, 2021 (gmt 0)
I've also blocked that /12 on my VPS. Saw a lot of Bad Actors from there...