Forum Moderators: phranque
but still caught by my script
I wonder if, scrapers are using the July / AugustOption B is that they've got a long list of sites that they work through, top to bottom, and today it is your turn.
When my script suspects a non human visitor, it displays a CAPTCHA-like challengeI take an even simpler approach: forward to a page that says in effect “Whoops! You’ve accidentally replicated the behavior of an undesirable robot” and then there’s a link to the originally requested page on the slim chance that it really was a human. At any given time, there will be a couple of botnets with a predictable pattern of requests ... and also bona fide humans from {country} requesting pages in {directory} where the redirect is an alternative to saying “Past experience tells me you’re too stupid to glance at a search snippet or even at the title of the page, [ they can’t all be using the I Feel Lucky option, can they? ] and I really don't see why I should put my server to the work of sending you up to a hundred image files for a page you'll never even look at.” None of them ever follow the link.
Most malign robots still seem to use http by default
I have no web-hits from any of those networks in my logs during the past 4 years. Who-ever was using RESNET wasn't doing much crawling from I can see.That sounds like a twist on the age-old issue of telling victims of {insert type of crime ad lib} that they should be flattered because obviously they’ve made themselves attractive to {insert type of offender ad lib}. Should I feel sad that my website hasn’t drawn the attention of some particular variety of malign robot?