Forum Moderators: phranque

Message Too Old, No Replies

Can a scraper/DOS attack be stopped if coming from random IP addrsess?

Did not appear as a robot - but had a browser user-agent

         

maximillianos

6:01 pm on Nov 14, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hello everyone.

I'm just pining for some ideas of what is possible.

This morning my website went under due to a scraper and/or DOS attack. I was able to identify the IP and block it.

Well a few hours later they came back stronger than ever, and this time they were coming from hundreds of different IP addresses, but all having the same user agent:

"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"

How can I combat such an attack? Is the above user-agent a valid one, or should I try to block that agent?

If it is valid, what the heck can one do to stop such an attack that appears to mimic valid web user requests?

Thanks for any ideas! I'm hoping to put something in place before they return.

jdMorgan

7:05 pm on Nov 14, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Are you sure this isn't Exploit Prevention Labs or AVG's LinkScanner? If it is, you can safely rewrite any request for a UA with the "SV1" or other LinkScanner-identifying sub-strings in it to a very-short-but-valid HTML page.

For example, in Apache example.com/.htaccess


RewriteCond %{HTTP_USER_AGENT} ^(User-Agent:\ )?Mozilla/4\.0\ \(compatible;\ MSIE\ 6\.0;\ Windows\ NT\ (5\.1¦6\.0);(1813¦\ SV1)\)$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^User-Agent:\ Mozilla/4\.0\ \(compatible;\ MSIE\ [67]\.0;\ Windows\ NT
RewriteCond $1 !^avg-scan\.html$
RewriteCond %{REQUEST_METHOD} ^GET$
RewriteCond %{HTTP_REFERER} ^$
RewriteCond %{HTTP:Accept-Encoding} !gzip,\ deflate
RewriteRule ^([^/]+/)*(([^./]+\.)+html)?$ /avg-scan.html [L]

Jim

maximillianos

7:58 pm on Nov 14, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Not sure about that. Their first attack came a few hours earlier under a completely different user-agent:

Opera/9.62 (Windows NT 5.1; U; en)

This first one they actually came from a static IP: 72.73.xx.xx

They made about 5000 page requests within about 30 minutes...

Pretty much brought my server down since they were pulling some of my more complicated scripts that are not run that often... along with some of my more general scripts that I utilize memcached with.

I have made some enemies lately by filing some DMCA violations, etc from some scraper sites that stole 10,000 pages of my content... They finally got booted from Google... but only after I exposed them for other issues (black hat seo). At that point they threatened me and well... since then I've had my server hacked once, and now this is happening...

Can't ever say being a webmaster is boring work... =)

[edited by: eelixduppy at 12:53 am (utc) on Nov. 15, 2008]
[edit reason] obfuscated [/edit]

incrediBILL

8:02 am on Nov 17, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You can get a free script at [anticrawl.com...] that should squelch this nonsense until they give up. Have read that the script may cause other issues so I only recommend it on a temporary basis, but it's enough to hold them off until they go away frustrated.

maximillianos

1:39 pm on Nov 17, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Bill, I will check that out. At first glance it appears spammy, but I trust you... ;-)

incrediBILL

3:04 pm on Nov 17, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



At first glance it appears spammy

It is kinda spammy, but the code seems OK, use a throw away email address ;)

maximillianos

3:29 pm on Nov 17, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks... Will do. =)