Forum Moderators: open

Message Too Old, No Replies

These google things in my bot trap

Spambots or real ones?

         

silverbytes

5:23 pm on Jan 29, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I had some ip that felt into bot trap (didn't obey robots.txt) some of these didn't resolve performing reverse dns lookup so I assume those are spambots
(could I be wrong about that?)

but I found two in particular:

ik-out-f136.google.com
hs-out-f136.google.com

What are these? Seems to be a proxy in google [google.com ]

at least one of those.

Questions: aren't they obbeding robots.txt. is that spam from google?
Should I ban those?

wilderness

6:08 pm on Jan 29, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I only checked your first line-inquiry for IP range.

You need to decide on your own whether the crawling-probing is beneficial or detrimental to your website (s).

I have the entire Class C range of google denied.
It's some kind of tool or plug-in for browsers.

incrediBILL

2:34 am on Jan 31, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Google has lots of ugly holes in their IP ranges that can be used by bots other than Googlebot. For starters, there's the Google translator, been scraped through that a few times. Then the web accelerator, which will definitely trip bot traps trying to pre-fetch pages, and some mobile proxies, all sorts of stuff.

BTW, the AdWords quality bot doesn't do the reverse DNS thing and won't show as googlebot.com either.

silverbytes

3:55 pm on Jan 31, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



AdWords quality bot doesn't do the reverse DNS thing and won't show as googlebot.com either

What does that bot do?
Do you know what IPs uses?

incrediBILL

8:10 pm on Feb 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The user agent for that one is "AdsBot-Google (+http://www.google.com/adsbot.html)" and I've only seen it operate from the IP block 72.14.199.*

It should only crawl your site if you run AdWords ads that link to your site as it somehow determines your landing page quality, blah.

wilderness

9:29 pm on Feb 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It should only crawl your site if you run AdWords ads

Bill,
The same Class C range is also used by the Google-image bot.
When the image bot went haywire last year and began cralwing my images (they were excluded in robots.txt since the inception of time), I added the Class C.

I did contact Google (search old threads) and the bot was manually stopped, however it has resumed for short crawls occassionaly since, thus I have not removed the range from denils.