Forum Moderated by: open

Crawler, Spider, and User Agent ID


Forum to identify search engine spiders and user agents

 
Thread SubjectMessagesStarted byLast Message
New Spider IP Address
This IP doesn't show up on any listings I've seen
2 ThirdRock 2:11 pm Mar 7, 2006
Rss_reader
2 bobothecat 11:23 pm Mar 6, 2006
ucbot
2 bobothecat 10:54 pm Mar 2, 2006
What are those guys at Slurp! China on?
SlurpConfirm404 URL oddities
3 AlexK 6:38 pm Mar 2, 2006
Y!j-bsc/1.0
New Yahoo spider?
3 bobothecat 2:01 am Mar 2, 2006
Google and Drupal
3 Sheen1 4:45 pm Feb 28, 2006
Robot - Meaningful Machines
ultra agressive spider
5 ziegast 9:37 am Feb 27, 2006
NextopiaBOT
Did not check robots.txt
4 bose 5:13 am Feb 26, 2006
Is Google dealing with Yahoo (Inktomi)?
Google Sitemaps file spidered by Yahoo
5 extranjero 4:57 am Feb 26, 2006
Slurpy Verifier/1.0
Has anyone seen this?
4 selomelo 4:41 am Feb 26, 2006
InfoPath.1. what is it?
Earlier thread died on speculation
3 arnarn 4:37 am Feb 26, 2006
YahooYSMcm/2.0.0
No robots.txt
4 GaryK 1:01 pm Feb 24, 2006
tailrank
2 keyplyr 5:37 am Feb 18, 2006
Slew of new Slurp IP's?
6 MrSpeed 4:48 pm Feb 16, 2006
Is the government spying on me?
If so at least they read robots.txt!
8 GaryK 4:24 am Feb 16, 2006
Was this a fake "msnbot"? (Non-MS IP; no robots.txt; triggered traps)
Does Microsoft sell/license their bot to others?
11 Pfui 4:54 pm Feb 15, 2006
Titanium 2005 (4.02.01)
Is this Panda Antivirus Titanium?
5 GaryK 4:33 am Feb 14, 2006
PHP files with "?" string
Are they indexed?
3 halbesma 11:14 pm Feb 13, 2006
MSIE Crawler
What is it?
4 RichTC 11:10 pm Feb 13, 2006
Jeeves
3 wilderness 9:19 pm Feb 13, 2006
Pic-grabber? *internetserviceteam.com showing up in log
completely ignored robots.txt and now I don't know how to stop it!
3 Dottie_Matrix 6:44 pm Feb 13, 2006
cloaked spider from 66.220.7.* and 66.220.20.*
relentless .html + .txt spider
4 Hetta 3:43 pm Feb 12, 2006
Google attempting crawl with invalid Mozilla Uesr-agent
I hope this is just a G employee just fooling around
10 jdMorgan 7:24 am Feb 9, 2006
deleted old pages but spiders see it as errors as it's no longer live
6 hulahoop 7:13 am Feb 9, 2006
New bot Java/1.5.0_06 grabs all pages
grabbed all pages from 2 different domains
5 privacyman 7:04 am Feb 9, 2006