Forum Moderated by: open
Forum to identify search engine spiders and user agents
| Thread Subject | Messages | Started by | Last Message | ||||
|---|---|---|---|---|---|---|---|
| Szukacz claims to honour robots.txt but doesn't |
5 | Mokita | 4:28 pm Aug 3, 2006 | ||||
| Adwords Bot |
2 | volatilegx | 4:13 pm Aug 3, 2006 | ||||
| ebay indexing? |
6 | jake66 | 3:24 am Aug 3, 2006 | ||||
| PigBlock No robots.txt |
4 | GaryK | 7:14 pm Aug 1, 2006 | ||||
| Purpose of this Crawler/Bot? |
7 | DXL | 7:04 pm Aug 1, 2006 | ||||
| The new Y! slurp |
3 | BlackTulip | 5:36 pm Aug 1, 2006 | ||||
| PediaSearch.com Crawler |
9 | keyplyr | 7:19 am Jul 30, 2006 | ||||
| SevenTwentyFour/LinkWalker - New Owner, Mission Watch out! Brand name surveillance via LinkWalker. |
4 | Wizcrafts | 5:00 am Jul 30, 2006 | ||||
| lwp::simple/5.803 pretending to be yahoo? |
6 | jake66 | 7:53 pm Jul 27, 2006 | ||||
| "NutchCVS" (again) but from penguin26.parc.xerox.com No robots.txt |
3 | Pfui | 9:02 am Jul 26, 2006 | ||||
| "GT::WWW/1.026" from .reverse.layeredtech.com No robots.txt |
7 | Pfui | 4:16 am Jul 26, 2006 | ||||
| MetagerBot/0.8-dev (MetagerBot; http://metager.de; ) Note space before close paren |
8 | Pfui | 7:19 pm Jul 25, 2006 | ||||
| Downloads from a blank UA? |
23 | keyplyr | 6:26 am Jul 25, 2006 | ||||
| HA! No User Agent for You! These just flat-out annoy me! |
10 | GaryK | 2:16 am Jul 25, 2006 | ||||
| Crawler/1.0 http://elibron.com No robots.txt |
5 | GaryK | 7:24 pm Jul 24, 2006 | ||||
| Mozilla/5.0 (compatible;MAINSEEK BOT) No robots.txt |
3 | GaryK | 3:06 pm Jul 24, 2006 | ||||
| Mozilla/5.0 (compatible; robtexbot/1.0; http://www.robtex.com/ ) Note space before close paren. Also: no robots.txt; uses site URL in ref |
10 | Pfui | 1:50 am Jul 24, 2006 | ||||
| "teoma agent1" from directhit.com -- no robots.txt |
2 | Pfui | 11:17 pm Jul 22, 2006 | ||||
| "research-spider" from .cs.brown.edu |
2 | Pfui | 11:16 pm Jul 22, 2006 | ||||
| "Entrieva/1.0" -- no robots.txt |
2 | Pfui | 10:26 pm Jul 22, 2006 | ||||
| 000s of Truncated Page Requests from Many IPs [3] ( 1 2 3 ) |
82 | jomaxx | 10:59 pm Jul 20, 2006 | ||||
| Yahoo! Crawlers - A response from Yahoo! Search Response from Yahoo! |
9 | Yahoo_Mike | 8:33 am Jul 18, 2006 | ||||
| How to ban (compatible ; type requests Note space between compatible and semicolon[2] ( 1 2 ) |
40 | larryhatch | 3:06 am Jul 18, 2006 | ||||
| Googlebot Google but not Googlebot |
4 | vortech | 4:55 pm Jul 16, 2006 | ||||
| server2.attributor.com |
12 | Cromicon | 4:39 pm Jul 16, 2006 |