Forum Moderators: open
At the same time the above user agent was crawling my sites, Python-urllib/2.4 was crawling my sites from the same IP Address.
gw26.jscc.ru - - [20/Jul/2006:03:43:19 -0700] "GET /dir/file.html HTTP/1.1" 403 815 "-"
"Crawler/1.0+http://elibron.com"
gw25.jscc.ru - - [20/Jul/2006:22:27:45 -0700] "GET / HTTP/1.1" 403 815 "-"
"Crawler/1.0+http://elibron.com"
83.149.215.35 - - [21/Jul/2006:05:07:10 -0700] "GET /dir2/ HTTP/1.1" 403 - "-"
"Crawler/1.0+http://elibron.com"
Excellent cross-spotting re the Python connection, Gary!
gw26.jscc.ru - - [20/Jul/2006:03:43:18 -0700] "GET /robots.txt HTTP/1.1" 403 815 "-"
"Python-urllib/2.4"
gw25.jscc.ru - - [20/Jul/2006:22:27:44 -0700] "GET /robots.txt HTTP/1.1" 403 815 "-"
"Python-urllib/2.4"
83.149.215.35 - - [21/Jul/2006:05:07:10 -0700] "GET /robots.txt HTTP/1.1" 403 815 "-"
"Python-urllib/2.4"
What a wonky way to do things. Or dumb like a fox -- if Python gets 403'd, Crawler sails on in...
-----
Btw, "Elibron.com" didn't answer the door but Domain-t-o-o-l-s says it says:
"Offers books in different languages, as well as music scores, visual art works, and historic photographs."
And "Brad DeLong's Semi-Daily Journal" (a blog, so no link) says:
"Elibron.com publishes cheap reprints and e-books in very small press runs, though mostly public domain stuff."
So why is a Russian super-computing entity crawling sites using TWO bots, and one from a package-and-sell-as-PDFs site? Is it borrowing the bots, or crawling for Elibron? (Can't say as I like the sound of either of those situ, actually.)
So why is a Russian super-computing entity crawling sites using TWO bots
American SuperComputer Center for Browser Studies. ;)
Sounds impressive until you discover the super-computer consists of the desktop, laptop, and four servers on my home LAN, and the browser studies means looking for new user agents. :)