Forum Moderators: phranque
Anybody know what's up with that? Or how that could have happened?
Thanks in advance
Audrey
[edited by: tedster at 5:57 pm (utc) on Oct. 8, 2006]
[edit reason] use example.com - not your real domain [/edit]
"Referral spam" is when you are given a request with an incorrect "referrer" header. The intent to to make it appear that you are getting traffic referred to you from sites that you actually aren't.
The hope on the part of the spammer is that you publish some sort of list of "top sites that refer to us" or else have an open log page. Because some blogging software automatically generates the "top sites" sort of list, this has become more popular.
This is a bit different, though. Somebody has accessed your site with the host: header set to an irrelevant site. The host: header, which was not supported prior to HTTP 1.1, is used to distinguish between sites when accessing a virtual host. Let's say you have example.com and example2.com, and you want to run both using the same server and IP address. How can the server tell the difference? It can tell by looking at the host: header, which all modern browsers add to the request.
There is nothing that will prevent client software from putting any arbitrary string in this header, though.
I suppose the hope is that you publish a list of "most accessed pages" or else have an open log published on your site. You can see that if somebody hit you enough times and you published such a list, somebody else's site could wind-up at the top of your list.
It could also be an attempt to find open proxy servers? Frankly, I'm not familiar with the details of how proxy servers work, or if they typically would be on port 80 at all.
They are already getting a 404 error. But you could take some more agressive steps. The ultimate way to reject this would be with an application-level firewall that would disconnect upon seeing a host: header that doesn't correspond to a site that is hosted on your server.
So far, this is the only instance where I've noticed this particular url with that extension on it. I hope I do not see another.
Thanks again for your help,
Audrey
It's not the host header but the request header. (Though I have seen invalid host headers, as well.)
Same idea, though. There's nothing to stop a client from stuffing any arbitrary text - whether meaningful or not - into any header. In this case, it's the request header.
Many hardware firewalls do have the ability to screen this out. For example, on a Netscreen firewall, you can use the Screening, Mal-URL setup to block any HTTP request that starts with "http://".
I see quite a bit of this in my logs. I assume they are looking for open proxy servers, I don't run a proxy server, and so it's benign, save for the unwanted traffic.