Forum Moderators: phranque
[edited by: phranque at 1:57 am (utc) on Oct 28, 2018]
[edit reason] unlinked url [/edit]
Humans will use language fields, bots will notIn the one I was looking at, the language header--which would have got the request denied if it had been absent--is a wholly plausible-for-the-source en-gb first, es-mx second. Admittedly, if it were a botnet running off infected machines, they could camouflage themselves by sending the infected browser's default headers.
Beyond the file being requested, the user-agent, and the referrer there are always, always a couple of blank fields at the end that I don't know what's supposed to be thereDifferent server types put their logs in different orders, but there will always be a couple of blank fields for things that could be present, but aren't. (For example, the typical Apache log entry contains the element - - representing two login-related fields that aren't needed in most requests.)
If both IPs belong to the same ISP, maybe its something in their setup. A proxy?That would have been my first-choice interpretation too: the ISP is doing something--possibly involving a phrase like “load balancing”--that makes sense to them, even if it makes sense to nobody else and creates confusion for the four people on the planet who actually look at their raw access logs ;)
What-ever the "http request headers" thing is, I don't see it in my logs
request headers (not access log entries)
please follow directionsPlease keep in mind that on these forums, almost all discussions of header logging have involved php on Apache servers. SumGuy is on IIS and will therefore need to follow entirely different steps.
46.229.168.143 - - [28/Oct/2018:00:05:01 -0700] "GET /ebooks/salmonia/ HTTP/1.1" 200 187103 "-" "Mozilla/5.0 (compatible; SemrushBot/2~bl; +http://www.semrush.com/bot.html)"
2018-10-28:00:01:14
URL: /ebooks/salmonia/
IP: 46.229.168.153
User-Agent: Mozilla/5.0 (compatible; SemrushBot/2~bl; +http://www.semrush.com/bot.html)
Connection: close
Accept-Encoding: gzip,deflate
Accept: text/html
Host: example.com
This is an authorized robot; humans will typically include a few more header fields, while malign robots typically have fewer header fields.
Do all browsers follow or obey the server directive to transmit that information? Are there add-ons or settings that disable it?Headers are part of a request. If they aren't sent, no valid request is received. In the example I quoted, everything after the first three lines (timestamp, IP, requested resource) is a named header. Certain header fields, like User-Agent and Referer, are logged by the server. I think you could theoretically--if it's your own server--customize logs so all header fields, or at least all standard ones, are shown all the time, but what a mess that would be.
2018-10-28:02:32:36
URL: /wordpress/wp-admin/
IP: 50.62.160.99
Connection: close
Host: example.com
I do not need to cross-check server access logs to know that this request was blocked. [edited by: lucy24 at 5:55 pm (utc) on Oct 30, 2018]
I have never heard of packet switching on a human browser“Packet switching” may not be the operative term here. Each file is a separate request, and it's up to the ISP how to handle successive requests that ultimately go to the same human computer. At a minimum, a page request is separate from supporting-file requests, since the browser doesn't know what other files to ask for until it has read the page. You may remember that AOL requests--especially AOL dialup--came from all over the map. Requests from mobiles are also likely to be widely distributed. (I would prefer to think this is not because the visitor is surfing the web while driving, hopping along from one tower to the next. If you happen to live equidistant from two or more cell towers, two successive requests may end up following entirely different routes.)