Forum Moderators: phranque

Message Too Old, No Replies

GET /index.html?author=n requests

What's with that?

         

SumGuy

11:56 pm on Dec 21, 2019 (gmt 0)

5+ Year Member Top Contributors Of The Month



What is trying to be accomplished with this request:

GET /index.html?author=n author=n

The user-agent is Apache-HttpClient/4.5.2 (Java/1.8.0_151)

A few times now I've got a sequence of requests like this from the same IP where n incremented from 1 to 20. Requesting IP was in china. This results in code 200 and they get my index.html file but my server ignores the ?author part of the request.

phranque

1:42 am on Dec 22, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



HttpClient [hc.apache.org] is an Apache Software Foundation (open source) project so it's got a low entry barrier.

i've never seen that pattern in the url request.
it is certainly probing behavior.

you shouldn't be responding with a 200 OK status code here.
you should be 301 redirecting this request to the (canonical) root url - i.e. https://www.example.com/
you can do this using mod_rewrite (assuming an apache server)

or if you decide that this request pattern is intrusive/unwanted, you should return a 403 Forbidden status code.

lucy24

2:49 am on Dec 22, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This results in code 200 and they get my index.html file
Why doesn't it result in code 301 (Index redirect)?

I think at one point I was getting enough of the ?author=blahblah business that I had to block them by that criterion, though now it’s probably deficient headers that does it. At least, I couldn’t find any that were not blocked.

:: detour to raw logs ::

Yipes, I’d forgotten there were that many. And I’m not sure I ever noticed the pattern:

"GET /?author=1 HTTP/1.1" 403 2330 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_151)"
"GET /?author=2 HTTP/1.1" 403 2330 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_151)"
"GET /?author=3 HTTP/1.1" 403 2330 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_151)"

and so on all the way up to author=15. Sometimes they go up to 20 instead. And one of them had the hiccups, skipping two numbers en route from 1 to 15 so there were only 13 requests.

_151 and _161 seem to be the most popular, though they range from 61 to 201 overall. Always Java/1.8.0, and always--with a sole 15-hit exception--ending in one. Huh.