Forum Moderators: phranque

Message Too Old, No Replies

Possible Apache-HttpClient abuse

Server side scripts

         

akwexavante

7:38 am on Jul 1, 2019 (gmt 0)

5+ Year Member



Hoping for a little help.

Over the weekend i've had hundreds of failed visits from someone in Germany using the " Apache-HttpClient/4.5.7 (Java/11.0.3)" as there browser to try and access many of my perl cgi scripts directly. The same IP address has been trying to gain access to my router and failed.

I say failed visits to perl cgi scripts, this isn't true as such but direct access is controlled by a rather impolite "Go Away" message in reponse to direct access attempts!

Tried googling to find out just what the Apache-HttpClient/4.5.7 (Java/11.0.3) browser is and i've failed. Well i can find information obviously but nothing that makes any sense of just what the Apache-HttpClient/4.5.7 (Java/11.0.3) is exactly and why it's used! Well not in way i understand.

I'm assuming that in this case it's being used to exploit vulnerabilities, or in this case search for opportunities to exploit vulnerabilities where they find them.

So what is this browser and its purpose!?

lucy24

3:45 pm on Jul 1, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Is it a browser? I thought all those Apache-http client strings were just cut-rate robotic UAs.

:: detour to logs to pick out exact format ::

Yeah, full range (over the past 18 months) from
Apache-HttpClient/4.3.4 (java 1.5)
to
Apache-HttpClient/4.5.7 (Java/11.0.3)

Interestingly, the last-named specializes in HEAD requests for all kinds of non-page files (never noticed this before, because they're all blocked) while the others prefer to GET pages. There's even a
Apache-HttpClient/4.5.3 (Java/1.8.0_161)
(and earlier versions back to _111)
which spends half its time on GET /robots.txt
though it never seems to act on the information (for example by heading straight for roboted-out directories).

The fact that different exact UAs engage in different behaviors strongly suggests that people are picking up a robot script, using it for some specific purpose and then moving on.

Why don't you just deny the UA comprehensively?

:: further cross-checking of logs and own htaccess ::

It must send defective headers, because I find I've never needed to deny it by name. It fails other tests. Typically a robot script comes with the full package of information: not just a User-Agent string, but headers and general behavior (for example, Nutch by default follows robots.txt).

It is entertaining to track robot behaviors, but only after you've made sure they are not getting in.

akwexavante

5:28 pm on Jul 1, 2019 (gmt 0)

5+ Year Member



Thank you Lucy24 for your reply,

On this occasion the person or robot was specifically targeting cgi scripts only directly and no other files which raised my suspicions and grabbed my attention.
When a number of things happen my scripts send me emails with info about what happened and when etc. On a small number of situations I get text messages too but this didn’t happen.
This Monday am my inbox was, well Full of emails where a person or robot was specifically targeting cgi scripts directly.
I would normally get perhaps two emails a month and perhaps two text messages a year when unusual behaviour is identified so to get an inbox filled in one weekend was very unusual indeed.

tangor

9:02 pm on Jul 1, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Seems like Summer heats things up. :)

I've had a number of these hits as well. All 403, of course, but still annoying.

Jonesy

10:25 pm on Jul 4, 2019 (gmt 0)

10+ Year Member Top Contributors Of The Month



Putting the perl scripts below web root ought to help.
You can sleep better at night...

lucy24

12:54 am on Jul 5, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



... and call the directory something other than /scripts/. If I had it to do over again, I would not have called my /includes/ directory by that name. Instead I make do with a RewriteCond-plus-Rule that says in effect “If you ask for it, you can’t have it.” The same should work for cgi scripts and anything else that is used only internally by the server, not requested by the browser.

tangor

2:34 am on Jul 5, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Solid advice, though I've never had a problem for those specifics in 25 years ... php/wp* on the other hand is an absurd nightmare of unrelenting probes by bad actors---mostly eastern Europe and Asia, with a liberal sprinkle from South America.

*Fortunately I don't use either and 403s are the routine response.