Forum Moderators: phranque

Message Too Old, No Replies

Block query strings and keywords in uri

         

colak

6:40 am on Apr 5, 2015 (gmt 0)

10+ Year Member



Following an attack from over 170,000 different IPs in one of my sites a few days ago I am trying to block access to urls containing specific keywords but I am totally lost as everything I try either does not work or returns a 500 error

Basically the code below does not work

<FilesMatch /?m=any&q=|/?m=any=|/index.php?s=|/?m=any&q=1|/wp-login|/fckeditor>
order allow,deny
deny from all
</FilesMatch>


I also tried

RewriteCond %{QUERY_STRING} ^m\=any$- [F]
RewriteCond %{QUERY_STRING} ^m\=any$- [F]
RewriteRule ^ - [F]


and

RewriteEngine On
RewriteCond %{THE_REQUEST} ^.*(wp-login).* [NC]
RewriteRule ^(.*)$ - [F,L]


RewriteCond %{REQUEST_URI} !^/(wp-login.php|wp-admin/|wp-content/plugins/|wp-includes/|fckeditor).* [NC]
RewriteRule .* - [F,NS,L]


all of which return 500 errors.

Could someone advice me on how I can block those requests?

lucy24

7:19 am on Apr 5, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



<FilesMatch /?m=any&q=|/?m=any=|/index.php?s=|/?m=any&q=1|/wp-login|/fckeditor>

Can't check now, because it's too late in the night, but I kinda think Files or FilesMatch can't deal with query strings. So let's proceed to mod_rewrite. In addition, FilesMatch uses Regular Expressions, so each of those literal question marks would have to be escaped with a \ backslash (not a / slash, if that's what those were supposed to be).

RewriteCond %{QUERY_STRING} ^m\=any$- [F]

Yup, that's your 500 error. You've taken a RewriteRule flag and attached it to a RewriteCond. (I detoured to my test site and quickly confirmed that it's an instant 500.)

RewriteCond %{THE_REQUEST} ^.*(wp-login).* [NC]
RewriteRule ^(.*)$ - [F,L]

That one looks reasonable, except that what's the condition for? Does the /wp-login/ directory actually exist on your site, and is it used in some way? If yes, we'll hammer out some details. If no, all you need is
RewriteRule wp-login - [F]


However I don't see a 500 error in this rule by itself. The same goes for the following rule. Are you sure you tried each one separately?

The [NS] flag won't work for WP files. It applies only to server-internal activity such as directory indexes or SSIs, not to rewrites or php business. So in the last rule you do need a THE_REQUEST -- but, again, only if some of those requested filenames actually exist on your site, and are used by some http process.


Edit: It's important to remember that all these rules will not prevent malign robots from making requests. They'll only prevent them from getting in-- or, in the case of requests for nonexistent files, will save the server the work of having to look for the file each time instead of returning an immediate 403.

colak

8:03 am on Apr 5, 2015 (gmt 0)

10+ Year Member



Hi Lucy and thanks for the prompt reply. The site is not using wp so there are no pages or directories using those words. The cms I am using has its own php powered 404 error handling which might be messing up the htaccess rules.

RewriteCond %{THE_REQUEST} ^.*(wp-login).* [NC]
RewriteCond %{THE_REQUEST} ^.*(fckeditor).* [NC]
RewriteRule ^(.*)$ - [F,L]


for some strange reason now serve the 404 from the cms

colak

10:27 am on Apr 5, 2015 (gmt 0)

10+ Year Member



I just noticed some typos in my last sentence above. What I meant to write is:

for some strange reason (and in contrast to the 500 error yesterday), a cms produced 404 error is served for uris containing wp-login or fckeditor which is what I am trying to avoid.

not2easy

2:44 pm on Apr 5, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



But if they do not exist, they are supposed to return a 404 error. Do these file requests somehow serve a 200 for files that don't exist if you remove those rules? If not, the rules just make your server work harder to produce the same results in the end. There isn't a rule that can prevent requests.

colak

3:00 pm on Apr 5, 2015 (gmt 0)

10+ Year Member



With and without the rules the server logs show a 404 which I know that it is served by the cms from the page that appears. But maybe you are right. I am now wondering if I can serve a custom 404 page for those requests.

Also does anyone have any idea on how I could deal with those query strings?

not2easy

7:14 pm on Apr 5, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



You can serve 403 "Forbidden" requests for those pages using the code that lucy24 posted:
RewriteRule wp-login - [F]

But since the requested pages don't exist, the bots may catch on sooner with a 404. If your regular 404 uses a lot of bandwidth it might make sense to serve them a different version with nothing but a 404 header. Then you would need to sort through requests as it is not a user friendly 404.

It is not uncommon to get botnets looking for easy access points, if it is an ongoing problem for your site then it may pay to look into alternatives to the defaults. If it was just a drive by seeking/probe it may not be worth your time because it means setting up specific rules for specific requests and adding more files and monitoring their effect. Normally at some point the botnet gets reprogrammed and quits looking for what isn't there - moving along to other prospective targets. A 403 might keep them looking for ways around a 403. You need to evaluate the long term effect of dealing with a transient issue. (Of course this assumes it was a botnet problem given that you said these requests are coming from over 170,000 IPs)

lucy24

8:49 pm on Apr 5, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



for some strange reason (and in contrast to the 500 error yesterday), a cms produced 404 error is served for uris containing wp-login or fckeditor which is what I am trying to avoid.

That's what I would have expected. I tend to think that in your previous venture, you had some other unrelated line leading to the 500.

With and without the rules the server logs show a 404 which I know that it is served by the cms from the page that appears.

Actually, no. A 404 only shows up in server logs if the server itself couldn't find the file. If you have the generic CMS code that says, in part,
RewriteRule . /index.php [L]

then all requests will come through as a 200 in logs, except the ones that have been handled earlier. (Or that served a 403 on other grounds, such as "Deny from..." lines.)

Somewhere I've got a short list of Things That Took Me Years To Wrap My Brain Around. One item on that list is: The response the server sends out is not necessarily the response the user receives. They're only the same if you have a hard-coded HTML site.

I agree with not2easy that a 404 will sometimes make a robot go away faster. Some robots also respond well to a Contemplate Your Navel redirect such as to 127.0.0.1 or to their originating IP. But it's your choice how much time you want to spend teasing robots. And, in the case of a CMS, one object is to save your server work. So anything that can be put into your htaccess file, bypassing the CMS entirely, is an advantage. For example:
RewriteRule ^wp-admin/ - [R=404]

Not a typo! Little-known fact: you can put any numerical code [httpd.apache.org] after the R flag. If it's outside the 3xx range, there's an implied [L] and the target, if any, will be ignored:
Any valid HTTP response status code may be specified, using the syntax [R=305], with a 302 status code being used by default if none is specified. The status code specified need not necessarily be a redirect (3xx) status code. However, if a status code is outside the redirect range (300-399) then the substitution string is dropped entirely, and rewriting is stopped as if the L were used.

Normally you wouldn't return a 404 explicitly. But doing it this way will save your CMS a bunch of work. And even on a hard-coded site, it saves the server the work of physically looking for the file.

phranque

6:27 am on Apr 6, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



welcome to WebmasterWorld, colak!

which version of apache are you running?
that [R=404] flag might not work with older versions.

colak

7:18 am on Apr 6, 2015 (gmt 0)

10+ Year Member



You people are great! Thanks so much for all the input and advice.

^wp-admin/ - [R=404]
^fckeditor - [R=404]


returns a 500 for me so I am giving up on this as bots might eventually leave. The problem remains though with the searches as

RewriteCond %{QUERY_STRING} ^m\=any$- [NC]
RewriteCond %{QUERY_STRING} ^m\=any\&q=1$- [NC]
RewriteRule ^ - [F]


does not seem to be returning a 403.

lucy24

4:02 pm on Apr 6, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The rule as written has two separate conditions; since all conditions default to "AND" both must be met. In addition, the "m=" part is set to be the first element of the query. And, finally, there's the inexplicable - hyphen after the ending anchor. Frankly I'm surprised that didn't create a 503! For a more robust rule try
RewriteCond %{QUERY_STRING} \bm=any\b [NC]
RewriteRule ^ - [F]

Equals signs do not need to be escaped; neither do ampersands. The anchor \b is here a shortcut for the formal
(^|&) (before) and ($|&) (after)

meaning "this bit is either the very first (or last) thing in the full query string, or the beginning/end of this particular parameter". There are situations where it won't work as intended, but most of the time it's a useful shortcut.

Finally, you should constrain the rule to requests for pages so the server doesn't have to waste time evaluating conditions on every request ever. This will depend on your exact URL structure, for example
RewriteRule (^|/|\.html)$ - [F]

on a conventional hand-rolled html site where URLs end in .html or / slash.

^fckeditor - [R=404]

I hope that wasn't the full line ;-) I didn't realize (thanks, phranque!) that the R=any-number-here formulation may not work in Apache < 2.2.