Forum Moderators: phranque

Message Too Old, No Replies

redirect non-existent file requests htaccess

don't want to 404 them, want them redirected

         

idiotgirl

12:06 am on Dec 23, 2019 (gmt 0)

10+ Year Member Top Contributors Of The Month



Server wide, I get hundreds and hundreds of requests (exploit attempts) for non-existent Wordpress files every day: wp-admin, wp-login, wp-content, wp-signup, wlwmanifest, etc.. The sites the requests are being made toward are NOT Wordpress sites, but those requests are often followed by requests for bitcoin, wallet, etc. - total exploit attemtps. I don't want to serve a 404 (since the requested files and directories don't even exist), and I don't want them around anyway, I want to redirect them elsewhere via htaccess. What's the most elegant way to do that?

RewriteCond (REQUEST URI anything with wp-admin or trigger words in it)
RewriteRule (redirect to http://example.com)

Has anyone else done that?



[edited by: not2easy at 1:10 am (utc) on Dec 23, 2019]
[edit reason] readability [/edit]

not2easy

1:14 am on Dec 23, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



A 404 is the right response for a file that doesn't exist. Personally I've found it easier to see where they come from and block those scripted bots, serving a 403 instead.

idiotgirl

2:48 am on Dec 23, 2019 (gmt 0)

10+ Year Member Top Contributors Of The Month



I have been 403'ing them but now I want to target certain Wordpress terms and redirect them instead. My logfiles are full of them, like every day. I know I can do it with htaccess but I'm not sure the way to do it.

not2easy

3:11 am on Dec 23, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Redirecting to another domain that you don't own could earn you some blowback from them. If they aren't the source of your woes, why should you inflict your unwanted traffic on their domain? If your logs are full of them it should be fairly routine to determine where they are hosted and block their traffic. You will still see a 403 response, but bandwidth isn't a concern unless you create an exceptionally elaborate 403 error page. Good neighbor websites don't send unwanted traffic to others, it is not a good way to handle unwanted visitors.

Have you thought of using a service like Cloudflare to filter out such traffic? It's free and speeds up your site.

idiotgirl

4:03 am on Dec 23, 2019 (gmt 0)

10+ Year Member Top Contributors Of The Month



It's a domain I own, but I want them all redirected there, especially with those particular terms. What's the best way to do this with .htaccess?

lucy24

4:22 am on Dec 23, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What's the best way to do this with .htaccess?
Stock response: What have you tried so far?

The worst way to do it is with the -f test -- in fact that's one thing that makes any CMS so server-intensive. Instead, look at patterns that you know for a fact you don't have, like
^wp
or
^xmlprp

Still and all, I honestly don’t understand why you would want to redirect to another of your own sites, when this simply means you’re turning one request into two--assuming they follow the redirect, which most robots do.

You may not realize that you can serve a 404 manually (in mod_rewrite it would be the flag [R=404]), thereby saving your server the work of looking for the file. The requester doesn’t know you’ve done this; they receive the identical 404 either way.

The wonderful thing about a 404 response is that it sends absolutely no information to the requester. When you return a 403, you’re saying “Mwa ha ha, I’m onto you” but a 404 just says “No such file”. And that was all she wrote.

phranque

4:22 am on Dec 23, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I don't want to serve a 404 (since the requested files and directories don't even exist),

a 404 is the proper response for "requested files and directories (that) don't even exist"

and I don't want them around anyway,

in that case a 403 is the proper response.

I want to redirect them elsewhere via htaccess. What's the most elegant way to do that?

RewriteCond (REQUEST URI anything with wp-admin or trigger words in it)
RewriteRule (redirect to http://example.com)

you don't need a RewriteCond for this - you can do this all in the RewriteRule directive.

idiotgirl

4:32 am on Dec 23, 2019 (gmt 0)

10+ Year Member Top Contributors Of The Month



So if the file doesn't exist and I don't want the regular 404 page to display as it would to a real visitor, which is the best way to do this with nonexistent files matching that request pattern?

lucy24

7:02 am on Dec 23, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



which is the best way to do this
Again: what have you tried so far?

It doesn't matter whether the files exist or not, because the server is never going to look for them. All that matters is the request pattern. In your first post you listed a string of them: wp-login, wp-signup, wp-admin and so on. You can see the unifying feature there.

I don't want the regular 404 page to display as it would to a real visitor
OK, that's a legitimate concern if your 404 is designed to provide lots of good information and links for humans. Not that most robots will be distracted by links in the 404 page; typically they come in with a shopping list and will not be turned aside.

The only way to show robots entirely different content is to serve an entirely different response, such as 418 (“teapot error”) if it isn’t already in use on your server for something else. In fact, you could confuse the heck out of robots by serving a response code that they’re not familiar with; most of the 400 series isn’t in frequent use. The horse's mouth [w3.org] currently doesn't go above 417... and even some of the existing 400-class errors could be used. (“Request too long? But I only asked for /wp-admin/ !”)

idiotgirl

7:56 am on Dec 23, 2019 (gmt 0)

10+ Year Member Top Contributors Of The Month



So far they've been 403'd. Like I said, I'd rather just send them all to a black hole on another domain I have. I don't even want to serve them a real 404. I thought if I could do a rewrite via .htaccess using specific terms I could accomplish that. I found a sort of similar example online, but I'd like to consolidate via regex - I wondered if anyone else has done something similar, and how they accomplished it.

RewriteEngine on
RewriteCond %{REQUEST_URI} !^/file\.txt [NC]
RewriteCond %{REQUEST_URI} file\.txt [NC]
RewriteRule .* [wwwexample.com...] [R=301,L]

I was pretty active on this group many years ago, and I remember some pretty sharp people on the group regarding htaccess. I thought perhaps they might suggest a workable approach for those specific Wordpress calls.

phranque

8:14 am on Dec 23, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I was pretty active on this group many years ago, and I remember some pretty sharp people on the group regarding htaccess. I thought perhaps they might suggest a workable approach for those specific Wordpress calls.

i'm pretty sure i suggested you could do it all in a single RewriteRule and lucy24 hinted at the approach for specifying the pattern.

from the Apache Web Server forum Charter [webmasterworld.com]:
  • It is not appropriate to expect other members to write your code for you or to debug your entire project; Please don't expect other members to solve a problem you don't want to begin solving yourself.

  • Don't get upset if someone has the answer but wants to provide you with resources and material to help you solve it on your own. After all, the most educational threads are those where members learn how to help themselves. Such threads also prove to be of most value the next time someone has a similar question.

even when you were still active this was the philosophy in this forum.

idiotgirl

8:27 am on Dec 23, 2019 (gmt 0)

10+ Year Member Top Contributors Of The Month



Thanks for the suggestions. I have not done enough rewrite rules to know the best way to regex this to a single call instead of multiple calls, or if it required multiple calls (one for each). But I suppose based on the helpful suggestions, I can hack away at this and find out by trial and error without inconveniencing anyone or giving the impression I was assuming someone would say "here, been there, done that before, it's done like so..." I believed maybe someone else had done something similar, and would know the best way to go about it,orsay - no - you have to write your rules one at a time, or whatever. Apparently no one else has done that, or it's not a sound idea. Thanks for the advice.

not2easy

12:02 pm on Dec 23, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Another option would be to take parts of the suggestions, search past discussions and see examples of "how to do it". That can often help you out. Understanding what you use on your site is much more helpful than pasting in what works for someone else. That approach may or may not work, which is why this is interactive.

No5needinput

2:46 pm on Dec 23, 2019 (gmt 0)

10+ Year Member Top Contributors Of The Month



Not sure if it's any help at all but this is how I do it - on a NON Wordpress site:

<FilesMatch "^(wp-login.php|admin.php|wp-admin/|login.php|wp-content/plugins/|wp-includes/|wlwmanifest.xml|wp-config.php|varien/js.js|xmlrpc.php|license.txt)">
Order allow,deny
Deny from all
Satisfy All
</FilesMatch>

phranque

3:13 pm on Dec 23, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Deny from all

This solution will provide a 403 response rather than the 301 response preferred by idiotgirl.

idiotgirl

5:20 pm on Dec 23, 2019 (gmt 0)

10+ Year Member Top Contributors Of The Month



I've done the <FilesMatch> method for a long time, but that as phranque said is 403 and I wanted to 301.

I wrote this last night and tested it but it redirected me whether I used a forbidden term or not, so I clearly have one or more glitches.
RewriteEngine on
RewriteCond %{REQUEST_URI} !^/(wp-admin|wp-content|wp-login|wp-signup)/.* [NC,OR]
RewriteCond %{REQUEST_URI} !^/(install\.php|setup\.php|wlwmanifest\.xml).* [NC,OR]
RewriteRule (.*) http://www.example.com/$1 [R=301,L]



[edited by: not2easy at 6:47 pm (utc) on Dec 23, 2019]
[edit reason] readability [/edit]

not2easy

7:17 pm on Dec 23, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Apologies for the edit, we use example.com for the domain, but variations will link and be unclear to others.

Your code uses !^ at the beginning but that means "does not begin with" and if you try the same thing without that and without the / you might have better success. You do not need to spell out each request listed on that line because they all have wp- as part of the filename. Those could be consolidated to and catch them all. That code after ) as in )/.* isn't doing anything that you want. Try a plain ) at the end. I'm not sure why the second line uses file names with extenstions and the first line uses filenames without extensions. If you create a list of the filenames that you want to trigger the rewrite, it becomes easier to sort out to use the simplest form for the rule.

Search for code that is similar and you will see the differences. A search term like RewriteCond %{REQUEST_URI} would show you examples.

lucy24

7:20 pm on Dec 23, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



RewriteCond %{REQUEST_URI} file\.txt [NC]
Never put something in a RewriteCond that can go in the body of a RewriteRule. mod_rewrite operates on a “two steps forward, one step back” system: FIRST it looks at the Rule itself and sees whether it applies to the request. If and only if the rule--including flags such [NS]--seems to apply, then and only then does it look at Conditions.

Deny from all
This solution will provide a 403 response rather than the 301 response preferred.
In future years, when mod_compat is no longer a part of Apache installs, it will probably provide a 500 response.

RewriteCond %{REQUEST_URI} !^/(wp-admin|wp-content|wp-login|wp-signup)/.*
Is it possible you have misunderstood the significance of the leading ! (exclamation mark)? Also see above about Rule vs. Condition.

Why is it constrained to those four specific wp- terms? Do you have legitimate content beginning in wp- that you do want visitors to be able to see?

Edit: Although it is possible to use <Files> for files that do not physically exist--I just tried it on my test site to make sure--it is not the ideal way to do it. In 2.4 you should probably do something involving an <If> condition instead. Save the <Files> for real, physical files, like error documents or robots.txt, that require specialized access rules.

idiotgirl

7:29 pm on Dec 23, 2019 (gmt 0)

10+ Year Member Top Contributors Of The Month



The first condition is folder names, the second condition is file names. I was explicit about the full wp-names because other folders may begin/contain with wp- for one reason or another, but it could probably be rewritten as

RewriteCond %{REQUEST_URI} ^wp-(admin|lcontent|login|signup) [NC,OR]

I thought it required a forward slash before that as

^/wp-(admin|lcontent|login|signup)

but I'll test it. Thanks.

lucy24

10:14 pm on Dec 23, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yes, you need a slash in a RequestUri if you're using the opening anchor, as you certainly should. What you do NOT need, because it will break the rule as you've already discovered, is a leading ! exclamation mark. See phranque's post, above.

The [NC] flag is not needed, because robots never do request WP-ADMIN so it just creates extra work for the server. (It has to flatten the case of both the rule/condition and the request before comparing them. This is not as intuitively obvious for a computer as it is for a human.)

other folders may begin/contain with wp-
“may contain” doesn’t matter, because you are using an opening anchor. Unless you de facto have filepaths that begin with wp- and then go on into other stuff (but why, for pity's sake? You said it isn't a WP site, so it is in your power to give appropriate directory names), there’s no need to spell out what-if-anything comes after the /wp- element.

To reiterate:
Never put something in a RewriteCond that can go in the body of a RewriteRule. Omit the leading / (but not the opening anchor!) unless the rule is lying loose in the config file, such as in a VirtualHost envelope. You did say htaccess, right?

idiotgirl

10:30 pm on Dec 23, 2019 (gmt 0)

10+ Year Member Top Contributors Of The Month



A straight rewrite rule like so:

RewriteRule ^/wp-(admin|content|login|signup) http://example.com

Makes no difference if the file being redirected *doesn't exist*, makes no difference if redirecting *to another domain* - correct? A client may conceivably have their own folder with wp-something (a person's name - an article - whatever), even without WP installed, so I don't want to accidentally redirect. I need to be somewhat explicit instead of greedy matching.

lucy24

12:23 am on Dec 24, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



RewriteRule ^/
Do not use a leading / slash in a RewriteRule in .htaccess. The pattern will never match, and the rule won't execute.

Yes, the existence or non-existence of the file makes no difference, because mod_rewrite is looking purely at the request. And the target (the stuff on the right) makes no difference to how the pattern (the stuff on the left) is evaluated.

A client may conceivably have their own folder with wp-something (a person's name - an article - whatever), even without WP installed, so I don't want to accidentally redirect.
And this will be happening at the root of your own domain? Now I'm confused: Where is this RewriteRule to be located? In htaccess (which lives in some specific directory), or loose in the server's overall config file? Thanks to its wonky inheritance rules, mod_rewrite isn't usually the best bet for server-wide access controls.

idiotgirl

12:30 am on Dec 24, 2019 (gmt 0)

10+ Year Member Top Contributors Of The Month



Correct, this is in the root public_html folder for each domain, not server-wide.