Forum Moderators: phranque
website/filename.extension
website/folder/index.php?pge=filename
RewriteEngine on
RewriteCond %{HTTP_HOST} ^website&TDL [NC]
RewriteRule ^(.*\.?)website&TDL/(.*)\.php $1website&TDL/_YT/index.php?pge=$2 [L,R=301]
Of course, i would be very happy if someone could help me write a proper rewrite rule
1. RewriteBase:
"Sets the base URL for per-directory rewrites". What the F does that mean?
If should I set RewriteBase to the website url? The website url + the new folder where the content is located? The relative path to the old content? the path to the new content? the full physical path?
I am at a loss understanding how this works.
2. Separators:
Spaces or tabs or doesn't matter? I assume it doesn't matter, but I am not sure of anything anymore.
3. mod_rewrite version
Does it make a difference what version of mod_write is running on the server? (PHP runs as CGI so I can't check directly, but I can ask the host).
4. In RewriteRule, is there anyway to see the uri before the pattern is applied?
"Pattern is a perl compatible regular expression, which is applied to the current URL. ``Current'' means the value of the URL when this rule is applied. This may not be the originally requested URL, which may already have matched a previous rule, and have been altered."
What is the format of this URL (assuming no rules have been matched yet)? Is it the website address + path? only the path? The relative path of the resource? the resource name only? the full physical path? Does it change depending on what RewriteCond statements are before the RewriteRule?
5. At this point, I am considering replacing the content of every existing old page with a php redirect.
Is there any downside to that solution? (Apart from having to manually change the content of 200+ files, and presumably a slight performance hit because the entrance page has to be loaded before the redirect).
going from:
website/filename.extension
to
website/folder/index.php?pge=filename
filename.extension filename folder/index.php?page=filename RewriteCond %{THE_REQUEST} {more-stuff-here} RewriteCond %{HTTP_HOST} ^website [NC,OR]
RewriteCond %{HTTP_HOST} ^www.website [NC]
%{HTTP_HOST} example ^(www\.)?example RewriteCond %{REQUEST_URI} !="^/.*?/.*"
RewriteRule ^([^/.]+)\.php /folder/index.php?pge=$1 [L] RewriteCond %{THE_REQUEST} index
RewriteRule ^(([^/.]+/)*)index\.any-extensions-you-use http://www.example.com/$1 [R=301,L]
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
RewriteCond %{THE_REQUEST} index
RewriteCond %{QUERY_STRING} pge=([^&]+)
RewriteRule ^folder/index\.php http://www.example.com/%1 [R=301,L]
RewriteRule ^([a-z0-9-]+)\.extension$ /folder/index.php?pge=$1 [L] Have you got other domains passing through the same htaccess?
...confirm that there's no reason for that = sign and the quotation marks
www.example.com/folder/index.php?pge=index
if ($_GET['pge'] == "index")
{
$page = "pg_home";
}
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
Have you already started doing this? If you are redirecting, you want a R=301 permanent redirect; R=302 (or plain R) means "for now you'll find the page over here but it really lives at the old URL, so keep indexing that one".
What you really want-- assuming I have successfully read your mind w/r/t to depth of URLs involved-- is
RewriteRule ^([^/.]+)\.php /folder/index.php?pge=$1 [L]
:: detour to text editor to paste in this line and squint at it in a bigger font ::
I can't understand why you would want to redirect friendly URLs to parameter-based URLs. Usually it's the other way round.
What you need is an internal rewrite. As long as "folder" name is always the same and your friendly URLs are all lower-case and don't have folders, this will work...
RewriteRule ^([a-z0-9-]+)\.extension$ /folder/index.php?pge=$1 [L]
I read in the Apache wiki that it's a good practice:We should really encourage people to use the lexicographically equal operator instead of a RegEx if they want to ckeck, if test string is lexicographically equal to cond pattern.
E.g. using
RewriteCond %{HTTP_HOST} !=""
I don't think it really makes a difference in my case, but I figured it couldn't hurt.
If the request is not exactly equal to the literal string
^/.*?/.*
That'll save you a trip or two to the editor.
I took care of that case inside index.php:
What happens to the base path inside the php files? Is this an easy fix in .htaccess or do I need to fix the resource paths in the php scripts?
It makes a hell of a difference, because "lexicographically equal" doesn't mean "it fits this Regular Expression".
Ahem. I am not on w###, and I should hope that all browsers on the planet allow you to resize text. But they don't let me put it into a variety of fonts so I can stare at it upside-down, backward and sideways ;)
Except that the request will never reach the php if it keeps getting rewritten or redirected forever. You need something in the htaccess that says "If the request is already for index.php, you do not need to redirect to index.php".
Oh, oops, we need to figure out where the mistake is happening. Luckily this will take approximately three seconds. Pull up your error logs and see what files are being requested. If the requested path-plus-filename is correct, the problem is in htaccess. If it is incorrect, the problem is in php.
<base href="http://www.example.com/folder/">in the head of the template files.
The second time, folder/index.php does not match the pattern, so there is no looping.
Unless I am missing something here?
Inside the php files, the paths are relative:
$_SERVER['DOCUMENT_ROOT'] . "/blahblah" Images, CSS and JS are accessed from the web using URLs.
Rather than using <base>, alter the links in each href to begin with a leading slash and mention the full path to the file. Don't use relative linking.
Let's see the final htaccess code as I'm sure you have several bits you don't need and maybe some stuff you need, but don't know about, is missing.
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} example.com [NC]
RewriteRule ^([^/]+)\.php /folder/index.php?pge=$1 [L]
Yes, and it's important. The body of the RewriteRule looks ONLY at the "path".
:: shuffling papers ::
Thought so. See my first post in this thread, on your question #4. Adding a query string has no effect on the path. So the URL will continue matching until the cows come home.
Site-absolute links are safest. I'm not in a position to advise on this, seeing as how I only speak four words of php
and recent detour prompted by a different thread has just revealed that I made a vast booboo when I created a scad of php pages the weekend before last
but so far
$_SERVER['DOCUMENT_ROOT'] . "/blahblah"
has always behaved as expected.
// Paths
$liveServer = false;
if ($liveServer)
{
define ('ROOT_PATH', "/folder/");
define ('IMAGE_PATH', "/folder/images");
}
else
{
define ('ROOT_PATH', "");
define ('IMAGE_PATH', "/images");
}
Oh yes and: The [L] flag doesn't mean "Stop here and proceed directly to the page". It only means "You're done with mod_rewrite for now, so go back to the first mod-- which may happen to be mod_rewrite, but it doesn't matter-- start from the beginning with the newly rewritten URL, and continue until everything rinses clean".
RewriteEngine on
RewriteCond %{HTTP_HOST} example.com [NC]
RewriteRule ^([^/]+)\.php /folder/index.php?pge=$1 [L] ^([a-z0-9-]+)$ or similar (or ^(([a-z0-9-]+/)*[a-z0-9-]+)$ if there are folders). RewriteCond %{HTTP_HOST} !^(example\.com)?$
RewriteRule (.*) http://example.com/$1 [R=301,L] RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
I'm not sure why you test HTTP_HOST. There's no need.
^([^/]+)\.php is way below optimum. It means "keep on parsing 'not a slash' until you come to a slash, and check that the slash is a literal period". This will always fail because a slash is not a period, and that will result in many "back off and retry" trail match operations being performed to find out what you really meant.
You want ^([^.]+)\.php which means, "keep on parsing 'not a period' until you come to a period, then check it is followed by 'php'".
If the "friendly" URLs can be folder-based, you'll need ^(([^/]+/)*[^.]+)\.php instead.
You should also position a non-www to www or a www-to non-www redirect ahead of all this code.
RewriteCond %{HTTP_HOST} example\.com
RewriteCond %{HTTP_HOST} !^(example\.com)?$
RewriteRule (.*) http://example.com/$1 [R=301,L] RewriteCond %{HTTP_HOST} example\.com
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) http://www\.example.com/$1 [R=301,L] ^(([^/.]+\.)+)php$ As you have extra hostnames resolving to folders, the non-www/www code I supplied will need an extra Condition:
Noted, your "odd" requirement for root URLs with multiple periods:
^(([^/.]+\.)+)php$
may be what you are looking for, as long as you're aware that the passed parameter will end in a period, which your PHP script will need to strip.
Of course, life is even easier when you use extensionless URLs. :)
This one has an error in it: the period is part of the capture group so it would give "index.php?pge=name."
I am not quite sure why you want to change
What are -f and -d tests?
RewriteCond %{REQUEST_URI} !-f
RewriteCond %{REQUEST_URI} !-d
RewriteCond {there's a third piece which I've forgotten}
RewriteRule (.*) /index.php?$1 [L]
Second time around, the path becomes:
folder/index.php
One problem I noticed is that apparently the query seems to get lost during the redirect (I can access the $_GET vars on the testing server but not on the live server).
?pge=$1 Lucy24:
Ignorance is bliss. Stay that way ;) The conventional CMS htaccess is built around a package that goes something like
RewriteCond %{REQUEST_URI} !-f
RewriteCond %{REQUEST_URI} !-d
RewriteCond {there's a third piece which I've forgotten}
RewriteRule (.*) /index.php?$1 [L]
This is supposed to mean "all requests for pages get quietly rewritten to a php page that deals with everything". But since the Rule isn't constrained to requests ending in / or .php this test has to run on absolutely all requests all the time. And there are very few situations where a request for, say, an image file are answered by php. (Few but not zero. For example it's how the <noscript> version of piwik works: <img src blahblah piwik.php et cetera>)
Lucy24:
Uh-oh. Do you mean that the request already has a query string before it meets the rule whose target includes
?pge=$1
If so, you need to add the flag [QSA] for "query string append". ...
g1smd:
The -f and -d tests check whether the URL request resolves to a physical file or to a physical folder. These are very slow server filesystem read operations that should be avoided.
Lucy mentioned how they are used, the idea being that because a request for robots.txt resolves to a real file, the rewrite will not occur, but a request for any other file that does not test true for -f or for -d will be rewritten to be handled by the index .php file.
This is a horrible method that should be avoided. The fact that several popular CMS packages use it is not an endorsement. They have gone for "easy code" with a huge performance hit.
phranque:
the CondPattern of the RewriteCond directive can perform various file attribute tests:
[httpd.apache.org...]
Thanks. I am in the process of reading the documentation but it's not an easy read. However, I think I am starting to figure it out thanks to the outstanding explanations I have received so far.