Forum Moderators: phranque
When people visit my site, for example, www.example.com/abc.foo, if this URL does not exist, instead of a 404 page not found error, I want visitors automatically go to www.example.com/archive/abc.foo. If it still does not exist, then a 404 error.
Some features I'd like to see are:
1) the redirect to /archive/abc.foo is transparent to visitors, they will even not see /archive/abc.foo; instead, the url is still www.example.com/abc.foo
2) abc.foo is a file here. However, it is not limited to file, can also be directory. For example, www.example.com/directory, www.example.com/directory/subdirectory/file.foo, www.example.com/directory/subdirectory, etc.
3) Rewrite only applies to root directory. No rewrite in "archive" folder
4) No infinite rewrite (i.e.: /abc.html not exist, go to /archive/abc.html, if /archive/abc.html does not exist either, will NOT go to /archive/archive/abc.html)
Here is my .htaccess
# Apache/PHP/Drupal settings:
#
# Protect files and directories from prying eyes.
<FilesMatch "\.(engine¦inc¦info¦install¦module¦profile¦po¦sh¦.*sql¦theme¦tpl(\.php)?¦xtmpl)$¦^(code-style\.pl¦Entries.*¦Repository¦Root¦Tag¦Template)$">
Order allow,deny
</FilesMatch>
# Don't show directory listings for URLs which map to a directory.
Options -Indexes
# Follow symbolic links in this directory.
Options All
# Customized error messages.
ErrorDocument 404 /index.php
# Set the default handler.
DirectoryIndex index.php index.html index.htm
# Override PHP settings. More in sites/default/settings.php
# but the following cannot be changed at runtime.
# PHP 5, Apache 1 and 2.
<IfModule mod_php5.c>
php_value magic_quotes_gpc 0
php_value register_globals 0
php_value session.auto_start 0
php_value mbstring.http_input pass
php_value mbstring.http_output pass
php_value mbstring.encoding_translation 0
</IfModule>
# Requires mod_expires to be enabled.
<IfModule mod_expires.c>
# Enable expirations.
ExpiresActive On
# Cache all files for 2 weeks after access (A).
ExpiresDefault A1209600
# Do not cache dynamically generated pages.
ExpiresByType text/html A1
</IfModule>
# Various rewrite rules.
<IfModule mod_rewrite.c>
RewriteEngine on
# If your site can be accessed both with and without the 'www.' prefix, you
# can use one of the following settings to redirect users to your preferred
# URL, either WITH or WITHOUT the 'www.' prefix. Choose ONLY one option:
#
# To redirect all users to access the site WITH the 'www.' prefix,
# (http://example.com/... will be redirected to http://www.example.com/...)
# adapt and uncomment the following:
# RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
# RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]
#
# To redirect all users to access the site WITHOUT the 'www.' prefix,
# (http://www.example.com/... will be redirected to http://example.com/...)
# uncomment and adapt the following:
# RewriteCond %{HTTP_HOST} ^www\.example\.com$ [NC]
# RewriteRule ^(.*)$ http://example.com/$1 [L,R=301]
# Modify the RewriteBase if you are using Drupal in a subdirectory or in a
# VirtualDocumentRoot and the rewrite rules are not working properly.
# For example if your site is at http://example.com/drupal uncomment and
# modify the following line:
# RewriteBase /drupal
#
# If your site is running in a VirtualDocumentRoot at http://example.com/,
# uncomment the following line:
# RewriteBase /
# Rewrite current-style URLs of the form 'index.php?q=x'.
RewriteCond %{DOCUMENT_ROOT}/$1 !-f
RewriteCond %{DOCUMENT_ROOT}/$1 !-d
RewriteCond %{DOCUMENT_ROOT}/archive/$1 -f [OR]
RewriteCond %{DOCUMENT_ROOT}/archive/$1 -d
RewriteRule ^(.*) /archive/$1
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]
</IfModule>
# $Id: .htaccess,v 1.81.2.4 2008/01/22 09:01:39 drumm Exp $
The problem is that :
If I visit /abc.php, and abc.php is physically located in /archive (NO /abc.php, it is just a URL generated by Drupal's URL path feature), then it reports 404 error.
It is the same if I visit /directory and there is a physical /archive/directory folder.
If I visit /abc or /abc.php, and there is no /achive/abc or /archive/abc.php , then it works.
Here is the demo site:
<snip>
For example:
Visiting /mission.php gives 404 (as /archive/mission.php exists)
But /links.php is okay
Visiting /members gives 404 (as /archive/members exists)
But /projects is okay
If I change the Rewrtie rule order so that it is
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?q=$1RewriteCond %{DOCUMENT_ROOT}/$1 !-f
RewriteCond %{DOCUMENT_ROOT}/$1 !-d
RewriteCond %{DOCUMENT_ROOT}/archive/$1 -f [OR]
RewriteCond %{DOCUMENT_ROOT}/archive/$1 -d
RewriteRule ^(.*) /archive/$1 [L,QSA]
Then the problem is that /members/ or URLs like it works; However, visit to /aborig_new/ gives 404 error, while it is supposed to be redirect to /archive/aborig_new/
How can I fix this?
Many thanks,
[edited by: jdMorgan at 1:27 pm (utc) on April 27, 2008]
[edit reason] No URLs, please. See Terms of Service. [/edit]
Reversing the rules won't work, simply because the first rule will then match *all* of the URLs that the second rule might have matched -- The second rule is more specific, in that it requires the requested URL-path to exist in the /archive directory, whereas the first rule imposes no such requirement.
You may also wish to review the mod_rewrite documentation, specifically that for the RewriteCond directive used with the -f and -d flags. Doing so will allow you to better understand the code you're using -- a good thing, since we may be able to *help* you here, but it's ultimately going to be up to you to fix this problem.
Jim
The log msg is pretty simple:
154.xx.53.14 - - [27/Apr/2008:18:01:10 +0000] "GET /mission.php HTTP/1.1" 404 3661
154.xx.53.14 - - [27/Apr/2008:18:00:30 +0000] "GET /events/workshops HTTP/1.1" 404 3964
... ...
[edited by: tedster at 1:24 am (utc) on July 16, 2008]
[edit reason] anonymize the IP address [/edit]
I purposely visited a URL that reports 404 error. The info is logged in access log, not error log.
Here are two examples:
154.xx.53.14 - - [27/Apr/2008:21:37:40 +0000] "GET /members/ HTTP/1.1" 404 3661
154.xx.53.14 - - [27/Apr/2008:21:41:11 +0000] "GET /themes/fisheries/img/menu_projects_ovr.gif HTTP/1.1" 200 1037
154.xx.53.14 - - [27/Apr/2008:21:41:12 +0000] "GET /projects/ HTTP/1.1" 200 12399
The log setting for this website is:
ErrorLog "¦/usr/sbin/rotatelogs /srv/log/fisheries/error_log.%Y-%m-%d 86400"
CustomLog "¦/usr/sbin/rotatelogs /srv/log/fisheries/access_log.%Y-%m-%d 86400" common
I have no idea why 404 is logged in access log. Isn't it part of error log?
Thanks,
M.
[edited by: jdMorgan at 10:24 pm (utc) on April 27, 2008]
[edit reason] obscured specifics per TOS. [/edit]
[Sat Apr 19 21:10:05 2008] [error] [client 72.xx.252.136] File does not exist: /u/web/fisheries/<some-rewritten-file-path-which-is-probably-incorrect>.
If you're not getting 404 errors logged, then contact your host and ask them to fix it; Chasing a problem with RewriteCond -f/-d is going to be difficult if we cannot see the filepath that the server is trying to access.
Jim
Thanks for reply.
I have no idea why error log did not log it. It is a dedicated server and I have root access to it.
I double checked httpd.conf, it seems okay
... ...
<VirtualHost *:80>
DocumentRoot "/srv/www/fisheries"
ServerName fisheries.example.com
ErrorLog "¦/usr/sbin/rotatelogs /srv/log/fisheries/error_log.%Y-%m-%d 86400"
CustomLog "¦/usr/sbin/rotatelogs /srv/log/fisheries/access_log.%Y-%m-%d 86400" common
<Directory "/srv/www/fisheries">
AllowOverride All
allow from all
Options +Indexes
</Directory>
</VirtualHost>
... ...
I do not have today's error log, I do have yesterday's error log. But that basically just contains errors that brought by my errors in .htaccess. (spelling errors)
Thanks,
Ming
// Menu status constants are integers; page content is a string.
if (is_int($return)) {
switch ($return) {
case MENU_NOT_FOUND:
drupal_not_found();
//echo "You are going to visit an archived page:";
//echo "http://fisheries.example.com/archive";
//echo $_SERVER['REQUEST_URI'];
//$url=" [fishers.example.com...]
//header ("Location: $url");
break;
case MENU_ACCESS_DENIED:
drupal_access_denied();
break;
case MENU_SITE_OFFLINE:
drupal_site_offline();
break;
}
}
elseif (isset($return)) {
// Print any value (including an empty string) except NULL or undefined:
print theme('page', $return);}
It seems that Drupal will use index.php as a handler even there is 404 error. In other words, Drupal considers a 404 visit a valid visit instead of an error.
Here is how I would re-write your code, to save up to two unnecessary 'exists' checks:
# If requested resource exists as a file or directory, skip next two rules
RewriteCond %{DOCUMENT_ROOT}/$1 -f [OR]
RewriteCond %{DOCUMENT_ROOT}/$1 -d
RewriteRule (.*) - [S=2]
#
# Requested resource does not exist, do rewrite if it exists in /archive
RewriteCond %{DOCUMENT_ROOT}/archive/$1 -f [OR]
RewriteCond %{DOCUMENT_ROOT}/archive/$1 -d
RewriteRule (.*) /archive/$1 [L]
#
# Else rewrite requests for non-existent resources to /index.php
RewriteRule (.*) /index.php?q=$1 [L]
If that is not the correct and complete server path, then the code won't work as shown.
I would strongly suggest that you never use a script to handle errors. As shown in this case, it can lead to problems which are impossible to debug. I recommend using only static HTML custom error pages to handle errors. For now, I suggest commenting-out the ErrorDocument 404 directive, so that the server will use its default error document instead.
Jim
I still have some trouble with this. The problem is :
If I created a Drupal node, say: test, when visiting mysite/test, it will go to mysite/archive/test IF /archive/test exists, instead of rendering the Drupal node.
Could you please help?
Thank you very much,
But the problem still exists:
<code>
<IfModule mod_rewrite.c>
RewriteEngine on
# If requested resource exists as a file or directory, skip next two rules
RewriteCond %{REQUEST_URI} ^/archive [OR]
RewriteCond %{DOCUMENT_ROOT}/$1 -f [OR]
RewriteCond %{DOCUMENT_ROOT}/$1 -d
RewriteRule (.*) - [S=2]
#
# Requested resource does not exist, do rewrite if it exists in /archive
RewriteCond %{DOCUMENT_ROOT}/archive/$1 -f [OR]
RewriteCond %{DOCUMENT_ROOT}/archive/$1 -d
RewriteRule (.*) /archive/$1 [L]
#
# Else rewrite requests for non-existent resources to /index.php
RewriteRule (.*) /index.php?q=$1 [L]
</IfModule>
</code>
When I visit example.com/about, if there is a folder "about" in /archive, the visit will be redirected to example.com/archive/about, instead of rendering example.com/about (a Drupal node)
Ming
I would advise you to name nodes and archive subdirectories so that conflicts do not occur. If you cannot put into words a URL-based method to determine whether the node or the archive should be accessed by a particular URL, then it will also be impossible to code a mod_rewrite solution -- Mod_rewrite works on URL-patterns; If you can define a URL pattern to unambiguously 'map' URLs to server filepaths, then it will work.
Jim