Forum Moderators: phranque

Message Too Old, No Replies

.htaccess to replace characters in URLs

Swapping between two characters with .htaccess

         

WildBil2Me

4:01 pm on Mar 25, 2007 (gmt 0)

10+ Year Member



A plugin I'd been using for some time recently ceased development. When I replaced it with a new plugin I discovered that the new plugin used a different character to separate multi word tags.

The old format was 'myurl.com/tag/foo+bar' and the new format is 'myurl.com/tag/foo_bar'

I'd like to redirect any incoming links to the new format but I've been having trouble finding .htaccess documentation that really covers this. I assume that the best option would be to rewrite the URL by replacing the '+' signs with the '_' signs. I'm just unsure how to proceed ...

Any suggestions?

jdMorgan

4:27 pm on Mar 25, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Better yet, replace underscores with hyphens, so that search engines (Google, in particular) can read the hyphenated words as separate words, rather than interpreting "blue_widgets" to mean, "match only the exact character sequence 'blue_widgets'". It's unlikely that anyone will search for "blue_widgets" or "red_widgets" including the underscore, so this new plug-in will destroy any advantage to including the keywords in the URL.

The following code will redirect requested URLs containing three, two, or one underscore to the same URL, but with underscores replaced with hyphens.


Options +FollowSymLinks
RewriteEngine on
RewriteRule ^([^_]+)_([^_]+)_([^_]+)_([^.]+)\.html$ http://www.example.com/$1-$2-$3-$4.html [R=301,L]
RewriteRule ^([^_]+)_([^_]+)_([^.]+)\.html$ http://www.example.com/$1-$2-$3.html [R=301,L]
RewriteRule ^([^_]+)_([^.]+)\.html$ http://www.example.com/$1-$2.html [R=301,L]

Add more rules maintaining the overall pattern shown (note that the last regex pattern in each rule is different from the preceding ones) as required to fit the maximum number of hyphens you expect in a URL, up to a maximum of eight. After that, more sophisticated code is needed to overcome both the mod_rewrite limitation on back-references and a rather nasty bug that occurs when trying to internally rewrite more than once.

Here's the first rule modified to replace three "+" characters with hyphens as an example:


RewriteRule ^([^+]+)\+([^+]+)\+([^+]+)\+([^.]+)\.html$ http://www.example.com/$1-$2-$3-$4.html [R=301,L]

Note that the "+" characters not enclosed in [groups] must be escaped by preceding them with "\". Again, I strongly advise against using underscores in your published URLs if you wish to derive any SE benefit from keyword-in-URL.

These rules will likely need to be further adapted, as I'm not sure what your real links look like. For example, I assumed a ".html" extension on your links, which may or may not be the case. If there is never an extension, then you could/should replace "([^.]+)" with "(.*)".

For more information, see the documents cited in our forum charter [webmasterworld.com] and the tutorials in the Apache forum section of the WebmasterWorld library [webmasterworld.com].

Jim

WildBil2Me

4:56 pm on Mar 25, 2007 (gmt 0)

10+ Year Member



Hi jdMorgan, thanks for checking in.

I swapped to the '-' rather than the '_' and your code is working great for handling those redirects. (this was an existing option in the plugin.)

It isn't redirecting when I substitute the '+' sign though.

What I mean is that right now if I type 'http://example.com/tag/Foo_bar' it will direct to 'http://example.com/tag/Foo-bar' but 'http://example.com/tag/Foo+bar' still goes to a 404 page.

I should include the edited code I'm using:

RewriteRule ^tag/([^_]+)_([^.]+)$ http://example.com/tag/$1-$2 [R=301,L]

Solution
Thanks jdMorgan, I figured it out. It took me a few minutes to really understand the situation. I'm using the following two lines and they're doing exactly what I need.

Options +FollowSymLinks
RewriteEngine on
RewriteRule ^tag/(.*)_(.*)$ http://example.com/tag/$1-$2 [R=301,L]
RewriteRule ^tag/(.*)\+(.*)$ http://example.com/tag/$1-$2 [R=301,L]

[edited by: WildBil2Me at 5:19 pm (utc) on Mar. 25, 2007]

jdMorgan

5:06 pm on Mar 25, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, a bit of tweaking is needed to handle both, unless you use two separate rules:

RewriteRule ^tag/([^+_]+)[+_]([^.]+)$ http://example.com/tag/$1-$2 [R=301,L]

Again, the documents cited in our forum charter are very useful.

Jim

WildBil2Me

5:09 pm on Mar 25, 2007 (gmt 0)

10+ Year Member



My apologies, I was using that URL purely as a generic. I didn't realize there was a set standard.

Again, I apologize and will head over to review the TOS right now.