Forum Moderators: phranque

Message Too Old, No Replies

Regex, Slashes and URL rewrite

         

rayw

9:25 pm on Apr 12, 2010 (gmt 0)

10+ Year Member



Hi,

I am attempting to pass a url path as a query string using regular expressions in a url rewrite rule. Something like:

http://example.com/url/friendly/path
to
http://example.com/handler.aspx?path=/url/friendly/path

My question is whether or not slashes are allowed in the query string? Or do i need to encode these slashes?

thanks for any help you are able to provide!

[edited by: bill at 4:17 am (utc) on Apr 13, 2010]
[edit reason] use example.com for examples [/edit]

jdMorgan

9:47 pm on Apr 12, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This is one of the toughest questions there is, pertaining to URI-encoding requirements. From Uniform Resource Identifier (URI): Generic Syntax [tools.ietf.org] :

The characters slash ("/") and question mark ("?") may represent data
within the query component. Beware that some older, erroneous
implementations may not handle such data correctly when it is used as
the base URI for relative references (Section 5.1), apparently
because they fail to distinguish query data from path data when
looking for hierarchical separators. However, as query components
are often used to carry identifying information in the form of
"key=value" pairs and one frequently used value is a reference to
another URI, it is sometimes better for usability to avoid percent-
encoding those characters.


So it's a good idea for compatibility reasons, and a bad idea for usability reasons...

My first reaction would be to not encode them, since it's unlikely your URI will be resolved by an "older, erroneous implementation." But I just looked at some query strings from Google, and they *do* encode slashes in URI passed in query strings.

No single correct answer, in other words....

Jim

phranque

6:45 am on Apr 13, 2010 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



the syntax description also states:
The query component is indicated by the first question mark ("?") character and terminated by a number sign ("#") character or by the end of the URI.

which means any slashes in the query component should theoretically not be confused with slashes in the path component.