Forum Moderators: open

Message Too Old, No Replies

HTML5 pattern, require / as first character

Forms handling URLs: the first character MUST be a forward slash.

         

JAB Creations

11:40 am on Aug 13, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm trying to get an HTML5 pattern attribute to require a forward slash to be the first character. I've tried many variations though this is the closest I feel visually represents what I'm trying to accomplish:

pattern="pattern="^[/].+{1,128}""


Unfortunately with that pattern any character is allowed as the first character. I know ^ is what a pattern starts with. I have tried escape the forward slash without any luck. What am I missing here?

John

JAB Creations

1:11 pm on Aug 13, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



pattern="^\/.{1,127}"

lucy24

4:46 pm on Aug 13, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What is the intended purpose of
.{1,127}
? If it means “no longer than 127 characters” then you would have to say
.{1,127}$
otherwise it would just ignore characters 128-end. And if you’re not capturing, what’s the upper limit for? Seems like all you’d need is
^/\w
and then the rest falls where it may.

:: wandering off to horse’s mouth to make sure I’m not saying anything lethally incorrect
https://www.w3.org/TR/html52/sec-forms.html#the-pattern-attribute
followed by hasty edit ::

It isn’t fully clear whether the / needs to be escaped--in javascript you only need to when the RegEx is inside /blahblah/ delimiters--but it certainly does no harm.

If the intent is to constrain the input to site-internal links, do you need to worry about people spelling out “https://www.example.com/” for the site you’re already on?

not2easy

5:54 pm on Aug 13, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



This confused me too, it was not clear that is was for a .js form.

JAB Creations

6:40 pm on Aug 13, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Lucy, thanks. I am aware that regex ends with a dollar sign. I've made it my goal over the years to avoid relying on regular expressions due to their expense and to use them as the last if condition if at all possible. This is HTML5 though and this input element is being used to create the URL immediately after the domain name (/ alone is the front page). So now the correct answer thanks to you is:

pattern="^\/.{1,127}$"


John

lucy24

7:03 pm on Aug 13, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This is HTML5 though
Yup, that’s why I had to take a quick look at the docs. It’s part of the new expanded Forms* section. But it does say plainly that the pattern is a Regular Expression, even if they don’t deign to spell out exactly which syntax they use. So whether you like it or not, you’re using a RegEx ;)


* If it weren’t part of input--a string with clear beginning and end--the pattern would instead have been [^"]{1,127}" which is why it’s a good thing I checked The Docs before posting.

JAB Creations

7:44 pm on Aug 13, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's not that I don't have an interest in regular expressions, it's that it's been the amatuer's first go-to instead of doing code correctly. I haven't shunned it, I just haven't blindly hugged it in the dark with the possibility that I was actually randomly embracing a tire hanging from a tree.

I would imagine that there may be different ... kinds(?) of regular expression syntax's...could you please indulge me a bit?

John

not2easy

8:04 pm on Aug 13, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



POSIX and perl syntax are the most common, but not the only syntax in use. I like the wikipedia page to explain it all: [en.wikipedia.org...]
(basically because I am not a RegEx wizard - my everyday usage is grep, a highly forgiving syntax) It does help to know the differences and which works best for its intended use.

lucy24

11:39 pm on Aug 13, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



When it comes to things like ^ and $ and . all RegEx dialects (“flavors”) are the same. It’s only when you get into the more complicated and nuanced constructions that they differ. In your rule .{1,127} some dialects might let you say .{,127} while others won’t. But the most dramatic differences are in the extras: one text editor says \p{Punct} while another says [[:punct:]] and they’ll differ on exactly what classes can be defined. Escaping / is another one: if the language as a whole--such as javascript or certain Apache modules--uses / to mark the beginning and end of the RegEx, then obviously / has to be escaped; otherwise it doesn’t. Some dialects have the option of using something other than \ as your escape character. Some use \1 for captures; others use $1.

And so on.

JAB Creations

11:07 am on Aug 14, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Interesting to read that Perl, JavaScript, Java, Python, XML and HTML5 all share the same syntax for regular expressions. I appreciate the insight from both of you.

John