Forum Moderators: open

Message Too Old, No Replies

Changes spaces to a hypen in a column

         

nestman

12:13 am on Jan 29, 2016 (gmt 0)

10+ Year Member



I have a keywords column and it contains stuff like this:

apples, oranges, pine apple

I'm trying to change the spaces to hyphens using this query"

UPDATE articles SET keywords = REPLACE(keywords," ","-") WHERE keywords REGEXP '[A-Z] [A-Z]' limit 1;

But this adds hyphens where I don't want them, like this:

apples,-oranges,-pine-apple

Can this be done with REGEXP? Or will I need to involve PHP?

Thank you.

lucy24

2:48 am on Jan 29, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It's not about how, it's about where. You have to make the changes at the source; if you've already converted your array into a comma-delimited list, it's too late. Well, maybe replace the space in your existing pattern with (?<!,)\s (I don't really mean \s there just wasn't any practical way to type it in Forums), assuming your keywords will never contain literal commas. But it still seems like a bit of two steps forward, one step back.

Can this be done with REGEXP? Or will I need to involve PHP?

Well, they're not mutually exclusive.

:: trying and failing to avoid reference to "apples and oranges" ::

nestman

3:01 am on Jan 29, 2016 (gmt 0)

10+ Year Member



I'm kinda knew to regexp. Can you explain (?<!,)

Thank you lucy24.

lucy24

7:06 am on Jan 29, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Can you explain

The form
(?<!blahblah)
means "not preceded by".

topr8

9:03 am on Jan 31, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



i think you are over complicating things.
you just need to run two updates...

UPDATE articles SET keywords = REPLACE(keywords," ","-"); -- replace every white space with a hyphen

UPDATE articles SET keywords = REPLACE(keywords,",-",", "); -- replace all hyphens immediately following a comma with a white space

... this is assuming all the keyword lists are set up correctly. eg no double white spaces etc.
although you could easily get around this by running an update either before the above two or after the above two

UPDATE articles SET keywords = REPLACE(keywords," "," "); -- replace double white space with single white space, use this before doing the updates above
or ...
UPDATE articles SET keywords = REPLACE(keywords,",--","-"); -- replace all double hyphens with a single hyphen, you could run this after running the updates above.

(this would even account for a 'keyphrase' that had a literal comma in it, assuming there was no white space after the comma, which would have been the only way to have entered it anyway, otherwise it would have been immpossible to differentiate from just another list item - however i'm certain you wouldn't have any keyphrases that had a literal comma in them)

lucy24

9:56 am on Jan 31, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



you just need to run two updates

Ha! Another victory for "two steps forward, one step back". I do this all the time in text editing but didn't think of it here. It's often simpler to change something globally, and then change part of it back again, than to work up a rule for the parts that weren't meant to change in the first place.

topr8

11:05 am on Jan 31, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



yeah lucy i agree - i'm a big fan of the strategy ...

"two steps forward, one step back"


i've spent hours and hours trying to work out how to do things elegantly and in one go ... and it is often so much quicker not over thinking the problem and just doing the obvious, even if it is inelegant or downright down and dirty.

... this is something that applies in all walks of life, not just coding!

csdude55

9:54 am on Mar 24, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I haven't tested this, but wouldn't [[:<:]] [[:>:]] (word boundaries) work? I understand \b doesn't work for REGEXP, so this is how I would have tried:

REGEXP '[[:<:]]\w \w[[:>:]]'

I believe that \b matches a whitespace and comma, so if [[:<:]] is the same then, in theory, this should work.

Note, I'm also using \w instead of [A_Z], which should be the same as [A-Za-z0-9_]. I'm not 100% sure if REGEXP recognizes \w, though.