Forum Moderators: open

Message Too Old, No Replies

MySQL SELECT query

SELECT Id from MyTable WHERE Col1 ONLY_CONTAINS('a', 'i', 'u', 'e', 'o')

         

runonce

7:04 pm on Apr 16, 2008 (gmt 0)

10+ Year Member



Hi everyone,

This is the situation:

I want to select rows where Col1 only contains any of the descripted characters.

For exmple:

Hypothetic MySQL table (MyTable):

Id¦ Col1
1 ¦ ai
2 ¦ ue
3 ¦ iee
4 ¦ ki
5 ¦ ga

Hypothetic MySQL query:


SELECT Id from MyTable WHERE Col1 ONLY_CONTAINS('a', 'i', 'u', 'e', 'o')

Expected result: rows 1, 2 and 3

Rows 4 and 5 should not be selected because contains characters not descripted in "ONLY_CONTAINS()".

Now, what could play the role of this hypothetic function "ONLY_CONTAINS()"?

I hope I do explain myself.

Thank you very much!

LifeinAsia

7:07 pm on Apr 16, 2008 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



This sounds like a perfect use for regular expressions. I'm not an expert, but there are several experts around that should be able to chime in with an answer.

runonce

10:16 pm on Apr 16, 2008 (gmt 0)

10+ Year Member



Yes, probably this can be done with regular expressions.

I can still solve this with PHP but I'll love to use MySQL instead.

Thanks

Demaestro

10:32 pm on Apr 16, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



SELECT Id from MyTable WHERE 'a' in Col1 and 'i' in Col1 and 'u' in Col1 and 'e' in Col1 and 'o' in Col1

This will work but it isn't very efficient.. regx would be the best way depending on the surrounding code.. But I am not the guy to help with that either.

[edited by: Demaestro at 10:34 pm (utc) on April 16, 2008]

LifeinAsia

10:48 pm on Apr 16, 2008 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Like I said, I'm not a regex expert, but I think something like this should work.

SELECT Id
FROM MyTable
WHERE ID NOT IN
(SELECT ID FROM MyTable WHERE Col1 LIKE '%[^aeiou]%')

runonce

12:52 am on Apr 17, 2008 (gmt 0)

10+ Year Member



SELECT Id from MyTable WHERE 'a' in Col1 and 'i' in Col1 and 'u' in Col1 and 'e' in Col1 and 'o' in Col1

This resulted in an sintax error message. Adding the brackets around the IN parameter (

WHERE 'a' IN(Col1)
) would fix that but still couldn't make what I'm looking for. Tried some variants with no progress.

SELECT Id
FROM MyTable
WHERE ID NOT IN
(SELECT ID FROM MyTable WHERE Col1 LIKE '%[^aeiou]%')

This didn't work either. Tried some variants, even using REGEXP but still couldn't make it.

Thanks anyway!

rocknbil

4:03 pm on Apr 17, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You don't use a regexp class [] with like, or % with 'regexp' (unless % is one of the characters you want to match.) For like, % acts as a wildcard, the equivalent of * in a regexp.

Off the top of my head,

select id from MyTable where Col1 regexp '[aeiou]' and Col1 not regexp '[^aeiou]';

[aeiou] defines a class, any characters in this set. [^aeiou] means any characters NOT in this set. So it does what you ask: any characters in this set, but NOT if there are characters outside this set.

Testing:

select 'ai' regexp '[aeiou]' and 'ai' not regexp '[^aeiou]';
--> 1 (1 == match, 0 == no match)

select 'ue' regexp '[aeiou]' and 'ue' not regexp '[^aeiou]';

--> 1

select 'ki' regexp '[aeiou]' and 'ki' not regexp '[^aeiou]';

--> 0

select 'ga' regexp '[aeiou]' and 'ga' not regexp '[^aeiou]';

--> 0

Also appears to be case-insensitive.

Aside note: Matches of this type have always, for me, reflected some sort of fault in logic in the way I've set up the database. You should very rarely ever have to search a field using a regular expression. An exception might be something like hunting down bad email address formats. Regexp searches are a slow version of a text search, and on a large database can make a significant difference. If you can re-think the database structure to eliminate regexp searches, you'll be much better off. Not sure of the context, but an example might be to store data in a relational table to be searched by integer numbers for a value.

runonce

12:52 am on Apr 18, 2008 (gmt 0)

10+ Year Member



The solution seems to be with
REGEXP '[ai]'
and
NOT REGEXP '[^ai]'
.

Thank you!

Anyway, the tests works fine but for some reason, when I take it to a table query, results are not the expected.

The query

SELECT * FROM words WHERE romajiWord REGEXP '[ai]' AND romajiWord NOT REGEXP '[^ai]'
will result in 0 rows even when there is a row with the value
ai
in
romajiWord
.

Shouldn't this query return every row which only contains any of the described characters (

ai
in this example) in the
romajiWord
column?

I've been trying variants the whole day but still couldn't figure what I'm doing wrong.

rocknbil

6:17 pm on Apr 18, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



create table test (id int(11) primary key auto_increment, text varchar(100));

insert into test (text) values ('ai');
insert into test (text) values ('ue');
insert into test (text) values ('iee');
insert into test (text) values ('ki');
insert into test (text) values ('ga');
insert into test (text) values ('AI');
insert into test (text) values ('UE');
insert into test (text) values ('IEE');
insert into test (text) values ('KI');
insert into test (text) values ('GA');

select id from test where text regexp '[aeiou]' and text not regexp '[^aeiou]';


+----+
¦ text ¦
+----+
¦ 1 ¦
¦ 2 ¦
¦ 3 ¦
¦ 6 ¦
¦ 7 ¦
¦ 8 ¦
+----+
mysql> 6 rows in set (0.02 seconds)

Note it's also working case-insensitive.

Not sure what you're doing, but you can test by logging in to mysql and doing a copy/paste of everything from "create table" to the "select" statement above.

Perhaps something with the encoding/collation of the tables?