Forum Moderators: open

Message Too Old, No Replies

Creating an index, do I just worry about the WHERE conditions?

         

csdude55

5:44 pm on Oct 2, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have a table that I created many years ago. All of my queries are like:

SELECT colA FROM table
WHERE colB = '' OR colB = 'foo'


I had originally created an index on colA and colB (named ORIGINAL, with Cardinality of 83). But now I'm pretty sure that the index only needs to be on colB (the WHERE conditions).

So I created a second index that's just on colB (named NEW, with a Cardinality of 1), then ran:

EXPLAIN
SELECT colA FROM table
WHERE colB = '' OR colB = 'foo'


It returns:
type: index
possible_keys: NEW
key: ORIGINAL
Extra: Using where; Using index

So it's using the index with the WHERE columns AND the SELECT columns!

Then I deleted the ORIGINAL index, and now EXPLAIN shows that it doesn't use an index at all!

Does that mean that I should include the columns in the SELECT, too?

In this table of 83 rows, the speed difference is insignificant. But this is just a test run, and the result is going to affect whether I rebuild much larger tables.

csdude55

5:47 pm on Oct 2, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Wait, I just discovered that if I remove colB = '' OR, or change it to something other than '' (eg, WHERE colB = 'this' OR colB = 'that'), then it will use the NEW index.

So the issue is apparently when the column is empty.

Is there a "better" way to select columns that are blank OR contain a specific string?

phranque

8:18 pm on Oct 2, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



when creating an index, you should look at any column usage in WHERE clauses, GROUP BY, ORDER BY, MIN, MAX, and JOIN.

phranque

8:32 pm on Oct 2, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Is there a "better" way to select columns that are blank OR contain a specific string?

it might be overkill but you could probably do this with Boolean Full-Text Searches [dev.mysql.com]

csdude55

11:34 pm on Oct 2, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The code I posted was pretty much verbatim, except for the table name, column names, and values. I don't have any groups, orders, etc.

Any clue why it's not using the index when I use WHERE colB = '' or colB IS NULL (I tested both), but does use it when I use WHERE colB = 'foo'? Is there something magical (anti-magical?) when checking if a value is empty?

phranque

5:52 am on Oct 3, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



i literally answered the precise question you posed in the title of this thread.

csdude55

6:12 am on Oct 3, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Haha, I see :-) I wasn't complaining about your answer, just elaborating on the issue that I discovered in the second post

LifeinAsia

4:05 pm on Oct 4, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Unless there's a compelling reason to have an index on colA, I'd drop it. One, it'll make sure the DB doesn't use it. Two, it'll free up space. Three, it will stop wasting CPU resources keeping the colA index updated when you add/delete/modify data.

csdude55

5:13 pm on Oct 4, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That's what I did, @LifeinAsia, and now it doesn't use an index at all. I can't figure out why.

The weird thing is that I went through it and followed all of the rules to speed it up (I replaced the NULL values with '', changed a column of VARCHAR to ENUM, and then created the new index based on the WHERE statement), and now the query is slightly slower! I mean, it's only 0.002 slower so it hardly matters, but still... not what I expected :-/

And now I don't know whether to go through my larger tables (one that's about 3G) and make the same changes. It'll end up being a lot of work if I don't have any performance improvements.

LifeinAsia

5:15 pm on Oct 4, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Sounds strange. Maybe try dropping the colB index and re-creating it?

csdude55

5:26 pm on Oct 4, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I just tried it, but it had no impact.

My only theory is that, with such a small table, making it's faster to process with no index at all? The table has 72 rows now (it was 83, but I removed several that were no longer relevant), and 62 of them match colB = ''. So I'm going to move forward on a big table tonight, anyway, to see what happens.