Forum Moderators: open

Message Too Old, No Replies

Sorting results with = first, MATCH second

         

csdude55

6:41 pm on Mar 19, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This is in MySQL. I'm working on a "search" option for a table that's currently 2.6G in size. I'm setting it up so that the user can search for a "username" independently of other content.

I don't have an index on the "username" field by itself yet. I know it's going to take a long time to process, so I want to make absolutely sure that I do it right the first time.

Ideally, I would like results where username = '%s' to show up first, followed by results that are less of a match. For example, let's say that there are 3 registered usernames:

csdude
csdude55
csdude 55

Someone searches for "csdude" then I would want results where username="csdude" to show up first, with the other two after. But if they search for "csdude 55" then I want username="csdude 55" to show up first, followed by results for "csdude55", followed by results for "csdude".

At least, that's how I think the search would make the most sense... I'm open to thoughts on that.

Testing speeds on a backup table with a BTREE index on the username and a separate FULLTEXT index:

// 0.02 seconds, uses the BTREE index
SELECT DISTINCT id FROM table WHERE username = "csdude 55"

// 3.9 seconds, doesn't use an index
SELECT DISTINCT id FROM table WHERE username LIKE "%csdude%55%"

// 1.3 seconds, uses the FULLTEXT index
SELECT DISTINCT id, MATCH(username) AGAINST('csdude 55' IN BOOLEAN MODE) AS score
FROM table
WHERE
MATCH(username) AGAINST ('csdude 55' IN BOOLEAN MODE)
ORDER BY score DESC

Based on this, I THINK that the best option is to create a FULLTEXT index on the main table and use MATCH..AGAINST. Unless you guys and gals can suggest something else?

It took about 30 minutes to process the FULLTEXT index to the backup table, so I'm guessing it'll take 1 - 1 1/2 hours to add it to the main table. It actually timed out using phpMyAdmin, but the index was there and working. I'm not even sure how to handle adding it to the bigger table, do I need to stop MySQL altogether while it processes? Do I need to stop it, copy the table to a backup folder, then turn MySQL back on and create the index? Just thinking about it makes me break out in a sweat... LOL

lammert

10:30 pm on Mar 19, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Working on a production database always raises the blood pressure somewhat :)

Normally creating an index will lock the database for other queries until the operation is completed. You may want to add ALGORITHM=INPLACE to the index creation command. With this setting, it doesn't block data queries. I would test it on a backup database first though to be sure it has the desired effect.

Reference: [dev.mysql.com...]

csdude55

5:59 pm on Mar 20, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I tried on a backup via PMA last night, but after 30 minutes it timed out. I tried it again via SSH before going to bed, and when I got up it was still running and my server load was off the charts!

So now my plan is to create a new table with the new indexes, and I'll use

INSERT INTO newTable SELECT * FROM table WHERE id > xxx LIMIT 1000

After everything's moved over, I'll rename the old table to "table_backup", then rename the new one "table"

I hope that as long as I do the renaming at around 2am, the damage should be minimal.