Forum Moderators: open
I have a 'url table' with 5 million+ rows, and some preliminary testing shows that partitions are helping me insert/update/select much quicker.
Are any of you actively using partitions in MySQL and have some success stories from using them in your larger tables? (drawbacks encountered from using them also welcome)
Just for some background and some of the advantages of partitioning:
In MySQL, the InnoDB storage engine has long supported the notion of a tablespace, and the MySQL Server, even prior to the introduction of partitioning, could be configured to employ different physical directories for storing different databases....Partitioning takes this notion a step further, by allowing you to distribute portions of individual tables across a filesystem
[dev.mysql.com...]
Basically, instead of having just one data and index file per table, partitioning allows you to split the table into smaller, more manageable chunks. MySQL currently supports just horizontal partitioning, i.e. by columns rather than rows. MySQL allows these partitions on separate disks.
Partitioning allows the DB engine to examine your query's WHERE clause and if applicable, only examine the relevant partitions of a table rather than the entire values of a column.
Take the logging of a website for example:
A busy website's logs quickly grow, and after several million rows of data speed becomes an issue. If you create partitions based on the date of each 'hit', per month for instance, you can dramatically speed up the queries for more frequent tasks like "hits for today" or "hits this month". MySQL can effectively ignore the data you don't regularly query but still want stored. You could have billions of hits stored but still have good query speed when looking up recent hits.
Of course, it all depends on how you want to query the data and your individual needs. I've read up a little on them and feel there's obvious advantages for some tables I have. Here's how I see them being used:
1. IP to Country Lookup Table
IP Ranges Table (5.4 million rows)
id (MED INT, AUTO INCREMENT)
countryid (SMALL INT)
loweripbound (INT)
upperipbound (INT)loweripbound and upperipbound are IP ranges that correspond to a country, the countryid.
There is only one practical use of the data supplied above; get the countr(y¦ies) for IP addresses supplied in the WHERE clause.
Rather than the DB engine having to query 5.4 million rows, you could partition the data based on the 1st octet of stored IP addresses. MySQL would then evaluate your WHERE clause and 'know' which partitions to query to retrieve results, therefore (potentially) only having to scan a fraction of the index/data.
I've a few questions regarding my own setup of tables that could use partitions, but thought I'd see what's going on with other people first.
One definite advantage of using them seems to be the archiving of old data, which I'm sure many would appreciate as a remedy for fast growing tables, but haven't had a means to deal with it.