At present, the Stat table in the system is increasing by 20W data pieces per day. Although more than 3 months of data has been dumped elsewhere, there are still nearly 2KW of data pieces in the table with a capacity of nearly 2GB.
Select * from Stat; The where… Limit is pretty fast, but when it comes to group by paging, it’s pretty slow.
It has been observed that group by takes 35 to 50 days for a seven-day period. The operational experience is extremely unfriendly. Search the MySQL partition scheme. It is found that the Internet is basically a systematic explanation of the concept and type of partition, as well as some experimental effects, not close to actual combat.
By referring to the MySQL manual as well as my own fumbling, I finally implemented partitioning in the current system, because for the record.
Partition type selection
The Stat table itself is a statistical report, so its data is stored by date, and the hot data is generally limited to the current day and 7 days. So I chose the Range type for partitioning.
Create partitions for the current table
Alter table alter table alter table alter table alter table
ALTER TABLE stat
PARTITION BY RANGE(TO_DAYS(dt)) (
PARTITION p0 VALUES LESS THAN(0),
PARTITION p190214 VALUES LESS THAN(TO_DAYS('2019-02-14')),
PARTITION pm VALUES LESS THAN(MAXVALUE)
);
Copy the code
Here are two things to note:
MySQL (I’m version 5.7) has a bug that no matter which partition you are looking for, it will scan the first partition. We have hundreds of thousands of columns in each partition, so it is very painful to scan.
The second is the PM partition, which is the largest partition. If don’t PM, then you save the data on 2019-02-15 will report an error. So PM is actually a reserved partition for future data.
Periodically extend partitions
MySQL partition does not dynamically expand itself, so we need to write code to dynamically expand MySQL partition.
Adding a partition requires the REORGANIZE command, which reassigns a partition. For example, if tomorrow is 15th, then we need to add a partition to 15th, which is actually splitting the PM partition into 2 partitions:
ALTER TABLE stat
REORGANIZE PARTITION pm INTO (
PARTITION p190215 VALUES LESS THAN(TO_DAYS('2019-02-15')),
PARTITION pm VALUES LESS THAN(MAXVALUE)
);
Copy the code
Here comes the question: how do I get all the partitions of the current table? There are many ways to do this on the web, but I thought it would be easier to show create table stat and then use the re to match all partitions.
Periodically Deleting partitions
As the database gets bigger and bigger, we definitely want to clean up old data, as well as old partitions. This is also easier:
ALTER TABLE stat DROP PARTITION p190214, p190215
Copy the code