MySQL partition table introduction

Partition is a table design mode, the correct partition can greatly improve the efficiency of database query, complete higher quality SQL programming. But partitioning can have devastating results if used incorrectly.

Partitioning is not implemented at the storage engine level, so not only InnoDB storage engine supports partitioning, but also common storage engines such as MyISAM and NDB. Not all storage engines support partitioning. For example, CSV, FEDORATED, and MERGE do not support partitioning. Before using this partitioning feature, you should be aware of the partitioning support of the storage engine of your choice.

MySQL database added support for partitioning in version 5.1. Partitioning is the process of breaking up a table or index into smaller, more manageable parts. For database access applications, logically there is only one table or index, but physically this table or index may consist of dozens of physical partitions. Each partition is a separate object that can be processed on its own or as part of a larger object.

The MySQL database supports horizontal partitioning (allocating records of different rows in the same table to different physical files), but does not support vertical partitioning (allocating records of different columns in the same table to different physical files). In addition, a partition of a MySQL database is a partitioned index. A partition contains both data and indexes. Global partitioning, on the other hand, means that data is stored in various partitions, but the index of all data is placed in one object. Currently, MySQL databases do not support global partitioning.

You can run the following command to check whether partitioning is enabled for the current database:

mysql> show global variables like '%partition%';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| have_partitioning | YES   |
+-------------------+-------+
1 row in set(0.04 SEC) mysql > show plugins * * * * * * * * * * * * * * * * * * * * * * * * * * * 43. The row * * * * * * * * * * * * * * * * * * * * * * * * * * * Name: partition Status: ACTIVE Type: STORAGE ENGINE Library: NULL License: GPLCopy the code

There is sometimes a misconception that once partitioning is enabled, the database will run faster. There are a number of problems with this conclusion. Empirically, partitioning may improve the performance of some SQL statements, but partitioning is primarily used to manage high availability of databases. Partitions should be used with great care in OLTP applications. In short, if you simply use partitions without understanding how they work or how your application uses them, they can have a negative impact on performance.

MySQL > create partition

RANGE partitioning

The RANGE partition, the most common type of partition, allocates multiple rows to a partition based on column values belonging to a given contiguous interval. These intervals must be contiguous and non-overlapping, defined using the VALUES LESS THAN operator.

The LIST partition

The LIST partition is similar to the RANGE partition, except that the LIST partition is selected based on column values matching some value in a discrete set of values, rather than being continuous.

LIST PARTITION is implemented BY using “PARTITION BY LIST(expr)”, where “expr” is a column value or an expression based on a column value that returns an integer value, and then defines each PARTITION BY “VALUES IN (value_list)”. Value_list is a comma-separated list of integers.

HASH partitioning

The purpose of HASH partitioning is to evenly distribute data among predefined partitions to ensure that the amount of data in each partition is roughly the same. In RANGE and LIST partitions, you must explicitly specify in which partition a given column value or collection of column values should be stored; In HASH partitioning, MySQL does this automatically. All the user has to do is specify a column value or expression based on the value of the column to be hashed, and specify the number of partitions that the partitioned table will be split into.

To use HASH partitions to split a TABLE, add a “PARTITION BY HASH (expr)” clause to the CREATE TABLE statement, where “expr” is an expression that returns an integer. It can simply be the name of a column whose field type is MySQL integer. In addition, you will most likely need to add a “PARTITIONS Num” clause, where num is a non-negative integer indicating the number of PARTITIONS to be partitioned from the table. If you do not include a PARTITIONS clause, the number of PARTITIONS will default to 1.

LINER HASH

MySQL also supports linear hashing, which differs from regular hashing in that linear hashing uses a linear powers-of-two algorithm, while regular hashing uses modulus of hash function values. The only syntactic difference between LINEAR hash partitioning and regular hash partitioning is the addition of the “LINEAR” keyword to the “PARTITION BY” clause.

The KEY partition

The KEY partition is similar to the HASH partition except that the HASH partition is partitioned using user-defined functions that support string HASH partitioning. The KEY partition is partitioned using functions provided by the MySQL database based on the same algorithm as PASSWORD().

COLUMNS

RANGE, LIST, HASH, and KEY partitions are defined as follows: Data must be an integer (interger). If it is not an integer, it should be converted to an integer by functions such as YEAR(), TO_DAYS(), and MONTH(). COLUMNS began to be supported in MySQL5.5, which can be seen as an evolution of RANGE and LIST partitions. COLUMNS can be directly partitioned using non-shaping data. COLUMNS are obtained by direct comparison of types and do not need to be converted to an integer. In addition, the RANGE COLUMNS partition can partition the values of multiple COLUMNS.

The COLUMNS partition supports the following data types:

  • All integer types, such as INT, SMALLINT, TINYINT, and BIGINT. FLOAT and DECIMAL are not supported.
  • DATE type, such as DATE and DATETIME. The remaining date types are not supported.
  • String types, such as CHAR, VARCHAR, BINARY, and VARBINARY. BLOB and TEXT types are not supported.

NULL value in the partition

MySQL databases allow partitioning of NULL values, but the way they are handled may be completely different from other databases. MySQL database partitions always treat NULL values as less than any non-NULL value in the same way that MySQL database ORDER BY operations handle NULL values. Therefore, the MySQL database handles NULL values differently for different partition types.

  • For the RANGE partition, if a NULL value is inserted into the partition column, the MySQL database will place the value in the leftmost partition.
  • For the LIST partition, if a NULL value is inserted into the partition column, you must explicitly indicate which partition put the NULL value, otherwise an error is reported. For the LIST partition, if a NULL value is inserted into the partition column, you must explicitly indicate which partition put the NULL value, otherwise an error is reported.
  • For HASH and KEY partitions, NULL values are treated differently from RANGE and LIST partitions. Any partition function returns 0 for records that contain a NULL value.

Partitioning and performance

Does partitioning really speed up database queries? In fact, you may not even notice an increase in query speed, or even see a sharp drop in query speed, so it is important to understand the environment in which partitions are used before using them properly.

Database applications are divided into two categories: one is OLTP (online transaction processing), such as Blog, e-commerce, online games and so on; Another category is OLAP (online analytical processing), such as data warehouse, data mart. Partitioning can really improve query performance for OLAP applications, because most queries in OLAP applications require frequent scanning of a large table. Suppose you have a table of 100 million rows with a timestamp attribute column. The user’s query requires a year’s worth of data from this table. If you partition by timestamp, you only need to scan the corresponding partition. This is the zonal pruning technique introduced earlier.

For OLTP applications, partitioning should be very careful. In this application, it is usually not possible to retrieve 10% of the data of a large table, and most of it is just a few records returned through the index. According to the principle of B+ tree index, for a large table, the general B+ tree requires two to three disk I/OS. As a result, B+ trees can do just fine without the help of partitioning, and poorly designed partitioning can cause serious performance problems.

For example, many development teams think that a table with 1000W rows is a very large table, so they tend to use partitions, such as a 10-hash partition for the primary key, so that each partition has only 100W of data, so the query should be faster. For example, select * from table where pk=@ PK However, have you considered the case that 100W and 1000W rows of data themselves constitute the same level of B+ tree, maybe two or three levels. The above index traversing the primary key partition does not result in any performance improvement. Well, if the B+ tree height of 1000W is 3,000W and the B+ tree height of 100W is 2, then the above index partitioning by primary key can avoid 1 IO, thus improving the efficiency of the query. This is fine, but the table only has the primary key index, and there are no other columns to query. If an SQL query similar to the following is executed: SELECT * from table where key=@key, the query for key needs to scan all 10 partitions. Even though the query cost for each partition is 2 I/OS, a total of 20 I/OS are required. For the original single-table design, the KEY query only takes 2 to 3 IO.

As you can see from the above conclusions, you must be careful when using partitions in AN OLTP scenario.

MySQL 5.7 has improved partitioning

In MySQL 5.6, partition information is maintained in the MySQL Server layer (in.PAR files), InnoDB engine layer is not aware of the concept of partition, InnoDB engine layer treats each partition as a normal InnoDB table. When opening a partitioned table, many partitions are opened. Opening these partitioned tables is equivalent to opening the same number of InnoDB tables, which requires more memory to hold the metadata of InnoDB tables and various cache and handler information related to opening IBD files. In MySQL 5.7, InnoDB introduced Native Partitioning, which moves Partitioning information from the Server layer to the InnoDB layer. The memory cost of opening a partitioned table is basically the same as that of opening an InnoDB table.

This article is adapted from inside MySQL Technology :InnoDB Storage Engine, 2nd edition

Personal wechat official Account:

Individual making:

github.com/jiankunking

Personal Blog:

jiankunking.com