The first two articles described the concept of table partitioning, its features, and possible pitfalls. This article introduces the application of table partitioning.

To optimize the query

Partitioning introduces new ways to optimize queries (with some pitfalls, of course). The biggest feature is that the optimizer can use partitioning functions to filter out irrelevant partitions. This feature acts like a coarse-grained index, making it possible to access less data when querying.

Therefore, specifying partitioning conditions in the WHERE clause is important, even if it may be unnecessary. With this condition, the optimizer can filter out unwanted partitions. Otherwise, the query executor engine would have to access the entire partition of the data table, which would result in extremely slow queries.

You can use the EXPLAIN PARTITION directive to see if the optimizer has filtered out partitions, as in the following example:

EXPLAIN PARTITIONS SELECT * FROM sales;
Copy the code
id: 1
select_type: SIMPLE
partitions: p_2019,p_2020,p_2021
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 3
Extra: 
Copy the code

As seen above, this case accesses all partitions, and if you add a constraint to the WHERE condition:

EXPLAIN PARTITIONS SELECT * FROM sales_by_day WHERE day > '2020-01-01';
Copy the code
id: 1
select_type: SIMPLE
table: sales_by_day
partitions: p_2020,p_2021
Copy the code

The optimizer does a good job of tuning partitions, converting range values into discrete values and deciding which partitions to query against them. However, it’s not a panacea, and the following query looks optimizable but doesn’t:

EXPLAIN PARTITIONS SELECT * FROM sales_by_day WHERE YEAR(day) > 2020;
Copy the code
id: 1
select_type: SIMPLE
table: sales_by_day
partitions: p_2019, p_2020,p_2021
Copy the code

MySQL can only perform partitioning filtering optimizations on columns corresponding to partitioning functions, not on an expression, even if the result of the expression looks the same. For the same reason indexes cannot be used in arguments to expressions. The above query can be optimized in the following equivalent way:

EXPLAIN PARTITIONS SELECT * FROM sales_by_day 
WHERE day BETWEEN '2020-01-01' AND '2020-12-31';
Copy the code
id: 1
select_type: SIMPLE
table: sales_by_day
partitions: p_2020
Copy the code

Because the WHERE condition refers directly to the value of the partitioned column, rather than the expression, the optimizer can do partitioning optimization. The first rule of this feature is that we can partition using expressions, but we must perform search queries against partitioned columns.

The optimizer is also smart enough to partition optimizations during queries. For example, if the partitioned table is in the second table of the union query and the condition of the union query is the partitioning key, then MySQL will only look for rows in the associated partition (but the optimization results will not be shown in EXPLAIN because this is optimized at run time, not query optimization period).

Merge table

Earlier, simpler partitions that merge tables have different constraints and fewer optimizations. Since the data table partition is at a strict abstraction level, direct access to the partition’s physical hidden tables is not allowed. The merged table allows us to access the hidden physical table separately from the merged table. With further integration of partitions and improvements in the optimizer in the future, it is likely that the merged table will be discarded or even removed.

The merged table is essentially a container for the real physical table. You can use a special UNION syntax to create multiple tables to merge. Here is an example:

CREATE TABLE t1(a INT NOT NULL PRIMARY KEY) ENGINE=MYISAM;
CREATE TBLE t2(a INT NOT NULL PRIMARY) ENGINE=MYISAM;
INSERT INTO t1(a) VALUES (1), (2);
INSERT INTO t2(a) VALUES(1), (2);
CREATE TABLE mrg(a INT NOT NULL PRIMARY KEY)
ENGINE=MERGE UNION=(t1, t2) INSERT_METHOD=LAST;
SELECT a FROM mrg;
Copy the code

The following operation is equivalent to fetching data from both tables at the same time:

a
1
1
2
2

Note that the actual table has the same field names, number, and type, and that the indexes are present both on the merged table and in the respective sub-tables. This is the requirement to create the merged table. Note that each sub-table has the same primary key on the same column, but the merged table can have duplicate rows. This is the drawback of the merged table, where every piece of data looks normal, but the merged table does not add constraints of its own, so you need to use a program to constrain them.

The INSERT_METHOD=LAST directive tells MySQL to send inserts to the previous table at the same time as the merged table. We can only specify the FIRST or LAST operation, and we can also operate directly on the sub-table individually. This is different from partitioned tables, which have more control over how data is stored.

Inserting data into the merged table inserts data into both the merged and partitioned tables.

INSERT INTO mrg(a) VALUES(3);
SELECT a FROM t2;
Copy the code
a
1
2
3

Merge tables also have some interesting features and limitations, such as deleting a merge table or one of its sub-tables. Deleting a merged table does not affect its sub-tables, but deleting one of the sub-tables varies depending on the operating system. For example, on Linux systems, the file descriptor for the split table remains and the data table exists, but only through the merge table query.

DROP TABLE t1, t2;
SELECT a FROM mrg;
Copy the code
a
1
1
2
2
3

Here are some restrictions on merging tables that need to be kept in mind to avoid potholes:

  • Creating a merge table does not check whether the merge table is compatible. If the definitions of the merge table are different, MySQL may create an unusable merge table. Furthermore, modifying a separate table after creating a valid merged table will cause the merged table to become unusable.
  • REPLACE is not available for merge tables, and increment keys don’t work as expected either.
  • Querying the merge table accesses each sub-table. This can also cause single-row queries to be slower than single-table queries. Therefore, it is best to limit the number of sub-tables, especially if this table may be the second table in a federated query. The less data is accessed at a time, the greater the performance impact of merging tables.
  • By accessing the partitioned table in advance, range queries are less affected than queries directly on the merged table alone.
  • The speed of a table scan is the same for a merged table as for a normal table.
  • Unique fields and primary keys stop the query once the query is successful, in which case the server accesses one sub-table in turn until the desired data is retrieved.
  • The order of split table queries is determined by the order in which the tables were created. This feature can improve the speed of sorting operations on merged tables if you need to specify the order of access to data frequently.
  • Since the merged table does not hide MyISAM’s tables, it provides some features that table partitioning does not: a MyISAM table can be a subtable of multiple merged tables; The.frm,.myi and.frm tables can be replicated directly on different servers. MYD file. After the merge table is defined, more sub tables can be added. You can create a temporary merge table, such as data in a specified time range, which is not possible with table partitioning. You can remove a sub-table from a merged table for backup, restore, change, repair, or other operations. These operations can then be added to the same merge table.
  • You can use Myisampack to compress some sub-tables.

On the other hand, all physical tables are hidden by the MySQL server and can only be accessed and controlled through partitioned tables.