MySQL table
In the MySQL environment, it is recommended that the capacity of a single table does not exceed 10 million. Otherwise, the query efficiency is greatly affected. After all the usual indexing, read-write separation, reasonable SQL, etc., the remaining method is to partition the table, sharding is horizontal cutting.
Mysql5.1 provides PARTITION functionality, which is one of the ways you can split table data. Sub table is a large table into N small table; Partitioning is storing the data of a large table on several blocks, still one table.
There are several types of table sorting strategies
Range partitioning
The number ID can be divided by the number range. For example, as shown in the figure above, id 11000W is placed on the first node. For 100000012000W, put it on the second node, and so on. The advantage of this method is convenient retrieval, according to the ID search can quickly locate the node, expansion is also convenient, add a node is good; The disadvantage is that the data is not uniform, and the last node will be accessed very frequently because the data is hot.
Take mold equalization partition
In order to solve the problem of hot data, we can adopt the mode of module segmentation. Generally, 2N nodes are set, modulo the number 2n. The advantage of this method is that it acts as a load balancer, and the hotspot data is evenly scattered among all nodes. The disadvantage is that it is difficult to expand capacity, and can only double capacity, otherwise it needs to re-sharding.
Date range partition
If it is log data, it can be shard by date.
Hash Hash partition
To split non-numeric ids horizontally, just hash the ID and modulo 2^n the hash value to get the shard node number.
conclusion
Table partitioning is basically this centralized method, range partitioning, modular, hash. The core idea is to evenly distribute data to several areas to avoid causing various data hot spots. The corresponding problem is data skew in calculation.