We have discussed various implementations of replication in previous articles, and we all know that replication means keeping copies of the same data on different nodes. The problem here is that when the data is very large, we need to divide the data into different parts for storage, this process is called partition, sometimes also called sharding.

The scalability problem is that the scalability of partitions usually depends not on the scalability of disks, but rather on the scalability of disks. The scalability problem often becomes a bottleneck when data is scaleable and the number of queries becomes too large, allowing data to be spread across disks or nodes. This disperses the query stress and improves performance.

Partition and Replication

Does Partition mean that the amount of data stored on each disk is smaller? In fact, replication is often combined with replication to allow the remaining disks to hold copies of other partitions. In most cases, active partitions are spread across different disks or nodes to achieve balance.

For example, leader-follower storage on disks and nodes might look like the following figure. Ideally, we would like to have only one leader per node, with the rest being followers (of course, when things go wrong, there could be multiple leaders).

The Key – Value data partition

We know that partition is currently to spread the pressure of access, so how do we partition data? You might say, well, let’s just randomize it, there’s a good chance that randomize will split the load, but the problem is when you want to access some data how do you know which node to access? You may still need to access all the nodes before you can find the data you need, which is obviously unreasonable. Partition: partition: partition: partition: partition: partition: partition

Partition by key range

The first method we will think of for a partition is to sort it by a certain key, and then put a certain segment into a partition. In this way, we can know where to query data as long as we know the key. This method is similar to the bookshelf arrangement of the library, as shown in the figure below:

In this implementation, we do not need to evenly distribute data by key, for the simple reason that data is generally not evenly distributed by key, so we can assume that the key range of each partition is allocated according to the data situation. If we sort the keys in a certain order, the query for the range will be friendlier and we can easily find which partitions to access.

One of the problems with this approach is key selection. If your key selection is not good, it may result in many partitions being accessed and very few partitions being accessed. For example, if you select the time to do the partition, Then you will notice that the partition that saved the most recent period of time may receive more visits than the partition that saved the previous data, so the choice of key is very important.

Partition by hash of key

In order to avoid the above problems caused by pure keys, we usually use the hash value of a key to partition. That is, a hash function is used to calculate the key-related value, and then the partition is based on the hash value. Here the hash function is more important, MongoDB uses MD5, Cassandra uses Murmur3, etc.

After the hash value is calculated, partition can be performed according to the hash value, as shown in the following figure:

This method solves the key weight problem well, but it also introduces a new problem, that is, it is very unfriendly to the scope search of key, because even the key of a scope will now be hashed on different partitions, thus increasing the difficulty of the scope search of key.

If there is a problem, there is a solution. Some smart people say that we can hash only the front part of the key, and then the back part can ensure that at least this part of the key will still fall on the same partition. This does solve the scoping problem we mentioned above.

Processing hotspot access

Although we say that the above hash processing can effectively deal with the problem of too many accesses to a partition, in fact, it cannot completely solve the problem. For an extreme example, all of our accesses are repeated to a key, so the above hash processing can not take good care of this problem.

For hot keys, one way to do this is to add a random variable to the key to split the keys between different partitions, but the problem with this is that you need to read all the keys back as well.

conclusion

This paper mainly introduces the basic concept of partition and several methods of key-value partition and their existing problems.