The past and present of distributed database

When people first started using database systems, all data ran on a single server, the so-called stand-alone database server. In enterprise applications, we will build an application server, generally it will be run on a server or workstation, most of the cases using Linux/Unix/Windows operating system, some people also called such a server application server. As the name suggests, his role is to handle complex business logic. One thing to note, however, is that in this architecture, the application server does not store any business data, that is, it only handles logical operations and handles user requests, and the actual data is stored on the database server mentioned earlier. The application server translates the user’s request into a database language (usually SQL) that runs in the database to add, delete, modify, and query data. The database server is not directly open to the public, and administrators are not allowed to manipulate data directly at the database level. All operations are done through the application server. The application layer, database layer and UI layer are known as the traditional Web three-tier architecture.

Replication

With the increase of data volume, the continuous progress of technology and the increase of demand, security, reliability, fault tolerance, recovery and other factors are considered in the design of database. Hence the emergence of distributed database systems. In the past, when storing data, the single architecture mode is adopted, and all data is stored in a database. Once the database is faulty, all application requests will be affected. Database recovery is also a headache. Sometimes a full database recovery can take hours or even days to run. In today’s world of Internet applications, business requirements pose serious challenges to architecture. No Internet application will allow hours of downtime. The emergence of distributed database provides us with a technical solution. Distributed database deployment includes multi-point deployment. A set of service application databases are distributed on multiple database servers and are separated from the primary and secondary servers. The primary server processes daily service requests, and the secondary server continuously backs up the primary server. When the primary server breaks down or runs stably, the secondary server immediately replaces the primary server and continues to provide services. At this point, the development operation personnel will repair and recover the faulty server, and then put it into production. Such architectures, also known as highly available architectures, support disaster recovery, provide reliable support for the business world, and are one of the mainstream architectures adopted by many enterprise applications. It should be noted that in such a master-slave design, the slave database is often designed to be read-only and the master database supports read and write operations. There is usually a primary database connected to several secondary databases. In the application of Internet products, people will request read operation to the application server in most cases, so that the application server can distribute the read operation request to several secondary databases, so as to avoid the problem that the concurrent request of the main database is too high. As for why most apps are for reading, you can think about whether you see more pictures posted by others or yourself when you use wechat or Weibo. When you scroll down and refresh your moments, those are read requests. Only comments, likes and shares are written.

The world is such that when technology solves real problems for people, new needs emerge endlessly. The rise of smart phones, Internet plus and entrepreneurship has ignited the passion of a nation with thousands of years of civilization. All kinds of new ideas and concepts are constantly emerging. No one has dozens of Internet applications installed in their mobile phones, from ordering food, express delivery, to housing, tourism, and then to education and pension, that link is not supported by the Internet, and there is no technological component. We are living in such an ordinary and no lack of pride in the society. Many requirements and data are flooding our architecture, challenging our storage.

At this point, you might be wondering if the aforementioned distributed database multi-point deployment has a lot of bottlenecks. For example, in a master-slave database structure, the contents of the slave database are essentially a full copy of the master database, a technique called Replication. Replication typically synchronizes master and slave data using the Transaction Log approach. For example, when data is inserted into the master database, the master database inserts a record into the Trasaction Log to declare that the database wrote the record. After that, a Replication Process is triggered that synchronizes the contents of the Transaction Log to the slave database. The whole process is shown below:

For database extension, there are usually two methods, horizontal extension and vertical extension.

  • Vertical scaling: This is the traditional scaling method, which is to upgrade the hardware of a server, such as adding more powerful CPU, memory, or disk space. The limitation of this method is that it is limited to the capacity expansion of a single server, and the hardware configuration of a single server is increased as much as possible. The advantage is simple architecture and only need to maintain a single server.
  • Horizontal scaling: This is the dominant form of architecture today, and refers to the expansion of the system by increasing the number of servers. In such an architecture, the configuration of a single server will not be very high, it may be a low configuration, very cheap PC, each machine hosts a subset of the system, all the machine servers composed of a cluster will provide more powerful and efficient system capacity than a single server. The problem is that the system architecture is more complex than the single server, and the construction and maintenance require higher technical background. Sharding in MongoDB is officially designed for horizontal scaling. Here, we will push aside the veil of Shard and discuss the technical differences of different shards and their impact on the database system.

Shard

The Replication structure mentioned earlier ensures that all data in a database will have multiple copies, ensuring high availability of the database. But the new problem is that if you want to store a large amount of data, both the master and slave servers need to store all of the data, which inevitably leads to performance problems. You can say that Replication is only the first stage of a distributed database. The main solution is that the database is highly available, and the read data can be horizontally expanded, partly solving the problem of large concurrent access of master data. However, it does not solve the distributed requirement of database write operations. In addition, the database query is limited to one server and cannot support multiple database servers at a time. Let’s assume that if you have an architecture that allows you to implement horizontal database shard and store the shard data on different servers, then when the query request is sent to the database, the statement that matches the query condition can be retrieved asynchronously across multiple databases. This not only utilizes the CPU of multiple servers, It is also possible to take full advantage of IO on different servers, which obviously improves query performance. In a distributed database, if a Transaction fails, it can be rolled back to its original state. In a distributed database, if a Transaction fails, it can be rolled back to its original state. If a Transaction fails, it can be rolled back to its original state. Transactions need to span multiple database nodes to maintain data integrity, which causes a lot of trouble for developers. In addition, in the case of a large number of table associations in a relational database, distributed query operations will involve a large amount of data migration, which will obviously degrade database performance. However, in non-relational databases, we weaken or even remove transactions and multi-table association operations, according to CAP theory: In A distributed database environment, it makes sense to understand that NoSQL databases are sacrificing C for A and P in order to maintain architecture extensibility and to maintain fault tolerance of partitions, we must choose between consistency and availability. At the same time, according to this theory, there is a very popular understanding in the industry, that is: relational database design chooses consistency and availability, NoSQL database design is different. HBase selects consistency and partition tolerance, and Cassandra selects availability and partition tolerance.

This article focuses on the techniques and performance of partitioning in a non-relational database, using MongoDB as an example, as discussed in the following sections.

MongoDB Sharding Principle

MongoDB supports horizontal server scaling through Shard and high availability (HA) through Replication. The two techniques can be used separately, but in large database enterprise applications people often use them together.

MongoDB Sharding

Let’s start with a brief overview of how sharding works in MongoDB. From the word sharding, we can see that it means to divide the data in the database table into several groups according to certain boundaries, and put each group on a MongoDB server. Take user data, for example, for example, if you have a data table to store the user basic information, may be due to your application is very popular, accumulated hundreds of millions of users in a short period of time, so that when you query on this form will usually take a long time, so that the users table is called the performance bottleneck of your application. The obvious way to do this is to split the user table, assuming that there is an age field in the user table, we will do a simple split operation, put the data on different servers according to the age of the user, in a unit of 20, the user under the age of 20 to server1. Users between the ages of 20 and 40 go to server2, users between the ages of 40 and 60 go to server3, and users over 60 go to Server4, and we’ll talk about whether this split makes sense. In this case, the age of the user is the Shard Key we Sharding (more on Shard Key selection later), The split server1, Server2, Server3, and Server4 are the four Shard (partitioned) servers in the cluster. Ok, the Shard cluster is already there, and the data has been split. How do we send requests to the four Shard servers when a user makes a query request? For example: my query condition is that the user is between 18 and 35 years old. Such a query request should be sent to server1 and server2 because they store the user’s data under 40 years old. We do not want such a request to be sent to the other two servers because they do not return any data results. Mongos, which can be called the router in the Shard cluster, is just like the router in our network environment. Its function is to forward the request to the corresponding target server. With mongos, the query statement will be correctly forwarded to server and server2. It is not sent to server3 or server4. Mongos analyzes query statements based on the user’s age (Shard Key) and sends the statements to the relevant Shard servers. In addition to Mongos and Shard, the other required member is the config server, config Server, which stores the configuration information of all other members of the Shard cluster. Mongos goes to this Config server to check the addresses of other servers in the cluster. This is a server that doesn’t need too much performance as it won’t be used to do complex query computations. It’s worth noting that after MongoDB3.4, config Server must be a Replica set. With the above examples in mind, a Shard cluster can be deployed as follows:

Among them:

  • Shard: Each SHard server stores a subset of data, such as the user table above, and each shard stores user data for an age group.
  • Mongos: Handles requests from application servers and is an interface between the application server and the Shard cluster.
  • Config server: stores the configurations of the SHard cluster. It is usually deployed on a replica set.

MongoDB Shard performance analysis

Environment to prepare

Whether this server architecture is reasonable, or whether it can meet the increasing demand for data. I’m afraid it would be difficult to convince people if only through theoretical explanation. I already believe in the working mode of combining theory with practice. Therefore, in my article, in addition to explaining theories, THERE will be certain examples to verify the results of theories for everyone. Let’s build a local runtime environment based on the above example. Thanks to the convenience of MongoDB, it is possible to set up such a database cluster environment on any PC, regardless of operating system type, and any major version of Windows/Linux/Mac can run such an environment. In this article, I’m using version MongoDB3.4.

Create a MongoDB Shard environment with three Mongos, each connected to several Shards, plus three Config Server clusters. More than 20 MongoDB servers are typically required. If the line command line command play, even in a very skilled situation, not half an hour I am afraid to build. Fortunately, there are third-party libraries that do this for us, and you can check out MTools. It is a command line tool for creating various MongoDB environments. The code is written in Python and can be installed to your environment via PIP Install. The use of specific methods can refer to https://github.com/rueckstiess/mtools/wiki/mlaunch. Can also use https://github.com/zhaoyi0113/mongo-cluster-docker above the script environment to carry on the Docker.

The following command is used to create a local MongoDB Shard cluster, including 1 Mongos route, 3 Shardreplica, 3 Shard servers and 3 Config servers per replica. This creates a total of 13 processes.

__Wed Nov 08 2017 10:06:35 GMT+0800 (CST)____Wed Nov 08 2017 10:06:35 GMT+0800 (CST)__mlaunch init --replicaset --sharded 3 --nodes 3 --config 3 --hostname localhost --port 38017 --mongos 1__Wed Nov 08 2017 10:06:35 GMT+0800 (CST)____Wed Nov 08 2017 10:06:35 GMT+0800 (CST)__Copy the code

After the server is created, we can connect to Mongos to check the shard state. The port is 38017 specified above.

__Wed Nov 08 2017 10:06:35 GMT+0800 (CST)____Wed Nov 08 2017 10:06:35 GMT+0800 (CST)__mongos> sh.status() --- Sharding Status --- ... shards: { "_id" : "shard01", "host" : "shard01/localhost:38018,localhost:38019,localhost:38020", "state" : 1 } { "_id" : "shard02", "host" : "shard02/localhost:38021,localhost:38022,localhost:38023", "state" : 1 } { "_id" : "Shard03" and "host", "shard03 / localhost: 38024, localhost: 38025, localhost: 38026", "state" : 1} active mongoses: "3.4.0" : 1... __Wed Nov 08 2017 10:06:35 GMT+0800 (CST)____Wed Nov 08 2017 10:06:35 GMT+0800 (CST)__Copy the code

You can see that the shard server that you just created has been added to the Mongos. There are three shard clusters, each of which contains three shard servers. Other than that, we don’t see much about Shard. This is because the server cluster doesn’t have any data yet, and it hasn’t been shelled.

Data preparation

The first is data entry. In order to analyze the performance of our server cluster, we need to prepare a lot of user data. Fortunately, MTools provides a method called Mgenerate for us to use. He can insert any JSON data into MongoDB based on a data template. The following JSON structure is the data template we need to use in our example:

__Wed Nov 08 2017 10:06:35 GMT+0800 (CST)____Wed Nov 08 2017 10:06:35 GMT+0800 (CST)__{
    "user": {
        "name": {
            "first": {"$choose": ["Liam", "Aubrey", "Zoey", "Aria", "Ellie", "Natalie", "Zoe", "Audrey", "Claire", "Nora", "Riley", "Leah"] },
            "last": {"$choose": ["Smith", "Patel", "Young", "Allen", "Mitchell", "James", "Anderson", "Phillips", "Lee", "Bell", "Parker", "Davis"] }
        }, 
        "gender": {"$choose": ["female", "male"]},
        "age": "$number", 
        "address": {
            "zip_code": {"$number": [10000, 99999]},
            "city": {"$choose": ["Beijing", "ShangHai", "GuangZhou", "ShenZhen"]}
        },
        "created_at": {"$date": ["2010-01-01", "2014-07-24"] }
    }
}__Wed Nov 08 2017 10:06:35 GMT+0800 (CST)____Wed Nov 08 2017 10:06:35 GMT+0800 (CST)__Copy the code

Save it in a file called user.json and use mgenerate to insert a hundred random pieces of data. Random data is formatted as defined in the JSON file above. You can insert different numbers of documents by adjusting the –num argument. (Link to mgenerate wiki)

__Wed Nov 08 2017 10:06:35 GMT+0800 (CST)____Wed Nov 08 2017 10:06:35 GMT+0800 (CST)__mgenerate user.json --num 1000000 --database test --collection users --port 38017__Wed Nov 08 2017 10:06:35 GMT+0800 (CST)____Wed Nov 08 2017 10:06:35 GMT+0800 (CST)__Copy the code

The above command will insert one million pieces of data into the Users Collection in the test database. On some machines, the above statement may take some time to run, because generating a million data is a time-consuming operation. The reason for generating so much data is that when we analyze performance later, we will see a significant difference in performance. Of course, you can only generate 100,000 pieces of data to test, as long as you can see the difference in the execution time of different find statements on your machine.

After inserting the data, we want to see how the data we just inserted is distributed in the server cluster. Normally, you can run the sh.status() MongoDB shell command to view the value. But for a brand new set of clustered servers, we don’t see much useful information until we shard any collections. However, you can see how the data is distributed by explaining a query. I have to emphasize how much a good IDE can affect productivity when performing data performance analysis. The main reason I chose dbKoda as MongoDB’s IDE is that it is currently the only perfect interpretation of MongoDB Shell. This is especially important for developers unfamiliar with MongoDB Shell commands. Fortunately, the IDE also supports Windows/Mac/Linux, covering almost all operating system versions. Here are the explain results for a find of the one million collections just created. (For the Explain app, please refer to my article on how to improve search performance with MongoDB’s Built-in Explain feature.)

(Click to enlarge image)

As you can see from the figure above, the 1 million pieces of data we inserted were all allocated to the first shard server, which is not what we want to see. Don’t worry because I haven’t done the shard yet and MongoDB doesn’t automatically allocate the data. Let’s take a step-by-step look at how to use Shard for efficient data queries.

Configure the Shard database

With the environment set up and the data ready, the next thing to do is configure the database and shard the data. For convenience, we have divided the users into three groups: junior under 20, Middle between 20 and 40, and senior over 40. In order to save space, I will not describe how to use MongoDB commands here. Our data will be split into chunks based on user ages and distributed to different Shard clusters. If you are not familiar with the following commands, check the official MongoDB documentation for explanation of Shard Zone/Chunk.

__Wed Nov 08 2017 10:06:35 GMT+0800 (CST)____Wed Nov 08 2017 10:06:35 GMT+0800 (CST)__db.getSiblingDB('test').getCollection('users').createIndex({'user.age':1})
sh.setBalancerState(false)

sh.addShardTag('shard01', 'junior')
sh.addShardTag('shard02', 'middle')
sh.addShardTag('shard03', 'senior')

sh.addTagRange('test.users', {'user.age': MinKey}, {'user.age':20}, 'junior')
sh.addTagRange('test.users', {'user.age': 21}, {'user.age':40}, 'middle')
sh.addTagRange('test.users', {'user.age': 41}, {'user.age': MaxKey}, 'senior')

sh.enableSharding('test')
sh.shardCollection('test.users', {'user.age':1})
sh.setBalancerState(true)__Wed Nov 08 2017 10:06:35 GMT+0800 (CST)____Wed Nov 08 2017 10:06:35 GMT+0800 (CST)__Copy the code

As you can see from the above command, we will first create an index for the Shard Key and then disable the Balancer. We do this because we do not want Balancer to run during the Shard Collection. The data were then divided into three groups by age, labeled junior, middle and senior, and assigned to three Shard clusters. {” collectionSharded “: {” collectionSharded” : {” collectionSharded “: {” collectionSharded” : { “Test. users”, “ok” : 1}.

A few things to note about Shard

  • Once you perform a Shard operation on a Colleciton, the Shard Key you select and its corresponding value become immutable, so:
  • You cannot re-select the Shard Key for this collection
  • You cannot update the Shard key value

Then don’t forget that we also need to turn Balancer on: sh.setBalancerState(true). Running sh.isbalancerRunning () should return true if the Balancer service is running and adjusts the Chunk allocation among the different Shards servers. The Balancer will run for a while because it is reassigning the grouped data to the specified shard server. You can check whether the Balancer is running by using the sh.isbalancerrunning () command. Now is the time to take a coffee break or look out the window.

In order to understand how data is distributed among three shard clusters, it is necessary to analyze the division of chunks and zones. The following figure shows the shard Cluster statistics on dbKoda, which shows that data is divided into six chunks. Each SHard cluster stores two chunks.

(Click to enlarge image)

Some students may wonder why our data is divided into 6 chunks and 2 chunks per shard cluster. Who ensures that the data is distributed evenly? Let me explain their concepts and how we can use them.

Chunk

We already know that MongoDB uses shard keys to slice data, and the sliced data is divided into several chunks. A chunk can be thought of as a subset of the data in a Shard server. According to the Shard key, each chunk has an upper and lower boundary, which in our case is the age of the user. Chunk has its own size. The size of chunk changes as data is inserted into Mongos. The default size of chunk is 64MB. Of course, MongoDB allows you to set the size of chunks. You can also split a chunk into smaller chunks, or merge multiple chunks. In general, I don’t recommend manually manipulating chunk sizes, or slicing or merging chunks at the Mongos level, unless there’s a good reason to do so. The reason is that when data is continuously inserted into our cluster, the chunk size in mongodb will change greatly. When the size of a chunk exceeds the maximum value, Mongo will slice the chunk according to the Shard key. If necessary, A chunk may be divided into several small chunks. In most cases, this automatic behavior has met our daily business needs without manual operation. Another reason is that after the chunk is divided, the direct result will lead to uneven data distribution. At this point balancer will be called to redistribute the data. In many cases this operation will run for a long time, resulting in implicit load balancing of the internal structure, so manual splitting is not recommended. Of course, understanding how chunks are allocated is a prerequisite for analyzing database performance. I don’t go into much detail here on how to do this, but interested readers can refer to the official MongoDB documentation for a more comprehensive explanation. I’m just going to highlight a few things to keep in mind when doing chunk operations, all of which are critical to your MongoDB performance.

  • If there are a large number of small chunks, it can ensure that your data is evenly distributed in the shard cluster but may result in frequent data migration. This will increase operations at the Mongos level.
  • Large chunks can reduce data migration, reduce the network burden, and reduce the load at the Mongos routing level, but the disadvantage is that data may be distributed unevenly in the SHard cluster.
  • Balancer runs automatically when data is not evenly allocated. How does Balancer decide when data migration is needed? If the number of chunks in different Shard replicas exceeds a fixed value, the Balancer automatically executes it.

Zones

It can be said that chunk is the smallest unit for MongoDB to migrate data among multiple Shard clusters. Sometimes the data distribution is not in the direction we expect. Take the above example as an example, although we choose the user age as the shard key, However, MongoDB does not allocate data as we expect it to do, and this is done by Zones. Zones addresses the relationship between shard clusters and shard keys. Each group is called a Zone, and each Zone is assigned to a different SHard server. A Shard can store one or more zones, provided that there are no data conflicts between zones. When Balancer runs, it migrates chunks in the Zone to the shard associated with the Zone.

Once we understand these concepts, we have a clearer idea of how data is allocated. We have a full explanation of the problem mentioned above. On the surface, the data looks evenly distributed, so let’s run a few queries to see how well it performs. Again, use the Explain view in dbKoda.

(Click to enlarge image)

Users above 18 years old are searched in the figure above. According to our grouping definition, there are corresponding records on the three Shards, but the age group corresponding to Shard1 is below 20 years old, which should include a small amount of data. So in the shard table in the figure, the real SHARd01 returns 9904 records, far less than the other two shards, which also conforms to our data definition. As can be seen from the above performance description, this statement also takes relatively little time to run on SHARd01.

Take a look at the following example. If we search for users over the age of 25, shard1 does not appear in the results, which is also consistent with our data allocation, because Shard1 only stores users under the age of 20.

(Click to enlarge image)

Do you choose the right Shard Key?

Now that we know how the data is distributed, let’s go back and see if our choice of Shard key makes sense. Careful readers have noticed that one problem with the explain results run above is that SHARd3 stores a large amount of data. If we look at the number of records for each age group, It can be found that shard1, shard2 and shard3 respectively include 198554, 187975 and 593673. Obviously, the majority of users are over 40 years old. This is not the desired result because Shard3 becomes a bottleneck in the cluster and the database operation statements run much faster on SharD3 than on the other two shards, as can be seen from the explain results above. The query execution time on SHARd3 is more than twice that of the other two Shards. More importantly, with the increasing number of users, the distribution of data will also show significant changes. After the system runs for a period of time, the number of users of SHARd2 may exceed that of SHARD3, or SharD1 may be called the server storing the largest amount of data. This data imbalance is undesirable. Why? Do you think the age of the users we chose as the grouping condition is not a very ideal key? So what kind of key can ensure uniform distribution of data? Let’s look at the types of Shard keys.

Ranged Shard Key

This is the shard key we used for the age groups we chose above. Depending on the value of the shard key, it splits the data into contiguous intervals. Records with similar values are stored in the same Shard server. The advantage is that the query efficiency can be ensured when the continuous value records are queried. When the database query statement is sent to Mongos, Mongos will quickly find the target shard, and do not need to send the statement to all shards, generally only a few shards can complete the query operation. The disadvantage is that data cannot be evenly distributed, and serious performance bottlenecks will occur when data is inserted and modified.

Hashed Shard Key

The Hashed Shard Key corresponding to the Ranged Shard Key is called Hashed Shard Key. It uses the index hash value of the field as the value of the Shard Key to ensure the data evenly distributed. There is a hash calculation between Mongos and each shard cluster, and all data is migrated according to this method to calculate where the data should be migrated. When Mongos receives a statement, it usually broadcasts the statement to all shards for execution.

Given the above knowledge, how do we choose between the Ranged and Shard? The following two properties are key to our choice of shard key.

Shard Key Cardinality

Cardinality refers to the number of different values that a shard key can take. This value can also be seen as the maximum number of chunks Balancer can create. Take our age field for example. If a person is under 100 years old, the cardinality of this field can take on 100 different values. For a unique age data, it does not appear in different chunks. If you choose a Shard Key with a small cardinality, such as only 4 cardinality, the data will be distributed to 4 different shards at most. This structure is also not suitable for horizontal scaling of the server, because no data will be split to the fifth Shard server.

Shard Key Frequency

Frequency refers to the number of shard keys that have the same value. If most of the data has the same shard key value, the chunk where they are stored becomes a bottleneck for the database. Moreover, these chunks also become unsharable chunks, seriously affecting the horizontal expansion of the database. In this case, you should consider creating a shard key using a composite index. Therefore, try to select fields of low frequency as shard keys.

To express support for someone or something

Monotonic growth here means that after data is shelled, new data will be inserted into the Shard according to its shard key value. If the key value of new data increases to the maximum value, these new data will be inserted into the same SHard server. For example, in the previous user age group field, if all the new users in the system are older than 40, shard3 will store all the new users, and shard3 will become the performance bottleneck of the system. In this case, the Hashed Shard Key should be considered.

Redesign the Shard Key

From the above analysis, we can conclude that the user age field in the previous example is a bad scenario. There are several reasons:

  • The age of the user is not fixed. Since the shard key is an immutable field, it cannot be modified once it is determined, so the age field is obviously not very appropriate. After all, there is no user whose age never increases.
  • The distribution of users of a system in different age stages is different, and the applications such as games and entertainment may attract more young people. And for medical treatment, health care aspects may have more elderly attention. From this point of view, such segmentation is also inappropriate.
  • The selection of age field does not take into account the problems brought by future user growth. It is possible that age is evenly distributed when data is segmented, but unequal data distribution may appear after the system runs for a period of time, which will bring great trouble to data maintenance.

So how should we choose? If you look at all the attributes of the user table, you can see that there is a creATED_AT field, which refers to the timestamp when the record was created. If the statistics Key is used, there will be the problem of monotonous growth in the direction of data growth. After analysis, it is found that the field has few repeated records and has high cardinality and low frequency, so Harded Key becomes a good alternative.

Unfortunately, we can’t change the Shard key. The best way is to back up the data and recreate the shard cluster. I won’t repeat the creation and data preparation process, but you can do it yourself based on the previous examples.

In the image below, I create a usersCollection and create a Hashed Shard Key with created_at as the index. Note that created_at must be a hash index to be a Hashed Shard Key. The following is the result of a query against the user table.

(Click to enlarge image)

As you can see from the figure, the explain results show that the three Shard servers distributed all the data roughly evenly, and the execution time on all three shards was roughly even, within 500 to more than 700 milliseconds. Remember the query results above? Running time on data-heavy shards is between 1 and 2 milliseconds. You can see a significant improvement in overall performance.

Choose the perfect Shard Key

There are many factors to consider when choosing a Shard key, some technical, some business. Generally speaking, the following points should be noted:

  • All add, delete, modify and query statements can be sent to all shard servers in the cluster
  • Any operation needs to be sent only to the shard server associated with it; for example, a delete operation should not be sent to a SHard server that does not contain the data to be deleted

In fact, there is no perfect shard key, only the factors that should be considered when choosing a Shard key. There won’t be one shard key that can handle all add, delete, change, and query operations. You need to abstract the elements used to select the shard key from the application scenario given to you, consider these factors and make a final choice, such as: does your application handle more reads or writes? What are the most common write scenarios?

summary

There is no unified method for selecting shard keys, which should be designed according to specific requirements and the direction of data growth. In our daily development process, not all technical problems should be solved by technical personnel, the world is a business-driven era, and technology is mainly for business services, we need to improve the speed of change in demand. As for how to choose the Shard Key in this paper, I think it is more important to discuss with business personnel the meaning of each data field, the business value used and the growth point of future business, rather than purely technical considerations. If you choose the wrong shard key in the beginning, changing the shard key later in the application can be an extremely tedious process. You may need to back up your collection, then recreate the Shard service and restore the data, which is likely to take a long time to run. In today’s world of Internet applications, server outages are measured in seconds, and it is possible that the wrong shard key choice can have disastrous consequences for your application. I hope this article can give you a bit of inspiration, in the initial design stage of the project fully consider all aspects of the factors.

References

  1. Directing a Shard: docs.mongodb.com/manual/shar…
  2. Shard Keys: docs.mongodb.com/manual/core…
  3. Next Generation Databases: NoSQLand Big Data, Guy Harrison:

    dbKoda: www.dbkoda.com
  4. Directing a Docker Cluster: github.com/zhaoyi0113/…
  5. CAP, unseen: en.wikipedia.org/wiki/CAP_th…

About the author:

Zhao yi, graduated from Beijing institute of technology, is now working for SouthbankSoftware, engaged in the development of NoSQL and MongoDB. Worked as project development and Technical Director at GE, ThoughtWorks and Yuanqi Hare, and worked on a wide variety of projects, including Web, Mobile, medical devices, social networking, and big data storage.

Thanks to CAI Fangfang for proofreading this article.