An overview of the
MongoDB is a general purpose, document-oriented distributed database [^1], which is the official description of MongoDB. Different from traditional relational databases such as MySQL, Oracle and SQL Server, one of the most important features of MongoDB is “document-oriented”. Due to the different data storage methods, the interface provided by MongoDB is no longer the well-known SQL, so it is divided into NoSQL. NoSQL is the opposite of SQL, many of the most familiar storage systems are divided into NoSQL, such as Redis, DynamoDB[^2], Elasticsearch, etc.
sql-and-nosq
NoSQL is often understood as having NoSQL or non-relational [^3], but others have interpreted it as something more than SQL [^4]. Digging into the meaning and origins of the term may Not make much sense. This secondary interpretation is often used for marketing purposes, but it is important to note that MongoDB stores data in a completely different way than traditional relational databases.
The architecture of MongoDB is very similar to that of MySQL. Both of them use pluggable storage engines at the bottom to meet different needs of users. Users can choose different storage engines according to data characteristics.
mongodb-architecture
As the default storage engine for MongoDB, WiredTiger uses B-trees as the underlying data structure for indexing, but in addition to b-trees, it also supports an optional LSM tree, log-structured merge-tree, as the underlying storage structure. You can create an LSM tree-based Collection in MongoDB using commands like this [^6]:
db.createCollection(
"posts",
{ storageEngine: { wiredTiger: {configString: "type=lsm"}}}
)
Copy the code
In this article, we will not only introduce why the default MongoDB storage engine WiredTiger chooses to use B tree rather than B+ tree, but also compare the performance and application scenarios between B tree and LSM tree to help readers understand today’s problem more comprehensively.
design
In order to compare the differences between two different data structures and B trees, here we will explain why B+ trees and LSM trees are not the default WiredTiger data structures:
- As a non-relational database, MongoDB’s requirement for traversing data is not as strong as that of relational database, and it pursues the performance of reading and writing single records.
- Most OLTP databases face the scenario of read more and write less. B tree and LSM tree have greater advantages in this scenario.
Both of the above scenarios are faced and solved by MongoDB, so we will compare different data structures in these two common scenarios.
non-relational
In fact, we have mentioned many times above that MongoDB is a non-relational document database. After it completely abandons the relational database system, it is very free in design and implementation. It no longer needs to follow THE SQL and relational database system, and can be more free to optimize specific scenarios. Traversing data in MongoDB’s hypothetical scenario is not a common requirement.
mysql-innodb-b-plus-tree
MySQL uses B+ trees because only leaf nodes in B+ trees store data, and sequential traversal can be achieved by connecting each leaf node in the tree with Pointers. Traversal data is very common in relational databases, so this choice is not a problem at all [^7].
The ultimate purpose of MongoDB and MySQL to choose between multiple different data structures is to reduce the random I/O times required by queries. MySQL believes that queries traversing data are common, so it chooses B+ tree as the underlying data structure and abandons the feature of storing data through non-leaf nodes. But MongoDB faces a different problem:
mongodb-wiredtiger-b-tree
Although the query of traversing data is relatively common, MongoDB believes that the query of single data record is far more common than that of traversing data. Since the non-leaf nodes of B tree can also store data, the average random I/O times required to query a data are less than that of B+ tree. MongoDB using B trees can perform queries faster than MySQL in similar scenarios. This is not to say that MongoDB cannot iterate over data, we can also use ranges in MongoDB to query a batch of records that meet the corresponding criteria, but it takes longer than MySQL.
SELECT * FROM comments WHERE created_at > '2019-01-01'
Copy the code
A lot of people might think of a range query like this when they see a query traversing data, but in a relational database it’s more common to see SQL like this — query for foreign keys or all records where a field is equal to a value:
SELECT * FROM comments WHERE post_id = 1
Copy the code
The above query is not a range query. It does not use expressions like > and <, but it does query a series of records in the Comments table. If the comments table has an index post_id, the query might iterate through the index to find the comment that meets the criteria. This query also benefits from the MySQL B+ tree’s interconnected leaf nodes, as it reduces the number of random I/OS to disk.
MongoDB as a non-relational database, it uses a completely different method from the set design. If we still use the traditional table design ideas of relational database to think about the set design in MongoDB, write similar queries as shown above will bring relatively poor performance:
db.comments.find( { post_id: 1 } )
Copy the code
Because all nodes of the B tree can store data, there is no good way to connect each successive node through Pointers, so the performance of the above query in the B tree will be much worse than that in the B+ tree. However, this is not a design method recommended in MongoDB, and the more appropriate way is to use embedded documents. Store post together with all comments that belong to it:
{ "_id": "..." , "title" : "why use mongo B tree", "author" : "draven", "comments" : [{" _id ":"... ", "content" : "you write it not"}, {" _id ": "...", "content": "That's right"}]}Copy the code
Db.comments. Find ({post_id: 1}), we just need to take out the POST to get all the relevant comments, which is different from the traditional relational database design method needs all the developers using MongoDB to rethink, This is one of the biggest reasons why many people use MongoDB and find it doesn’t perform as well as MySQL – using the wrong posture.
Some readers may wonder why MongoDB doesn’t use hashes as the underlying data structure, since it thinks querying individual data records is far more common than traversing data.
datastructures-and-query
If we used hashing, the complexity of the query for all single records would be O(1), but the complexity of traversing the data would be O(n); If you use a B+ tree, the complexity of a single record query is O(log n) and traversal data is O(log n) + X. The two different data structures, one providing the best performance of a single record query and the other providing the best performance of traversal data, However, these can not meet the scenarios faced by MongoDB — single record query is very common, but the traversal of data also needs relatively good performance support. Hashing, a data structure with relatively extreme performance, can only be used in simple and extreme scenarios.
Read more to write less
The LSM tree is a disk-based data structure designed primarily to provide a low-cost indexing mechanism for files that require frequent writes over time [^8]. Whether B tree or B+ tree, writing records to index files composed of these data structures requires random disk writes. The optimization logic of LSM tree is to sacrifice part of the read performance and convert random writes to sequential writes to optimize data writes.
We won’t go into the details of why LSM trees have better write performance in this article, but rather why WiredTiger uses B trees as the default data structure. WiredTiger benchmarks read and write throughput for LSM and B tree performance [^9]. The benchmark results are as follows:
LSM_btree_Throughput
- Without limiting writes;
- The write performance of LSM tree is 1.5 ~ 2 times that of B tree.
- The read performance of LSM tree is 1/6 ~ 1/3 of that of B tree.
- In the case of restricted writes;
- The write performance of LSM tree is almost the same as that of B tree.
- The read performance of LSM tree is 1/4 ~ 1/2 of that of B tree.
In the case of restricted writes, 30,000 pieces of data are written per second, and the analysis here shows that B trees perform much better than LSM trees in either case. For most OLTP systems, the system queries many times more than writes, so the LSM tree’s excellent write performance does not make it the default data format for MongoDB.
conclusion
Application scenarios are always the first problem to be considered in system design. As NoSQL MongoDB, its target scenarios are quite different from earlier databases. Let’s briefly summarize the two reasons why MongoDB finally chooses to use B tree:
- MySQL uses B+ trees because traversal of data is very common in relational databases. It often needs to deal with relationships between tables and query some data by scope. However, as a document-oriented database, MongoDB attaches more importance to the document-centered organization mode than the relationship between data, so it chooses b-tree with better performance to query a single document, which can also ensure acceptable delay for the query of traversing data.
- LSM tree is a data structure specially used to optimize write. It changes random write to sequential write, which significantly improves write performance but sacrifices read efficiency, which is not matched with the characteristics required by most scenarios. Therefore, MongoDB finally chooses B tree with better read performance as the default data structure.