原文 : Performance Best Practices: Transactions and Read/Write Concerns

By Mat Keep and Henrik Ingo

The Nuggets translation Project

Permanent link to this article: github.com/xitu/gold-m…

Translator: Miigon

Proofread by: Kimberly

MongoDB high performance best practices: transactions, read care versus write care

This is the fifth in a series of tweets covering MongoDB’s best performance practices.

In this series of articles, we take a multi-dimensional look at key considerations for achieving high performance in large data volume scenarios, including:

Data Modeling and Memory allocation (Working set)
Request patterns and performance analysis
The index
Database sharding
Transactions, read care versus write care (the topic of this article)
Hardware and operating system configuration
The benchmark

Single document atomicity

In a partitioned database design, interrelated data needs to be abstracted into multiple independent parent-child tables. In MongoDB, however, such data can be clustered and stored, thanks to documents. MongoDB’s single-document operations provide atomistic semantics that meet the needs of most applications.

An operation can modify one or more fields, including updating multiple subdocuments or array elements. MongoDB ensures complete isolation when updating individual documents; Any error will cause the entire operation to be rolled back, thus ensuring that the user always gets consistent document data.

Arrival of multi-document ACID transactions

Support for multi-document ACID transactions has been added to MongoDB since version 4.0, making it easier for developers to implement requirements in a variety of scenarios using MongoDB. After version 4.0, the scope of a transaction is limited to a replica set. In the subsequent 4.2 release, support for multi-document transactions was extended to the entire shard cluster.

MongoDB’s transactional capabilities are very similar to those of relational databases — multiple statements, familiar syntax, and easy integration into any program. With snapshot isolation, transaction capabilities ensure data consistency, provide an “all or nothing” execution pattern, and have no impact on the performance of other operations that do not involve transaction capabilities.

You can check out our benchmark results published at the VLDB conference papers for more information on transaction performance.

Next we’ll discuss how to make better use of transactions in your projects.

Best practices for multi-document transactions

Creating long time consuming transactions, or performing a large number of operations in a single ACID transaction, can increase the cache pressure on the WiredTiger storage engine. This is because since snapshot creation, all write operations require caching to store and manage state. Because a transaction uses the same snapshot throughout its run, writes to the collection during the transaction pile up in the cache. These writes cannot be written to the database until the transaction commits or terminates and the associated lock is released.

In order to maintain stable and predictable database performance, developers need to pay attention to the following:

Transaction runtime

By default, MongoDB automatically terminates multi-document transactions that run for more than 60 seconds. If the server has low write volume, it can flexibly adjust the transaction runtime.

To address transaction timeouts, large transactions should be shred into smaller transactions that can complete within the configured runtime. Also, to reduce query time, you should ensure that you have optimized the query with appropriate index coverage.

Number of operations in a transaction

There is no hard limit to the number of documents that can be read in a transaction. As a best practice, however, a single transaction should generally not modify more than 1000 documents.

If there are operations that need to modify more than 1000 documents, developers should split the transaction into multiple transactions to process those documents.

Distributed cross-shard transactions

Transactions involving multiple database shards incur higher performance costs because cross-shard operations require multiple nodes to collaborate over the network.

Snapshot Read care level is the only isolation level that provides a consistent snapshot of data in a cross-shard scenario. When low latency is more important than cross-shard read consistency, use the default Local read care level, which performs transactions in a snapshot of a single machine.

Exception handling

When a transaction terminates, an exception is returned to the caller and the transaction is fully rolled back. Developers need to implement exception catching and retry logic for temporary exceptions such as MVCC write conflicts, temporary network errors, or primary copy elections.

With the retried write mechanism, the MongoDB caller automatically retries the commit instructions for the transaction.

Benefits for write latency

While it may not be obvious at first glance, using multi-document transactions actually improves write performance by reducing commit latency.

Using w:majority’s write concern level, we assume that 10 update instructions are executed separately, and each instruction waits for a round-trip time for a copy between shards.

However, if the same 10 update instructions run in the same transaction, they will be replicated once the transaction commits, reducing latency by 10 times!

What else do I need to know?

You can learn all the best practices in the MongoDB multi-document Transaction Reference. Refer to the Production Environment Considerations section of this document for performance-related guidelines.

Select the appropriate write guarantee level

MongoDB allows you to specify a reliability assurance level, called write care level, when submitting write requests to the database.

Note that the write care level can take effect for any operation on the server, whether it is a general operation on a single document or part of a multi-document transaction.

The following options can be set at the per-connection, per-database, per-collection, or even per-operation level. There are altogether these options:

Write acknowledgment:This is the default write care level.mongodWrite operations are ensured so that customers can catch exceptions such as network exceptions, duplicate key exceptions, and Schema validation exceptions.
Journal Acknowledgment: mongodThe write operation is confirmed to be successful only after it has been written to the log of the master node. This level ensures that a write operation can survive oncemongodCrash and ensure that the write operation is written to the hard disk.
Replica Acknowledgment: Use this option to ensure that a write operation is not considered successful until other members of the Replica set have sent a write acknowledgment. MongoDB supports writing to a specified number of replicas. This option also ensures that the written data is written to the log of the secondary database. Since replicas can be deployed on different racks or even in different data centers, ensuring that written data is amplified to additional replicas provides extremely high reliability.
Majority confirmation: The write concern level waits for write operations to be applied to the Majority of data-bearing, electable members of the replica set. Therefore, a write operation will fail to execute in the event of a primary copy election. This level also ensures that write operations are logged on all replicas – including the primary one.

Choose the appropriate level of read care

Like write care, read care can be applied to any request made to the database, whether it is a read of a single document or as part of a multi-document transaction.

To ensure isolation and consistency, readConcern can be set to majority, which indicates that data is returned to the application only if it has been overwritten to a majority of nodes in the replica set. That is, data cannot be rolled back due to the election of a new master node.

MongoDB supports a linearizable read care level. The linearizable read care level ensures that a node is still the master node of the replica set at the time of reading, and that the returned data is not rolled back even if another node is later elected as the new master node. Using this read care level can have a significant impact on latency, so you need to provide a maxTimeMS value to time out an operation that is running too long.

Use causal consistency only when necessary

Causal consistency guarantees that all reads in the client session will see the result of the last write regardless of which replica is servicing the current request. Using causal consistency reduces the impact of delay only where monotonic read guarantees are needed.

The next article

That’s this week’s high performance best practices. Next in this series: Hardware and operating system configuration.

If you find any mistakes in your translation or other areas that need to be improved, you are welcome to the Nuggets Translation Program to revise and PR your translation, and you can also get the corresponding reward points. The permanent link to this article at the beginning of this article is the MarkDown link to this article on GitHub.

The Nuggets Translation Project is a community that translates quality Internet technical articles from English sharing articles on nuggets. The content covers Android, iOS, front-end, back-end, blockchain, products, design, artificial intelligence and other fields. If you want to see more high-quality translation, please continue to pay attention to the Translation plan of Digging Gold, the official Weibo, Zhihu column.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

MongoDB high performance best practices: transactions, read care and write care

MongoDB high performance best practices: transactions, read care versus write care

Single document atomicity

Arrival of multi-document ACID transactions

Best practices for multi-document transactions

Transaction runtime

Number of operations in a transaction

Distributed cross-shard transactions

Exception handling

Benefits for write latency

What else do I need to know?

Select the appropriate write guarantee level

Choose the appropriate level of read care

Use causal consistency only when necessary

The next article

MongoDB high performance best practices: transactions, read care and write care

MongoDB high performance best practices: transactions, read care versus write care

Single document atomicity

Arrival of multi-document ACID transactions

Best practices for multi-document transactions

Transaction runtime

Number of operations in a transaction

Distributed cross-shard transactions

Exception handling

Benefits for write latency

What else do I need to know?

Select the appropriate write guarantee level

Choose the appropriate level of read care

Use causal consistency only when necessary

The next article

Related Posts

No secrets under the source code – do the best Netty source analysis tutorial

Leetcode 2037. Minimum Number of Moves to Seat Everyone (Python)

Break down 8 key trends in cloud Native 2.0 architecture design