Distributed transactions are an inevitable problem in distributed architecture projects. This is one of the most common questions asked in job interviews. The following is a summary of my study. Not in great detail, but as a summary note.
Typical schemes
With respect to distributed transactions, the engineering field talks about strongly consistent and ultimately consistent solutions. Typical solutions include:
- Two-stage submission (2PC) scheme
- EBay event queue scheme
- TCC compensation mode
- Final consistency of cached data
Theoretical support
The purpose of distributed transaction is to ensure the consistency of sub-database data, but cross-database transaction will encounter various impossible problems. Therefore, a distributed transaction solution is needed to ensure data consistency. The famous CAP theory determines that when solving the consistency problem, in fact, the system availability and partition tolerance also need to be considered together.
Theory of CAP
In distributed systems:
- Consistency
- Availability
- Partition Tolerance
Three elements can only satisfy two at the same time, not both, among which partition tolerance is everyone will not give up.
Consistency model
There are three types of consistency models:
- Strong consistency (data in all partitions is consistent at any time)
- Weak consistency (after data is updated, there is no guarantee that all systems will be able to read the latest value or how long it will take to read it)
- Final consistency (After data is updated, the system does not promise to return the most recently written value immediately, but guarantees that all partitions can be updated to this value)
Distributed solutions
1. 2PC scheme – Strong consistency
Implementation principle: through the submission of phases and logging. Wait until all phases of transactions are ready to commit successfully before all commit. Logs are recorded during the preparation process. If there is no successful phase, log back processing can be used. Disadvantages: Synchronous blocking, suitable for database level 3PC has made some improvements, adding timeout mechanism. But few have done it
2. EBay event queue scheme – final consistency
The architects at eBay first suggested it. The core idea is to execute distributed tasks asynchronously in the form of messages or logs, cache the messages and logs to local files, databases or message queues, and then retry or cancel the tasks according to service rules. The interface is required to be idempotent. Implementation scheme: local message table, MQ
Try-confirm-cancel (TCC) compensation mode — final consistency
The TCC model has a transaction manager that records the TCC global transaction state. In each service, the business side of the service is required to record the invocation link and perform the rollback after failure.
4. Cache data – Final consistency
In a business system, the cache acts as a buffer for the database and synchronizes data from the database to the cache. Inconsistencies can occur in this process. 1. Set the expiration time for cached data. This expiration time is the final consistent tolerance time that the system can achieve. 2. Clear cached data after updating database data.
This article is published by OpenWrite!