First, what is a transaction

A transaction integrates all operations involved in a single execution into an indivisible execution unit. All operations that constitute a transaction can only be committed if all operations can be executed properly. As long as any operation fails, the whole transaction will be rolled back. In short, make sure you either do all or none of the operations. And once a transaction commits, its changes are permanently saved to the database.

Four properties of transactions (ACID)

  • A: Atomicity All the operations in A transaction either complete or do not complete, and do not end somewhere in between. If a transaction fails during execution, it will be rolled back to the state before the transaction began, as if the transaction had never been executed.
  • C: Consistency The Consistency of a transaction means that the database must be consistent before and after a transaction is executed. If the transaction completes successfully, all changes in the system are applied correctly and the system is in a valid state. If an error occurs during a transaction, all changes in the system are automatically rolled back and the system is returned to its original state.
  • I: Isolation refers to the fact that in a concurrent environment, when different transactions simultaneously manipulate the same data, each transaction has its own complete data space. Changes made by concurrent transactions must be isolated from changes made by any other concurrent transactions. When a transaction views a data update, the data is either in the state it was in before another transaction modified it, or in the state it was in after another transaction modified it, and the transaction does not see the data in the intermediate state.
  • D: Durability means that as long as a transaction ends successfully, its updates to the database must be persisted forever. Even if a system crash occurs, the database system can be restarted to the state it was at the successful end of the transaction.

Third, InnoDB transaction implementation

Based on the four features that measure transactions, InnoDB implementation transactions are actually implementations of four features.

  • atomic

    • There are many types of logs in MySQL, binary logs, query logs, error logs, slow query logs, and so on. In addition to these logs, two types of transaction logs are provided: redo log for persistence, and Undo log for atomicity and isolation implementations.
    • The database will generate an undo log for each SQL update. For example, an INSERT log will generate a DELETE undo log. If the transaction fails or rollback is called, data can be rolled back based on undo log.
  • Isolation,

    • Isolation means that operations within a transaction are isolated from other transactions, and that concurrent transactions cannot interfere with each other. Strict isolation corresponds to the Serializable transaction isolation level, but Serializable is rarely used in practice for performance reasons.
    • InnoDB uses repeatable read isolation levels, MVCC and row and gap locks for isolation.
  • persistence

    • As the storage engine of MySQL, InnoDB stores data on disk. However, if I/O of disk is required for reading and writing data, the efficiency will be very low. For this purpose, InnoDB provides a Buffer Pool that contains a map of some of the data pages on disk as a Buffer to access the database: When data is read from the database, it is read from the Buffer Pool first. If there is no Buffer Pool, it is read from the disk and put into the Buffer Pool. When data is written to the database, it is first written to the Buffer Pool, and the modified data in the Buffer Pool is periodically flushed to disk (this process is called flushing).
    • The use of Buffer Pool greatly improves the efficiency of reading and writing data, but it also brings a new problem: if MySQL crashes and the modified data in the Buffer Pool is not flushed to disk, the data will be lost and the persistence of transactions cannot be guaranteed.
    • The redo log was introduced to solve this problem: When data was modified, the redo log was recorded in addition to the data in the Buffer Pool. When a transaction commits, the fsync interface is called to flush the redo log. If the MySQL database is down, you can read the redo log data during the restart to restore the database. Redo log uses write-ahead logging (WAL). All changes are written to the log first and then updated to the Buffer Pool. This ensures that data will not be lost due to MySQL downtime and meets the persistence requirements.

    Since the redo log also needs to write the log to disk at transaction commit time, why is it faster than writing the modified data in the Buffer Pool directly to disk (i.e., flushing)? The causes are as follows: (1) Redo log is a sequential I/O, because the data position of each redo log is random. The default MySQL Page size is 16KB. Every small change on a Page will be written to the entire Page. The redo log contains only the parts that actually need to be written, and the number of invalid IO is greatly reduced.

  • consistency

    • Consistency refers to the fact that the integrity constraint of the database has not been destroyed after the transaction execution, and the data state is legal before and after the transaction execution.
    • Consistency is not only guaranteed by the database itself, but also by the business system.

The origin of distributed transactions

Modern software architectures form complex software systems as business domains are divided into multiple microservices. From the database level, with the outbreak of data volume, we have to adopt the way of database and table to reduce the pressure of database. Thus, multiple services depend on different databases, so how can transactions be guaranteed when operating at the same time? This is a distributed transaction.

Distributed transaction, in short, is a large transaction consists of different sub transaction, distribution of these small transactions on a different server nodes, belong to different services, distributed transaction need to make sure that under the same transaction affairs either all success, either fail, guarantee eventual consistency of data.

Distributed transaction solution

I don’t want to get too conceptual in this article, but to mention RocketMQ’s distributed transaction implementation, I’ll mention the current centralized solution for distributed transactions:

  • Two-stage Submission (2PC)

    Two-phase Commit (2PC) is one of the implementation methods of XA distributed transaction protocol proposed by Oracle Tuxedo system. For details, see Two-Phase Commit of Distributed Transaction (2PC).

  • The Try – Confirm – Cancle (TCC)

    TCC implements distributed transactions based on try, confirm, cancel. For more information, see Compensation Transactions for Distributed Transactions (TCC).

  • Local message table

    The local message table solution was originally proposed by ebay. The core is to asynchronously execute tasks requiring distributed processing through message logging. Message logs can be stored in local text, databases, or message queues, and then automatically or manually initiated retries by business rules. Manual retry is more commonly used in payment scenarios to deal with post-hoc problems through reconciliation systems.

In addition to the above, there are some solutions like Ali SEATA, SAGA scheme and best effort notification… Those of you who are interested can find out for themselves, and of course there are MQ transactions that we will talk about in this article.

MQ transactions

RocketMQ is an open-source distributed messaging middleware with high performance and high throughput. It supports distributed transactions based on message asynchronism to achieve ultimate transaction consistency.

Here is the basic flow interaction diagram for a RocketMQ transaction message:

The figure is divided into two processes: normal transaction message sending and submission, transaction message compensation process.

1. Transaction message sending and submission:

(1) Send the half message. (2) The server responds to the message writing result. (3) Execute a local transaction based on the sent result (if the write fails, the half message is not visible to the business and the local logic is not executed). (4) Perform Commit or Rollback according to the local transaction status (Commit generates the message index, and the message is visible to consumers)

The flow chart is as follows:

2. Compensation process:

(1) Initiate a “Rollback” from the server for the pending transaction messages that are not Commit/Rollback. (2) The Producer receives the Rollback message and checks the status of the local transaction corresponding to the Rollback message. (3) According to the status of the local transaction, Recommit or Rollback

In the compensation phase, the timer check mode is used to resolve the timeout or failure of the Commit or Rollback messages.

Use of RocektMQ transaction messages

As mentioned above, you should be familiar with RocketMQ transaction messages. Here’s how to use them in a development scenario.

The difference between sending a transaction message and a normal message is that you create a TransactionMQProducer and a corresponding TransactionListener implementation.

  • The specific configuration of TransactionMQProducer includes groups, nameServer addresses, thread pools to execute local transactions, and the implementation of transaction listeners.
this.producer = new TransactionMQProducer(config.getGroup());
    this.producer.setNamesrvAddr(config.getNameServer());
    this.producer.setExecutorService(config.getExecutorService());
    this.producer.setTransactionListener(config.getTransactionListener());
Copy the code
  • TransactionListener implementationTransactionListenerTwo methods of the interface:
    • executeLocalTransaction(Message message, Object o)Methods used to perform local transactions.
    • checkLocalTransaction(MessageExt messageExt)RocketMQ looks back to the method called by the local transaction state.

The code is 👀 : github.com/wangning101…

Welcome to visit personal blog for more knowledge sharing.