1.1 Microservices and distributed data management issues
ACID Transactions Can be used by a single application that has a relational database. ACID Transactions can provide some important operational features:
- Atomicity – Any change is atomic
- Consistency – The database state is always consistent
- Isolation – Even if transactions are executed concurrently, they appear to be serial
- Durable – Cannot be rolled back once a transaction is submitted
Given the above features, the application can be simplified to start a transaction, change (insert, delete, update) many rows, and then commit those transactions.
Another advantage of using a relational database is that it provides SQL (powerful, declarable, table-transformed query language) support. Users can easily combine data from multiple tables through queries, the RDBMS query scheduler determines the best implementation, and users don’t have to worry about low-level issues such as how to access the database. In addition, because all the application data is in one database, it is easy to query.
However, for microservices architectures, data access becomes very complex because the data is microservices private and the only way to access it is through the API. This packaged data access approach makes microservices loosely coupled and independent of each other. If multiple services access the same data, schema updates the access time and coordinates it across all services.
What’s more, different microservices often use different databases. Applications generate different kinds of data, and a relational database is not always the best choice. In some scenarios, a NoSQL database may provide a more convenient data model, providing better performance and scalability. For example, an application that generates and queries strings uses a character search engine such as Elasticsearch. Similarly, an application that generates social image data can use a picture database, for example, Neo4j; Therefore, microservices-based applications typically use a combination of SQL and NoSQL databases, an approach known as Polyglot Persistence.
Partitioned, Polyglot-persistent architectures for storing data have many advantages, including loosely-coupled services and better performance and scalability. However, then comes the challenge of distributed data management.
The first challenge is how to complete a transaction while maintaining data consistency across multiple services. The reason for this problem is to take an online B2B store as an example. The customer service maintenance includes all kinds of customer information, such as Credit Lines. The order service manages orders and needs to verify that a new order does not conflict with a customer’s credit limit. In a single application, the order service only needs to use ACID transactions to check available credits and create orders.
On the contrary, in the microservice architecture, the order and customer tables are private tables of corresponding services respectively, as shown below:
The order service cannot access the customer table directly, only through the API published by the customer service. Distributed Transactions, also known as two-phase commit (2PC), is also available for ordering services. However, 2PC is not optional in current applications. According to CAP theory, a choice must be made between availability and ACID consistency, and availability is generally the better choice. However, many modern technologies, such as many NoSQL databases, do not support 2PC. Maintaining data consistency between services and databases is a very fundamental requirement, so we need to find other solutions.
The second challenge is how to complete the search for data from multiple services. For example, imagine that the application needs to display a customer and his order. If the order service provides an API to accept the user’s order information, then the user can receive the data using a join operation of the class application type. The application accepts user information from the user service and this user order from the order service. Suppose that the order service only supports querying orders by private keys (perhaps using a NoSQL database that only supports primary key-based acceptance), and there is no suitable way to receive the required data.
1.2 Event-driven Architecture
For many applications, the solution is to use an Event-driven architecture. In this architecture, microservices publish an event when something important happens, such as updating a business entity. When the microservices that subscribe to these events receive this event, they can update their own business entities and possibly cause more time to publish.
Events can be used to implement business transactions across multiple services. Transactions typically consist of a series of steps, each consisting of an event that updates the business entity’s microservices and publishes an event that activates the next step. The following figure shows how an event-driven approach is used to check credit availability when an order is created, with the microservice exchanging events through a Message Broker.
- The Order service creates an Order with a NEW state and issues an “Order Created Event” Event.
- The customer service consumption Order Created Event Event, for which the Credit is Reserved for the Order, issues the “Credit Reserved Event” Event.
- The order service consumption Credit Reserved Event changes the status of the order to OPEN.
More complex scenarios can introduce more steps, such as reserving inventory while checking user credit.
Consider that (a) each service atomically updates the database and issues events, and then (b) the message broker ensures that the event is delivered at least once, and then the business transaction (which is not an ACID transaction) can be completed across multiple services. This pattern provides weak determinism, such as eventual consistency. This type of transaction is called a BASE model.
Events can also be used to maintain an implementation view of different microservices owning data pre-join. Maintain service subscriptions for this view and update the view. For example, the Customer Order View update service (maintaining the Customer Order view) subscribes to events published by the customer service and the order service.
When the Customer Order View update service receives a customer or order event, it updates the customer Order View data set. You can use a document database (such as MongoDB) to implement the customer order view, storing one document per user. The Customer Order View query service is responsible for responding to queries about customers and recent orders by querying the Customer Order View data set.
Event-driven architecture also has both advantages and disadvantages. This architecture enables transactions across multiple services and provides ultimate consistency, and enables applications to maintain the ultimate view. The disadvantage is that the programming mode is more complex than the ACID transaction mode: in order to recover from application-level failures, compensatory transactions need to be completed, for example, the order must be cancelled if the credit check is unsuccessful; In addition, the application must deal with inconsistent data because changes caused by in-flight transactions are visible, and the application may encounter inconsistent data when reading unupdated final views. Another disadvantage is that the subscriber must detect and ignore redundant events.
Achieving Atomicity
Event-driven architectures also encounter atomicity issues with database updates and publishing events. For example, the ORDER service must insert a row into the ORDER table and then publish an ORDER Created Event, which requires atomicity. If, after updating the database, the service crashes causing the event to fail to be published, the system becomes an inconsistent state. The standard way to ensure atomic operations is to use a distributed transaction that includes a database and a message broker. However, based on the CAP theory described above, this is not what we want.
1.3.1 Publishing events using local transactions
One way to obtain atomicity is to apply a multi-step process involving only local transactions to the publication EVENT application. The trick is to have an EVENT table that functions as a message list in the storage business entity database. The application initiates a (local) database transaction, updates the business entity state, inserts an EVENT into the EVENT table, and then commits the transaction. Another independent application process or thread queries the EVENT table, publishes the EVENT to the message broker, and then uses a local transaction to mark the EVENT as published, as shown below:
The ORDER service inserts a row into the ORDER table, and then inserts an ORDER Created EVENT into the EVENT table. The EVENT publishing thread or process queries the EVENT table, requests unpublished events, publishes them, and updates the EVENT table to mark the EVENT as published.
This approach has both advantages and disadvantages. The advantage is that you can ensure that event publishing is independent of 2PC, and that applications publish business-level events without having to infer what happened to them; The downside is that this approach is prone to errors because developers have to remember release events. This approach is also a challenge for some applications that use NoSQL databases because NoSQL itself has limited transaction and query capabilities.
This approach does not require 2PC because the application uses local transactions to update the status and publish events. Now let’s look at another approach that applies simple state updates to obtain atomicity.
1.3.2 Mining transaction logs in the database
Another way to obtain atomicity of thread or process publishing events without requiring 2PC is to mine database transaction or commit logs. The application updates the database to make changes in the database transaction logs, which are read by the transaction log mining process or thread and published to the message broker. As shown below:
An example of this approach is the LinkedIn Databus project, where Databus mines Oracle transaction logs and publishes events based on changes. LinkedIn uses Databus to ensure consistency between records within the system.
Another example is: AWS DynamoDB is a managed NoSQL database. A DynamoDB stream is created over the past 24 hours based on changes in the database tables (create, update and delete operations). Applications can read these changes from the stream. These changes are then published as events.
Transaction log mining also has advantages and disadvantages. The advantage is to ensure that each update release event is not dependent on 2PC. Transaction log mining can be simplified by separating publishing events from application business logic; The main disadvantage is that the transaction log has different formats for different databases, even different versions of databases. It is also difficult to convert from low-level transaction log update records to high-level business events.
The transaction log mining method updates the database directly through the application without the intervention of 2PC. Now let’s look at a completely different approach: one that doesn’t require updates and just relies on events.
1.3.3 Using Event Sources
Event sourcing ensures consistency of business entities by using radically different Event centric approaches to obtain atomicity without requiring 2PC. Instead of storing the current state of the entity, the application stores a series of state change events for the business entity. An application can replay events to recreate the current state of the entity. Whenever a business entity changes, new events are added to the schedule. Because the save event is a single operation, it must be atomic.
To understand how event sources work, consider event entities as an example. Traditionally, each ORDER is mapped to a row in the ORDER table, for example in the ORDER_LINE_ITEM table. But for the event source mode, the order service stores an order as an event state change: created, approved, shipped, canceled; Each event includes enough data to recreate the order status.
Events are permanently stored in the event database, providing apis to add and retrieve entity events. The event store is similar to the message broker described earlier, providing apis to subscribe to events. The event store, which delivers events to all interested subscribers, is the backbone of the event-driven microservices architecture.
The event source approach has many advantages: it solves the key issues of event-driven architecture, making it possible to reliably publish events as long as there are state changes, and it also solves the problem of data consistency in microservices architecture. Also, because you persisted events rather than objects, you avoided the Object Relational Impedance Mismatch problem.
The data source approach provides a 100% reliable log of business entity change monitoring, making it possible to capture entity state at any point in time. In addition, the event source approach enables business logic to be composed of loosely coupled business entities that exchange events. These advantages make it relatively easy to migrate individual applications to microservices architectures.
The event source approach also has a number of disadvantages, because different or less familiar transformation patterns make relearning less easy; Event stores only support primary key Query business entities, which must be performed using Command Query Responsibility Segregation (CQRS); therefore, applications must process final consistent data.
1.4 summarize
In the microservice architecture, each microservice has its own private data set. Different microservices may use different SQL or NoSQL databases. Although database architecture has strong advantages, it also faces the challenge of data distributed management. The first challenge is how to maintain business transaction consistency across multiple services; The second challenge is how to get consistency data from a multi-service environment.
The best solution is to adopt an event-driven architecture. One of the challenges is how to atomically update status and publish events. There are several ways around this, including treating the database as a message queue, transaction log mining, and event source.