Common problems and solutions of distributed systems

How many distributed pits do you know?

Theory of CAP

In the design of a distributed system, only consistency, availability, and partition can be met.
- Consistency means that all nodes access the same latest data copy. Availability means that the services provided by the system are always available. Fault tolerance of partitions means that the distributed system still needs to provide consistency and availability services when encountering any network partition failure. In a distributed system, it is impossible to satisfy all three features, at most two.
- The CA forgoes partition tolerance and the relational database is designed according to the CA
- AP abandons consistency in favor of ultimate consistency, and many non-relational databases are designed according to AP.
- CP waives availability, such as an inter-bank transfer, requiring that the transaction be completed until both banking systems have completed the entire transaction.

The BASE theory of

BASE is an abbreviation for the phrases basically available, soft state, and Eventually consistent.
Base theory is an extension of AP in CAP, which achieves availability by sacrificing strong consistency. When failure occurs, part of the data is allowed to be unavailable but the core function is guaranteed to be available. Data is allowed to be inconsistent for a period of time, but eventually reaches a consistent state.
- Basic availability: allow the loss of some available functions and ensure the availability of core functions.
- Soft state: Intermediate states are allowed, which do not affect system availability, such as “payment in progress”, “data synchronization in progress”, etc. When the data is consistent, the status changes to Successful.
- Final consistency: Data on all nodes is consistent after a period of time. For example, the status of “Payment in Progress” eventually changes to “Payment succeeded” or “Payment failed”.

Fusing, downgrading and limiting current

Reference: analysis of downgrading, fusing and current limiting

demotion

Downgrading is service downgrading. When the pressure on our servers increases, we selectively reduce the availability of some functions, or turn them off, in order to ensure the availability of core functions.
For example, the website of post bar type, when the server is overwhelmed, you can choose to close the Posting function and user service-related functions, etc., to ensure the core function of logging in and browsing posts.

fusing

A downgrade is usually when something goes wrong with our own system. Fusing generally refers to the failure of the dependent external interface and severs the relationship with the external interface.
For example, A function in service A depends on service B, and then service B has A problem and returns slowly. That’s when a circuit breaker is needed. That is, when it is found that A is calling B, an error is returned.

Current limiting

Traffic limiting limits requests within a certain time window to ensure system availability and stability and prevent system slowdowns or downtime caused by traffic surges.
A general limit is the total number of requests or requests made over a period of time.

How is message queuing distributed?

Idempotent concept

It’s idempotent no matter how many times you do it the same way you did the first time. Used to solve the problem of repeated message consumption.

We will solve the problem of repeated consumption

Database insertion scenario:
- Each time data is inserted, check whether the primary key ID of the data exists in the database. If not, update the data.
Writing redis scenes:
- Redis’ set operation is naturally idempotent
Other scenarios:
- Each time a producer sends a piece of data, it adds a globally unique ID. Each time a producer consumes a piece of data, it checks for the presence of this ID in Redis. If not, it performs normal message processing. If there is, it indicates that the previous consumption, avoid repeated consumption.

Resolve message loss

Loss of information related to order placing, payment result notification and deduction may result in financial loss.

1. The message is lost when the producer stores the message

Solution:
- Confirmation mechanism. Each message sent by a producer is assigned a unique ID. If written to a message queue, the broker sends back an ACK message indicating that the message was received successfully. Otherwise, a callback mechanism is used to let the producer resend the message.

2. The message queue loses messages

Solution:
- The broker responds to the producer after the message has been flooded. If the message is written to the cache and the response is returned, then the machine suddenly loses power and the message is lost, and the producer thinks it was sent successfully.
- If the broker is clustered and has multiple replicas, messages need to be written not only to the current broker, but also to the secondary machine. Configure it to write to at least two machines before sending the producer response, which basically guarantees reliable storage.

3. Consumers lose messages

Solution: After processing the message, the consumer actively ack it.

Resolve message disorder

The producer sends two messages to the message queue in order: message 1: add data A and message 2: delete data A.
Expected result: Data A is deleted.
But if there are two consumers, the order of consumption is message 2 and message 1. The final result is to increase data A.
Solution:
- The global order
  - Only one producer can send messages to a topic, and only one queue can exist within a topic. The consumer must also single-thread consume the queue.
- Part of the order
  - Split the topic internally, creating multiple memory queues where message 1 and message 2 enter the same queue.
  - Create multiple consumers, one queue for each consumer.

Resolve message backlog

There are too many messages in the message queue to be consumed. The scenario is as follows:
- Customers are dead
- Consumers are spending too slowly
Solution:
- Fixed code level consumer issues.
- Stop existing customers.
- Temporarily create five times the number of queues
- Temporarily build up five times as many customers as before.
- Put all piled messages into a temporary Queue

Resolve message expiration

Solution:
- Have a batch redirecting program ready
- Manually rerun messages in batches when they are idle

Problems with distributed caching

Asynchronously replicating data causes data loss

When the primary node asynchronously synchronizes data to the secondary node, the primary node breaks down. As a result, some data is not synchronized to the secondary node and the secondary node is elected as the primary node, and some data is lost.

Split-brain results in data loss

The machine on which the master node is located is detached from the cluster network and is actually running itself. But the sentry elects the standby node as the master node, and at this point there are two master nodes running, which is like two brains telling the cluster what to do, but who to do? This is the split brain.
If the client fails to switch to the new master node and is connected to the first master node, some data is written to the first master node and the new master node does not have such data. After the first master node recovers, it will be connected to the cluster as a standby node and its data will be wiped and replicated again from the new master node. The new master node does not have some data written by the previous client, causing some data to be lost.
Solution:
- Min-rabes-to-write 1 is configured, indicating that at least one standby node is available.
- If min-rabes-max-lag 10 is configured, the delay of data replication and synchronization cannot exceed 10 seconds. Data loss of up to 10 seconds.

The problem of sub-database sub-table

Sub – library, sub – table, vertical split, horizontal split

Split: Because a database supports a limited number of concurrent access, you can split the data of a database into multiple libraries to increase the maximum number of concurrent access.
Split table: because the amount of data in a table is too large, it is difficult to use the index to query data, so you can split the data of a table into multiple tables. When querying, you only need to look up a table after splitting, and the query performance of SQL statements is improved.
Advantages of separate database and separate table: after separate database and separate table, the concurrency is increased many times; The disk usage is greatly reduced. The amount of data in a single table is reduced, improving the EFFICIENCY of SQL execution.
Horizontal split: Split the data of a table into multiple databases with the same table structure in each database. Use multiple libraries for higher concurrency. For example, the order table has 5 million data accumulated every month. It can be split horizontally every month and put the data of the last month into another database.
Vertical split: To split a table with many fields into multiple tables into the same library or multiple libraries. Put high-frequency fields in one table and low-frequency fields in another. Leverage the database cache to cache frequently accessed row data. For example, an order table with many fields is divided into several tables to store different fields (redundant fields can be available).

Unique ID of sub-database sub-table

There are several ways to generate unique ids:
- Database self-increment ID (not suitable)
- UUID (too long to be ordered)
- Get the current system time as the unique ID (for high concurrency, there may be multiple ids of the same type within 1ms)
- Snowflake algorithm
- Baidu’s UIDGenerator algorithm
- Meituan’s Leaf-Snowflake algorithm

The problem of distributed transactions

In distributed mode, each service invokes each other and the link may be very long. If an error occurs on either side, related operations of other services need to be rolled back.

In two days, I finished the distributed transaction

How do RocketMQ and Kafka do transaction messages for message queues?

2 PC scheme

Role: participant and coordinator. Phases: Preparation and submission phases.
Preparation phase: The transaction coordinator sends a preparation command to each participant, and each participant executes the transaction after receiving the command. But no transaction is committed.
Commit phase: The coordinator enters the second phase after receiving the response from each participant, and if one participant prepares to fail, the coordinator sends a rollback command to all participants and a commit command to all participants.
If the coordinator does not receive any response from individual participants in the first stage, the transaction will be considered to have failed after waiting for a certain time and the rollback command will be sent. Therefore, the transaction coordinator has timeout mechanism in 2PC.
Advantages:
- Commit and roll back local transactions using the database’s own functions without invading business logic.
Disadvantages:
- Synchronous blocking: After the prepare command is executed in the first phase, each local resource is locked because everything is done except the transaction commit.
- Single point of failure: the coordinator fails and the entire transaction fails.
- Data inconsistency: Some participants cannot receive the coordinator’s request and some do, due to possible network exceptions. Such as phase 2 commit requests, where data inconsistencies arise.

TCC scheme

TCC implements transaction submission and rollback through business code, which is a two-stage submission at the business level or application level.
Try phase: Checks the resources of each service and locks or reserves the resources.
Confirm phase: Perform actual operations in each service.
Cancel phase: If the execution of any of the service’s business methods fails, the previous successful steps need to be rolled back.
Advantages: There is no resource blocking, and each method commits a transaction directly.
Disadvantages: Great intrusion into the business.
Note:
- Idempotent problem: Because network calls cannot guarantee the arrival of the request, there is always a reset mechanism. Therefore, idempotent implementation is required for the Try, Confirm, and Cancel methods to avoid repeated execution errors.
- Empty rollback problem: The try method timed out due to network blocking, at which point the transaction manager issues the Cancel command. So you need to support Cancel to Cancel properly without a Try.
- Suspension problem: The try method timed out due to network blocking, triggering the transaction manager’s Cancel command. But after execution, the try request comes in. The freeze operation is suspended, so the empty rollback must be logged to prevent further calls to the Try.

Transaction message scheme

This mode is applicable to asynchronous update scenarios where real-time data is not required. The goal is to solve the problem of data consistency between message producers and consumers.
Rationale: Use RocketMQ to implement message transactions. Ensure that both the order and message steps succeed or fail.
Step 1: System A sends A half-message to the broker, which marks the message status as prepared. The message is not visible to the consumer.
Step 2: The broker responds to system A, telling system A that the message has been received.
Step 3: A System performs local transactions.
Step 4: If system A successfully executed the local transaction, change the prepared message to COMMIT, and system B can subscribe to the message.
Step 5: The broker will also poll all prepared messages periodically and call back to system A to tell the broker how the local transaction is going, whether to wait or roll back.
Step 6: A System checks the execution result of the local transaction.
Step 7: If system A fails to perform A local transaction, the broker receives A Rollback signal and discards the message. If the local transaction is successful, the broker receives a Commit signal.
B After receiving the message, the system starts to execute the local transaction. If the transaction fails, the system automatically retries the transaction until it succeeds. Or System B rolls back and notifies system A to roll back in other ways.
B system needs to be idempotent.

Best efforts to inform the scheme

Basic Principles:
- After performing A local transaction, System A sends A message to the Broker.
- The broker persists messages.
- If system B fails to execute a local transaction, the maximum effort service periodically tries to invoke system B again and tries its best to make system B try again. After several attempts, system B still fails and has to give up. Go to the developer for troubleshooting and subsequent manual compensation.

Scheme selection

Dealing with payment and transaction, TCC is preferred.
Large systems, but with less stringent requirements, consider message transaction schemes.
For single application, it is recommended to commit XA in two stages. (XA is the landing implementation of 2PC)
Try your best to inform the solution suggestions are added, after all, it is impossible to give a problem to the development investigation, first retry a few times to see whether it can succeed.

Common problems and solutions of distributed systems

Theory of CAP

The BASE theory of

Fusing, downgrading and limiting current

demotion

fusing

Current limiting

How is message queuing distributed?

Idempotent concept

We will solve the problem of repeated consumption

Resolve message loss

1. The message is lost when the producer stores the message

2. The message queue loses messages

3. Consumers lose messages

Resolve message disorder

Resolve message backlog

Resolve message expiration

Problems with distributed caching

Asynchronously replicating data causes data loss

Split-brain results in data loss

The problem of sub-database sub-table

Sub – library, sub – table, vertical split, horizontal split

Unique ID of sub-database sub-table

The problem of distributed transactions

2 PC scheme

TCC scheme

Transaction message scheme

Best efforts to inform the scheme

Scheme selection

Related Posts

Can methods that Spring Boot defines interfaces be declared private?

My understanding of the –volatile keyword

Java collection framework source code analysis: LinkedHashSet and LinkedHashMap