Explain idempotent design of services in detail

This is the fifth day of my participation in the November Gwen Challenge. Check out the details: The last Gwen Challenge 2021

primers

In some technical design scheme review meetings in daily work, it is often mentioned that attention should be paid to the idempotence of service interface. Recently, a student came up to me and asked me what idempotence really is.

In today’s distributed/microservitization, the service capabilities provided are rich and diverse. Web API based on HTTP protocol is the most popular distributed service delivery mode, which is particularly important for the guarantee of idempotency of services.

I thought about it, and I thought it was necessary to popularize it.

Today, I plan to summarize the material and share it with you on a series of problems about idempotence of service

1. What is idempotency?

Idempotence comes from a concept in mathematics, such as idempotence function/idempotence method (a function that is executed repeatedly with the same parameters and can achieve the same result without affecting the state of the system or worrying about the changes caused by repeated execution).

Multiple calls have the same effect on the system and the same effect on resources.

idempotence

Idempotence emphasizes the influence of the outside world on the inside of the system through the interface, as long as one or more calls to a resource should have the same side effects.

Note: This means that the side effects on the resource must be the same, but the return value can be different!

2. What are the main idempotent scenarios?

According to the above definition of idempotent, we know that repeated data or data inconsistency is mostly due to repeated requests.

A repeated request here means that the same request is made more than once in some cases.

What are the scenarios that could lead to this?

Under the microservices architecture, there will be a large number of network communications between different microservices based on HTTP, RPC or MQ messages, and a third condition [unknown], which is timeout. If the timeout occurs, the microservice framework will retry.
Multiple clicks during user interaction inadvertently trigger multiple transactions.
MQ messaging middleware, message re-consumption
Third party platform interface (e.g., payment success callback interface), because exceptions also cause multiple asynchronous callbacks
Other middleware/application services may also retry depending on their characteristics.

What is the function of idempotence?

Idempotence mainly ensures that the impact of multiple invocations on resources is consistent.

Before explaining the role, let’s use the resource processing application to illustrate:

HTTP corresponds to CRUD operations on a database:

PUT ：CREATE

GET: the READ

POST: the UPDATE

DELETE ：DELETE

(This is true not just for databases, but for any data such as charts and files.)

1) query

SELECT * FROM users WHERE xxx;
Copy the code

Does not produce any change to the data, naturally idempotent.

2) new

INSERT INTO users (user_id, name) VALUES (1, 'zhangsan');
Copy the code

Case1: contains a unique index (for example, ‘user_id’). Repeated insertion will cause subsequent execution failures and is idempotent.

Case2: Does not have a unique index. Multiple insertions will cause data duplication and are not idempotent.

3) change

Case1: Direct assignment, no matter how many times score is executed, it is idempotent.

UPDATE users SET score = 30 WHERE user_id = 1;
Copy the code

Case2: Calculates the assignment. Score data is different for each operation and does not have idempotency.

UPDATE users SET score = score + 30 WHERE user_id = 1;
Copy the code

4) remove

Case1: delete absolute value, repeat for many times, the result is the same, with idempotent.

DELETE FROM users WHERE id = 1;
Copy the code

Case2: the relative value is deleted, the result is inconsistent after repeated for many times, and it is not idempotent.

DELETE top(3) FROM users;
Copy the code

Summary: Generally only idempotent guarantees are required for write requests (new & update).

4. How to solve the idempotence problem?

When we search the Internet for solutions to idempotence problems, there will be various solutions, but how to judge which solution is the best solution for our own business scenarios, in this case, we need to focus on the essence of the problem.

After the above analysis, we find that to solve the idempotent problem is to control the write operations on resources.

We analyze and solve the problem from each link of the process:

Analysis of idempotence problem

4.1 Control repeated requests

Control action trigger source, that is, the front end to achieve idempotent control

Relatively unreliable, does not fundamentally solve the problem, only as a secondary solution.

Main solutions:

Control the number of operations, for example: The submit button can be operated only once (the button becomes gray after the submission action)
Timely redirection, such as redirecting to the success page after a successful order/payment, eliminates the problem of repeated submission caused by the browser moving forward or backward.

4.2 Filter repetitive actions

Control filtering repetitive actions refers to the control of the number of valid requests in the process of action flow.

1) Distributed locks

Redis is used to record the business identification currently being processed. When no such task is detected in the process, it enters the process; otherwise, it is judged as a repeated request and can be filtered.

When an order initiates a payment request, the payment system will check whether there is a Key for the order number in Redis cache. If there is no Key for the order number, the Key will be added to Redis. Check that the order payment has been paid, if not, the payment will be made, and delete the Key of the order number after the payment is completed. Distributed locking is achieved through Redis, until the order payment request is completed, the next request can come in.

Distributed locking is more efficient than de-duplicating tables, which can be put into the cache concurrently. Same idea, only one payment request can be completed at a time.

2) Token

The application process is as follows:

1) The server provides an interface for sending tokens. Obtain the token before performing services, and the server saves the token in Redis.

2) When the business end initiates a business request, it carries the token with it and usually puts it in the request header.

3) The server determines whether the token exists in Redis. If the token exists, it is the first request and the service can be continued. After the service is completed, the token is deleted from redis.

4) If it is judged that the token does not exist in Redis, it means that the operation is repeated, and the repeat mark is directly returned to the client, so as to ensure that the business code will not be executed repeatedly.

3) Buffer queues

Send all requests down quickly to access the buffer pipe. Asynchronous tasks are then used to process the data in the pipeline and filter out duplicate requests.

Advantages: Synchronous to asynchronous, high throughput.

Disadvantages: Cannot return processing results in a timely manner, need to listen to the subsequent asynchronous return of processing results data.

4.3 Solving Repeated Write Problems

Common ways to implement idempotency include pessimistic locking (for Update), optimistic locking, and unique constraints.

1) Pessimistic Lock

Assume that every time a row or table is fetched, it will be modified, so lock the row or table.

When the database performs a SELECT for update, it acquires a row lock on the selected row. Therefore, any other concurrent select for update attempts to select the same row will be rejected (waiting for the row lock to be released), thus obtaining the lock effect.

Row locks obtained by select for UPDATE are automatically released at the end of the current transaction and therefore must be used in a transaction. (Note that for update must be used on the index, otherwise the table will lock)

START TRANSACTION; Select * FROM users WHERE id=1 FOR UPDATE; UPDATE users SET name= 'xiaoming' WHERE id = 1; COMMIT; # commit transactionCopy the code

2) Optimistic Lock

I’m optimistic. Every time I go to get my data, I think no one else will change it. If the version changes during the update, the update will not succeed.

However, optimistic locks can fail, which is often referred to as an ABA problem, but this will not happen if the version is continuously incregated.

UPDATE users 
SET name='xiaoxiao', version=(version+1) 
WHERE id=1 AND version=version;
Copy the code

Disadvantages: You need to query the current version before operating services

There is another: state machine control

For example: payment status flow flow: Payment pending -> Payment in progress -> Paid

Strictly speaking, it is also a kind of optimistic lock.

3) Unique constraints

It is common to use database unique indexes or global business unique identifiers (such as source+ sequence number, etc.).

This mechanism takes advantage of the unique constraint of the database primary key to solve the idempotent problem in insert scenarios. However, the requirement for a primary key is not self-incremented, which requires the business to generate a globally unique primary key.

Global ID generation scheme:

UUID: Combines the machine’s network card, local time, and a random number to generate a UUID.
Auto_increment OF database ID: uses an auto_increment policy of the database ID, such as MySQL auto_increment.
Redis implementation: By providing autoatomic commands like INCR and INCRBY, the generated ID is guaranteed to be uniquely ordered.
Snowflake: Twitter’s open-source distributed ID generation algorithm divides 64-bit bits into namespaces, with each part representing a different meaning.

** Summary: ** In order of optimal benefits on the application, the recommended order is optimistic locking > Unique constraint > pessimistic locking.

5, summary

In general, non-idempotent problems are mainly caused by repeated, indeterminate writes.

1. Address the main thinking points of repetition

From the whole process of request, control repeated request triggering and repeated data processing

The client controls the initiation of repeated requests
The server filters repeated invalid requests
The underlying data processing avoids repeated writes

2. Main thinking points of controlling uncertainty

Make changes in service design ideas to avoid uncertainty as far as possible:

Change statistical variables to data recording mode
The range action is changed to the confirm action

Afterword.

After listening to the above section of my story, he seemed to feel full of harvest like said: probably understand…

But from my own sense of duty I had to say a few words to him:

1) Idempotent processing is very necessary to ensure the accuracy of system data, although it is complicated in business processing and may reduce the execution efficiency of the interface;

2) Good at discovering and digging out the essential problems when encountering problems, so as to solve them efficiently and accurately;

3) Choose the right solution for your business scenario, rather than trying to impose some off-the-shelf technology implementation, regardless of combination or innovation, remember that the right solution is the best.

I hope you can grasp the ability of problem analysis and solution, do not rush to solve the problem at the beginning, you can do more in-depth analysis, understand the essence of the problem and then consider solutions to solve.

Thanks for reading!

– END –

Author: The road to architecture Improvement, ten years of research and development road, Dachang architect, CSDN blog expert, focus on architecture technology precipitation learning and sharing, career and cognitive upgrade, adhere to share practical articles, looking forward to growing with you. Attention and private message I reply “01”, send you a programmer growth advanced gift package, welcome to hook up.