Welcome to follow our wechat official account: Shishan100

My new course ** “C2C e-commerce System Micro-service Architecture 120-day Practical Training Camp” is online in the public account ruxihu Technology Nest **, interested students, you can click the link below for details:

120-Day Training Camp of C2C E-commerce System Micro-Service Architecture

** “** Last article we talked about Redisson’s open source framework for Redis distributed lock implementation principle, if you do not know the brother can see: please, interview please do not ask me the Redis distributed lock implementation principle.

Today we will talk about an interesting topic: under the scenario of thousands of orders per second, how to optimize the concurrency of distributed locks?

Background introduction

First, let’s look at the background of this question. Okay?

Some time ago, a friend of mine was in an interview, and then one day he talked to me about a good e-commerce company in China. The interviewer asked him a scene question:

If a distributed lock is used to prevent oversold inventory when placing orders, but it is a high concurrency scenario with thousands of orders per second, how to optimize the distributed lock for high concurrency to deal with this scenario?

He said he didn’t answer because he hadn’t done it. In fact, I felt a little bit interesting when I heard the interview question, because if I were interviewing the candidates, I would have given more scope.

For example, let the students in the interview talk about the stock oversold solution under the scenario of high concurrency and second kill of e-commerce, the advantages and disadvantages of various solutions and practices, and then talk about the topic of distributed lock.

There are many technical solutions to the oversold problem, such as pessimistic locking, distributed locking, optimistic locking, queue serialization, Redis atomic operation, etc.

However, since the interviewer brother is limited to using distributed locks to solve the oversold inventory, I guess I just want to ask a point: how to optimize the concurrency performance of distributed locks in high concurrency scenarios.

In my opinion, the Angle of the interviewer’s question is acceptable, because in the actual production, the distributed lock ensures the accuracy of data, but its natural concurrency is a little weak.

It just so happens that IN other scenarios of my own project, I did the distributed lock optimization scheme under the scenario of high concurrency, so I just borrowed this friend’s interview question to give you the idea of high concurrency optimization of distributed lock.

Inventory oversold phenomenon is how to produce?

Let’s take a look at what the so-called e-commerce inventory oversold means if we don’t use distributed locks. Take a look at the picture below:

This diagram, in fact, is very clear. Suppose the order system is deployed on two machines, and different users want to buy 10 iphones at the same time, and each sends a request to the order system.

Each order system instance is then checked in the database, and the current inventory is 12 iphones.

Two big brothers a look, happy, 12 inventory is greater than the number of 10 to buy ah!

As a result, each order system instance sent SQL to the database to place an order and then deducted 10 stocks, one from 12 to 2 and the other from 2 to -8.

It’s over now, the inventory is negative! Tears, there are not 20 iphones for two users! There’s nothing to be said for that.

How to solve the problem of overselling inventory with distributed lock?

How can we solve the problem of overselling inventory with distributed locks? In fact, it is very simple, remember last time we talked about the implementation of the distributed lock principle:

For a lock key, only one client can obtain the lock at a time. Other clients will wait indefinitely to try to obtain the lock. Only the client that has obtained the lock can execute the following business logic.

The code looks something like this. Now let’s analyze why this prevents inventory overselling.

You can follow the sequence number of that step above to see again, immediately understand.

As you can see from the figure above, only one instance of an order system can successfully add a distributed lock, and then only one instance can check the inventory, determine whether the inventory is sufficient, place an order to reduce the inventory, and then release the lock.

After the lock is released, another order system instance can be locked. Then we check the inventory and find that there are only 2 sets in stock. The inventory is insufficient and we cannot buy, so the order fails. It’s not going to subtract inventory to -8.

Are there any other solutions to the oversold inventory problem?

Of course! For example, pessimistic locks, distributed locks, optimistic locks, queue serialization, asynchronous queue dispersion, Redis atomic operation, etc., many other solutions, we have our own set of optimization mechanisms for inventory oversold.

But as mentioned earlier, this article is about concurrent optimization of a distributed lock, not a solution to oversold inventory, so oversold inventory is just a business scenario.

In the future, THE author will write an article about the solution to the problem of overselling inventory of e-commerce. This article will focus on a distributed lock concurrency optimization first. I hope you can understand this intention and background, so as to avoid some brothers who do not see clearly and ridicule.

And suggest that even if you have objections to the content of the article, the public number to leave a message to discuss with me in the background, technology, is to communicate more, open ideas, collision thinking.

Distributed locking scheme in high concurrency scenario

Ok, now let’s see, what are the problems with distributed locking schemes in high concurrency scenarios?

That’s a big problem! Dude, I don’t know if you can tell. Once the distributed lock is added, all clients must lock the inventory lock key of the same item in order to order the same item.

For example, all orders for iPhone must be locked with the “iphone_stock” lock key. As a result, single requests for the same item must be serialized and processed one after another.

If you go back and look at it again and again, you should be able to figure this out.

Let’s assume that after the lock is added and before the lock is released, check the inventory -> create the order -> deduct the inventory. This process is very high performance, calculate the whole process 20 milliseconds, this should be good.

So a second is 1000 milliseconds, and only 50 requests for this good can be processed sequentially.

For example, if 50 requests come in a second, all of which are for the iPhone, each request will be processed in 20 milliseconds, one at a time, and the last 1000 milliseconds will be exactly 50 requests.

Take a look at the picture below to get a sense of it.

So at least you can see the pitfalls of simply using distributed locks to deal with oversold inventory.

The defect is that when multiple users place an order for the same commodity at the same time, it will be serialized based on distributed lock, resulting in the inability to process a large number of orders for the same commodity at the same time.

Such a solution might be acceptable for ordinary small e-commerce systems with low concurrency and no SEC kill scenarios.

Because if the concurrency is very low, there will be less than 10 requests per second. If there is no scene of instantaneous high concurrency killing a single commodity in a second, in fact, it is rare to place 1000 orders for the same commodity in a second, because the small e-commerce system does not have that scene.

How to optimize distributed locks for high concurrency?

Okay, so we’re finally on to something, so what do we do now?

The interviewer said, I am stuck now, overselling inventory is to use distributed locks to solve, and one second for an iPhone to place thousands of orders, how to optimize?

Now, based on that calculation, you can only process 50 iPhone orders a second.

In fact, it is also very simple to say, I believe that many people have seen the Java ConcurrentHashMap source code and the underlying principle, should know the core idea inside, is paragraph-based lock!

The data is divided into several segments, each of which is a separate lock, so that multiple threads can concurrently modify the data in different segments. Not to mention that only one thread can exclusively modify the data in ConcurrentHashMap at a time.

In addition, Java 8 has a new LongAdder class, which is also an optimization of Java 7’s AtomicLong, to solve the problem of CAS class operations in high concurrency scenarios, using optimistic locking ideas, which can cause a large number of threads to repeat the loop for a long time.

LongAdder also adopts a similar segmentation CAS operation. If it fails, it will automatically migrate to the next segment for CAS.

In fact, the optimization idea of distributed lock is similar, before we dropped this scheme into production in another business scenario, not in the inventory oversold problem.

But oversold inventory is a good business scenario that’s easy to understand, so let’s use it. Take a look at the picture below:

In fact, this is segment locking. You know, if you have 1,000 inventory items on your iPhone right now, you can split it into 20 inventory items, and if you want, you can have 20 inventory items in a table in your database, like stock_01, stock_02, something like that, or 20 inventory keys in a place like Redis.

In short, it is to break down your 1000 pieces of inventory to him, and each stock segment is 50 pieces of inventory. For example, stock_01 corresponds to 50 pieces of inventory, and stock_02 corresponds to 50 pieces of inventory.

And then, 1,000 requests per second, good! At this point, you can actually write a simple random algorithm, each request is randomly selected in 20 sections of inventory, one to lock.

Bingo! At the same time, there can be up to 20 order requests executed together, each order request locks a section of inventory, and then in the business logic, the database or Redis that section of inventory can be operated, including checking inventory -> judging whether the inventory is sufficient -> deducting inventory.

So what is this equivalent to? It is equivalent to processing 20 order requests simultaneously in 20 milliseconds, so in 1 second, it can process 20 * 50 = 1000 order requests to iPhone in turn.

Once the data is segmented, there is a pit we must pay attention to: that is, if a single order request, click lock, and then found that the inventory in the segmented inventory is insufficient, what to do at this time?

At this point, you automatically release the lock, then immediately change the next section inventory, try locking again and try processing. This process must be implemented.

Are there any disadvantages to distributed lock concurrency optimization?

Inadequacy is certain some, the biggest inadequacy, everybody discovers have not, very inconvenient! The implementation is too complex.

  • First of all, you have to store a single piece of data in segments. One good inventory field is now divided into 20 segmented inventory fields.
  • Secondly, every time you deal with inventory, you have to write your own random algorithm and randomly pick a segment to deal with;
  • Finally, if you run out of data in one segment, you have to automatically move to the next.

This process is to manually write code to achieve, or a bit of work, very troublesome.

However, we do in some business scenarios, because of the use of distributed lock, and then have to optimize the lock concurrency, and further use of the technology of segwise lock, the effect is of course very good, suddenly the concurrency performance can increase dozens of times.

The subsequent improvement of the optimization scheme

Take the oversold inventory scenario we discussed in this article. If you play this way, you will make yourself miserable!

Again, the oversold scenario is just a demonstration scenario, and we’ll talk about other solutions to oversold inventory in high concurrency architecture separately.

A footnote to the previous article

At the end of this article, THE author received a message from some friends, said that a friend in the technology group after seeing the last article, made fun of the last article (please, interview please don’t ask me the principle of Redis distributed lock), said that the principle of Redis distributed lock to take people crooked.

Here must be solemnly stated, the last article, clearly stated that Redisson is the open source framework of Redis lock implementation principle, is not my personal YY out of that set of principles.

In fact, Redisson is an excellent open source framework, and I think its overall implementation of distributed locks is OK, although it has some flaws, but can be used in production environments.

In addition, some brothers may feel that with Redis official website author gives distributed lock implementation idea is different, so make fun of, say to follow Redis official website author distributed lock implementation idea.

In fact, I must point out that the Redis official website gives only Redis distributed lock implementation ideas, remember, that is the idea! There is a gap between the thinking and the technical solution of the landing production environment.

For example, the distributed lock implementation idea provided on the official website of Redis does not provide the automatic renewal mechanism, the mutually exclusive self-waiting mechanism of the distributed lock, the reentrant lock adding and releasing mechanism of the lock. However, the Redisson framework implements a whole set of mechanisms for distributed locking.

So again, that’s just the idea, and if you want, you can implement a production-level distributed lock yourself based on the Redis website.

In addition, THE RedLock algorithm given on the official website of Redis has always been one that I personally do not recommend to use in production.

Because there may be some logic problems in that algorithm, it has caused controversy in foreign countries. Even the author of Redis himself has given a controversial article on the official website because of his RedLock algorithm, of course, he does not quite agree with it.

But in this case, both sides are right. Please refer to the official website for details:

Martin Kleppmann analyzed Redlock here. I disagree with the analysis and posted my reply to his analysis here.

Therefore, next time I have a chance, I will write the RedLock distributed lock algorithm proposed by the official author of Redis in the form of a large number of hand-drawn diagrams, and how the algorithm is used in the production environment based on Redisson framework, then we can discuss.

End

If there is any harvest, please help to forward, your encouragement is the biggest power of the author, thank you!

A large wave of micro services, distributed, high concurrency, high availability **** original series

The article is on its way,Please scan the qr code belowContinue to pay attention to:

Architecture Notes for Hugesia (ID: Shishan100)

More than ten years of EXPERIENCE in BAT architecture

** Recommended reading:

1. Please! Please don’t ask me about the underlying principles of Spring Cloud

2. [Behind the Double 11 carnival] How does the micro-service registry carry tens of millions of visits of large-scale systems?

3. [Performance optimization] Spring Cloud parameter optimization practice with tens of thousands of concurrent applications per second

4. How does the microservice architecture guarantee 99.99% high availability under the Double 11 Carnival

5. Dude, let me tell you in plain English what Hadoop architecture is all about

6. How can Hadoop NameNode support thousands of concurrent accesses per second in large-scale clusters

7. [Secret of Performance Optimization] How does Hadoop optimize the upload performance of large TERabyte files by 100 times

8, please, interview please do not ask me TCC distributed transaction implementation principle pit dad!

9, 【 pit dad ah! How do final consistent distributed transactions ensure 99.99% high availability in real production?

10, please, interview please don’t ask me Redis distributed lock implementation principle! **

11, 【 eyes light up! See how Hadoop’s underlying algorithms elegantly improve large-scale cluster performance by more than 10 times?