preface

Many friends reported that they had learned so much about high-concurrency topics, but they still didn’t know how to deal with high-concurrency business scenarios when they actually did the project! Even many people are still stuck in the stage of simply providing interfaces (CRUD), and do not know how to apply the concurrency knowledge to actual projects, let alone how to build high concurrency systems!

What exactly is a high concurrency system? Today, we will decrypt the architecture of a typical seckill system in a high-concurrency business scenario together with other articles on the topic of high-concurrency.

E-commerce system architecture

In the field of e-commerce, there are typical sSEC scenarios. What is sSEC scenarios? To put it simply, the number of people buying a product is far greater than the inventory of the product, and the product will be snapped up in a very short time. For example, the annual June 18 and November 11 promotion, xiaomi new product promotion and other business scenes are typical business scenes.

We can simplify the architecture of the e-commerce system as shown in the following figure.

As shown in the figure, we can simply divide the core layer of the e-commerce system into load balancing layer, application layer and persistence layer. Next, we estimate the concurrency for each layer.

  • If the load balancing layer uses high-performance Nginx, we can estimate the maximum concurrency of Nginx to be 10W+, which is expressed in tens of thousands.

  • Assume that Tomcat is used in the application layer, and the maximum concurrency of Tomcat can be estimated to be about 800, which is expressed in hundreds.

  • Assuming that the cache of the persistence layer uses Redis and the database uses MySQL, the maximum concurrency of MySQL can be estimated to be about 1000, in thousands. The maximum concurrency of Redis can be estimated to be about 5W, in 10,000 units.

So, the concurrency of the load balancing layer, the application layer and the persistence layer are different. So, what can we usually do to improve the overall concurrency and cache of the system?

(1) Expand the system

System capacity expansion includes vertical capacity expansion and horizontal capacity expansion, adding devices and machine configurations, and is effective in most scenarios.

(2) Cache

Local cache or centralized cache, reduce network IO, read data based on memory. Most scenarios work.

(3) Read and write separation

Use read and write separation, divide and conquer, increase the parallel processing capability of the machine.

Seckill system features

For the second kill system, we can expound some characteristics of its own existence from two angles of business and technology.

Service characteristics of the seckill system

Here, we can use 12306.cn for example, the annual Spring Festival travel rush, 12306.cn’s page view is very large, but the site’s usual page view is relatively gentle, that is to say, the annual Spring Festival travel rush, 12306.cn’s page view will appear instantaneous phenomenon.

For example, The Millet second kill system, at 10 in the morning to sell goods, 10 o ‘clock before the page view is relatively gentle, 10 o ‘clock will also appear concurrent instantaneous increase phenomenon.

Therefore, the traffic and concurrency of a seckill system can be represented in the following figure.

It can be seen from the figure that the concurrent amount of seckill system has the characteristic of instantaneous convex peak, which is also called flow spike phenomenon.

We can summarize the features of a seckill system as follows.

(1) Time limit, limit and price limit

Within a specified time; The number of goods in the activity is limited; The price of the item will be much lower than the original price, that is, the item will be sold for much lower than the original price in the split-kill activity.

For example, the time of seckill activity is limited to 10 am to 10:30 am on a certain day, the number of goods is only 100,000, and the price of goods is very low, such as: 1 yuan purchase business scenarios.

Time limits, limits, and price limits can exist separately or in combination.

(2) Activity preheating

Activities need to be configured in advance; Before the activity starts, users can view the relevant information of the activity; Before the start of the second kill activity, vigorously promote the activity.

(3) Short duration

The number of people buying is huge; The merchandise will sell out quickly.

In the system traffic presentation, there will be a spike phenomenon, at this time the concurrent traffic is very high, most of the second kill scenarios, goods will be sold out in a very short time.

The technical characteristics of seckill system

We can summarize the technical characteristics of seckill system as follows.

(1) The instantaneous concurrency is very high

Lots of users snap up goods at the same time; The instantaneous concurrency peak is very high.

(2) Read more and write less

The page view of goods in the system is huge; The quantity of goods available for purchase is very small; The number of queries and visits to inventory is far greater than the number of goods purchased.

Traffic limiting measures are often added to product pages. For example, verification codes were added to the product pages of the early sSEC system to smooth the front-end access traffic to the system. In the recent sSEC product details page, users are prompted to log in to the system when opening the page. These are some measures to limit the access to the system.

(3) Simple process

The business process of seckill system is generally simple; Generally speaking, the business process of seckill system can be summarized as: order inventory reduction.

For such a system with heavy traffic in a short period of time, it is not suitable for system expansion, because even if the system is expanded, that is, the system will be used for a short period of time, most of the time, the system can be accessed normally without expansion. So, what can we do to improve the performance of the system?

Second kill system scheme

According to the characteristics of seckill system, we can take the following measures to improve the performance of the system.

(1) Asynchronous decoupling

The whole process is disassembled and the core process is controlled by queue.

(2) current limit and brush prevention

Control the overall website traffic and raise the threshold of requests to avoid system resource exhaustion.

(3) Resource control

Control the resource scheduling in the whole process, and make full use of strengths and circumvent weaknesses.

Because the application layer can handle much less concurrency than the cache. So, in a high-concurrency system, we can use OpenResty directly to access the cache from the load balancing layer, avoiding the performance cost of calling the application layer. You can learn more about OpenResty at openresty.org/cn/. At the same time, because the number of goods in the seckill system is relatively small, we can also use dynamic rendering technology, CDN technology to speed up the access performance of the website.

If the concurrency is too high at the beginning of the seckill activity, we can put the user’s request into a queue for processing, and pop up the queuing page for the user.

Photo from Meizu

Seckill system sequence diagram

Many seckill systems and solutions to seckill systems on the Internet are not real seckill systems, they use only synchronous processing of requests, once the concurrency really up, their so-called seckill system performance will dramatically decline. Let’s take a look at the sequence diagram of the seckill system when placing orders synchronously.

Synchronous ordering process

1. The user initiates a seckill request

In the synchronous ordering process, first, the user initiates a seckill request. The mall service performs the following operations in sequence to process the sSEC request.

(1) Identify whether the verification code is correct

Mall service determines whether the verification code submitted by the user when initiating the seckill request is correct.

(2) Judge whether the activity has ended

Verify that the current seckill activity has ended.

(3) Verify whether the access request is in the blacklist

In the field of e-commerce, there is a lot of malicious competition, that is to say, other businesses may maliciously request seckill system through improper means, occupying a large amount of bandwidth and other system resources of the system. At this point, it is necessary to use risk control system to achieve blacklist mechanism. For simplicity, a blacklist mechanism can also be implemented using interceptors to count access frequency.

(4) Verify whether the real inventory is sufficient

The system needs to verify whether the real inventory of goods is enough and whether it can support the inventory of goods in this second kill activity.

(5) Deduct the inventory in the cache

In the second kill service, information such as commodity inventory is stored in the cache. In this case, it is necessary to verify whether the commodity inventory used by the second kill activity is sufficient and deduct the quantity of commodity inventory from the second kill activity.

(6) Calculate the price of the second kill

In the second kill activity, the second kill price of the commodity is different from the real price of the commodity, so the second kill price of the commodity needs to be calculated.

Note: In a seckill scenario, if the system involves more complex services, more operations will be involved. Here, I just list some common operations.

2. Submit the order

(1) Order entry

Save the order information submitted by the user to the database.

(2) Deducting real inventory

After the order is put into storage, the quantity of goods placed in this successful order shall be deducted from the real inventory of goods.

If we use the above process to develop a seckill system, when the user initiates a seckill request, the overall performance of the system will not be too high because each business process of the system is executed in serial. When the concurrency is too high, we will pop up the following queuing page for the user to prompt the user to wait.

Photo from Meizu

The queue could be 15 seconds, 30 seconds, or even longer. There is a problem: the connection between the client and the server is not released between the time the user initiates the seckill request and the time the server returns the result, which can take up a lot of resources on the server.

A lot of online articles on how to achieve the second kill system are adopted in this way, so, this way can do the second kill system? The answer is yes, but the amount of concurrency supported in this way is not too high. At this point, some users may ask: our company is doing this second kill system ah! After the line has been using, no problem ah! What I want to say is: it is possible to make a second kill system with synchronous ordering, but the performance of synchronous ordering is not too high. The reason why your company adopts the synchronous ordering method to make the seckill system does not have a big problem is that the concurrency of your seckill system does not reach a certain level, that is to say, the concurrency of your seckill system is not high.

Therefore, many so-called second kill system, there are second kill business, but can not be called the real second kill system, the reason is that they use a synchronous order process, limit the concurrent flow of the system. The reason why there are no big problems after the launch is that the concurrency of the system is not high enough to overwhelm the whole system.

If 12306, Taobao, Tmall, JINGdong, Millet and other large mall seconds kill system is so to play, then, their system will be played to death sooner or later, their system engineers are not fired to blame! Therefore, in the second kill system, this kind of synchronous processing of the order of the business process is not desirable.

This is the whole process of synchronous ordering, but if the ordering process is more complex, it will involve more business operations.

Asynchronous ordering process

Since a synchronous ordering process is not a true ordering system, we need to adopt an asynchronous ordering process. The asynchronous order flow does not limit the high concurrent traffic of the system.

1. The user initiates a seckill request

After a user initiates a seckill request, the mall service goes through the following process.

(1) Check whether the verification code is correct

When a user initiates a seckill request, the verification code is sent together with the request. The system checks whether the verification code is valid and correct.

(2) Whether to limit the current

The system determines whether to limit the flow of a user’s request. In this case, we can determine the length of the message queue. Because we put the user’s request in the message queue, and the message queue is stacked with the user’s request, we can determine whether to limit the user’s request according to the number of pending requests in the current message queue.

For example, in a kill activity, we sell 1000 items and there are 1000 requests in the message queue. If there are still subsequent kill requests, we can no longer process the subsequent requests and directly return to the user a message indicating that the item is sold out.

Therefore, by using traffic limiting, we can process users’ requests and release connected resources more quickly.

(3) Send MQ

After the user’s seckill request passes the validation, we can send the user’s request parameters and other information to MQ for asynchronous processing, and at the same time, respond to the user with the result information. In the mall service, there are dedicated asynchronous task processing modules that consume requests in the message queue and process subsequent asynchronous processes.

When a user initiates a seckill request, the asynchronous order process processes fewer business operations than the synchronous order process. It sends the subsequent operations to the asynchronous processing module via MQ for processing, and quickly returns the response result to the user, releasing the request connection.

2. Asynchronous processing

We can asynchronously process the following operations of the order process.

(1) Judge whether the activity has ended

(2) Determine whether the request is on the system blacklist. In order to prevent the malicious competition of peers in the field of e-commerce, blacklist mechanism can be added to the system to put the malicious request into the system blacklist. You can do this by using interceptors to count access frequency.

(3) Subtracts the inventory quantity of split-kill items in the cache.

(4) Generate a seckill Token, which is bound to the current user and the current seckill activity. Only the request that generates a seckill Token is eligible for seckill activity.

Here we introduce the asynchronous processing mechanism, in the asynchronous processing, the system can control how many resources, how many threads to deal with the corresponding task.

3. Short polling query results

In this case, the client can use a short poll to check whether the second kill qualification is obtained. For example, the client can poll the request server every 3 seconds to check whether the request server has obtained a seckill Token. In this case, the processing on the server is to determine whether the current user has a seckill Token. If the server generates a seckill Token for the current user, the current user has a seckill Token. Otherwise, the query continues until time out or the server returns a message that the item is sold out or there is no second kill qualification.

When using short polling to query the seconds kill result, we can also prompt the user to queue processing on the page, but at this time, the client will poll the server to query the status of the seconds kill qualification every few seconds. Compared with the synchronous ordering process, there is no need to occupy the request connection for a long time.

At this point, there may be some netizens will ask: using the way of short polling query, there will be no query until the timeout whether there is a second kill qualification state? The answer is: maybe! Let’s think about the real scene of The event. In essence, businesses participate in the event not to make money, but to improve the sales and popularity of the product, and attract more users to buy their products. Therefore, we do not have to guarantee that users can query 100% of the status of seckill eligibility.

4. Second kill settlement

(1) Verify the order Token

When the client submits the seckill settlement, it will submit the seckill Token to the server together, and the mall service will verify whether the current seckill Token is valid.

(2) Add to the shopping cart

After verifying that the Token is valid and valid, the mall service will add the products to the shopping cart.

5. Submit your order

(1) Order warehousing

Save the order information submitted by the user to the database.

(2) Delete the Token

After the order is successfully stored, the Token is deleted.

Here’s a question for you to consider: Why do we use asynchronous processing only in the pink part of the asynchronous ordering process, and not in the rest of the asynchronous peak clipping and valley filling?

This is because in the design of asynchronous ordering process, whether in product design or interface design, we carry out the flow limiting operation on the user’s request at the stage when the user initiates the second kill request. It can be said that the flow limiting operation of the system is very pre-emptive. When the user initiates the seckill request, the traffic limit is carried out, and the peak traffic of the system has been smoothed out. After that, the concurrency of the system and the system traffic are not very high.

Therefore, a lot of articles and posts on the Internet in the introduction of the second kill system, saying that the use of asynchronous peak clipping to carry out some current limiting operations when placing an order, that is bullshit! Because the single order operation belongs to the later operation in the whole process of the second kill system, the flow limiting operation must be processed in advance, and it is useless to do the flow limiting operation in the process after the second kill business.

High concurrency “black science and technology” and winning strange move

Let’s say we use Redis for caching in a seckill system, and let’s say the number of concurrent reads and writes in Redis is around 50,000. Our mall seckill business needs to support about 1 million concurrent requests. If all of these 1 million concurrent messages are sent to Redis, Redis is likely to fail. So, how do we solve this problem? Let’s explore this question.

In highly concurrent seckill systems, if Redis is used to cache data, the concurrency capability of the Redis cache is key, because many prefix operations need access to Redis. While asynchronous peak clipping is just a basic operation, the key is to ensure the concurrent processing capability of Redis.

The key idea to solve this problem is: divide and conquer, divide and open up commodity inventory.

Foster a

When we store the inventory quantity of instant kill items in Redis, we can “split” the inventory of instant kill items to improve the read and write concurrency of Redis.

For example, the id of the original second kill product is 10001, and the inventory is 1000 pieces, and the storage in Redis is (10001, 1000). We divide the original inventory into 5 pieces, and each piece of inventory is 200 pieces. At this time, the information we store in Redia is (10001_0, 200). (10001_1, 200), (10001_2, 200), (10001_3, 200), (10001_4, 200).

At this point, we will inventory after segmentation, each division of inventory to use commodity id plus a digital id to store, so, in the store inventory of each Key in the Hash arithmetic, the Hash results are different, which means, has a great probability of store inventory Key is not in the same slot Redis, This improves the performance and concurrency of Redis requests.

After splitting the inventory, we also need to store a mapping relation between the id of the commodity and the Key after splitting the inventory in Redis. The Key of the mapping relation is the ID of the commodity, namely 10001, and the Value is the Key to store the inventory information after splitting the inventory, namely 10001_0, 10001_1, 10001_2. 10001 _3, 10001 _4. In Redis we can use a List to store these values.

In the real processing of inventory information, we can first query all the keys corresponding to the split inventory from Redis, and at the same time use AtomicLong to record the current number of requests, and use the number of requests to perform modular calculation on the length of all the keys corresponding to the split inventory from Redia. The result is 0,1,2,3,4. Then concatenate the item ID in front to get the actual inventory cache Key. At this point, you can directly go to Redis to obtain the corresponding inventory information according to this Key.

Substitute stealthily

In a high-concurrency business scenario, we can access the cache directly from the load balancing layer using the Lua script library (OpenResty).

Here, let’s consider a scenario: in the second kill business scenario, the second kill product is snapped up in an instant. At this point, when users launch seconds kill request again, if the system by the load balance layer of each service request application layer, again by the various application layer service access cache and the database, in fact, nature has no meaning, because the goods had been sold out, and then through the system of application layer by layer check already do not have much meaning!! The concurrent visits of the application layer are in hundreds, which will reduce the concurrency of the system to a certain extent.

In order to solve this problem, at this time, we can take out the user ID, commodity ID and SEC activity ID carried by the user when sending requests in the load balancing layer of the system, and directly access the inventory information in the cache through Lua script and other technologies. If the inventory of a kill item is less than or equal to 0, the user is directly returned with a message indicating that the item is sold out, without layer upon layer verification of the application layer. For this architecture, we can refer to the architecture diagram of the e-commerce system in this article (the first diagram at the beginning of the text).

Write in the last

If you find this article helpful, please search and follow the wechat official account of “Glacier Technology” to learn high concurrency programming techniques from Glacier.

Finally, the core skills of concurrent programming knowledge diagram is attached. I wish you to avoid detours when learning concurrent programming.