preface
Second kill system believe that many people have seen, such as JINGdong or Taobao’s second kill, millet mobile phone’s second kill, so how to achieve the background of the second kill system? How do we design a seckill system? What should be considered for a seckill system? How to design the second kill system of SAO Qi? We’ll look at that question in this issue
A: the second kill should consider what problems
1.1: Analysis of oversold problem In the business scenario of seckill, the most important point is oversold problem. If there are only 100 goods in stock, but 200 are oversold in the end, generally speaking, the price of seckill system is relatively low, if the oversold will seriously affect the property interests of the company, so the first thing is to solve the oversold problem of goods.
1.2: High concurrency seckill has the characteristics of short time and large concurrency. Seckill lasts only a few minutes. In order to create a sensation effect, general companies will attract users with very low prices, so there will be a lot of users participating in the buying. In a short period of time, a large number of requests will flood in, and how to prevent the backend from too high concurrency resulting in cache breakdown or failure, breaking the database are issues to consider.
1.3: interface brush now most of the second kill will come out for the second kill corresponding software, this kind of software will simulate constantly to the background server to initiate a request, hundreds of times a second is very common, how to prevent this kind of software repeated invalid request, prevent constantly launched request is also need we targeted consideration
1.4: seckill URL for ordinary users, see only a relatively simple seckill page, in the specified time, the seckill button is gray, once reached the specified time, the gray button becomes clickable state. This part is aimed at the white users, if the user is a little computer background, will see the browser network through F12 to see the URL of the second kill, through the specific software to request can also achieve second kill. Or know ahead of time to kill the URL, a request on the direct implementation of the second kill. This is a problem we need to think about
1.5: The database design crash has the risk of destroying our server. If it is used in the same database with our other businesses and coupled together, it is likely to implicate and affect other businesses. How to prevent such problems from happening? Even if the server freezes or crashes, it should not affect the normal online services as much as possible
1.6: The problem of large number of requests According to 1.2, even the use of caching is not enough to deal with the impact of short time and high concurrency traffic. How to carry such a huge number of visits while providing stable and low latency service guarantee is a major challenge to be faced. Let’s calculate an account. If the redis cache is used, the QPS that a single Redis server can bear is about 4W. If a second kill attracts enough users, the single QPS may reach hundreds of thousands, and the single Redis is still not enough to support such a huge number of requests. The cache will be penetrated, infiltrating DB directly and bringing down mysql. There will be lots of errors in the background
Two: second kill system design and technical scheme
2.1: Second kill system database design
In response to the seckill database problem mentioned in 1.5, a separate seckill database should be designed to prevent the entire site from being overwhelmed by the high concurrency of seckill activity. Here you only need two tables, one for the kill order list and one for the kill goods list
In fact, there should be several tables, commodity table: can be associated with goods_id to check specific commodity information, commodity image, name, usual price, second price, etc., and user table: according to the user user_id can query user nickname, user phone number, shipping address and other additional information, this specific example will not be given.
2.2: Second kill URL Design In order to avoid people with program access experience to directly access the background interface through the ordering page URL to second kill goods, we need to make the SECOND kill URL dynamic, even the development of the whole system can not know the second kill URL before the start of the second kill.
The specific approach is to encrypt a string of random characters through MD5 as the SECOND kill URL, and then the front-end access to the background to obtain the specific URL, after the background verification can continue to second kill.
The description, parameters, transaction records, images and comments of goods are all written into a static page. The user request is generated directly in the foreground client without visiting the back-end server or going through the database, which can reduce the pressure on the server as much as possible.
This can be done by using Freemarker template technology, creating a web page template, filling it with data, and then rendering the page
2.4: Upgrading single Redis to cluster Redis seckill is a scenario with more read and less write. Redis is the most suitable cache. However, considering the cache breakdown problem, we should build a Redis cluster, using the Sentinel mode, which can improve the performance and availability of Redis.
2.5: Using Nginx Nginx is a high-performance Web server with tens of thousands of concurrent capabilities, while Tomcat has only a few hundred. Nginx maps client requests and distributes them to a cluster of back-end Tomcat servers to greatly improve concurrency.
2.6: Simplified SQL a typical scenario is in the process of inventory reduction, the traditional approach is to query inventory first, and then to update. This will require two SQL statements, but in fact we can do one SQL.
Update miaosha_goods set stock =stock-1 where goos_id ={#goods_id} and version = #{version} and sock>0; In this way, you can ensure that the inventory is not oversold and update the inventory once. Also note that optimistic locks with version numbers are used, which perform better than pessimistic locks.
A lot of requests come in, all need background query inventory, this is a frequent read scenario. You can use Redis to pre-destock, set the value in Redis before the kill starts, Set (goodsId,100), Integer stock = (Integer)redis.get(goosId); Then determine the sock value, if less than the constant value subtract 1;
However, it should be noted that when canceling, the inventory needs to be increased, and when increasing the inventory, it should also be noted that the total inventory should not be greater than the set number (inventory query and inventory reduction need atomic operation, you can use lua script at this time). When placing the next order and acquiring the inventory, it can be checked directly from redis.
2.8: Interface traffic limiting SEC killing is the ultimate essence of database update, but there are a lot of invalid requests, what we need to do is how to filter out these invalid requests to prevent infiltration into the database. To limit the current, there are many aspects to start with:
- 2.8.1: Front-end current limiting
The first step is to use front-end traffic limiting, where the user initiates a request after the seckill button is clicked, and the request cannot be clicked for the next 5 seconds (by setting the button to Disable). This small measure is cheap to develop, but effective.
- 2.8.2: Repeated requests from the same user within xx seconds are rejected
The specific number of seconds depends on the actual service and the number of seconds killed. Generally, it is limited to 10 seconds. Get (userId) from String Value = redis.get(userId); If I get this
Value is null or null, indicating that it is a valid request, and the request is allowed. If it is not empty, it is a repeated request, and the request is discarded. If valid, use redis.setexpire(userId,value,10). Value can be any value, it is better to use the service attribute, set the expire time with userId as the key,10 seconds (after 10 seconds, the key will automatically be null).
2.9: Token Bucket Algorithm Traffic Limiting There are many traffic limiting policies on the interface. The token bucket algorithm is adopted here. The basic idea of token bucket algorithm is that each request attempts to obtain a token, and the backend only processes requests with tokens. We can limit the speed and efficiency of token production by ourselves. Guava provides RateLimter API for us to use.
Here is a simple example. Note that guava is needed
Public class TestRateLimiter {public static void main(String[] args) {// Generate 1 token in 1 second Final RateLimiter RateLimiter = RateLimiter.create(1); for (int i = 0; i < 10; I++) {// this method blocks the thread until a token can be fetched from the token bucket. double waitTime= rateLimiter.acquire(); System.out.println(" task execution "+ I +" waitTime "+ waitTime); } system.out.println (" end of execution "); }}Copy the code
The idea of the code above is to limit our token bucket to generate 1 token per second (which is inefficient) and loop 10 times to execute the task through RateLimiter.
Acquire blocks the current thread until a token is acquired, that is, if the task does not acquire a token, it waits. Then the request will be stuck for a certain amount of time before it can proceed. This method returns how long the thread will wait. Execute as follows;
As you can see, there is no need to wait for the first task as the token is already produced in the first second of execution. Subsequent task requests must wait until the token bucket generates a token before proceeding.
If not, it blocks (there is a pause in the process). However, this approach is not very good, because if the user requests on the client side, if too many, directly in the background production token will stall (poor user experience), it will not abandon the task, we need a better strategy: if the time is not a certain amount of time, simply reject the task. Here’s another example:
public class TestRateLimiter2 { public static void main(String[] args) { final RateLimiter rateLimiter = RateLimiter.create(1); for (int i = 0; i < 10; I++) {long timeOut = (long) 0.5; boolean isValid = rateLimiter.tryAcquire(timeOut, TimeUnit.SECONDS); System.out.println(" task "+ I +" whether the execution isValid :" + isValid); if (! isValid) { continue; } system.out. println(" task "+ I +" executing "); } system.out.println (" end "); }}Copy the code
The tryAcquire method is used. The main function of this method is to set a timeout and return true if the token is available and false if it is not.
Let each task attempt to obtain the token in 0.5 seconds. If it fails to obtain the token, the task will be skipped (in a seckill environment, the request will be discarded). The program actually runs as follows:
The token bucket (1 token per second) will not return false if the token bucket (1 token per second) is not produced within 0.5 seconds.
** How efficient is this traffic limiting strategy? ** If our concurrent requests are 4 million instantaneous requests, the token generation efficiency is set to 20 per second, and the token acquisition time per attempt is 0.05 seconds, then the final test results will only allow 4 or so requests at a time, and a large number of requests will be rejected, which is the excellence of the token bucket algorithm.
2.10: Asynchronous ordering in order to improve the efficiency of ordering and prevent the failure of ordering service. The order operation needs to be processed asynchronously. The most common approach is to use queues, which have three obvious advantages: async, peak clipping, and decoupling. This can be done with RabbitMQ, and after limiting traffic and checking inventory in the background, valid requests flow into this step. It is then sent to the queue, which receives the message and places the order asynchronously. After placing the order, there is no problem in storage. You can notify the user of the success of seckilling by SMS. In case of failure, compensation can be used and retry.
2.11: Service degradation If a server breaks down or services are unavailable during the seckill process, backup work should be done. A previous blog post talked about using Hystrix for service meltdowns and downturns. You can develop a backup service that, if the server does go down, gives users a friendly message back instead of a blunt feedback such as a freeze or a server error.
Three:
Second kill flow chart:
This is the second kill flow chart I designed, of course, different second kill volume for the technology selection is not the same, this process can support hundreds of thousands of flow, if it is tens of millions of billions of it will have to be redesigned. For example, the sub-database sub-table, queue changed to kafka, Redis to increase the number of clusters and other means.
Through this design is mainly to show how we deal with high concurrency processing, and began to try to solve it, in the work of more thinking, more hands-on can improve our ability level, come on! If there are any mistakes in this blog, please kindly point them out.
Source: RRD. Me/fukGC
Welcome to follow my wechat public account “Code farming breakthrough”, share Python, Java, big data, machine learning, artificial intelligence and other technologies, pay attention to code farming technology improvement, career breakthrough, thinking transition, 200,000 + code farming growth charge first stop, accompany you have a dream to grow together