Author: draw your whole life
Link: https://juejin.cn/post/6844903949632274445
Every holiday season, people in first-tier and second-tier cities who return to their hometowns and go out to play almost all face a problem: grabbing train tickets!
Grab tickets, the limit of concurrent thinking
Although most of the cases can book tickets now, but put tickets instant that no ticket scene, I believe we have a deep experience.
Especially during the Spring Festival, people not only use 12306, but also consider Zhixing and other ticket-snatching apps, as hundreds of millions of people across the country scramble for tickets during this time.
The “12306 service” is burdened with QPS that is unsurpassed by any seckill system in the world, and millions of concurrent requests are perfectly normal!
The author specially studied the server architecture of “12306” and learned many highlights of its system design. Here, I would like to share with you and simulate an example: how to provide normal and stable service when 1 million people grab 10,000 train tickets at the same time.
https://github.com/GuoZhaoran/spikeSystem
Copy the code
1. Large and high concurrency system architecture
The system architecture with high concurrency adopts distributed cluster deployment. Load balancing is implemented at the upper layer of services and various Dr Methods (such as dual-fire room, node fault tolerance, and server DISASTER recovery) are provided to ensure high availability of the system. Traffic is also balanced to different servers based on load capacity and configuration policies.
1.1 Overview of Load Balancing
In the figure above, the user requests to the server go through three layers of load balancing. The following three layers of load balancing are briefly introduced.
①OSPF is an Interior Gateway Protocol (IGP).
OSPF advertises the status of network interfaces between routers to establish a link-state database and generate a shortest path tree. OSPF automatically calculates the Cost value of the routed interface. However, OSPF can manually specify the Cost value of the interface.
The Cost calculated by OSPF is inversely proportional to the interface bandwidth. The higher the bandwidth, the lower the Cost. Load balancing can be performed on paths that reach the same Cost value. A maximum of six links can perform load balancing simultaneously.
②LVS (Linux Virtual Server)
It is a Cluster technology, using IP load balancing technology and content – based request distribution technology.
The scheduler has a good throughput rate, transfers requests to different servers evenly for execution, and the scheduler automatically shields the server failures, thus forming a group of servers into a high performance, high availability virtual server.
(3) Nginx
It is a very high performance HTTP proxy/reverse proxy server, which is often used for load balancing in service development.
Nginx implements load balancing in three main ways:
-
polling
-
Weighted polling
-
IP Hash polling
Let’s configure and test weighted polling for Nginx.
1.2 Nginx weighted polling demo
The following is a weighted polling load configuration. I will configure the weight of 1,2,3,4 respectively on the local listening port 3001-3004:
Upstream load_rule {server 127.0.0.1:3001 weight=1; Server 127.0.0.1:3002 weight = 2; Server 127.0.0.1:3003 weight = 3; Server 127.0.0.1:3004 weight = 4; }... server { listen 80; server_name load_balance.com www.load_balance.com; location / { proxy_pass http://load_rule; }}
Copy the code
Next, use Go language to open the four HTTP port listening service, the following is listening on port 3001 Go program, the other several only need to modify the port:
package mainimport ( "net/http" "os" "strings")func main() { http.HandleFunc("/buy/ticket", Func handleReq(w http.responsewriter, ListenAndServe(":3001", nil)} R *http.Request) {failedMsg := "handle in port:" writeLog(failedMsg, "./stat.log")} logPath string) { fd, _ := os.OpenFile(logPath, os.O_RDWR|os.O_CREATE|os.O_APPEND, 0644) defer fd.Close() content := strings.Join([]string{msg, ""}, "3001") buf := []byte(content) fd.Write(buf)}
Copy the code
I wrote the requested port log information to the./stat.log file and used the AB pressure tool to do the pressure test:
ab -n 1000 -c 100 http://www.load_balance.com/buy/ticket
Copy the code
Upstream load balancing in Nginx: Upstream load balancing in Nginx:
https://www.kancloud.cn/digest/understandingnginx/202607
Copy the code
2. Selection of seckilling system
To solve this problem, we need to figure out one thing:
2.2 Order and reduce inventory
-
In extreme concurrency, the details of any memory operation can have a critical impact on performance, especially logic such as creating an order, which is typically stored in a disk database, putting an incredible strain on the database.
-
If a user places a malicious order, only placing an order without paying will reduce the inventory and sell many orders, although the server can limit the number of purchase orders for IP and users, which is not a good method.
2.2 Payment for inventory reduction
2.3 Withholding inventory
3. The art of withholding inventory
In the case of single machine with low concurrency, we usually realize the inventory withholding like this:
Then we have a local stock of 100 train tickets per machine, and the total stock on 100 servers is still 10,000, which ensures that the stock orders are not oversold. The cluster architecture we describe is as follows:
Let’s make a specific analysis based on the following architecture diagram:
4. Code demonstration
4.1 Initialization
The Redis library uses Redigo, and here is the code:
. Package localSpiketype localSpike struct {LocalInStock int64 LocalSalesVolume int64} struct {LocalInStock int64 LocalSalesVolume int64}... Package remoteSpike RemoteSpikeKeys struct {SpikeOrderHashKey string // Order hash key TotalInventoryKey string // Order hash key QuantityOfOrderKey string Func NewPool() *redis.Pool {return &redis.Pool{MaxIdle: 10000, MaxActive: 12000, // max number of connections Dial: func() (redis.Conn, error) { c, err := redis.Dial("tcp", ":6379") if err ! = nil { panic(err.Error()) } return c, err }, }}... func init() { localSpike = localSpike2.LocalSpike{ LocalInStock: 150, LocalSalesVolume: 0, } remoteSpike = remoteSpike2.RemoteSpikeKeys{ SpikeOrderHashKey: "ticket_hash_key", TotalInventoryKey: "ticket_total_nums", QuantityOfOrderKey: "ticket_sold_nums", } redisPool = remoteSpike2.NewPool() done = make(chan int, 1) done <- 1}
Copy the code
4.2 Local and unified withholding stock
The logic of the local inventory is very simple, the user requests to add the sales volume, and then compares whether the sales volume is greater than the local inventory, returns Bool:
Package localSpike// LocalDeductionStock. Mandatory bool func (spike * localSpike) LocalDeductionStock() bool{spike.LocalSalesVolume = spike.LocalSalesVolume + 1 return spike.LocalSalesVolume < spike.LocalInStock}
Copy the code
Unified inventory operation Redis, because Redis is a single thread, and we need to achieve data from it, write data and calculate a series of steps, we need to coordinate with Lua script packaging command, to ensure the atomicity of the operation:
package remoteSpike...... const LuaScript = ` local ticket_key = KEYS[1] local ticket_total_key = ARGV[1] local ticket_sold_key = ARGV[2] local ticket_total_nums = tonumber(redis.call('HGET', ticket_key, ticket_total_key)) local ticket_sold_nums = tonumber(redis.call('HGET', ticket_key, If (ticket_total_nums >= ticket_sold_nums) then return redis. Call ('HINCRBY', ticket_key, ticket_sold_key, Func (RemoteSpikeKeys *RemoteSpikeKeys) RemoteDeductionStock(conn Redis.Conn) bool {lua := redis.NewScript(1, LuaScript) result, err := redis.Int(lua.Do(conn, RemoteSpikeKeys.SpikeOrderHashKey, RemoteSpikeKeys.TotalInventoryKey, RemoteSpikeKeys.QuantityOfOrderKey)) if err ! = nil { return false } return result ! = 0}
Copy the code
hmset ticket_hash_key "ticket_total_nums" 10000 "ticket_sold_nums" 0
Copy the code
4.3 Responding to User Information
We start an HTTP service that listens on a port:
package main... func main() { http.HandleFunc("/buy/ticket", handleReq) http.ListenAndServe(":3005", nil)}
Copy the code
Now that we’ve done all the initialization, the logic of handleReq is pretty clear.
Func handleReq(w http.ResponseWriter, R * HTTP Request) {redisConn: = redisPool. Get LogMsg () : = "" < - done. / / the global read-write lock if localSpike LocalDeductionStock && () RemoteSpike. RemoteDeductionStock (redisConn) {util. RespJson (w, 1, "rob tickets success", nil) LogMsg = LogMsg + "result:1,localSales:" + strconv.FormatInt(localSpike.LocalSalesVolume, } else {util.RespJson(w, -1, "sold out ", nil) LogMsg = LogMsg + "result:0,localSales:" + strconv.FormatInt(localSpike.LocalSalesVolume, // stat.log)}func writeLog(MSG string, logPath string) {fd, _ := os.OpenFile(logPath, os.O_RDWR|os.O_CREATE|os.O_APPEND, 0644) defer fd.Close() content := strings.Join([]string{msg, ""}, "") buf := []byte(content) fd.Write(buf)}
Copy the code
4.4 Single-machine service pressure test
To start the service, we use AB pressure testing tool to test:
Ab - n - 10000 c 100 http://127.0.0.1:3005/buy/ticket
Copy the code
Here is my local low Mac pressure test information:
This is ApacheBench, Version 2.3 <$Revision: 1826891 $>Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/Licensed to The Apache Software Foundation, http://www.apache.org/Benchmarking 127.0.0.1 (be patient) Completed requestsCompleted requestsCompleted 2000 3000 1000 requestsCompleted 4000 requestsCompleted 5000 requestsCompleted 6000 requestsCompleted 7000 requestsCompleted 8000 requestsCompleted 9000 requestsCompleted 10000 requestsFinished 10000 requestsServer Software:Server Hostname: 127.0.0.1Server Port: 3005Document Path: /buy/ticketDocument Length: 29 bytesConcurrency Level: 100Time Taken for Tests: 2.339 secondsComplete Requests: 10000Failed Requests: 0Total transferred: 1370000 bytesHTML transferred: 290000 bytesRequests per second: 4275.96 [#/ SEC] (mean)Time per request: [MS] (mean)Time per request: 0.234 [MS] (mean, across all concurrent requests)Transfer rate: 572.08 [Kbytes/ SEC] receivedConnection Times (ms) min mean[+/-sd] Median maxConnect: 08 14.7 6 223Processing: 2 15 17.6 11 232Waiting: 1 11 13.5 8 225Total: 7 23 22.8 18 239Percentage of the requests served within a certain time (MS) 50% 18 66% 24 75% 26 80% 28 90% 33 95% 39 98% 45 99% 54 100% 239 (longest request)
Copy the code
In addition, it is found by checking logs that requests are normal, traffic is even and Redis is normal in the whole service process:
//stat.log... result:1,localSales:145result:1,localSales:146result:1,localSales:147result:1,localSales:148result:1,localSales:149resul t:1,localSales:150result:0,localSales:151result:0,localSales:152result:0,localSales:153result:0,localSales:154result:0,l ocalSales:156...
Copy the code
5. Review
① Load balancing, divide and conquer
② Use concurrency and asynchrony properly
● No. 1083, input no. Direct to this article
● Type m to get the article table of contents
Programmer Job Interview
Share experience in job hunting for programmers
Programmer written test, interview questions