Author: Lin Guanhong/The Ghost at my Fingertips

The Denver nuggets: juejin. Cn/user / 178526…

Blog: www.cnblogs.com/linguanh/

Making: github.com/af913337456…

Tencent cloud column: cloud.tencent.com/developer/u…

Worm hole block chain column: www.chongdongshequ.com/article/153…


directory

  • Before the order
  • General order process
  • Think about bottlenecks
  • The order queue
    • The first order queue
    • Second order queue
    • conclusion
  • Implement queue selection
  • answer
  • Implement queue selection
  • Example code for the Go version of the second queue

Before the order

Development work at present is mainly the traditional electricity, combined with the application and chain block, block chain platform is still the etheric, in addition, these days I write by, books published by tsinghua university press, after August, finally publishing stores, name is: “block chain DApp development of actual combat of the etheric fang”, now can online shopping.

The idea I’m going to share in this article is the order queue commonly used in e-commerce applications.

General order process

In e-commerce applications, simple and intuitive steps for users to complete the whole process from placing an order to making payment can be shown in the following figure:

Among them, the order information is persistent, which is to store data to the database. The third party payment platform calls back NotifyUrl to update the order status after the payment is completed.

The update process of order status is shown as follows:

Think about bottlenecks

The immediate bottleneck point on the server side is TPS. Removing the segmentation points, we mainly look at the bottleneck point of order information persistence.

In high concurrency business scenarios, such as seckilling, bargain snapping, etc. There will be a lot of order requests in a short period of time. If the order information is persistent and frequent read and write operations are directly performed on the database layer without optimization, the database will be overwhelmed and easily become the first service to collapse, such as the routine order writing process as shown in the following figure:

As you can see, each time you persist an order information, you typically go through network connection operations (linking to the database) and multiple I/O operations.

Thanks to connection pooling technology, we can link to a database without having to re-issue a full HTTP request each time. Instead, we can retrieve the open connection handle from the pool and use it directly, much like thread pooling.

In addition, we can add more optimization in the above process. For example, for some information that needs to be read, it can be stored in the memory cache layer in advance and added to the update and maintenance, so that it can be read quickly when used.

Even if we have some of the above optimization means, but for write operation I/O blocking time, in high concurrent requests, it is still easy to lead to the database can not bear, prone to link open anomalies, operation timeout and other problems.

In addition to the above mentioned operations, there are the following methods for optimizing operations at this layer:

  • Database cluster, using read and write separation, reduce write stress
  • Separate database, different business tables in different databases, will introduce distributed transaction problems
  • Using queue model to cut peak

Each approach has its own characteristics, and since this article deals with the architectural idea of order queues, let’s look at how to introduce order queues into an order system.

The order queue

There are many articles on the web about the practice of order queuing, most of which have omitted to explain the consistency of requests and responses.

The first order queueFlow chart:

Above is the queue model mentioned in most articles, with two unresolved problems:

  1. If there is third-party payment in the order, how to ensure the consistency of ① and ②? For example, one of them fails to be processed;
  2. If the order is paid by a third party, the payment is completed and the third-party payment platform is called backnotifyUrl, while ② is still waiting for processing, how to deal with this situation?

First of all, make sure that the above order flow chart is fine. It has the following advantages and disadvantages, and there are solutions to the two problems mentioned.

Advantages:

  • Users do not need to wait for the order persistent processing, but can directly get the response, realize the rapid order
  • Persistent processing, which is queued first come first, does not hit the database level as mentioned above with high concurrent requests.
  • High variability,Tie-in middlewareIs strong in combination.

Disadvantages:

  • When multiple orders enter the queue, the processing speed of step 2 cannot keep up. Which leads to the second problem.
  • More complex implementation

I will give solutions to the problems mentioned above. Let’s look at another order queue flowchart.

Second order queueFlow chart:

The second order queue design model, pay attention to the results of its synchronous waiting persistence processing, solves the consistency problem of persistence and response, but there is a serious time-consuming waiting problem, its advantages and disadvantages are as follows:

Advantages:

  1. Strong consistency of persistence and response.
  2. Persistent processing, which is queued first come first, does not hit the database level as mentioned above with high concurrent requests.
  3. Implement a simple

Disadvantages:

  1. When multiple orders are queued, the processing speed of the persistence unit falls behind, causing the client to wait for the response synchronously.

For this type of order queue, I’ll post a version of the Golang implementation below.

conclusion

Compared with the above two common order models, if the priority is given from the perspective of user experience, the first one that does not require users to wait for the results of persistent processing is significantly superior to the second one. If the technical team is good and skilled, the first implementation should also be considered.

If you just want to get to the point where you’d rather the user wait out the timeout than the storage tier service being washed out, then consider the second option only.

Implement queue selection

Here, let’s take a closer look at the options available to implement the queue module’s functionality.

I’m sure many of you who are experienced in back-end development have already figured out that using existing middleware, such as Redis, RocketMQ, and Kafka, is an option.

In addition, we can directly write code to implement a message queue in the current service system to achieve this goal. Let me categorize the queue types using graphs.

Different queue implementation methods can directly lead to different functions and have different advantages and disadvantages:

Advantages of level 1 cache:

  1. Level 1 cache, fastest. Fetching directly from the memory layer without linking;
  2. If persistence and clustering are not considered, it is simple to implement.

Disadvantages of Level 1 cache:

  1. If you consider persistence and clustering, the implementation is more complex.
  2. Without persistence, the order information in the queue will be lost if the server fails or the service is interrupted for other reasons

Advantages of middleware:

  1. The software is mature, and the well-known message-oriented middleware has been used in practice and is well documented.
  2. Support for multiple persistence strategies, such as RedisThe incrementalPersistence, to minimize the loss of order information due to unexpected crashes;
  3. Support for clustering, master/slave synchronization, which is an essential requirement for distributed systems.

Disadvantages of middleware:

  1. In distributed deployment, links need to be established for communication. As a result, read and write operations need to communicate over the network.

answer

Back to the first order model:

Question 1:

If there is third-party payment in the order, how to ensure the consistency of ① and ②?

First let’s look at what happens when there is inconsistency:

  1. (1) Failure. The user cannot obtain the result due to network reasons or return to another page. If (2) is successful, the final status of the order is to be paid. Users can enter the personal order center to complete the order payment;
  2. ① and ② both fail, then the order fails;
  3. (1) Success, (2) failure, the user is inThe response pageAfter the payment is complete, the order information is blank.

In the above situation, it is obvious that only 3 orders need to be restored, and the solutions are as follows:

  • When the payment callback interface of the server is accessed by the third-party payment platform, the corresponding order information cannot be found. So first, store the data that has been paid but no order information, for example, toTable A. Start one at a timeScheduled Task BSpecifically traverse table A, and then go to the order list to find whether there is corresponding order information, update if there is, continue if there is no, or follow the developed detection strategy.
  • When ② is due to the serverNon-collapse causeAnd lead to failure:
    • The original order data is reinserted on failureQueue headWait for the next repersistence.
  • When ② because of the servercollapseCause of failure:
    • Scheduled Task BAfter several times of testing and no result, the third-party payment platform was transferred to it during callbackOrder attachment informationRestore the order.
  • During the process of order recovery, the order information is blank.
  • Scheduled Task BWhere the serviceThe bestAnd the callback linknotifyUrlIn this way, when B fails, the callback service will also fail, and then the third-party payment platform will have a callback in case of failureRetry logicDepending on this, order information recovery can be completed when the callback service restarts.

Question 2:

If there is a third-party payment in the order, ① completes the payment, and the third-party payment platform calls back notifyUrl, but ② is still queuing for processing, how to deal with this situation?

For details about the solution, see the change mechanism of scheduled task B in problem 1.

Example code for the Go version of the second queue

Define some constants

const (
	QueueOrderKey   = "order_queue"     
	QueueBufferSize = 1024              // Request queue size
	QueueHandleTime = time.Second * 7   // Single mission timeout
)
Copy the code

Define queue access interface, convenient for a variety of implementation

// Define queue interface to facilitate multiple implementations
type IQueue interface {
	Push(key string,data []byte) error
	Pop(key string) ([]byte,error)
}
Copy the code

Define request and response entities

// Define request and response entities
type QueueTimeoutResp struct {
	Timeout  bool  // The timeout flag bit
	Response chan interface{}}type QueueRequest struct {
	ReqId  		string  `json:"req_id"`  // Single request ID
	Order       *model.OrderCombine `json:"order"` // Order information bean
	AccessTime 	int64 	`json:"access_time"` // Request time
	ResponseChan *QueueTimeoutResp `json:"-"`
}
Copy the code

Defining queue entities

// Define queue entities
type Queue struct {
	mapLock sync.Mutex
	RequestChan  chan *QueueRequest // Cache pipe, load request
	RequestMap   map[string]*QueueTimeoutResp 
	Queue IQueue
}
Copy the code

Instantiate the queue and receive interface parameters

// instantiate the queue to receive interface parameters
func NewQueue(queue IQueue) *Queue {
	return &Queue{
		mapLock:     sync.Mutex{},
		RequestChan: make(chan *QueueRequest, QueueBufferSize),
		RequestMap:  make(map[string]*QueueTimeoutResp, QueueBufferSize),
		Queue:       queue,
	}
}
Copy the code

Receiving a request

// Receive the request
func (q *Queue) AcceptRequest(req *QueueRequest) interface{} {
	if req.ResponseChan == nil {
		req.ResponseChan = &QueueTimeoutResp{
			Timeout:  false,
			Response: make(chan interface{},1),
		}
	}
	userKey := key(req)  // unique key generation function
	req.ReqId = userKey
	q.mapLock.Lock()
	q.RequestMap[userKey] = req.ResponseChan // The memory layer stores the resP pipe pointer to the corresponding REq
	q.mapLock.Unlock()
	q.RequestChan <- req  // Receive the request
	log("userKey : ", userKey)
	ticker := time.NewTicker(QueueHandleTime) // Start a timer with the timeout QueueHandleTime
	defer func(a) {
		ticker.Stop() // Release timer
		q.mapLock.Lock()
		delete(q.RequestMap,userKey)  // When a REq is processed, remove it from the map
		q.mapLock.Unlock()
	}()

	select {
	case <-ticker.C:  / / timeout
		req.ResponseChan.Timeout = true 
		Queue_TimeoutCounter++  // Auxiliary count, int type
		log("timeout: ",userKey)
		return lghError.HandleTimeOut  // Return a timeout error message
	case result := <-req.ResponseChan.Response:  // Req is processed completely
		return result
	}
}
Copy the code

The function takes the REQ from the request pipe and puts it into the queue container, which runs in Gorutine

// Take the REQ from the request pipe and put it into the queue container. This function runs in gorutine
func (q *Queue) addToQueue(a) {
	for {
		req := <-q.RequestChan // Retrieve a req
		data, err := json.Marshal(req)
		iferr ! =nil {
			log("redis queue parse req failed : ", err.Error())
			continue
		}
		iferr = q.Queue.Push(QueueOrderKey, data); err ! =nil {  // Push to join the team, there is time consumption
			log("lpush req failed. Error : ", err.Error())
			continue
		}
		log("lpush req success. req time: ", req.AccessTime)
	}
}
Copy the code

Take out the REQ processing, which is run in Gorutine

// Retrieve req processing, which is run in gorutine
func (q *Queue) readFromQueue(a) {
	for {
		data, err := q.Queue.Pop(QueueOrderKey) // Pop out of the queue, there is also time consumption
		iferr ! =nil {
			log("lpop failed. Error : ", err.Error())
			continue
		}
		if data == nil || len(data) == 0 {
			time.Sleep(time.Millisecond * 100) // Empty data req, pause and fetch again
			continue
		}
		req := &QueueRequest{}
		iferr = json.Unmarshal(data, req); err ! =nil {
			log("Lpop: json.Unmarshal failed. Error : ", err.Error())
			continue
		}
		userKey := req.ReqId
		q.mapLock.Lock()
		resultChan, ok := q.RequestMap[userKey] // Retrieve the corresponding RESP pipe pointer
		q.mapLock.Unlock()
		if! ok {// When the middleware restarts, such as Redis restarts and reads the old key, it will enter here
			Queue_KeyNotFound ++ // Count int types
			log("key not found, rollback: ", userKey)
			continue
		}
		simulationTimeOutReq4(req) // Simulate the task function, the input parameter is req
		if resultChan.Timeout {
			// The rollback operation has timed out
			Queue_MissionTimeout ++
			log("handle mission timeout: ", userKey)
			continue
		}
		log("request result send to chan succeee, userKey : ", userKey)
		ret := util.GetCommonSuccess(req.AccessTime)
		resultChan.Response <- &ret // Input processing succeeded}}Copy the code

Start the

func (q *Queue) Start(a)  {
	go q.addToQueue()
	go q.readFromQueue()
}
Copy the code

Run the example

func test(a){... runtime.GOMAXPROCS(4)
    redisQueue := NewQueue(NewFastCacheQueue())
    redisQueue.Start()
    reqNumber := testReqNumber
    wg := sync.WaitGroup{}
    wg.Add(reqNumber)
    for i :=0; i<reqNumber; i++ {go func(index int) {
    		combine := model.OrderCombine{}
    		ret := AcceptRequest(&QueueRequest{
    			UserId:       int64(index),
    			Order:        &combine,
    			AccessTime:   time.Now().Unix(),
    			ResponseChan: nil,
    		})
    		fmt.Println("ret: ------------- ",ret.String())
    		wg.Done()
    	}(i)
    }
    wg.Wait()
    time.Sleep(3*time.Second)
    fmt.Println("TimeoutCounter: ",Queue_TimeoutCounter,"KeyNotFound: ",Queue_KeyNotFound,"MissionTimeout: ",Queue_MissionTimeout)
}
Copy the code

Finally, upload an image of a book