What is the secret behind the highly available purchase system under 5 million orders per day?

The peak value of orders submitted at zero point exceeded 10,000 TPS, the total number of orders in one day exceeded 5 million, and the number of active users was 2 million. The activity of Suning 88 shopping day achieved fruitful results.

Behind the large traffic volume and high sales volume are our efforts in the past six months. Only by optimizing and upgrading the instantaneous high concurrency capacity of the shopping system can we ensure smooth and smooth shopping experience for consumers. Here is to introduce the next Suning shopping system to deal with the promotion of the art and road.

Practical: Highly available architecture design for the purchase system

Art is the methodology of architectural design. The architecture of the purchase system has been updated and iterated for many times. Its purpose is to create an application system with high availability, high performance, easy expansion and strong scalability.

To refine the whole process, we mainly did three aspects:

System architecture optimization design
Optimization of database performance
Apply high availability optimizations

System architecture optimization design

According to Conway’s law, organizational form equals system design. In order to meet the rapid pace of development iterations, all functions were placed in a cluster.

With the development of services and increasingly complex functions, a single cluster has become the biggest bottleneck limiting system performance.

Figure 1: Architecture design of Suning shopping system

Therefore, the first thing system optimization does is refactor the purchase system architecture. On the one hand, horizontal segmentation is carried out to layer the system, including:

Network layer: Accelerate response through CDN. On the one hand, CDN cache improves the speed of static content access and reduces the pressure on the server. On the other hand, CDN internal network line, speed up the source.
Load layer: includes four loads and seven loads. Functions include traffic scheduling, traffic control, security protection, cattle protection, etc. In addition, Lua aggregation of some lightweight services is also done in the load layer to improve response performance.
Application layer: This layer mainly implements business function logic.
Service layer: provides atomic services for the application layer, such as membership, coupon, sourcing, time-lapse, order generation, payment, etc.
Data layer: provides data storage access services, such as database, cache, etc. Provides data extraction and analysis services, such as Hbase and Hive.

On the other hand, according to the business characteristics of shopping, vertical segmentation is carried out to split the originally coupled functional logic into three layers: PGS-WEB, PGS- task-Web and PGS-ADMIN_WEB.

Each module is deployed in a separate cluster, with distributed remote calls between clusters working in collaboration with atomic services provided by the service layer. Among them:

PGS-WEB: Front-end business processing module. It includes three modules: display, transaction and marketing. Each module can be divided into smaller sub-modules.

For example, the marketing module can be subdivided into four lightweight gameplay modules: invite new group, bargaining group, inflated red packets and help group, which can be unplugged, split and expanded for different modules according to business needs.
Pgs-task-web: the middle platform scheduled TASK processing module, mainly used to process scheduled tasks, in addition to the payment logic in this layer.
PGS-ADMIN_WEB: background management module, mainly used for operators to maintain activities, goods, gameplay, etc.

Database performance optimization

In high-concurrency scenarios, operations such as submitting orders, generating group records, and querying orders will put great pressure on the database, and these operations that require high consistency cannot directly replace the database with distributed cache.

To cool down the database and improve the concurrent processing ability of the database, we must make the database have the ability of horizontal expansion. Therefore, based on Mycat database middleware, we implement the strategy of database subdivision and table subdivision.

Figure 2: MySQL database load capacity trends in high concurrency scenarios

Mycat is a database and table middleware of MySQL, similar to Nginx of Web server, which is the proxy layer between application and database.

As Mycat is an open source middleware, the technical implementation is not described here, but how it is applied in the purchase system.

As shown in the figure below, data operations of business logic are divided into three libraries, DataNode 1~3, through Mycat. The application itself is not aware of this process.

Figure 3: Sub-library architecture of Suning shopping system based on Mycat

The operation sharding rule for group and group details is based on the group ID (GROUP_ID), and the operation sharding rule for order is based on the order ID (ORDER_ID).

In addition, there is a separate BackupDB for big data extraction and data backup, which is guaranteed by Canal to contain full data.

When Mycat has problems, we can switch data sources at the application layer and downgrade to a single library to ensure business.

Apply high availability optimization

The optimization of application layer mainly includes distributed cache and asynchronization.

Redis distributed lock is used to solve the consistency problem in concurrent scenarios

For example, to prevent the order from being processed repeatedly, we use Jedis Transaction + SETNX command to implement Redis distributed lock:

Transaction transaction = jedis.multi(); // Return a transaction.setnx(tmpLockKey, "lock"); //Set if Not Existtransaction.expire(tmpLockKey, locktime); List<Object> rets = transaction.exec(); // Transaction executionCopy the code

Redis is used to realize active inventory and solve the problem of database resource competition

For each group event, we maintain the event inventory, or qualification/surplus, not the actual inventory.

In activities such as 1 dollar/SEC, the inventory changes rapidly and a large number of database updates cause row locks that affect system throughput.

The optimization scheme is to make active inventory deduction in Redis and synchronize it to the database at a certain period:

/** * Private Long updateStoreByRedis(String actId, String field, int count) { String key = redis.key(PGS_STORE_INFO, actId); // If active inventory cache information exists, update the corresponding number of fields if (! Redis.exists (key)) {// Read the ActivityStoreEntity from the database and initialize it to redis ActivityStoreEntity Entity = queryStoreInfoFromDb(actId); if (entity == null) { return -1L; } Map<String, String> values = new HashMap<String, String>(); Values. The put (PGS_STORE_ALL,...). ; Values. The put (PGS_STORE_REMAIN,...). ; Values. The put (PGS_STORE_LOCK,...). redis.hmset(key, values); Redis.expire (key, ONE_HOUR); redis.expire(key, ONE_HOUR); } return redis.hincrby(key, field, count); }Copy the code

Public int syncActivityStoreToDB(String actId) {/** * public int syncActivityStoreToDB(String actId) {... String key = redis. Key (PGS_STORE_SYNC, actId); // Determine the synchronization lock status. if(! Redis.exists (key)){update the number of stocks that can be locked by an activity redis.setex(key, actId,STORE_SYNC_TIME); } } } catch (Exception e) { Log; }... }Copy the code

Asynchronize operations to eliminate concurrent access peaks

For example, after the payment is completed, there is a series of follow-up processing processes, including activity inventory deduction, group status change, etc., some of which have high real-time requirements and need to be processed synchronously.

Others can be handled asynchronously, such as notifying logistics of shipments. We use Kafka queues to communicate asynchronously and let downstream systems consume.

Dao for body: the guarantee system under the high concurrency of the purchase system

“With daoyu, the art will be accomplished. The art that strays from the way fails.” The ultimate goal of all of our architecture optimizations and upgrades is to ensure the stability of peak sales.

The way to guarantee the purchase system in high concurrency scenarios is based on reasonable capacity planning, supported by comprehensive coverage of the monitoring system, and formed a perfect flow limiting + degradation + prevention and control strategy.

Full-link pressure measurement and capacity planning

According to service estimation, reasonable capacity planning can only be made in the production environment based on the pressure test of suning purchase full link scenario.

At present, our pressure measurement system can support the drainage pressure measurement, that is, the real flow on the line is copied, generated scripts, pressure measurement. This maximizes the consistency between the pressure measurement and the real situation, thus making the capacity planning more accurate.

End-to-end coverage of monitoring systems

At present, the monitoring system of Suning Shopping can achieve end-to-end coverage, including client -> network -> server monitoring.

Client monitoring relies on terminal logs covering PC + WAP + App. Network monitoring is mainly CDN logs and dial data.

The server side has the most diversified monitoring methods, including:

Server system status monitoring: CPU, memory usage, NETWORK adapter traffic, and disk I/O.
Web server monitoring: Displays indicators such as the number of Http connections, response time, Http exceptions, and status codes of the Web server in real time.
Application server exception monitoring: Collects application exception stack information in real time.
JVM state monitoring: Shows JVM memory, thread, GC, and Class usage in real time.
NoSQL monitoring: monitoring the number of Redis commands per minute, large objects, connectivity, etc.
Database monitoring: Monitors indicators at the database level.
Call chain monitoring: real-time display of call relationships between applications, feedback link system health status.

These monitoring systems are connected in series with the basic o&M platform through traceId, and are finally aggregated by the decision analysis platform to realize intelligent alarms.

Figure 4: End-to-end monitoring system and alarm decision platform

Flow control and risk control

Flow control is set to limit the flow of 88 shopping days when the zero peak traffic exceeds expectations to protect the application, otherwise there will be an avalanche of chain reaction.

The current flow control system can support multi-dimensional flow control policies. Including the most basic JVM active thread flow control, flow limiting for user IP, UA and member number, flow limiting policy for core interface, flow limiting policy for popular goods and so on.

Figure 5: Architecture of PFC system

Risk control is a prevention and control strategy aimed at the risk of scalpers brushing 88 popular commodities in daily shopping. In addition to the traditional scalping inventory list, the risk control strategy of shopping includes the judgment of users, addresses, event behaviors, device fingerprints, etc.

Different from black and white prevention and control, shopping adopts the scoring method to portrait users, and adopts a series of challenge modes such as SMS verification, sliding verification and face recognition for potential risk users.

Major promotion of preparedness and contingency plans

Major promotion preparations refer to a series of major promotion preparations combined with the business promotion rhythm, including the early downgrade of non-core scheduled tasks and the recovery of production operation rights.

The emergency plan is a plan for sorting out the unexpected events that may occur. The emergency plan is based on the means of degradation.

For example, some functions can be downgraded and closed at critical times, and the car can be abandoned to ensure the normal shopping process. Another example is the cooling measures for server performance bottlenecks, which can only be successfully completed each time if they are prepared to deal with all contingencies.

conclusion

The road is long, but this year’s 88 Suning shopping day has come to an end. The future presents us with more challenges and opportunities.

There are still a lot of work for the technical team of Suning Shopping to do in terms of how to further break through the bottleneck of system performance, how to provide users with personalized recommendation and services, and how to make Pinggu an open and social e-commerce platform.

We will continue to move forward, unstoppable, and bring you continuous technology sharing and updates.

Authors: Zhu Yiquan, Ren Zhangxiong, Zhang Tao, Gong Zhaozhong

Profile: Zhu Yiquan, graduated from Nanjing University of Aeronautics and Astronautics with a master’s degree, is a senior technical manager of Suning Tesco Consumer RESEARCH and development Center. He is mainly responsible for the optimization and promotion of each system architecture of Suning. Successively participated in the transformation of Https of Tesco, the transformation of Suning shopping architecture, and the construction of Wevin business monitoring platform. Focus on the technical research of creating high reliability, high performance, high concurrency service system.

Editors: Tao Jialong, Sun Shujuan

What is the secret behind the highly available purchase system under 5 million orders per day?

Related Posts

In-depth Understanding of Computer Systems (I) Computer system roaming

Redis author Antirez experienced a “sexist” storm

Flutter: implement iOS certificateless packaging IPA