Abstract: If you want to develop an e-commerce inventory system, what are the biggest concerns? Close your eyes and think, of course, it is high concurrency and anti-oversold! This paper presents an overall consideration of how to high concurrency and anti-oversold data accuracy scheme. Readers can directly refer to this design, or on this basis to make a more suitable for the use of the design scenario.
The following customer inventory example illustrates how to reduce inventory with high concurrency. The same principle applies to other scenarios that require concurrent writes and data consistency.
Sample inventory quantity model
For ease of description, we use a simplified inventory quantity model. There are many more inventory items in a real world scenario than in my example, but this is enough to illustrate the principle. In the table below, stockNum contains two fields: commodity identification and stock quantity. Stock quantity represents how many goods can be sold.
The field name | The English name | The field type |
---|---|---|
The labeling | skuId | Long integer |
Inventory quantity | num | The integer |
Traditional database guarantees against oversold
The traditional scheme of inventory management in order to ensure not oversold, are the use of database transactions to ensure: through Sql to judge the remaining inventory is enough, multiple concurrent execution of update statements only one can be successfully executed; In order to ensure that deductions are not repeated, an anti-repetition table is used to prevent repeated submissions, so as to achieve idempotency. AntiRe is designed as follows:
The field name | The English name | The field type |
---|---|---|
logo | id | Long integer |
The weight code | code | String (unique index) |
For example, an order process deduction process is as follows:
Insert into antiRe(code) value (' Sku ') Update stockNum set num=num- order number where skuId= num- order number >0 Insert into antiRe(code) value (' Sku 'Copy the code
In the face of more and more system traffic, the performance bottleneck of the database will be exposed: even if the sub-database sub-table is useless, when the promotion of high concurrency is for a small number of goods, and eventually the concurrent flow will hit a few tables, only to improve the resilience of a single fragment. We then design a scheme that uses Redis cache to do inventory deduction.
The combination of database and Redis satisfies the principle of high concurrency deduction
Inventory reduction actually includes two processes: the first step is oversold check, the second step is the persistence of the deduction data; In traditional database deductions, the two steps are done together. The realization principle of anti-write is actually clever use of the idea of separation, separation of anti-oversold and data persistence; The first anti-oversold is done by Redis; Through Redis anti-oversold, as long as the storage can be; The business database uses commodity sub-database and sub-table. The task engine uses document number to sub-database and sub-table. The hotspot commodity dropping will be dispersed by the state machine to eliminate hotspot.
The overall structure is as follows:
The first level is to solve the oversold test: we can put the data into Redis, and every time the inventory is deducted, incryby deduction is made to the data in Redis. If the returned quantity is greater than 0, it means that the inventory is enough, because Redis is a single thread, we can trust the return result. The first is Redis, can resist high concurrency, Ok performance. After the oversold check passes, enter the second level.
The second close to solve inventory deduction: after the first close, the second close does not need to judge whether the number is enough, only need a fool deduction inventory on the line, to the database to execute the following statement, of course, or need to deal with the repeated idempotent, do not need to judge whether the number is greater than 0, deduction SQL as long as the following can be written.
Insert into antiRe(code) value (' Sku ') Update stockNum set num=num- order number where skuId= merchandise ID TRANSACTION endCopy the code
Important point: ultimately, you have to use the database, how to solve the hotspot? The task library uses the order number to divide the database and table, so that different orders for the same item will be hashed in different inventory of the task library, although it is still database resistance, but has eliminated the database hot spot.
The overall interaction sequence diagram is as follows:
Hot against the brush
However, Redis also has a bottleneck. If the SKU is overheated, it will hit the Redis single chip, which will cause the single chip performance jitter. Inventory brush prevention has a premise is not card single. You can customize the flow limiting of millisecond time Windows within the JVM to protect Redis from as many streams as possible. The extreme case of limiting traffic is when an item should be sold in one second, but it actually takes two seconds. Normally, delayed sales do not occur. The JVM is chosen because if you use remote centralized cache limiting, Redis will be killed before data can be collected.
The implementation scheme can be implemented through a framework such as Guava, with a time window every 10ms, counting for each time window, and limiting traffic when a single server exceeds the count. For example, if there are more than 2 servers in 10ms, then there are 200 servers in one second, and 50 servers can sell 10,000 goods in one second. You can adjust the threshold according to the actual situation.
The Redis deduction principle
The incrby command of Redis can be used for inventory deduction, and the deduction items may be multiple. We use the hincrby command of Hash structure to simulate the whole process with the native command of Reids first. In order to simplify the model, we demonstrate the operation of one data item, and the principle of multiple data items is completely the same.
127.0.0.1:6379 > hset iphone inStock 1 # set apple have a phone available inventory (integer) 1 127.0.0.1:6379 > hget iphone inStock see apple mobile phone available stock for # 1 "1" 127.0.0.1:6379> hincrby iPhone instock-1 # The order is successfully placed (INTEGER) 0 127.0.0.1:6379> hget iPhone inStock # Verify the remaining 0" 0" 127.0.0.1:6379> hincrby iPhone instock-1 (integer) -1 127.0.0.1:6379> hincrby iPhone inStock 1 # identify -1, rollback inventory increment, Left 0 (integer) 0 127.0.0.1:6379> hget iPhone inStockCopy the code
Idempotent guarantee of deduction
If the application does not know whether the deduction is successful or not after calling Redis deduction, it can add an anti-duplicate code for the batch deduction command and execute setnx command for the anti-duplicate code. When an exception occurs, it can determine whether the deduction is successful according to the existence of the anti-duplicate code. For batch naming, pipeline can be used to improve the success rate.
// Initialize inventory 127.0.0.1:6379> hset iPhone inStock 1 # Set iPhone inStock to have an available inventory (integer) 1 127.0.0.1:6379> hget iPhone inStock # check the available inventory of iPhone as 1" 1" // apply thread 1 subtract inventory, order number A100, 127.0.0.1:6379> set A100_iphone "1" NX EX 10 # pipeline 127.0.0.1:6379> hincrby iPhone instock-1 (integer) 0 // Finish pipeline, OK and 0 are returned togetherCopy the code
Check to prevent concurrent deductions: To prevent concurrent deductions, check whether the Redis hincrby command returns a negative value to determine whether high concurrent oversold occurs. If the result after deduction is negative, reverse hincrby execution is required to add data back.
If network jitter occurs during the call and Redis times out, the application does not know the operation result. You can run the get command to check whether the anti-duplicate code exists to determine whether the deduction is successful.
127.0.0.1:6379> get a100_iphone # deduct failed "1" 127.0.0.1:6379> get a100_iphone # deduct failed (nil)Copy the code
One-way guarantee
In many scenarios, because you don’t use transactions, you are very likely to not oversold, and sell a lot, so in extreme cases, my choice is to not oversold, but possibly sell less. Of course, we should try to ensure that the data is accurate, not oversold, also sell a lot; Under the premise of not completely guaranteed, choose not oversold one-way guarantee, but also through means to reduce the probability of underselling as much as possible.
For example, in the process of Redis deduction, the command arrangement is to set the anti-duplicate code first, and then execute the deduction command fails. If network jitter during execution succeeds in placing the recode but fails in deducting the recode, it will be considered as successful when retry, resulting in oversold. Therefore, the preceding command sequence is incorrect. The correct way to write the command is as follows:
If inventory is deducted, the sequence is: 1. Inventory is deducted. 2.
If the inventory is rolled back, the sequence is 1. Write the weight code. 2.
Why PiPeline
In the above command, we use Redis Pipeline, look at the principle of Pipeline.
Non-pipeline mode request–> Execute –> Response Request –> Execute –> Execute the server queues the response result request–> execute The server queues the response result –> Response –>response
The use of Pipeline can ensure the integrity of the results returned by multiple commands as far as possible. Readers can consider using Redis transaction to replace Pipeline. In the actual project, I have had successful anti-pipeline experience and did not use Redis transaction. So it wasn’t adopted.
Redis transaction 1) mutil: start the transaction and all subsequent operations will be added to the “action queue” of the currently linked transaction 2) exec: commit the transaction 3) discard: unqueue execution 4) watch: trigger dicard if the key of the watch is changed.
The final consistency of the database is achieved through the task engine
The previous task engine is used to ensure that the data must persist in the database. The design of “task engine” is as follows. We abstract the task scheduling into a business-independent framework. The Task Engine can support simple process choreography and guarantee success at least once. A task engine can also be used as a state machine engine to support scheduling of state machines. Therefore, a task engine can also be called a state machine engine, which is the same concept in this article.
** Core principle of task engine design: ** First put the task into the database, through the database transaction to ensure that the sub-task split and the parent task to complete the transaction consistency.
** Task library sub-table: ** Task library using sub-table, can support horizontal expansion, through the design of sub-library field and business library field is different, no data hotspot.
The core processing flow of the task engine:
** First step: ** Synchronous call commit task, first persist the task to the database, state “lock processing”, ensure that the matter will be processed.
Note: In the original original version, tasks dropped into the database are to be processed, and then scanned by the Worker. In order to prevent concurrent and repeated processing, a single task is locked after scanning, and processing can be performed after the locking is successful. Later, it was optimized to directly identify the status of dropped tasks as “lock processing”, for the sake of performance, eliminating rescan and preemption of tasks, and directly processing the tasks asynchronously through threads within the process.
Lock Sql reference:
UPDATE task table _ sub-table ID SET Status = 100,modifyTime = now() WHERE id = #{id} AND status = 0Copy the code
** Step 2: ** The asynchronous thread invokes the external processing. When the external processing is complete, the receiver returns a list of subtasks. Through a database transaction, the parent task state is set to completed and the child task falls to the library. Add subtasks to the thread pool.
Key points: Ensure transactional child task generation and parent task completion
** Step 3: ** subtasks are scheduled to be executed, and the new subtasks are dropped into the library again. If no subtasks return, the whole process ends.
Exception handling Worker
Abnormally unlock Worker to unlock tasks that have not been processed for a long time, so as to prevent tasks that have been locked without server execution due to server restart or full thread pool.
Worker Prevents the thread pool task from being completed due to server restart, and the leak remediation program is locked again to trigger execution.
Task state transition process
Task engine database design
Example of task table database structure design (only for example use, real use needs to be improved)
field | type | instructions |
---|---|---|
Task ID Identification | Long | A primary key |
state | Int | 0 unprocessed, 100 locked, 1 completed |
data | String | Service data in Json format |
The execution time | Date | The execution time |
Task engine database Dr:
Task libraries use sub-library sub-table. When a library goes down, the traffic routed to the crashed library can be hashed to other surviving libraries. You can manually configure or automate disaster recovery through system monitoring. In the following figure, when task library 2 is down, you can modify the configuration to route the traffic from task library 2 to task library 1 and 3. The leak engine continues to scan for task library 2 because when task library 2 passes the primary disaster recovery, future and processed tasks can be replenished.
Example of task engine scheduling
For example, if a user buys two mobile phones and a computer, the mobile phone and the computer are scattered in two databases, and the task engine is used to persist the task first, and then drive it to split into two sub-tasks, and finally ensure the success of the two sub-tasks to achieve the final consistency of data. The tasks of the whole execution process are arranged as follows:
Task engine interaction flow:
Difference comparison – the ultimate solution for heterogeneous data
As long as there is heterogeneity, there will be differences. In order to ensure that the influence of differences is controllable, the ultimate solution is to rely on difference comparison to solve. Limited by the length of this article, it will not be expanded, and then written separately. The general process of difference comparison between DB and Redis is as follows: receiving the message of inventory change, continuously following up and comparing whether the data of Redis and DB are consistent. If the data is inconsistent continuously and stably, data repair will be carried out and THE DATA of Redis will be modified with DB data.