A, requirements,

  1. There are a number of products, each product is 100 pieces, each person is limited to buy a maximum of one item.
  2. Open purchase at X hour X minute 0 seconds on X month X day X. Before the appointed time, only the product page is visible and the buy button is dimmed.

Ii. Activity estimation

  1. Tens of thousands of people are expected to participate in each product
  2. Within half a minute of the start of the event, 10W transaction requests are expected for each commodity, with an estimated total TPS of 20W/s
  3. Half a minute after the start of the activity, it is expected that the vast majority of goods have been sold out, the remaining goods still support seckill, estimated total TPS: 2000/s

3. System status

  1. Maximum TPS that the system can maintain long-term stable operation: 1000/s
  2. Maximum TPS that the system does not deny services within 1 minute: 1500/s

4. Design pre-study

Plan 1: Transform the existing system and increase TPS substantially in the overall microservice system

Advantages:

  1. With the significant increase of TPS, the business system can directly bear the huge amount of transaction requests, and it will be more leisurely to cope with the demand of SEC killing.

Disadvantages:

  1. The number of SEC kill activities is less, the transformation cost and risk are large, and the cost performance is low
  2. After the transformation, the system architecture becomes more complex and less maintainable

Scheme 2: Develop new applications to carry the demand; Retrofit existing applications to support new ones

Advantages:

  1. Small transformation of existing business system, low cost.
  2. Using the new application to carry the second kill transaction request, high flexibility, without too much consideration of backward compatibility.

Disadvantages:

  1. The user experience needs to be sacrificed to some degree.

conclusion

Considering the cost and feasibility, plan 2 is adopted

5. Architecture design

5.1 Existing architecture design

  1. The foreground application directly accepts user access and transaction requests without CDN service.
  2. After the foreground application receives the transaction request, the direct RPC synchronizes the request to the central application
  3. After receiving a transaction request, the Application synchronously completes the business processes such as order creation, payment, deduction of goods and transaction completion
  4. The front desk application receives the processing results of the middle desk application and presents the logistics information of the traded goods to the user.

5.2 Latest architecture design

5.2.1 Architectural Design Overview Diagram

5.2.2 Service module design

1. The front-end and client randomly discard the transaction request
  1. According to the number of times users view the product details page, the popularity of the product is estimated, and different discard rates are set for different products. The design field: discard, with the value of [0, 100], is used as the product attribute.
  2. The front-end and client generate random numbers within the range of [0, 100] when the user submits a transaction request. Discard < Orders with random numbers are directly discarded, but the queuing page is displayed to the user, and the transaction failure is displayed after queuing for 10 seconds.
2. Provide CDN services
  1. Static resources are stored in THE CDN service
  2. A dedicated kill page is displayed on the front end, where no user action is requested to the company’s microservices business system except for “transaction initiation” requests.
3. Develop special “SEC kill application” to provide circuit breaker service
  1. The seckill application directly carries the transaction request from the front-end and client. The approved request is sent to the front-end application for processing. The rejected request is intercepted and the transaction fails.
  2. The seckill application obtains the latest configuration in real time from the configuration center. The discard rate of transactions can be configured. The developer can adjust the configuration in the second kill period to ensure that the TPS of the transaction requests sent to the business system is kept within 10000/s.
  3. Seckill applications are database-free, stateless, and scaled horizontally to increase throughput in proportion.
4. Optimized the throughput of existing business systems
  1. The application of the front desk and the middle platform can be optimized through optimization design and throughput optimization, with the following optimization methods:
    • The use of sub-database sub-table and other horizontal split way, horizontal expansion of the database, at this time can also greatly increase the number of POD
    • Static data is cached in Redis. Data is fetched from Redis to ensure that the database is not read. Background scheduled tasks update the cache. “Ensure cache update regularly and no cache breakdown”
  2. Database and table strategy:
    • Split database and application by product
    • Split the database and application by user ID
  3. After the above optimization, at least the maximum TPS of non-denial of service in a short time “within 1 minute” can be increased from 2000/s to 4000/s
  4. Due to the simple application logic of the foreground, TPS can be increased to 10000/s without further optimization.
  5. Due to the barrel effect, the maximum TPS of the whole service system not denying service in a short time “within 1 minute” is 4000/s
5. Asynchronous queues
  1. The foreground application is efficient, so only the middle-platform application can be optimized.
  2. The foreground application sends all SEC kill transaction requests to Kafka, and the mid-platform application consumes messages through Kafka for actual business processing.
  3. As the peak hours only take half a minute to one minute at most, it is acceptable for the mid-platform application to process all transactions in about two minutes.
  4. Kafka’s TPS is tens of W/s, and the front desk TPS of 5000/s ~ 10000/s has no pressure.
  5. Through this optimization, the maximum TPS of the whole service system without denial of service in a short time “within 1 minute” can be increased from 4000/s to 10000/s

Cons: Transaction processing can be slow in extreme cases.

6. Service degradation
  1. Some services are degraded from half an hour before the start to half an hour after the end
  2. During the period, logistics information is not displayed, invoices are not generated and displayed to ensure the stability of the core business system