This article comes from “tao department technology public number”, published by myself

One, the introduction

Product data is the foundation of marketing, a lot of marketing tools can eventually involve commodity data processing, such as marking, modify the commodity feature, called all sorts of downstream system, a single commodity can be synchronously in processing, real business on the basis of a certain business rules will be framed a large number of goods and carries on the processing, therefore, card coupons commodity set engine arises at the historic moment. The function of the card coupon commodity setting engine (commonly known as circle commodity) is to obtain commodities from data sources according to certain business rules, screen commodities that meet the rules, and set commodity discounts according to the customized operations of the business. Setting up commodity preference mainly revolves around commodity center, marketing center and other domains to operate, an important ability of circle product is to ensure the consistency of data in each domain after setting up commodity preference. Product data changes frequently, which may make the product not conform to the circular product rules. Another important capability of circular product is the ability to monitor the full amount of commodity center changes. The global view of the card commodity setup engine is shown below.

Three key elements of Biopin: data source, rules and business processing, all of which support horizontal expansion. Data source is the data source of biopin, and different data source access and query methods are different. Rules are used for data filtering, and only the data that meets the rules can be processed. The data that meets the rules needs to be processed on the service. The service processing can be customized.

Since its development in 2017, Quanpin has experienced 4 Double 11 events and countless daily promotional activities. At present, Quanpin has the real-time processing ability of tens of millions of products, data consistency guarantee ability, monitoring the change of all products and platform ability.

In this paper, the development of circular products is divided into two stages. In the first stage, the framework of circular products is laid; in the second stage, the stability and performance of the system are improved and the consistency guarantee ability is increased.

Second, the first stage

2.1 an overview of the

2.1.1 Life Cycle

Circular goods carry out life cycle management through the concept of activities. Circular goods pool is associated with rules and business. Circular goods pool details are a collection of commodities, which will be saved into the details of the commodity pool after the completion of commodity processing. An activity can be associated with multiple lap pool, a circle pool belongs to an activity, product pool setting times after product rules according to the business custom action to complete the processing of goods, activities in the process of commodities change would produce change, circle goods will pass to monitor commodities change news dynamic processing, after the end of the activity will trigger the action, The life cycle is shown in Figure 2.2. After setting the circular goods pool rule for the first time, the circular goods will be triggered to pull the full quantity of goods from the data source for processing, which is called the full quantity of circular goods. After the completion of the full circular products, the data source will change or the commodity information will change, and these changed commodities need to be re-processed through circular products, which is called incremental circular products.

Figure 2.1 Engine model diagram of card coupon commodity setting

Figure 2.2 Card commodity setup engine life cycle

2.1.2 System Architecture

As shown in Figure 2.3, the circular product framework can be divided into four modules, namely, data source module, action module, rule module and business processing module. The setting end is divided into three parts: activity setting, circular product pool setting and rule setting. The following four core modules will be explained around the circle product.

Figure 2.3 Frame diagram of circular products

2.2 Data source Module

The data source module is the data source of circular products. The following is mainly divided into four categories, namely, commodity list, merchant list, synchronization database table and commodity change message.

These data sources have derived a variety of circular product methods, such as circular product method of commodity list, circular product method of seller list, circular product method of spot promotion, circular product method of marketing site, circular product method of seller promotion of commodity, circular product method of flying pig seller, circular product method of new retail stalls, etc. Take the big promotion spot circle product way for example, each big promotion marketing platform investment will recruit spot goods, big promotion spot circle product pool way can designate all spot goods of big promotion, can also combine categories, commodity labels and other rules to filter goods.

2.2.1 List of Commodities

The commodity list is the simplest data source, and the commodity range of the data source is specified directly by the commodity ID. Because it is directly filled in the commodity ID, the network transmission limit supports 10W goods at most. In the process of full circular goods, the full goods are obtained from circular goods rules for processing, and incremental circular goods are processed by monitoring the message of commodity change.

2.2.2 Seller List

The seller list data source is a collection of seller ids to determine the range of goods by specifying the seller ID. In the process of full circular goods, the seller’s goods are obtained from the store search interface according to the seller ID. Incremental circular goods are obtained by monitoring the change message, which will be triggered when the seller releases new goods or changes goods. Incremental circular goods can be made by monitoring the change message.

2.2.3 Synchronizing library tables

Synchronous database table is the value of the synchronization of the original database to the new database for the use of circular products, using this method is more flexible and will not affect the original data source. Full circle product is to obtain full data by scanning the table. Incremental circular products have two channels. The first is to monitor changes in the database through Jingwei, and the second is to monitor changes in commodity information.

According to the characteristics of data sources, a variety of circular products can be derived. The data source of investment promotion of marketing platform supports the promotion of spot circular products, the way of marketing platform site, the way of sellers promoting goods circular products, etc. The data source of new retail goods supports the booth and business identity circular products.

2.2.4 Product Change message

Due to the commodity information changes will lead to goods do not conform to the rules, need to add or delete change of goods, such as small 2 set select the purpose of a class goods, the seller can edit for goods category, conforms to the original category rules of goods do not conform to the category need to delete, the original is not in conformity with the purpose of class goods now conform to the category needs to be increased. Every change in the commodity information triggers a commodity change message, so there is a way to process the commodity change message in all incremental commodities.

The daily average QPS of commodity change information is about 1W, and the peak QPS can reach more than 4W. At this stage, because each commodity pool rule is independent and cannot determine the relationship of an item to the commodity pool, each commodity pool processes the full commodity change message. Assuming that QPS of commodity change message is 1W and there are 5000 valid commodity pool, then QPS of commodity change message actually processed by commodity change system is 5000W. As a result, only rules for local computation could support processing of commodity change messages. Even so, the cycline system was a significant drain on machine performance, with over 600 machines at one time and CPU utilization reaching over 60%.

2.3 Rule Module

2.3.1 Framework design

ItemPoolRule is the class of looped rule, where relationRuleList is the looped rule and exclusionRuleList is the exclusionary rule. A commodity must meet the looped rule and not hit the exclusionary rule, then it is considered to meet the looped rule. RelationRule is the specific rule content. RuleHandler is the interface for processing rules. All rules must implement RuleHandler. For example, ItemTagRuleHandler, SellerRuleHandler, SellerRuleHandler, etc.

Figure 2.4 Rule model class diagram

2.3.2 rule tree

Rule tree design is shown in Figure 2.5. Each node represents a rule node. Top-level rules must be rules that can be used as data sources, such as commodity list rules and seller list rules. Judging whether a commodity complies with rules can be defined as: if a commodity complies with all rule nodes on a leaf link from top-level rules (that is, a link from top-level rules to any leaf node can be found in the rule tree), it is considered that the commodity complies with rules.

Figure 2.5 Rule tree design

For a better understanding, for example, as shown in Figure 2.6 below, the operation is carried out through the commodity list. The left link is the commodities in the commodity list conforming to the rules of the secondary category, and the right link is the commodities in the commodity list conforming to the rules of the primary category and the specified commodity subject.

Figure 2.6 Rule tree example

2.3.3 Top-level Rules

Because of the strong coupling between this section and the “batching Modules” section, you can look at the following section before looking at this section.

A top-level rule is both a rule and a data source, and a cycle gets all the goods in the data source from the top-level rule. When the item list is circled as a top-level rule, the rule content contains the item ids, which are the items of the data source. When the seller list is used as the top-level rule, the content of the rule contains the seller ID. When the commodity ID is obtained from the top-level rule, the store search interface is invoked according to the seller ID to obtain the commodity ID. As the top rule, the content of the rule contains the ID of the big promotion spot activity. When obtaining the commodity ID from the top rule, according to the activity ID, the goods are pulled from the synchronous investment spot table.

limitations

As you can see from the “batch processing module” section, in this stage, products are counted in the rule to contain the total number of items, and then processed in pages. This method has limitations, when the top-level rules become complex, it cannot be processed.

To take a slightly more complicated example, what happens when the list of sellers has multiple sellers in it? This stage the treatment of the circle way like figure 9 longitudinal approach, find all sell home has the largest quantity as the count, then the paging processing, every page in the process of all sellers need to cycle, when the greater the number of sellers, each page contains the greater the quantity, so the circle product way limits the maximum specified 300 sellers.

To take a more complex example, suppose that there are multiple sellers in a brand group, and one seller has a lot of goods, and now he needs to circle all the goods from all the sellers in the brand group. How do you do that? This stage is unable to deal with such complex rules, see the second stage for details.

2.4 Batch processing module

2.4.1 Distributed processing

Circular products divide the whole quantity of goods into many parts through pagination, and then carry out distributed processing through Metaq. The flow chart is shown in Figure 2.7. When the full cyclin is triggered, a metaq message will be generated to record the rule change, and the rule change message will be processed by the rule change action module. The rule change action module firstly calculates the maximum possible quantity of goods in the data source, and then divides it into many parts through paging processing. Each part generates a message of the type of increase of goods, and the message of increase of goods is processed by the action module of increase of goods. The commodity adding action module firstly pulls the commodity ID set corresponding to this part from the data source, then filters the commodity pool rules, and finally selects the corresponding business for processing.

Figure 2.7 Distributed processing flow chart

2.4.2 Paging

Paging begins by calculating the total maximum possible number of items and then paging at fixed intervals. The key information contained in the item add and item delete messages is: Start and end, take the simplest commodity list circling method as an example, suppose the operation fills 5W commodity ids, then the paging processing can be 500 commodity ids as a page, the first page start=0, end=500, the last page start=49500, end=50000, The ID of each item to be processed on each page is identified.

But the data source is often not so simple, take a slightly complicated way of promoting spot products for example, from the investment promotion synchronous spot goods stored in 64 tables, according to the commodity ID of the sub-table, where the promotion activity ID is the index field, how to efficiently obtain all the commodity ID of the designated promotion activity? When the amount of data is small, we can use count and limit to fetch data in batches. When the amount of data is large, using limit will cause large page turning problems.

In order to avoid poor performance when using limit for large page turns, cycline is processed as shown in Figure 2.8 below, which is considered horizontal. Count is equal to the sum of Max (ID)-min(ID) of all tables. Then divide tasks according to the interval. The actual interval is 5000, and for the convenience of drawing, the interval is 10. So each commodity increment message contains information that only needs to be start and end.

Figure 2.8 Horizontal paging

In the process of adding goods message, we need to loop through 64 tables to find min and Max until we find which table the start and end are in, and then fetch the goods in the table according to the start and end, the core code logic is as follows.

public List<CampaignItemRelationDTO> getCampaignItemRelationList(int start, int end, Function<Integer, Long> getMaxId, Function<Integer, Long> getMinId, Function<CampaignItemRelationQuery, List<CampaignItemRelationDTO>> queryItems) { List<CampaignItemRelationDTO> relationList = Lists.newArrayList(); for (int i = 0; i < 64; Long minId = getMinid.apply (I); long minId = getMinid.apply (I); long maxId = getMaxId.apply(i); long tableTotal = maxId - minId + 1; if (minId <= 0 || maxId <= 0) { continue; If (start-tableTotal > 0) {start -= tableTotal; end -= tableTotal; continue; } // If it is the last table, remove the required data, and then return, if it is not the last table, then remove the required data from this table. (minId+start)+(end-start) <= maxId); MinId + end <= maxId if (minId + end <= maxId) {minId + end <= maxId; AddAll (queryItems. Apply (getQuery(start + minId, end + minId, I))); break; } else {// select * from table 1; // select * from table 1; Then lower relationlist.addall (queryItems. Apply (getQuery(start + minId, maxId, I))); // The new end value should be pageSize- (end-start)-(tabletotal-start), end-tableTotal end = (int) (end-tabletotal); start = 0; } } return relationList; }Copy the code

There are several disadvantages to this approach:

  1. For table data set is a kind of good method, but for sparse data table is very inefficient, if the data distribution is sparse, the count is very big, task after the batch quantity is very large, finally obtain the ID is hundreds of goods, for example, a new circle of retail goods way due to the framework, also adopts the same paging processing method, A full circle item can increase the number of messages by up to 20W and may actually only get a few hundred items.
  2. Each message processing needs to query many tables until the start and end table, and determine whether the start and end tables are from the table by Max and min. Frequent values of Max and min will also cause pressure on the DB. In order to avoid the pressure on the DB, you need to use the cache Max and min.

For the first disadvantage: The data source with sparse data has too many messages, which can be improved without changing the framework. We only need to calculate the total count from another perspective, as shown in Figure 2.9 below. Count is the difference between the maximum and minimum values in all tables, so that the count value will not be very large even for sparse data source. The start and end loops are then used to retrieve the corresponding item ids from all tables during task processing. It also slightly reduces the number of times you have to take Max and min. If this paging approach is used for intensive data sources, it can lead to a large amount of data on a single page.

Figure 2.9 Vertical paging

For the second disadvantage: In this case, we can not directly determine which table start and end should be taken from. In fact, we can only pagination each form. The message contains not only start and end, but also index information of the sub-table. However, this method still has the problem of too many tasks for sparse data sources, and the current circular batch framework does not support this method.

2.5 Action Module

The function of the action module is to process metaq messages. The action module corresponds to the message type one by one. The action module is divided into rule change, increase of goods and deletion of goods.

2.5.1 Action for Changing rules

The rule change action module processes the message of rule change type. The main processing process of this action is to call the batch processing module for batch processing, and then send the information contained in each batch through Metaq, that is, the output commodity increase and commodity deletion messages.

2.5.2 Adding action of goods

The item add action module processes the message of item add type, and the action processing flow chart is shown in Figure 2.10 below.Figure 2.10 Flow chart of action processing for adding goods

2.5.3 Deleting an Item

The commodity deletion action module processes the message of commodity deletion type. The action processing flow chart is similar to Figure 2.10, except that the business processing module calls the method of commodity deletion at the end.

2.5 Service processing module

The framework class diagram of service processing module is shown in Figure 2.11. Each service needs to realize TargetHandler, in which the handle method handles the increase of circular items and the ROLLBACK method handles the deletion of circular items. At present, several big businesses have been connected: category coupons, coupon, membership card, etc.

Figure 2.11 Business processing class diagram

2.6 Phase Summary

At this stage, The circle product grew from nothing, was born from category coupons, and also derived from category coupons. In terms of business, it supported category coupons, coupon, membership card and other businesses, and in terms of performance, it could handle millions or even tens of millions of products. The system is improved in continuous development, the circle product of this stage has the following shortcomings.

2.6.1 Handling the performance of commodity change messages

2.2.4 explained the processing goods change the necessity and the existing problems, and when the effective lap pool more and more long, processing goods change message QPS is higher and higher, the system performance worse and worse, and a lot of rules you need to call HSF or time-consuming operations such as query cache, because these rules cannot support goods change message processing. During this phase, the CPU of the cluster hosting the cyclones was always above 50%, even though the cluster had more than 600 machines.

2.6.2 Complex top-level rule processing problems

In the face of complex top-level rules, Cyclin has no good way to deal with them. However, in the case of rapid business changes, Cyclin needs to be able to deal with complex rules. Even if there is no overly complex top-level rules at present, Cyclin has limitations in dealing with seller list cyclin.

2.6.3 System stability and controllability

  1. Stability problems: Learn through section 2.4, in a large number of goods, as long as the trigger coil change, rules change messages will immediately fission out more ring products, circle product metaq accumulation quantity can reach millions of news, because the downstream system current limit, resulting in a large number of exceptions, system load and high, message processing and time consuming, metaq message processing of the risk of an avalanche Sometimes a message is repeated tens of thousands of times.
  2. Controllability issues: Since more messages are immediately generated when a switch is triggered, the switch cannot selectively process messages, stop processing messages, or selectively ignore messages when messages pile up, which means there is no grip to control the system when a problem occurs. For example, two messages repeated tens of thousands of times, one is to delete the product, one is to add the product, keep marking the product to mark, a large number of product changes, resulting in the search engine synchronization delay, at that time can only watch. For another example, fullGc is caused by a looppin rule code bug, and then the looppin pool related to this rule generates a large number of messages. As the message cannot be selectively processed, the whole looppin system breaks down.

2.6.4 Handle defects in batches

In section 2.4.2, we discussed the shortcomings of the first stage of paging. The paging of different data sources should not be generalized and the framework should be more flexible.

2.6.5 Data Consistency Problem

When a large number of circulations are made, system or downstream system anomalies are unavoidable, so data may be inconsistent. For the business, if the item to be added is not added, it may be acceptable. If the item to be deleted is not deleted, it is likely to be damaged.

Third, the second stage

3.1 an overview of the

In the second stage, the new architecture is optimized for the problems in the first stage, as shown in Figure 3.1, where the yellow part is the new part. Cyclin can be divided into six blocks: data source module, action module, rule module, business processing module, scheduling module and setting end. The scheduling module is the most important part of the new part. The task model is first added, as shown in Figure 3.2. The task is first saved in DB, scheduleX triggers the scheduling logic at the second level, and finally the task is distributed for distributed processing through Metaq. In the figure, the red line represents the flow of full cycle product, and the orange yellow represents the flow of incremental cycle product. The new sections are described in detail below.

Figure 3.1 Structure diagram of circular products in the second stage

Figure 3.2 Task model

3.2 Optimization of product change message

Goods change message processing flow chart shown in figure 3.3, the first step, to establish a lap pool and commodity generalization relationship, the second step, through the Blink according to commodities and circle the generalization relationship of product pool filter change message, leave a small amount of goods changes, the third step, according to the generalization relationship of lap pool and commodity determine which circle pool change message to deal with the goods. The average daily QPS of the filtered commodity change message is about 200, and only the commodity pool related to the commodity needs to process the commodity change message. Therefore, for some commodity pool, the QPS of processing the commodity change message is less than 100, and the system performance consumption is also greatly reduced. The number of cluster machines has decreased from more than 700 at the peak to more than 300 now (the actual number of cluster machines can be compressed to less than 100 due to the cluster carrying other business).

Figure 3.3 Flow chart of commodity change message

The key point of commodity change message filtering lies in how to establish the generalization relationship between commodity pool and commodity. The idea here is to delineate possible commodities as far as possible according to specific rules. For example, when Xiao Er fills in the seller list, the relationship between the pool and the sellers has been determined. Other sellers will definitely not have any relationship with the pool. Therefore, the relationship between the pool and the seller can be stored in TAIR for Blink to filter the change message of goods. The relationship between the seller and the circle of goods pool is a more general idea, other circle of goods can also be transformed into this relationship, such as commodity list circle of goods, when the small second fill in the commodity ID, which goods belong to the seller is determined, in addition to the seller’s goods will not have a relationship with the circle of goods pool.

Of course, there are times when the relationship between sellers and circular product pool is not applicable, such as the circular product pool method of promotion activity, a promotion activity may involve hundreds of thousands of sellers, and sellers will continue to sign up to participate in the promotion, so it is difficult to obtain the relationship between circular product pool and sellers. The relationship between TMC_tag and the circular product pool can be established for the circular product pool of large promotion activities. All large promotion commodities have a unified TMC_tag. Therefore, the relationship between TMC_tag and the circular product pool can be saved as diamonds for the Blink to filter commodity change messages. In short, other circular product methods can find the generalization relationship between circular product pool and commodities according to specific rules, and judge whether there is a relationship between commodities and commodity pool through the information and generalization relationship on commodities.

3.3 Complex top-level rule processing

3.3.1 Dimension definition

In the first stage, it has been explained that top-level rules are rules that can be used as data sources. In order to better support complex data source rules, the concept of dimension is introduced, and then complex rules are turned into simple rules by dimensionality reduction. Finally, the purpose is to obtain the commodity ID contained in the data source.

Definition 1: A single item ID is zero-dimensional, that is, it has no dimensions

Definition 2: The rule that can directly obtain multiple commodity ids is one-dimensional, such as commodity list rule and single seller rule

Definition 3: A two-dimensional rule consists of multiple one-dimensional rules, such as the multiple sellers rule

Definition 4: THREE-DIMENSIONAL rules are composed of multiple two-dimensional rules, and higher-dimensional rules are composed of multiple rules one dimension lower than it

It can be seen from the above definition that seller list rules may be one-dimensional rules or two-dimensional rules. When the rule contains only one seller, it is one-dimensional rules, and when the rule contains multiple sellers, it is two-dimensional rules. In order to better understand, take the complicated rule above to explain, a seller has many commodities, a brand group has many merchants to sign up, if the operation now set up all commodities under multiple brand groups, figure 3.5 shows the dimension reduction process of this rule.

Figure 3.5 Regular dimension reduction process

3.3.2 Rule Change Action Adjustment

In the first stage, the processing process of rule change action is to call the batch-processing module for batch-processing, and then send the information contained in each batch through Metaq, that is, the output goods increase and goods delete messages. Now, the processing process of rule change action is adjusted as shown in Figure 3.6 below. Firstly, it is necessary to determine whether the rule is one-dimensional rule. Only one-dimensional rules can be directly processed by batch processing; otherwise, dimension reduction will be carried out, and the generated dimension reduction task will be processed by the rule dimension reduction action.

Figure 3.6 New rule change action processing flow chart

3.3.3 Adding the Rule dimension Reduction action

Figure 3.7 shows the action processing flow of rule dimension reduction tasks. The custom downsizing method in RuleHandler specifies the rules for the next dimension, so a downsizing task can reduce the rules by only one dimension. Dimension reduction is only in view of the top-level rule as a data source, therefore, first of all, the recursive top access rules, and then call custom dimension reduction method processing after top rules to get a lower dimension of top rule sets, and then use the dimension reduction after the top level of the rules to replace top new rules in the rule tree tree set, finally, will the new rules trees generate rules change tasks, Determine whether to continue dimensionality reduction by the rule change action.

Figure 3.7 Processing flow chart of rule dimension reduction action

3.4 Adding a scheduling module

3.4.1 Task Scheduling

The newly added task model is shown in Figure 3.2. The task is equivalent to the circular product message in the first stage. In the second stage, the task needs to drop into the database first and then be scheduled by the scheduler.

The scheduler is the power of task reversal, all types of tasks will be inserted into DB by the scheduler unified scheduling. The completed tasks in the task list will be cleaned at intervals, and even then, there may be millions of tasks in the task list, and the speed of the product is largely determined by the scheduler, so the performance requirements of the scheduler are high, not only that, the scheduler should have more flexibility.

The scheduler processing flow chart is shown in Figure 3.8. ScheduleX triggers scheduling logic at the second level, and then distributes task IDS through Metaq. In fact, scheduleX can also distribute task IDS. In the end, the distribution was changed to via Metaq because scheduleX was delivering a large number of tasks with unacceptable delays.

Speak back to figure 3.8, the basis of task scheduling is knowing the distribution of the unfinished task, in order to avoid the statistics generated when the unfinished task distribution slow SQL, made a very important action, here is the first step, acquire the unfinished task belongs to the distribution of the circle to pool ID, because here only according to the state statistical lap pool ID, state has index and lap pool, High performance due to the use of coverage indexes; Second, ten circular product pools are randomly selected to ensure balanced task scheduling and reduce the time consuming of task statistics. The third step, statistics the distribution of the number of unfinished tasks in the ten product pools; Step 4: According to the quantity statistics and system configuration in Step 3, allocate the number of tasks for each circular product pool to participate in scheduling; Step 5: Obtain the task ID according to the number of tasks assigned; Step 6: Metaq sends task ids in batches to different machines for processing; Step 7: Receive metaq message; Step 8: Submit the task ID for asynchronous processing. In order to improve the processing speed, a thread pool is maintained here, and the task ID only needs to be submitted to the blocking queue. Step 9: After the task ID is submitted for asynchronous processing, the status of the task is immediately updated to in processing to avoid the task being scheduled again. The task in processing is not an unfinished task.

Figure 3.8 Task scheduling flow chart

Compared with the circular product message mode in the first stage, tasks in the second stage are first saved to DB and then scheduled by the scheduler, which can provide more flexibility and obtain the following advantages:

  1. Task priority can be adjusted according to the circular goods pool, and problems in some circular goods pool will not affect the whole;
  2. Task scheduling speed can be adjusted, can be suspended, can be assigned according to the task type processing speed;
  3. Task scheduling can be monitored, accurate statistics;
  4. Cycle product process can be queried, traceable;

3.4.2 Task Statistics

A perfect platform without system visualization, task processing progress is an important part of the visualization. Group by and count are indispensable for task quantity statistics. When the task table is at its maximum, there may be millions of data, so it is definitely impossible to conduct statistics in a synchronous manner. Therefore, the asynchronous manner is adopted, as shown in Figure 3.9 below. The coverage index method is used to get the distribution of cycline pool ID. The task of each cycline pool is not very large, so the separate statistics of each cycline pool will not produce slow SQL.

Figure 3.9 Task statistical thinking

3.4.3 New idea of batch processing

The defects of batch processing were mentioned in the first stage, and how to solve the whole problem will be discussed here. The new batch processing method is still under development, and the design ideas are as described in this chapter.

3.4.3.1 Framework Design

Cyclones have a variety of data sources, each with different characteristics, so it is impossible to process all data sources in a common batching manner. Therefore, the framework for circular batching should be more general, allowing each data source to define its own batching method. As shown in Figure 3.11, basic batch object Pageable is added. Considering compatibility with old frame, Pageable contains batch parameters start, end and pageSize used by old frame. Custom batch object TablePageable or others inherit from Pageable, RuleHandler adds custom batch method getPageabelList, The batting method of the old framework can be written in the AbstractRuleHandler getPageableList. You need to override getPageabelList with a custom batting RuleHandler.

Figure 3.11 Batching the new framework class diagram

3.4.3.2 Page Processing Roadmap

In the first phase, paging was done by primary key ID to avoid the limit flip problem. Here we will first discuss why there is a big page turn problem in limit and how to optimize it.

As shown in the following SQL query, when the value of N is large, this SQL query is inefficient and even slows down the database when it is executed concurrently. This SQL query requires that N+M rows be returned to the table, then M rows are returned according to limit, and N rows in the previous query are discarded. Generally encountered in this case, the business is not allowed to turn the page, should be filtered according to the conditions, but circular products to take out all the data in batches, so circular products can not get around this problem.

SELECT * FROM table WHERE campaing_id = 1024 LIMIT N,M
Copy the code

Two solutions are summarized here. In order to obtain all the valid commodities, we do not need to consider the whole data paging, so we can separate the table paging processing.

Id and limit combination optimization

In an ideal case, id is continuously incrementing, and offset can be replaced by ID in the WHERE condition, as shown in the following SQL.

SELECT * FROM table WHERE campaing_id = 1024 and id > N LIMIT M
Copy the code

Use limit to query the next batch based on the id of the last page turning result. If the ID is not continuous, limit can skip many discontinuous ids and reduce the number of queries.

Based on the actual situation of circular products, this idea is used to obtain the minimum and maximum values of data distribution in the table through min and Max. For sparse data, the batching interval can be large (in order to solve the problem of too many tasks, for example, the interval is 2W, that is, end-start= 2W). Start and end are the start id and end ID of each batch of data respectively, and then use limit to fetch the next page data according to the ID as the where condition. Then use limit to fetch the next page data according to the maximum ID of the next page as the WHERE condition. The loop continues until ID >end. Maybe one or two times. The flow chart is shown in Figure 3.12 below.

Figure 3.12 Optimization idea of paging processing of circular products

coverMr.Lead optimization

When the SQL query is completely indexed, that is, the return parameters and query conditions have indexes, the performance of the overwrite index query is very high. Run the limit command to query the primary key ID and then query the corresponding data based on the primary key ID. Because no data is retrieved from disks, the limit method performs better than the previous one. The SQL is as follows:

SELECT * FROM table AS t1 
INNER JOIN (
  SELECT id FROM table WHERE campaing_id = 1024 LIMIT N,M 
) AS t2 ON t1.id = t2.id
Copy the code

Table ITEM_POOL_DETAIL_0733 contains 10953646 entries, After item_pool_id = 1129181 and status = -1 is used to filter 3865934 entries, item_pool_id and status establish a joint index.

In test 1, when offset is small, the SQL and execution plan are shown as follows. The average execution time is 83ms. It can be seen that the SQL performance is ok when offset is small. SQL:

SELECT *  FROM `item_pool_detail_0733` WHERE item_pool_id = 1129181 and status = -1   LIMIT 1000,100
Copy the code

Execution Plan:

id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE item_pool_detail_0733 ref idx_pool_status,idx_itempoolid idx_pool_status 12 const,const 5950397 100.00

In test 2, when offset is large, the SQL is shown as follows. The execution plan is the same as above, and the average execution time is 6371ms. At this time, the SQL performance is very poor. SQL:

SELECT *  FROM `item_pool_detail_0733` WHERE item_pool_id = 1129181 and status = -1  LIMIT 3860000,100
Copy the code

Test 3 now uses the idea of “overwrite index optimization” to optimize the SQL of Test 2. The SQL and execution plan are shown below. The average execution time of TEST 2 is 1262ms, which is more than 5 times higher than the execution time of 6371ms before optimization, but the execution time of more than 1s is also unacceptable for Cypin. In practice, separate tables distribute data evenly across all tables, so it is rare for a single table to have 300W more data left after filtering, so I then test the performance of the SQL with different data volumes. When LIMIT 2000000,100, the average execution time is 697ms. When LIMIT 1000000100, the average execution time is 390ms. When LIMIT 500000,100, the average execution time is 230ms. SQL:

SELECT * FROM `item_pool_detail_0733` as t1 INNER JOIN ( SELECT id FROM `item_pool_detail_0733` WHERE item_pool_id = 1129181 and status = -1 LIMIT 3860000100) as t2 on T1.id = t2.idCopy the code

Execution Plan:

id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 PRIMARY ALL 3860100 100.00
1 PRIMARY t1 eq_ref PRIMARY PRIMARY 8 t2.id 1 100.00
2 DERIVED item_pool_detail_0733 ref idx_pool_status,idx_itempoolid idx_pool_status 12 const,const 5950397 100.00 Using index

From the above test, it can be seen that the performance of 50W data queried by limit is ok. Moreover, the data source of circular product is divided into database and table according to the commodity ID. Therefore, the remaining data filtered according to the filtering conditions can almost be within 50W. If Cyclin uses this approach to optimize paging, the two shortcomings of the first phase can be completely solved, and the paging logic is much simpler than before.

3.5 Data Consistency Assurance

The data consistency guarantee solution fully reuses the cyclin framework and only needs three steps: first, add consistency check action in the action module; second, add scheduleX task of automatic output consistency check task; third, add custom consistency check method in the business processing module.

3.5.1 Added the Consistency Check action

Figure 3.13 shows the flow chart of the consistency check task.

Figure 3.13 Consistency check action processing flow chart

3.5.2 Automatic Output Consistency Check Task

Laps product currently valid product pool has exceeded 5000, one-time check if 5000 lap pool, then output the number of inspection tasks may be millions of, therefore, need a regular tasks to monitor the number of tasks unfinished tasks in the table, in the case of small number next to select several lap pool output consistency check task. The output of the consistency check task is still to reuse the cyclin rule change process.

3.5.3 User-defined Service consistency Check

The new service processing class diagram adds the custom consistencyCheck method consistencyCheck on the basis of the previous one. This method can be implemented for services requiring custom consistencyCheck.

Figure 3.14 New business processing class diagram

3.6 Performance Data

The performance of the card coupon commodity setting engine is mainly measured by three indicators: task scheduling throughput and commodity processing speed.

3.6.1 Task Scheduling Throughput

A task may contain several commodities or thousands of commodities, depending on the sparsity of data sources. When data is sparse, the number of tasks will be large, and the speed of circular commodities depends on the speed of task scheduling. Currently, the speed of task scheduling can reach 5W/min, which is not the maximum, and there is room for improvement.

3.6.2 Commodity processing speed

Affected by the downstream system, the processing speed of goods needs to be limited. Regardless of the service processing speed, the processing speed of goods can reach 6W TPS in theory.