1. Background

There is a batch task on line that reads yesterday’s data in batches, goes through a series of processes, and inserts it into today’s table. The table structure is as follows:

CREATE TABLE `detail_yyyyMMdd` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'primary key',
  `batch_no` varchar(64) NOT NULL COMMENT 'Lot Number',
  `order_id` varchar(64) NOT NULL COMMENT 'order ID',
  `user_id` varchar(64) NOT NULL COMMENT 'user ID',
  `status` varchar(4) NOT NULL COMMENT 'state',
  `product_id` varchar(32) NOT NULL COMMENT 'product ID'.PRIMARY KEY (`id`),
  KEY `idx_batchno_userid` (`batch_no`,`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='List';
Copy the code

Because of the large amount of data, the batch read yesterday, the use of paging query limit statement, query SQL is as follows:

SELECT id,batch_no,order_id,user_id,status,product_id FROM detail_yyyyMMdd WHERE batch_no='batch_type_yyyyMMdd LIMIT? ,? ;Copy the code

From a certain day, customer service frequently received customer complaints, feedback data is not updated.

2. Troubleshooting

Preliminary investigation, customer feedback data did not insert. After some service logs were added online, the result set of the second query was found to have duplicate data with the result set of the first query.

After communicating with THE DBA, we confirmed that a database change was made the day before the outbreak of customer complaints. In order to solve a slow query problem, the brother team added an index and changed the SQL as follows:

ALTER TABLE `detail_yyyyMMdd` ADD INDEX `idx_batchno_status_productid` (`batch_no`, `status`, `product_id`);
Copy the code

DBA analysis shows that the idx_batchNO_userID index is used in paging query before the SQL change goes online, and idx_batchNO_status_producTID is used after the SQL change goes online. However, a large number of repeated records exist in this index. As a result, the data in each paging query may be repeated with the previous one.

Why is it possible that when an index has duplicate records, the data in a paging query is the same as before? In the search on the net, post an official document: dev.mysql.com/doc/refman/…

If multiple rows have identical values in the ORDER BY columns, the server is free to return those rows in any order, and may do so differently depending on the overall execution plan. In other words, the sort order of those rows is nondeterministic with respect to the nonordered columns.

One factor that affects the execution plan is LIMIT, so an ORDER BY query with and without LIMIT may return rows in different orders.

If it is important to ensure the same row order with and without LIMIT, include additional columns in the ORDER BY clause to make the order deterministic.

In MySQL 5.6, the optimizer optimizes the order by limit statement by using a Priority queue. The Priority Queue allows you to maintain a heap of size N according to limit N, and only keep N records during sorting. However, heap sort is an unstable sorting algorithm, so when the sorted field values have duplicate, the returned data order may be different, one of the influencing factors is limit.

Solution: The query statement plus order by ID (to ensure that the sorting field is unique), live, the problem is resolved.

But two doubts remain:

  • The original index idx_batCHNO_userID also has duplicate records. Why has no problem been exposed?
  • Online queries can be read and paginated according to index order (index cluster table) without triggering priority queue optimization.

3. Locate the fault

In the first paging query, idx_batchNO_userID was removed, and in the second paging query, idx_batchno_status_productid was removed. In the second paging query, idx_batchno_status_productid was removed. The order of data for the two indexes is different, resulting in duplicate data for the two paging queries. Talking to dbAs, MySQL’s optimizer selects the best index based on cost, and the cost of the two indexes is not much different.

This conclusion can be reproduced offline, but not stably. Under the condition that the total result set does not change, the root cause of the two paging queries taking different indexes respectively needs to be further explored.

End

SQL > select * from ‘order by’ where ‘limit’ = ‘limit’;