Problem version

MySQL 5.6

Problem of repetition

First, create a table with user information, id as the primary key, and other business fields.

CREATE TABLE `account_info` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT 'primary key ID',
  `seq` bigint(20) unsigned DEFAULT NULL COMMENT 'number',
  `userId` varchar(64) NOT NULL COMMENT 'user ID',
  `amount` decimal(10.3) unsigned NOT NULL COMMENT 'the balance'.PRIMARY KEY (`id`),
  KEY `I_PIN` (`userId`),
  KEY `I_SEQ` (`seq`)
) ENGINE=InnoDB AUTO_INCREMENT=13 DEFAULT CHARSET=utf8
Copy the code

Insert test data as follows,

  • Id, seqIt is the numeric type that maintains the increment difference
  • userIdIt is the character types that remain different
  • amountIt is numeric types that keep data consistent.



    Perform SQL query on page 1 dataSELECT * FROM account_info ORDER BY amount DESC LIMIT 0,5, the results are as follows:



    Perform SQL query on page 2 dataSELECT * FROM account_info ORDER BY amount DESC LIMIT 5,5, the results are as follows:



    So as you can see,id2, 3, 4, 5 are all present on page 1 and page 2.

Problem analysis

Sorting algorithm optimization

In MySQL 5.6, the optimizer uses the Priority queue when it encounters an order by x limit m,n statement.

The main purpose of the priority queue is to keep n records in order when index ordering is not possible and limit n is used. However, only a small amount of memory is needed to complete the sorting, which means that the priority queue only needs the final return limit N.

Heap sort is unstable

The reason for the problem is that the priority queue uses heap sort, which is an unstable sort, meaning that the same values can be sorted and read in a different order.

Sorting scene

Do not use ordering of order by conditions

SELECT * FROM account_info LIMIT 0.5
Copy the code



Do not useorder byThis parameter is used by defaultA primary keySo the query pages are also ordered and there is no duplication of data

Sort conditional pages using order by

Use order fields

There is no paging duplication of data by using unique indexes and sorting data fields without duplicating them

SELECT * FROM account_info ORDER BY userId LIMIT 0.5
Copy the code

Use non-ordered fields

When sorting with duplicate data fields, paging duplicate data occurs

SELECT * FROM account_info ORDER BY amount LIMIT 0.5
Copy the code

conclusion

The sorting criteria for summary paging are as follows:

Paging sort field Whether the sort field has unique data Whether the sort fields are in order Paging duplicate data
A primary key is is no
The only index is no no
Common field no is is
Common field no no is
Common field is is no
Common field is no no

Whether can get the conclusion is that the paging duplicate data related to sort field data and uniqueness, has nothing to do with the sort field is orderly, in other words, as long as the sort of field data to guarantee uniqueness, such as the primary key, unique index, don’t repeat the common field), the page will not duplicate data, otherwise it will likely repeat data in different pages.

The solution

Sort using or combining fields that are unique to the data

SELECT * FROM account_info order by id LIMIT 0.5
SELECT * FROM account_info order by amount,id LIMIT 0.5
Copy the code

Combined with the use of data-unique fields, the problem of paging data duplication can be solved by turning an otherwise non-unique sort condition into a combinatorial unique sort condition

reference

MySQL > limit+ ORDER by limit+order BY