In applications with paging queries, queries that include LIMIT and OFFSET are common, and almost every one will have an ORDER BY clause. Using index sort can be very helpful for performance optimization, otherwise the server will need to do a lot of file sorting.

A high frequency problem is that the offset value is too large. A query like LIMIT 10000, 20 will produce 10020 rows and discard the previous 10000 rows, which is expensive. Given that all pages use the same frequency of access, such a query will scan half of the tables on average. To optimize them, you can limit the maximum number of pages that can be accessed in a paging view, or make cheap queries more efficient.

A simple trick to improve performance is to perform queries on overridden indexes rather than entire rows. You can join the result once with the complete row and then get the additional columns you need. This is more efficient, as in the following query:

SELECT film_id, description FROM sakila.film ORDER BY title LIMIT 50.5;
Copy the code

If the table is large, it can be tuned as follows:

SELECT film.film_id, film.description
FROM sakila.film
	INNER JOIN (
    SELECT film_id FROM sakila.film
    ORDER BY title LIMIT 50.5))as lim USING(film_id);
Copy the code

This “inferential federated query” works because it uses indexes to reduce the number of rows a server has to access to examine the data. Once the rows required by the review are found, they are queried jointly with the rows of the corresponding table to get the other columns of the corresponding row.

Limit can sometimes be converted to a fixed-position query, which allows for a range scan of an index. For example, if you precompute a fixed-position column called position, you can rewrite the query as follows:

SELECT film_id, description FROM sakila.film
WHERE position BETWEEN 50 AND 54 ORDER BY position;
Copy the code

Sorted data can be resolved in a similar manner, but is usually affected BY the GROUP BY operation. In most cases, you need to calculate and store the sort values ahead of time.

The real problem with LIMIT and OFFSET is at OFFSET, which means the server throws away a lot of rows. If an ordered bookmark is used to record the location of the next row retrieved, subsequent data can be accessed from the last location. For example, if you need to page the rentals back from the most recent rentals, you can rely on the record’s primary key increasing all the time, so you can query the first page of data like this:

SELECT * FROM sakila.rental
ORDER BY rental_id DESC LIMIT 20;
Copy the code

This query returns data between 16,049 and 16,030. Subsequent queries can start at the previous end:

SELECT * FROM sakila.rental
WHERE rental_id < 16030 
ORDER BY rental_id DESC LIMIT 20;
Copy the code

This technique works no matter how far you start the query from the offset.

Other techniques include using pre-computed statistics or querying by combining redundant primary keys and ordered tables, both of which improve query efficiency by buying space for time.