The query to be optimized looks something like this,

SELECT
	COL_B,
	MAX( CASE COL_A WHEN 'a' THEN 1 ELSE 0 END ) AS 'a',
	MAX( CASE COL_A WHEN 'b' THEN 1 ELSE 0 END ) AS 'b',
	MAX( CASE COL_A WHEN 'c' THEN 1 ELSE 0 END ) AS 'c' 
FROM
	Table 
GROUP BY
	COL_B 
ORDER BY 
    COL_C DESC
LIMIT 10
Copy the code

A normal index is created on COL_C. The execution time of this SQL is approximately 3s.

First optimization

I’m using paging here, so the first step is to unpack the paging SQL. Split into two SQL

  1. paging
SELECT
	COL_B 
FROM
	Table
GROUP BY
    COL_B
ORDER BY
	COL_C DESC
LIMIT 10
Copy the code

This SQL looks up the COL_B column value and then uses IN to include the COL_B column value IN the SQL where I really need to query the data. This SQL takes about 1.4s to execute.

  1. The query
SELECT
	COL_B,
	MAX( CASE COL_A WHEN 'a' THEN 1 ELSE 0 END ) AS 'a',
	MAX( CASE COL_A WHEN 'b' THEN 1 ELSE 0 END ) AS 'b',
	MAX( CASE COL_A WHEN 'c' THEN 1 ELSE 0 END ) AS 'c'
FROM
	Table 
WHERE
	COL_B IN (...) 
GROUP BY
	COL_B
ORDER BY 
    COL_C DESC
Copy the code

The query time is reduced to 0.5 seconds. After indexing COL_B, it is shortened to 0.1s.

After the first optimization, I can accept the speed of 2, but I still think the speed of 1 is too slow.

Second optimization

SQL > select * from ‘EXPLAIN’ where Extra has a Using filesort. Decided to use index overwrite, replacing the original COL_C index with COL_C,COL_B’s multi-column index. GROUP BY uses COL_B, so we can’t use COL_C. COL_C is to the left of COL_B.

So instead of GROUP BY, I can use DISTINCT

SELECT
    DISTINCT
	COL_B 
FROM
	Table
ORDER BY
	COL_C DESC
LIMIT 10
Copy the code

This change time is shortened from 1.4 seconds to 0.1 seconds.

An SQL query that used to take 3s was optimized to two 0.1s OF SQL.