The query to be optimized looks something like this,
SELECT
COL_B,
MAX( CASE COL_A WHEN 'a' THEN 1 ELSE 0 END ) AS 'a',
MAX( CASE COL_A WHEN 'b' THEN 1 ELSE 0 END ) AS 'b',
MAX( CASE COL_A WHEN 'c' THEN 1 ELSE 0 END ) AS 'c'
FROM
Table
GROUP BY
COL_B
ORDER BY
COL_C DESC
LIMIT 10
Copy the code
A normal index is created on COL_C. The execution time of this SQL is approximately 3s.
First optimization
I’m using paging here, so the first step is to unpack the paging SQL. Split into two SQL
- paging
SELECT
COL_B
FROM
Table
GROUP BY
COL_B
ORDER BY
COL_C DESC
LIMIT 10
Copy the code
This SQL looks up the COL_B column value and then uses IN to include the COL_B column value IN the SQL where I really need to query the data. This SQL takes about 1.4s to execute.
- The query
SELECT
COL_B,
MAX( CASE COL_A WHEN 'a' THEN 1 ELSE 0 END ) AS 'a',
MAX( CASE COL_A WHEN 'b' THEN 1 ELSE 0 END ) AS 'b',
MAX( CASE COL_A WHEN 'c' THEN 1 ELSE 0 END ) AS 'c'
FROM
Table
WHERE
COL_B IN (...)
GROUP BY
COL_B
ORDER BY
COL_C DESC
Copy the code
The query time is reduced to 0.5 seconds. After indexing COL_B, it is shortened to 0.1s.
After the first optimization, I can accept the speed of 2, but I still think the speed of 1 is too slow.
Second optimization
SQL > select * from ‘EXPLAIN’ where Extra has a Using filesort. Decided to use index overwrite, replacing the original COL_C index with COL_C,COL_B’s multi-column index. GROUP BY uses COL_B, so we can’t use COL_C. COL_C is to the left of COL_B.
So instead of GROUP BY, I can use DISTINCT
SELECT
DISTINCT
COL_B
FROM
Table
ORDER BY
COL_C DESC
LIMIT 10
Copy the code
This change time is shortened from 1.4 seconds to 0.1 seconds.
An SQL query that used to take 3s was optimized to two 0.1s OF SQL.