1. Union execution process

First we create a table T1

create table t1(id int primary key, a int, b int, index(a));
delimiter ;;
create procedure idata()
begin
 declare i int;
 set i=1;
 while(i<=1000)do
 insert into t1 values(i, i, i);
 set i=i+1;
 end while;
end;;
delimiter ;
call idata();Copy the code

And then let’s execute this statement

explain select 1000 as f union (select id from t1 order by id desc limit 2)
Copy the code

(key=PRIMARY) (key=PRIMARY) (key=PRIMARY)

Using temporary in line 3 shows that a temporary table is used



Let’s look at the flow of this statement:

1. Create a temporary table with f as the primary key

2. Insert 1000 into temporary table

3. Steps in sub-query:

1. Insert 1000 into the temporary table. The insert failed because primary keys conflict

2. Insert the second line 900

4. Return the temporary table data as the result and delete the temporary table

The flow chart of this process is as follows:



If we change union to union all, we don’t need to use temporary tables, because union all is repeated and kept,

You can see that the “extra” column is not Using temporary

explain select 1000 as f union all (select id from t1 order by id desc limit 2)Copy the code



2. Group by Execution process

Let’s look at the following statement:

explain select id%10 as m, count(*) as c from t1 group by m;Copy the code



You can see the explain results

Using index(a); Using temporary; Using filesort

Group statistics by id%10 and sort by m

The execution process is as follows:

1. Create a temporary table where m is the primary key and c is the primary key

2. Calculate the result of id%10 as x

Insert (x,1) into (x,1) if there is no primary key (x,1) in the temporary table

4. After the traversal is complete, sort the traversal by the M field and return the result to the client

The flow chart is as follows



Let’s look at the result of this statement

explain select id%10 as m, count(*) as c from t1 group by mCopy the code



In fact, if we do not need to sort the query results, we can add an order by NULL

Let’s execute this statement

explain select id%10 as m, count(*) as c from t1 group by m order by nullCopy the code



You can see that there is no sorting here, because the scan starts from table T with id at 1, so the first row is 1

What happens if we execute the following statement?

If we change the size of the memory temporary table to smaller than the size of the data we want to query, then disk temporary table will be used. The default engine of disk temporary table is InnoDB

set tmp_table_size=1024;
select id%100 as m, count(*) as c from t1 group by m order by null limit 10
Copy the code

Group by optimization — direct sorting

Actually in the above about the temporary table from memory disk into a temporary table is a waste of time, that is to say, mysql, found in space is not enough, the execution of the disk into a temporary table, but if we tell mysql directly, I want to query data is very large, so the mysql optimizer would have thought that, since you told me that the data is very big, If sort_buffer is not large enough, I will use temporary tables on disk to help sort.

select SQL_BIG_RESULT id%100 as m, count(*) as c from t1 group by m;Copy the code



To summarize:

1. If we do not need to sort the statistical results, we can add order by NULL to save the sorting process.

2. Try to use memory temporary tables in the sorting process. You can avoid using disk temporary tables by increasing tmp_table_size appropriately.

3. If the amount of data is too large, use SQL_BIG_RESULT to tell the optimizer to use the sorting algorithm directly.