Does in or exists have better performance
SQL script:
/* create database testdb6; use testdb6; /* user table */ drop table if exists users; create table users( id int primary key auto_increment, name varchar(20) ); insert into users(name) values ('A'); insert into users(name) values ('B'); insert into users(name) values ('C'); insert into users(name) values ('D'); insert into users(name) values ('E'); insert into users(name) values ('F'); insert into users(name) values ('G'); insert into users(name) values ('H'); insert into users(name) values ('I'); insert into users(name) values ('J'); /* drop table if exists orders; Create table Orders (id int primary key auto_increment,/* order id*/ order_no varchar(20) not null,/* order id*/ title varchar(20) Not null,/* order title */ goods_num int not null,/* order quantity */ money decimal(7,4) not null,/* order amount */ user_id int not null Engine =myisam default charset=utf8; delimiter ? drop procedure batch_orders ? /* Stored procedure */ create procedure batch_Orders (in Max int) begin declare start int default 0; declare i int default 0; set autocommit = 0; while i < max do set i = i + 1; insert into orders(order_no,title,goods_num,money,user_id) values (concat('NCS-',floor(1 + rand()*1000000000000 )), concat (' order title - 'I), 50, I % (100.0000 + (I % 50)), I % 10);   end while; commit; end ? delimiter ; /* Insert 10 million orders */ call batch_orders(10000000); /* The process of inserting data takes 3 minutes to 10 minutes, depending on the performance of the machineCopy the code
SQL > select * from orders where user_id (orders) and users (users) is present; therefore, we use the ids of user_id and user (users) in and exists.
The results of
1. Where is a small table
(1) select count(1) from users where no.
(1) select count(1) from users where name = no; (2) select count(1) from users where name = no;
2. Where is a large table
(1) select count(1) from users where u.id (select o.us_id from orders o);
(2) select count(1) from users u where exists (select 1 from users o where o.usser_id = U. ID);
Analysis of the
We use the following two statements to analyze:
select count(1) from orders o where o.user_id in(select u.id from users u);
select count(1) from orders o where exists (select 1 from users u where u.id = o.user_id);
Copy the code
Mysql > select orders from users where user_id = ‘id’; mysql > select orders from users where user_id = ‘id’; mysql > select orders from users where id = ‘id’;
That is, querying the Users table is the outer loop, and the main query is the outer loop
Summary: In executes the subquery first, which is the statement contained in(). After the subquery queries the data, the previous query is divided into N ordinary queries (n indicates the number of rows of data returned in the subquery).
Order_id = order_ID; order_ID = users; order_ID = users; order_ID = order_ID; order_ID = users
The outer and inner loops are what we call nested loops, and nested loops should follow the “small on the outside, big on the inside” principle, which is the difference between copying a lot of small files and a few big files
Summary: If the subquery finds data, return true; If not, return a Boolean value false. Return true to save the data, otherwise discard it. An EXISTS query executes a subquery when a data is queried
conclusion
Small tables drive large tables.
In applies to a situation where the outer surface is large and the inner surface is small; exists applies to a situation where the outer surface is small and the inner surface is large.