I’ve been working on MySQL for a quarter, and I’m ruined by this SQL question

Today, my friend sent a SQL question in the group. I was confused and had no idea for a long time. I have been on the hook for the whole Q2 of MySQL, looking at various index optimizations, MVCC, locks, and B+ trees. “

Topic:

I have a table BThe fields include: user ID, year, month, please query the user ID of each month in 2020.Copy the code

Whether or not such a table would be used in any scenario is not discussed. Please use SQL to implement the above problem

When I saw it, the first thing that came to my mind was to use group by

select user_id from b where year=2020 group by user_id having count(month)=12;
Copy the code

I did not build a table to test, and I am not sure if it is correct

According to the usual thinking of writing business code, obtain the user ID of each month, and then take the intersection, find that there is no way to go.

So step by step, if you know what a SQL execution process is like each step, then there will be difficult SQL to write?

SQL query execution sequence in detail. SQL query execution sequence in detail

Create a test table with the following structure:

CREATE TABLE `b` (
  `user_id` int(11) NOT NULL,
  `month` int(10) DEFAULT NULL,
  `year` int(10) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Copy the code

Then build some test data:

insert into b ,2,2020,1,2020 values (1), (1), (1,3,2020),,4,2020 (1), (1,5,2020),,6,2020 (1), (1,7,2020),,8,2020 (1), (1,9,2020),,10,2020 (1), (1, 11202 0), (1,12,2020),,1,2020 (1);insert into b ,2,2020,1,2020 values (2), (2), (2),3,2020,,4,2020 (2), (2),5,2020,,6,2020 (2), (2,7,2020),,8,2020 (2), (2,9,2020),,10,2020 (2), (2, 11202 0), (2,12,2020);insert into b ,1,2020 values (6), (6,2,2020),,3,2020 (6), (6,4,2020),,5,2020 (6), (6,6,2020),,7,2020 (6), (6,8,2020),,9,2020 (6), (6,10,2020), (6, 11202 0), (6,12,2020);insert into b ,1,2020 values (10), (10,2,2020), (10,3,2020), (10,4,2020), (10,5,2020), (10,6,2020), (10,7,2020), (10,8,2020), (10,9,2020), (10, 10 , 2020), (10,11,2020), (10,12,2020);insert into b ,1,2020 values (25), (25,2,2020), (25,3,2020), (25,4,2020), (25,5,2020), (25,6,2020), (25,7,2020), (25,8,2020), (25,9,2020), (25, 10 , 2020), (25,11,2020), (25,12,2020);insert into b 66,2,2020,1,2020 values (66), (), (66,6,2020),,4,2020 (66), (66,5,2020),,6,2020 (66), (66,7,2020), (66,8,2020),,9,2020 (66), (66, 10 , 2020), (66,11,2020),,12,2020 (66);Insert into b values,1,2020 (7), (7,2,2020), (7,4,2020), (7,5,2020), (7,7,2020), (7,9,2020), (7,10,2020), (7,11,2020);Insert into b values,1,2020 (12), (12,4,2020), (12,5,2020), (12,7,2020), (12,9,2020), (12,10,2020), (12,11,2020);Insert into b values,1,2019 (12), (12,4,2019), (12,5,2019), (12,7,2019), (12,9,2019), (12,10,2019), (12,11,2019);Copy the code

The first step is to start with the simple one. I will check out the one whose year is 2020. However, since there may be duplicate data, I will go to duplicate data

select DISTINCT month,user_id from b where year=2020;
Copy the code

If you use DISTINCT, you must put it first. I get an error if I put user_id first

select user_id,DISTINCT month from b where year=2020;
Copy the code

SQL > execute SQL > execute SQL > execute SQL > execute SQL > execute SQL


Let’s assume we call this query result Table 1

When I saw this result, I thought, why don’t I just group by user_id and count the number? Since every month in 2020 is recorded, if a user has 12 records in Table 1, that’s the user I’m looking for

Select user_id from table 1 as c group by c. usser_id having count(month)=12;Copy the code

OK, so that’s pretty much it. Table 1 is a subquery

select user_id from (select DISTINCT month,user_id from b where year=2020) as c group by c.user_id having count(month)=12;
Copy the code

The execution result

The SQL THAT I originally thought of was clearly wrong and did not allow for repetition

Here’s a little tidbit

Select * from table group by ID. Select * from table group by ID. Select * from table group by ID

This may require an understanding of how group by is executed

Suppose we have a test table with the following records:

Remember to table1

Execute the following SQL to get Table2

SELECT name FROM test GROUP BY name;
Copy the code

SQL > select count (‘ test ‘, ‘group by’, ‘group by’, ‘group by’);

So what does MySQL do when it executes group by? You can imagine MySQL generating a virtual table like the following when executing group by

On the right is the virtual table generated after executing group by name, if we do not select name, but select *

A relational database is based on relationships, so you can’t have multiple values in a cell

  • If we look at the name column, there’s only one data per cell, so if we select name, there’s no problem. Why does the name column have only one value per cell? Because we’re using the name column to group by
  • What if there is more than one cell inside id and number? The answer is to use aggregate functions, which are used to input multiple data points and output one data point. For example, cout(ID),sum(number), and the input of each aggregate function is each multi-data cell
  • Select name,sum(number) from test group by name,sum(number) from test group by name,sum(number) from test group by name That is 2+3, return 5, the final result is as follows:

Group by performing principle reference: https://blog.csdn.net/qq403580298/article/details/90756352

It is the core competitiveness of a technician to seek invariance in the rapidly changing technology. Unity of knowledge and practice, theory and practice