SELECT column after GROUP BY

Standard SQL allows you to write only the following three items in the SELECT clause: aggregate keys specified BY the GROUP BY clause, aggregate functions (SUM, AVG, etc.), and constants. Let’s look at an example:

We have the student class table (TBL_student_class) with the following data:

DROP TABLE IF EXISTS tbl_student_class; CREATE TABLE tbl_student_class (id int(8) unsigned NOT NULL AUTO_INCREMENT COMMENT 'Increment primary key ', Sno varchar(12) NOT NULL COMMENT 'sno varchar ', cno varchar(5) NOT NULL COMMENT' sno varchar ', cname varchar(20) NOT NULL COMMENT 'sno varchar ', PRIMARY KEY (id)) COMMENT=' 表 示 '; -- ------------------------------ Records of tbl_student_class-- ----------------------------INSERT INTO Tbl_student_values ('1', '20190607001', '0607', 'tBL_student_class '); INSERT INTO Tbl_student_class VALUES ('2', '20190607002', '0607', 'tBL_student_class '); INSERT INTO TBL_student_class VALUES ('3', '20190608003', '0608', 'tBL_student_class '); INSERT INTO TBL_student_class VALUES ('4', '20190608004', '0608', 'tBL_student_class '); INSERT INTO TBL_student_class VALUES ('5', '20190609005', '0609', 'tBL_student_class '); INSERT INTO TBL_student_class VALUES ('6', '20190609006', '0609', 'tBL_student_class ');Copy the code

We want to count each class (class number, class name) a number, as well as the maximum number of students, we should write this query SQL. I think everyone can

SELECT cno,cname,count(sno),MAX(sno) FROM tbl_student_classGROUP BY cno,cname;
Copy the code

But some people will think, cNO and Cname is originally one-to-one, cNO once determined, cname is also determined, that SQL can not be written like this?

SELECT cno,cname,count(sno),MAX(sno) FROM tbl_student_classGROUP BY cno;
Copy the code

Execution error:

[Err] 1055 – Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column ‘test.tbl_student_class.cname’ which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

Note: The second expression in the SELECT list (cname) is not in the GROUP BY clause, and it is not an aggregate function; This is incompatible with the SQL schema: ONLY_FULL_GROUP_BY.

** Why can’t columns in the original table (not in the GROUP BY clause) be referenced directly after GROUP BY? ** Don’t worry, let’s look down slowly.

SQL mode

The MySQL server can run in different SQL modes and can apply these modes in different ways for different clients, depending on the value of the SQL_mode system variable. The DBA can set the global SQL mode to match the site server operation requirements, and each application can set its session SQL mode to its own requirements.

Schema affects the SQL syntax MySQL supports and the data validation checks it performs, which makes it easier to use MySQL in different environments and with other database servers. For more details, please refer to Server SQL Modes.

Note The content varies with the MySQL version (including the default value). Ensure that the value is consistent with your own MySQL version.

SQL schemas are divided into two types: syntax support and data check

Syntax support class

  • ONLY_FULL_GROUP_BY

For GROUP BY aggregations, the SQL is illegal if the columns in the SELECT, HAVING, or ORDER BY clause do not appear in GROUP BY

  • ANSI_QUOTES

With ANSI_QUOTES enabled, strings cannot be quoted in double quotes because they are interpreted as identifiers, acting like ‘. Update t set f1=””… Syntax error Unknown column ‘ ‘in field list

  • PIPES_AS_CONCAT

The | | as a connection string operator rather than the or operator, it is the same and the Oracle database, and string concatenation function CONCAT also similar ()

  • NO_TABLE_OPTIONS

Using SHOW CREATE TABLE does not output mysql-specific syntax sections such as ENGINE, which needs to be considered when migrating across DB classes using mysqldump

  • NO_AUTO_CREATE_USER

Literals do not automatically create users. When granting privileges to MySQL users, we use GRANT… ON… TO dbuser creates users along the way. Setting this option is similar to the Oracle operation in that users must be created before authorization

Data checking class

  • NO_ZERO_DATE

The date ‘0000-00-00’ is considered illegal depending on whether the following strict mode is set

1. If the strict mode is set, NO_ZERO_DATE automatically meets the requirement. But for INSERT IGNORE or UPDATE IGNORE, ‘0000-00-00’ is still allowed and only warning is displayed.

2. If NO_ZERO_DATE is set in non-strict mode, the effect is the same as above, ‘0000-00-00’ allows but displays warning; If NO_ZERO_DATE is not set, no warning is considered a perfectly valid value.

3. The NO_ZERO_IN_DATE situation is similar to the above, except that the control date and day can be 0, that is, 2010-01-00 is valid.

  • NO_ENGINE_SUBSTITUTION

If an ENGINE is specified using ALTER TABLE or CREATE TABLE, the required storage ENGINE is disabled or not compiled. How to handle this problem? When NO_ENGINE_SUBSTITUTION is enabled, the error is thrown directly; When this value is not set, CREATE replaces the default storage engine, ATLER does not change, and raises a warning

  • STRICT_TRANS_TABLES

Set it to enable strict mode. Note that STRICT_TRANS_TABLES is not a combination of several policies. STRICT_TRANS_TABLES refers solely to how to handle low or invalid values for inserts or updates:

< int > < int > < int > < int > < int > < int > < int >

2, Out Of Range, become insert maximum boundary value;

3. When a new row to be inserted does not contain a value for a non-null column whose definition does not have an explicit DEFAULT clause, the column is missing a value.

The default mode

MySQL has its own default mode when we don’t modify the configuration file; The default mode varies with the version

SELECT VERSION(); -- Check sql_modeSELECT @@sql_mode;Copy the code

We can see that the default mode for 5.7.21 contains:

ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_EN GINE_SUBSTITUTIONCopy the code

The first constraint is **ONLY_FULL_GROUP_BY: when we do aggregate queries, the SELECT column cannot directly contain columns that are not in the GROUP BY clause. ** What if we removed the mode (from “strict mode” to “loose mode”)?

We found that the above error reporting SQL

-- In loose mode, you can run SELECT cNO,cname,count(sno),MAX(sno) FROM tbl_student_classGROUP BY cNO;Copy the code

It works, but is not recommended in general. The online environment is usually “strict mode”, not “loose mode”. Although in the case, no matter in “strict mode” or “loose mode”, the result is correct, that is because CNO and Cname correspond only. If CNO and Cname do not correspond only, then the value of Cname in “loose mode” will be random, which will cause problems that are difficult to investigate. Anyone interested can try it. So why the ONLY_FULL_GROUP_BY mode? Let’s move on

order

Order is a concept used to distinguish the order of a set or predicate. In predicate logic, predicates are classified according to the order of the input values.

A one-row predicate such as = or BETWEEEN is called a first-order predicate, while a one-row set such as EXISTS is called a second-order predicate (HAVING an input value is also a set, but is not a predicate). Similarly, third-order predicates = predicates whose input value is “set of sets”, fourth-order predicates = predicates whose input value is “set of sets of sets”, but SQL does not happen above third-order, so don’t worry too much.

Make it simple as shown below

When we talk about order, we have to talk about set theory; Set theory is the foundation of THE SQL language, and because of this feature, SQL is also known as the set-oriented language. Only by thinking in terms of collections can you understand the power of SQL. Through the figure above, I believe you can also see, here is not to do more in-depth explanation, interested can go to the relevant information.

Why can’t columns in the original table be referenced after aggregation

Many people are aware of the limitations of aggregated queries, but few understand correctly why they exist. The cname in the table tBL_student_class stores the class information for each student.

However, it should be noted that cname here is only the attribute of each student, not the attribute of the GROUP, while GROUP BY is an aggregation operation, and the object of operation is a GROUP composed of multiple students. Therefore, the attribute of the GROUP can only be statistical attributes such as average or sum, as shown in the following figure

It is ok to ask for the Cname of each student, but it is meaningless to ask for the Cname of a group of students. For groups, there are only “how many students in total” or “What is the maximum student number?” That’s the kind of question that makes sense.

Imposing attributes that apply to individuals over groups is simply a classification error; GROUP BY is used to divide each element into several subsets. After GROUP BY is used, the SQL operation object is changed from the “row” of order 0 to the “set of rows” of order 1. In this case, the row attribute cannot be used.

The world of SQL is really a hierarchical society, and using the properties of lower-order concepts on higher-order concepts leads to disorder, which is not allowed. At this point, I’m sure you all understand why you can no longer refer to columns in the original table after aggregation.

A set of one elements is also a set

The current set theory considers the set of one element as a normal set. Single-element sets, like empty sets, are defined primarily to preserve theoretical integrity. Therefore, for SQL based on set theory, it is also necessary to strictly distinguish between elements and single-element collections. Thus, there is a very striking hierarchical difference between element A and the set {a}.

Indicates a {a}Copy the code

The difference between these two levels corresponds to the difference between the WHERE clause and the HAVING clause in SQL. The WHERE clause works with zero-order objects like “rows” and the HAVING clause works with first-order objects like “collections”.

conclusion

1. SQL strictly distinguishes hierarchy, including hierarchy in predicate logic (EXISTS) and hierarchy in set theory (GROUP BY);

2. With hierarchy, attributes that apply to individuals do not apply to groups, which is why the SELECT clause of an aggregated query cannot directly reference columns from the original table.

3. In general, the attributes of a collection of elements are the same as those of its unique elements. This kind of set containing only one element makes it seem unnecessary to treat it as a set. However, in order to maintain the integrity of the theory, we should strictly distinguish between elements and sets of single elements.

Source: cnblogs.com/youzhibing/p/11516154.html