SQL Grouping solves the problem of aggregate and subtotal in OLAP scenarios. The syntax is grouped into several classes, but the problem is the same:

ROLLUP and CUBE are GROUPING SETS that encapsulate the rules. GROUPING SETS are the original rules.

To make it easier to understand, let’s start with one problem and work our way up.

At the end of table

The above is the bottom table of the example. There are 8 pieces of data, including city 1 and City 2. There are regions 1 ~ 4 below, and each piece of data has the population of the data.

Now I want to calculate the total population, and the subtotal population of each city. Before learning the grouping syntax, we could only use two select statements union:

SELECT city, sum(people) FROM test GROUP BY city
union
SELECT 'together' as city, sum(people) FROM test
Copy the code

The GROUPING SETS syntax is used to solve this problem.

GROUPING SETS

GROUPING BY GROUPING SETS allows you to specify any GROUPING term. For example, if you want to calculate the GROUPING sum and the GROUPING sum at the same time, you need to sum BY GROUPING BY empty contents and sum BY GROUPING BY city. GROUPING SETS describes the GROUPING SETS

SELECT 
city, area,
sum(people)
FROM test
GROUP BY GROUPING SETS((), (city, area))
Copy the code

GROUPING SETS((), (city, area)) represents the GROUPING of () and (city, area). The return result is:

As you can see, rows with a value of NULL are the totals we want, and their values are computed without any GROUP BY constraints.

Similarly, GROUPING SETS((), (city), (city, area), (area)) can be set in any GROUPING.

The data calculated by this rule is called a “super grouping record”. SQL provides the GROUPING function to solve the problem of confusing NULL values with real NULL values.

The GROUPING function

The GROUPING() function identifies NULL as 1 for super-grouping records:

SELECT 
GROUPING(city),
GROUPING(area),
sum(people)
FROM test
GROUP BY GROUPING SETS((), (city, area))
Copy the code

See the following figure for specific effects:

As can be seen, all fields calculated by super grouping will be identified as 1. We use the SQL CASE expression we learned before to convert it into total and subtotal words, and then we can get a data analysis table:

SELECT 
CASE WHEN GROUPING(city) = 1 THEN 'total' ELSE city END.CASE WHEN GROUPING(area) = 1 THEN 'subtotal' ELSE area END.sum(people)
FROM test
GROUP BY GROUPING SETS((), (city, area))
Copy the code

Then, when the front table is displayed, the first row of “total” and “subtotal” cells are merged into “total” to complete the total BI visual analysis function.

ROLLUP

A ROLLUP is a GROUPING set of a particular set of rules. This is equivalent:

SELECT sum(people) FROM test
GROUP BY ROLLUP(city)

- equivalent to the
SELECT sum(people) FROM test
GROUP BY GROUPING SETS((), (city))
Copy the code

Here’s another set of equivalent descriptions:

SELECT sum(people) FROM test
GROUP BY ROLLUP(city, area)

- equivalent to the
SELECT sum(people) FROM test
GROUP BY GROUPING SETS((), (city), (city, area))
Copy the code

See the pattern? ROLLUP “rolls up” the GROUP BY contents in order. Using the GROUPING function is also true for ROLLUP.

CUBE

CUBE is different in that it expands on all possibilities (hence the name CUBE).

By analogy with the above example, let’s write two more equivalent expansions:

SELECT sum(people) FROM test
GROUP BY CUBE(city)

- equivalent to the
SELECT sum(people) FROM test
GROUP BY GROUPING SETS((), (city))
Copy the code

Because only one item can not be seen in the above example, the following two items can be seen by grouping:

SELECT sum(people) FROM test
GROUP BY CUBE(city, area)

- equivalent to the
SELECT sum(people) FROM test
GROUP BY GROUPING SETS((), (city), (area), (city, area))
Copy the code

The so-called CUBE is a description of multidimensional shape, which has 2^1 expansion in two dimensions, 2^2 expansion in three dimensions, four dimensions, five dimensions and so on. As you can imagine, complexity explodes if you use CUBE to describe many combinations.

conclusion

Learning GROUPING grammar will save you from this problem:

The total and subtotal of the product have been opened. Should we take an extra number or get them together?

The standard answer and principle of this question are contained in this article. PS: For databases that do not support GROUPING syntax, block them like polyfill, a demotion scheme. As for how to mask this, refer to the two SELECT + unions mentioned at the beginning of this article.

The discussion is at SQL Grouping · Issue #406 · Ascoders /weekly

If you’d like to participate in the discussion, pleaseClick here to, with a new theme every week, released on weekends or Mondays. Front end Intensive Reading – Helps you filter the right content.

Copyright Notice: Freely reproduced – Non-commercial – Non-derivative – Remain signed (Creative Commons 3.0 License)