Use case when in MySQL

preface

The case keyword is used in two places in MySQL:

Flow Control Functions – CASE Operator
Flow Control Statements – CASE Statement

You cannot have an ELSE NULL clause in a CASE Statement and END it with END CASE, not END.

CASE statements are used in compound statements, such as stored procedures. CASE Operator is used as a function in a single statement.

This article focuses on the use of CASE Operator.

The basic syntax of case when

The first use:

CASE value
    WHEN compare_value THEN result
    [WHEN compare_value THEN result. ] [ELSE result]
END
Copy the code

Second usage:

CASE
    WHEN condition THEN result
    [WHEN condition THEN result. ] [ELSE result]
END
Copy the code

The difference between the two usages:

The first CASE syntax returns the result of the first branch where value=compare_value is true.

The second CASE syntax returns the result of the first branch where condition is true.

If no value=compare_value or condition is true, the ELSE is returned, or NULL is returned if there is no ELSE branch.

Case when

There must be no intersection between branches

This function is executed sequentially, with no overlap between conditions; It’s not that MySQL syntactically doesn’t allow intersections, but rather that once a successful match has been made the other branches don’t execute. If the logical relationship is not sorted out, the result of the query may not be the expected result.

The judgment of the NULL

The first use of CASE is to determine whether a field or expression is NULL.

Wrong way to write:

SELECT
    CASE(` ` field|'expression')WHEN NULL THEN 'Result false'
        ELSE 'Result is true'
    END
FROM `table_name`
Copy the code

The correct way to write this is:

SELECT
    CASE(` ` field|'expression')IS NULL
        WHEN TRUE THEN 'Result is true'
        ELSE 'Result false'
    END
FROM `table_name`
Copy the code

= = = = = = = = = = = = = = =

The default value problem

It is interesting to see this usage in this blog post on mysql Case When.

Statement 1:

UPDATE categories
SET
    display_order = CASE id 
                        WHEN 1 THEN 3 
                        WHEN 2 THEN 4 
                        WHEN 3 THEN 5 
                    END;
Copy the code

Statement 2:

UPDATE categories
SET
    display_order = CASE id 
                        WHEN 1 THEN 3 
                        WHEN 2 THEN 4 
                        WHEN 3 THEN 5 
                    END
WHERE
    id IN (1.2.3);
Copy the code

If you do not restrict ids with a WHERE statement, statement 1 sets the display_order field to NULL for all records whose ids are not 1, 2, or 3.

The value type returned by the branch can be inconsistent

SELECT
    CASE
        WHEN 5 % 3 = 0 THEN"Case 1"WHEN 5 % 3 = 1 THEN"Case 2"ELSE 12
    END AS result;
Copy the code

This SQL statement is executed without error on the DataGrip, nor is it followed by a table name query. The return type of the branch is inconsistent. Should I not report an error?

I then executed the SQL statement with the JdbcTemplate and found that there was no error, ELSE branch value was converted to a string.

Sure enough MySQL is not rigorous ah!

At this point, I went back and took a closer look at the MySQL documentation and found that it was quite detailed about this situation.

Here’s a quick translation:

CASE functions return values that can be aggregated types of all resultant value types:

If all values are of numeric type, then the aggregate type is also numeric:
- If at least one of the values is double, the result type is double.
- Otherwise, if at least one of the values isDECIMAL, then the type of result isDECIMAL.
- .
.
For all other types of combinations, the result isVARCHARType.
When a type is merged, it is ignoredNULLThe type to which the value belongs.

There are too many merge cases in the middle to list for space reasons. If you are interested, please go to operate_case.

Case when usage scenarios

Translate the field meaning based on the condition
Transfer line column

Field transformation

SELECT
    name 'name',
    age 'age'.CASE
        WHEN age < 18 THEN 'young'
        WHEN age < 30 THEN 'young'
        WHEN age > = 30 AND age < 50 THEN 'middle'
        ELSE 'older'
    END 'Age group'
FROM
    user_info;
Copy the code

A statement outputs multiple indicators

How many boys, how many girls, and how many of the boys pass, how many of the girls pass

The table structure is as follows: STU_SEX field 0 indicates male and 1 indicates female.

STU_CODE	STU_NAME	STU_SEX	STU_SCORE
XM	Xiao Ming	0	88
XL	Little lei	0	55
XF	Xiao feng	0	45
XH	The little red	1	66
XN	XiaoNi	1	77
XY	Small Iraq	1	99

SELECT 
    SUM (CASE WHEN STU_SEX = 0 THEN 1 ELSE 0 END) AS MALE_COUNT,
    SUM (CASE WHEN STU_SEX = 1 THEN 1 ELSE 0 END) AS FEMALE_COUNT,
    SUM (CASE WHEN STU_SCORE > = 60 AND STU_SEX = 0 THEN 1 ELSE 0 END) AS MALE_PASS,
    SUM (CASE WHEN STU_SCORE > = 60 AND STU_SEX = 1 THEN 1 ELSE 0 END) AS FEMALE_PASS
FROM 
    THTF_STUDENTS
Copy the code

The table structure in this example is not very reasonable: name, gender, and score are all in the same table; However, I rarely see “sum” used with “case”. “sum” is usually used with “group”.

Transfer line column

Show sales horizontally by month
Horizontal display of grades by subject

CASE WHEN alone cannot convert rows to columns, and clauses such as SUM and GROUP BY need to be used.

Statistical results of all subjects

SELECT
    st.stu_id 'student id',
    st.stu_name 'name'.sum(CASE co.course_name WHEN 'College Chinese' THEN sc.scores ELSE 0 END ) 'College Chinese'.sum(CASE co.course_name WHEN New Horizons English THEN sc.scores ELSE 0 END ) New Horizons English.sum(CASE co.course_name WHEN 'Discrete mathematics' THEN sc.scores ELSE 0 END ) 'Discrete mathematics'.sum(CASE co.course_name WHEN 'Probability theory' THEN sc.scores ELSE 0 END ) 'Probability theory'.sum(CASE co.course_name WHEN 'Linear algebra' THEN sc.scores ELSE 0 END ) 'Linear algebra'.sum(CASE co.course_name WHEN 'Advanced Mathematics' THEN sc.scores ELSE 0 END ) 'Advanced Mathematics'
FROM
    edu_student st
    LEFT JOIN edu_score sc ON st.stu_id = sc.stu_id
    LEFT JOIN edu_courses co ON co.course_no = sc.course_no
GROUP BY
    st.stu_id
ORDER BY
    NULL;
Copy the code

The sum of each department’s monthly performance

SELECT
    t1.dep,
    t2.depname,
    SUM(CASE mon WHEN '一月' THEN yj ELSE 0 END) ASJanuarySUM(CASE mon WHEN '一月' THEN yj ELSE 0 END) ASJanuarySUM(CASE mon WHEN '一月' THEN yj ELSE 0 END) ASJanuaryFROM
    table_1 t1
    LEFT JOIN table_2 t2 ON t1.dep = t2.dep
GROUP BY
    t1.dep;
Copy the code

SQL optimization

Examples of optimized statistical analysis

SQL > select * from sum case when;

SELECT 
    (
         SELECT SUM(total_fee) 
         FROM mall_order SS 
         WHERE SS.create_time = S.create_time AND SS.payment_method = 1
    ) AS 'zhifubaoTotalOrderAmount',
    (
        SELECT COUNT(*)
        FROM mall_order SS
        WHERE SS.create_time = S.create_time AND SS.payment_method = 1
    ) AS 'zhifubaoTotalOrderNum',
    (
        SELECT SUM(total_fee)
        FROM mall_order SS
        WHERE SS.create_time = S.create_time AND SS.payment_method = 2
    ) AS 'weixinTotalOrderAmount',
    (
        SELECT COUNT(*)
        FROM mall_order SS
        WHERE SS.create_time = S.create_time AND SS.payment_method = 2
    ) AS 'weixinTotalOrderNum'
 FROM mall_order S
 WHERE S.create_time > '2016-05-01' AND S.create_time < '2016-08-01' 
 GROUP BY
     S.create_time
 ORDER BY
     S.create_time ASC;

Copy the code

Execution: 50W pieces of data, about 10s; Full table scan, 4 DEPENDENT SUBQUERY, DEPENDENT on external query.

After using sum case when:

SELECT
    S.create_time,
    sum(case when S.payment_method =1 then 1 else 0 end) as 'zhifubaoOrderNum'.sum(case when S.payment_method =1 then total_fee else 0 end) as 'zhifubaoOrderAmount'.sum(case when S.payment_method =2 then 1 else 0 end) as 'weixinOrderNum'.sum(case when S.payment_method =2 then total_fee else 0 end) as 'weixinOrderAmount'
FROM
    mall_order S
WHERE
    S.create_time > '2015-05-01' and S.create_time < '2016-08-01' 
GROUP BY
    S.create_time
ORDER BY
    S.create_time asc;
Copy the code

Execution: Scan 50W pieces of data in the whole table for about 1s; Iterate through the entire table once to get the result.

Another example of optimization

SQL > alter TABLE SQL > alter table SQL

SELECT
    uid,
    sum(power) powerup
FROM t1 
WHERE
    date> ='2017-03-31' AND
    UNIX_TIMESTAMP(STR_TO_DATE(concat(date.' '.hour),'%Y-%m-%d %H'))> =1490965200 AND
    UNIX_TIMESTAMP(STR_TO_DATE(concat(date.' '.hour),'%Y-%m-%d %H'))<1492174801 AND
    aType in (1.6.9)
GROUP BY
    uid;
Copy the code

When the table is designed, the date and hour in the date and time are separated into two columns, and then merged into a new condition in the query. This results in a very inefficient SQL with full table scans, no indexes, temporary tables, and extra sorting.

Optimized SQL:

SELECT
    uid,
    sum(powerup+powerup1)
FROM
    (
        SELECT uid,
        CASE
            WHEN concat(date.' '.hour) > ='the 2017-03-24 13:00' THEN power ELSE '0' 
        END AS powerup,

        CASE
            WHEN concat(date.' '.hour) < 'the 2017-03-25 13:00' THEN power ELSE '0'
        END AS powerup1
        
        FROM t1
        WHERE date > = '2017-03-24' AND date AND aType in (1.6.9)
    ) a 
GROUP BY
    uid;
Copy the code

With case when optimization, the original index on date can be used.

conclusion

I don’t like using case WHEN in SQL statements in business code for two reasons:

Not very readable
Poor maintainability

But when you do statistical analysis, you use these functions and say, wow!

reference

Dev.mysql.com/doc/refman/…
Dev.mysql.com/doc/refman/…
www.cnblogs.com/echojson/p/…
Blog.csdn.net/qq_16142851…
Blog.csdn.net/u013514928/…
www.cnblogs.com/chenduzizho…
www.cnblogs.com/echojson/p/…
Blog.csdn.net/qq_16142851…
My.oschina.net/u/1187675/b…
Blog.csdn.net/weixin_3246…
Blog.csdn.net/rongtaoup/a…