A few days ago, I saw a mysql interview question from an Internet company. I think it is very useful in practical application. Now I will share it with you.

The title

There are three students whose Linux, MySQL and Java scores are in the MySQL data table, and the fields (ID), (student name), (course) and (Score) in the table are as follows:

id student_name course score
1 Zhang SAN linux 85
2 Zhang SAN mysql 98
3 Zhang SAN python 90
4 Li si linux 72
5 Li si mysql 90
6 Li si python 100
7 Cathy linux 85
8 Cathy mysql 88
9 Cathy python 87
SQL > alter table from table to table
student_name linux mysql python
: — — — — — : – : : – : — — — — — :
Zhang SAN 85 98 90
Li si 72 90 100
Cathy 85 88 87

title

The first thing THAT came to my mind was: column to row, PivotTable in Excel. I thought I could do this.

However, after thinking for a while, we could not get the desired effect by aggregating group by, so what should we do? The previous situation of processing mysql data for PivotTable was implemented in Python code, or directly in Excel table.

With a separate SQL statement, it’s a real hassle. But it can be done.

So how do we do that

Mysql > Mysql > mysql

The following data captures:To get the effect of a PivotTable, use the case when then end statement

SELECT student_name,
CASE WHEN course = 'linux' THEN score END,
CASE WHEN course = 'mysql' THEN score END,
CASE WHEN course = 'python' THEN score END
from student_score;
Copy the code

Get the following table, friends, is it a step closer to success? Think about it, what is almost?That’s right. That’s the last step of aggregation

SELECT student_name,
SUM(CASE WHEN course = 'linux' THEN score END) linux,
SUM(CASE WHEN course = 'mysql' THEN score END) mysql,
SUM(CASE WHEN course = 'python' THEN score END) python
from student_score
GROUP BY student_name;
Copy the code

Python implementation method

Hierarchical indexes play an important role in grouping operations such as reshaping data and array pivottables

import pandas as pd
from Dbconnection.DataBase import *

sql = """ SELECT student_name,course,score from student_score; "" "

col, res = get_mysql_test().query(sql)
# Create double layer index
df = pd.Series(
    [i[2] for i in res],
    index=[[i[0] for i in res], [i[1] for i in res]]
)
df = df.unstack()
df.index.name = 'student_name'
print(df)

# DataFrame pivot_table method
df = pd.DataFrame(res, columns=['student_name'.'course'.'score'])
df = df.pivot_table(['score'], index=['student_name'], columns=['course'])
df = df['score'].reset_index()
df.set_index(['student_name'], inplace=True)
df.columns.name = None
print(df)
Copy the code

The results are shown below

The implementation method of Excel table

Click Insert, PivotTable.As shown below, drag the names of the rows and columns to the position of the response.

Hope to be helpful to everybody, the place that has a problem also asks everybody to criticize and point out, thank!!

It would be nice to get some attention