Public account: You and the cabin by: Peter Editor: Peter
Hello, I’m Peter
The group_concAT operation can be implemented in Pandas. The group_concat operation can be implemented in Pandas.
group_concat
What does group_concat do in SQL or MySQL? Look at an example.
Here is a simple copy of the data stored in the table Information, with two fields ID and name:
+----+-----+
| id | name |
+------+---+
|1 | 10 |
|1 | 20 |
|1 | 20 |
|2 | 20 |
|3 | 200 |
|3 | 500 |
Copy the code
Method 1: Default
We group by ID, place names on the same line and separate them with commas:
select
id
,group_concat(name) as name
from information
group by id;
Copy the code
The result is:
|id| name|
|1 |10.20.20|
|2 |20 |
|3 |200.500|
Copy the code
Method 2: Specify symbols
The above results are separated by commas by default, and we can specify our own symbols:
select
id
,group_concat(name separator '; ') as name
from information
group by id;
Copy the code
The result is:
|id| name|
|1 |10;20;20| -- Separated by semicolons
|2 |20 |
|3 |200;500|
Copy the code
Method 3: Delete the display
We can also group by ID, remove redundant (duplicate) data, and put the rest together; For example, if id=1 repeats 20, we want to display only one 20:
The keyword distinct was added:
select
id
,group_concat(distinct name) as name
from information
group by id;
Copy the code
The corresponding results are shown as follows:
|id| name|
|1 |10.20| -- Only a 20 is displayed
|2 |20 |
|3 |200.500|
Copy the code
Mode 4: Arrange them in descending order
In all of the above cases, the data is sorted in ascending order, and we can also order in descending order:
select
id
,group_concat(name order by name desc) as name
from information
group by id;
Copy the code
Then the result displayed is:
The results are already in descending order
|id| name|
|1 |20.20.10|
|2 |20 |
|3 |500.200|
Copy the code
Pandas is used to implement group_concat.
Simulated data
import pandas as pd
import numpy as np
Copy the code
df = pd.DataFrame({
"name": ["Xiao Ming"."Xiao Ming"."Xiao Ming"."Little red"."Zhang"."Zhang"]."score": [10.20.20.20.200.500]
})
df
Copy the code
It is clear that we need to group The score of Xiao Ming, Xiao Hong and Xiao Zhang together.
Mode 1: Default group
Achieve default grouping, ascending order and no weight. There are three main steps:
1. Groupby groupby
2. After grouping, put all scores in a list through list
3. The third step is just index rearrangement
Method 2: Specify symbols
To specify a specific symbol, we use the join function. Since this function can only operate on strings, we need to convert numeric data in df to strings:
df.astype(str)
Copy the code
Method 3: Delete the display
Group by name field, and then use unique function for score. I’m just going to rearrange it
Mode 4: Arrange them in descending order
1. Let’s start with the default ascending order
2. Use apply function again for score field, and use sorted function for list to achieve descending order
Dear friend, have you learned?