Actually, writing articles is really hard
Set aside some time each day to write elegant articles
Stick it out for a few years and be the big daddy
Persist and you will succeed
Recently, I have encountered a bottleneck. I have been unable to find a good way to operate the public account (There is someone to guide me ~ By the way, there is a small Qun of more than 100 people in the eraser, which is short of paddling and over-talking-consumption management…)
Who has any experience
Give me a comment and point out a deer for the eraser
Hey hey ha hey ~
In the last article, we played around with the functions of the Dataframe calculation class
This article,
I’m going to give you a little buzz about grouping and sorting
These two functions are classified
You got to learn it. You got it
I’m pandas
Say first order
English sorted
All right, that’s it. The rest of your feelings
O ha ha ~ O (studying studying)
A joke
As a program evangelist, I have to be clear
Look, that’s sorted
Inside the dataframe, there are two sorting functions
One is called sort_values, and one is also called sort_values sorry, sort_index
See the name
For those of you who have read six of my blogs
Instantly you know what they do
- Sort_index Sorts by index
- Sort_values Sort by value
So let’s start with sort_index
Start with some basic data
It all depends on the data you’re looking for
df = pd.DataFrame([[4.8.3], [5.6.1], [1.9.2]],columns=['boys'.'girls'.'aboys'],index=['class2'.'class1'.'class3'])
Copy the code
Print it out, and when you look at it, you can see that chestnut is holding up really well
boys girls aboys
class2 4 8 3
class1 5 6 1
class3 1 9 2
Copy the code
Notice that the name of the column index is boys,girls,aboys and the name of the column index is class2,class1,class3
The order is out of order
Next, use sort
df = pd.DataFrame([[4.8.3], [5.6.1], [1.9.2]],columns=['boys'.'girls'.'aboys'],index=['class2'.'class1'.'class3'])
print(df)
print(df.sort_index())
print(df.sort_index(axis=1))
print(df.sort_index(axis=0))
Copy the code
Unter den unter den ~
Look at the results
Of course, reverse order is also relatively easy
Ascending =False and that’s OK
I’m not going to show you that
Sort_index boom over, it’s time for sort_values to come out
This function has one more argument, by
Demonstrate a walk
df = pd.DataFrame([[4.8.3], [5.6.1], [1.9.2]],columns=['boys'.'girls'.'aboys'],index=['class2'.'class1'.'class3'])
print(df)
print(df.sort_values(by='boys'))
print(df.sort_values(by='girls'))
print(df.sort_values(by='class1',axis=1))
Copy the code
Let’s go back to the last little picture and illustrate it
print(df.sort_values(by=['boys'.'girls']))
Copy the code
The sorting is done. Let’s start grouping
Groups. Groups are called groups
In PANDAS, this is one of the more advanced functions
GroupBy
Official website notes inside give a sentence
Ha ha, the official website wrote wrong
You have to pass in a by or a level
That is, by whom
But here’s the problem again
If you’re going to be grouped, it has to make sense
What do you mean?
For example, the data we’ve been testing
How do you group them?
There’s no need to group this one
So before grouping, you have to see clearly
Well, this data, I need to group, so I can do something without pulling
So I started to group
Well come
See the data
import pandas as pd
mydict = {
'class_name': ['class1'.'class1'.'class2'.'class2'].'student': [20.30.10.20]
}
df = pd.DataFrame(mydict)
print(df)
Copy the code
Alas, this data is easier to learn groups
At one time, at what time, the number of students in each class was counted as follows
Next, I want to know how many are in class1 class and how many are in class2 class
You need to group them by class name, which is class_name
You see, the grouping concept is there
print("*"*100)
print(df.groupby(by='class_name'))
Copy the code
A look, the result
Hey hey, really can’t understand
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * <pandas.core.groupby.groupby.DataFrameGroupBy object at 0x000001CB1D07CE80>Copy the code
It’s a DataFrameGroupBy object
After grouping, you need to use an aggregate or compute class function again
OK la
For example, group summation
print(df.groupby(by='class_name').sum())
------------------------------------------
student
class_name
class1 50
class2 30
Copy the code
For example, group averages
print(df.groupby(by='class_name').mean())
Copy the code
Like, what do you do in groups?
print(df.groupby(by='class_name').count())
Copy the code
And you know what?
df.groupby(by='class_name').size()
Copy the code
Okay, here’s a small case
At first glance, the two seem to have the same result
Pay attention to the details. It’s the details that separate the big boys from the big boys
print("*"*100)
print(type(df.groupby(by='class_name').size()))
print("*"*100)
print(type(df.groupby(by='class_name').count()))
Copy the code
Both return different types
****************************************************************************************************
<class 'pandas.core.series.Series'>
****************************************************************************************************
<class 'pandas.core.frame.DataFrame'>
Copy the code
One is a Series, one is a DataFrame
We’re digging deeper
import pandas as pd
mydict = {
'class_name': ['class1'.'class1'.'class2'.'class2'.'class3'.'class4'.'class4'].'student': [20.30.10.20.5.None.12]
}
df = pd.DataFrame(mydict)
print(df)
print("*"*100)
print(df.groupby(by='class_name').size())
print("*"*100)
print(df.groupby(by='class_name').count())
Copy the code
Compare the results
Take out your pen and highlight
Count Does not count values of None
Beautiful, deep essence
OK, that’s the end of this blog
Know you did not learn, learn you are the wisest
Tomorrow we’ll look at other uses of groupby
So far, the easy part is done
From NOW on, the rest is hard
Ha ha ha ha
You’ve got your eye on the Princess, haven’t you
Get your friends to pay attention…