I’ve been so busy lately, so busy, I can’t write my blog anymore
Time flies
In a flash, a week has passed
With a progressive or regressive personality, I managed to paddle my technique for a week
Today’s lesson is groupby advanced
Advanced, in fact, is a little more complex than the primary
It’s a little convoluted, and it’s not easy to understand
You’re a senior
In fact, for Pandas
It should be the basic part
What we’re going to learn today is
Custom richer grouping operations
The apply method
Value of the apply method
For some data types, yes, some
Agg doesn’t fit well with Transform, so apply appears
But what’s not, we’ll have to go into detail
So first of all, let’s do a few examples of apply and see what it does
To test, build the data first
import pandas as pd
df = pd.DataFrame({'A': ['bob'.'sos'.'bob'.'sos'.'bob'.'sos'.'bob'.'bob'].'B': ['one'.'one'.'two'.'three'.'two'.'two'.'one'.'three'].'C': [3.1.4.1.5.9.2.6].'D': [1.2.3.4.5.6.7.8]})
Copy the code
Data set. Let’s start grouping
grouped = df.groupby('A')
for name,group in grouped:
print(name)
print(group)
Copy the code
To become a master, this is when you need to start writing code
Don’t just watch
That’s right. You never seem to learn
Trust erasers
d = grouped.apply(lambda x:x.describe())
print(d)
Copy the code
Lambda expressions, you go to Baidu, the keyword Python lambda is an anonymous function, nothing difficult
Apply the describe method to the grouped data
Dangdang, the result is shown as
For the apply() method, it does the operation of passing the groupby data, group, group, group into the function
Look, it’s one group. One group passes in
So, there’s a multilevel structure
Hard to understand, isn’t it
Yeah, it’s just not easy to understand
Let me get you a picture. Let me get you a picture
So in this case, we get the first two pieces of data after grouping
New demand
The complete code
import pandas as pd
df = pd.DataFrame({'A': ['bob'.'sos'.'bob'.'sos'.'bob'.'sos'.'bob'.'bob'].'B': ['one'.'one'.'two'.'three'.'two'.'two'.'one'.'three'].'C': [3.1.4.1.5.9.2.6].'D': [1.2.3.4.5.6.7.8]})
grouped = df.groupby('A')
for name,group in grouped:
print(name)
print(group)
d = grouped.apply(lambda x:x.head(2))
Copy the code
Look at the numbers that come out
Good, good, although usually I only use the simplest
Instead of lambda, let’s implement it, maybe a little bit clearer
And the code, you can change it to look like this
def get_top(df):
return df.head(2)
d = grouped.apply(get_top)
Copy the code
Look at that. You look like a master
And then, you can also pass in a parameter
def get_top(df,n):
return df.head(n)
d = grouped.apply(get_top,n=3)
print(d)
Copy the code
The apply method can also be applied to series
Try it yourself
Finally, I need one to use Apply most frequently
It’s also the best way to do it
Of course Pandas is great
There must be a lot of alternatives
Fill an empty value
import pandas as pd
df = pd.DataFrame({'A': ['bob'.'sos'.'bob'.'sos'.'bob'.'sos'.'bob'.'bob'].'B': ['one'.'one'.'two'.'three'.'two'.'two'.'one'.'three'].'C': [3.1.4.1.5.9.None.6].'D': [1.2.3.None.5.6.7.8]})
grouped = df.groupby('A')
for name,group in grouped:
print(name)
print(group)
def fill_none(one_group):
return one_group.fillna(one_group.mean()) # Fill the null value with the average value
d = grouped.apply(fill_none)
print(d)
Copy the code
Perfect. Look at the data
Ok, have you learned apply?
No, I’m just reading it again
Read the book a hundred times, should not, or not
Take out your phone, point it at my Princess, and tap it