This article is participating in Python Theme Month, see [event link],

The table used in this article is as follows:

Let’s take a look at the original situation before ranking:

Import pandas as pd df = pd.read_excel(r 'c :\Users\admin\Desktop\ test.xlsx ') print(df)Copy the code

result:

Name Age Score 0 Xiao Ming 23.0 78 1 Xiao Gang NaN 89 2 Xiao Hong 876.0 65 3 Li Hua 65.0 89 4 Xiao Mei NaN 43 5 Zhang SAN 34.0 90 6 Li Si NaN 34 7 Wang Wu 98.5 87Copy the code

Numerical ranking and numerical sorting are corresponding, ranking will add a column, this column is used to store the data ranking. The rankings start at 1.

The rank() method is used to rank. This method takes two arguments, the first being ascending, which indicates whether to order ascending. The default is ascending. The other is method, which indicates what happens when the values to be arranged have duplicate values

Since the function of method parameter cannot be displayed without repeated value, the data with repeated value is displayed here to observe the effect caused by different parameter values.

1 Method indicates average

Import pandas as pd df = pd.read_excel(r'C:\Users\admin\Desktop\ test.xlsx ') print(df[" score "].rank(method='average'))Copy the code

The result:

0 4.0 1 6.5 2 3.0 3 6.5 4 2.0 5 8.0 6 1.0 7 5.0 Name: float64Copy the code

Let me explain it here. Because the score column in ascending order is [34, 43, 65, 78, 87, 89, 89, 90]

Let’s take a look at some of the numbers 89, which appear twice in the results list, in 7th and 8th place (counting from 1 here). So method=’average’ means to take the average of positions 6 and 7 of 89, so in this column 89 corresponds to row 2 and row 4 with a rank of 6.5

2 Method: The value is first

Df = pd.read_excel(r'C:\Users\admin\Desktop\ test.xlsx ') print(df[" score "].rank(method='first'))Copy the code

The result:

0 4.0 1 6.0 2 3.0 3 7.0 4 2.0 5 8.0 6 1.0 7 5.0 Name: float64Copy the code

3 method indicates min

Df = pd.read_excel(r'C:\Users\admin\Desktop\ test.xlsx ') print(df[" score "].rank(method='min'))Copy the code

The result:

0 4.0 1 6.0 2 3.0 3 6.0 4 2.0 5 8.0 6 1.0 7 5.0 Name: performance, dType: float64Copy the code

Let me explain it here. Because the score column in ascending order is [34, 43, 65, 78, 87, 89, 89, 90]

Let’s take a look at some of the numbers 89, which appear twice in the results list, in 7th and 8th place (counting from 1 here). So method=’min’ means to take the lowest value of position 6 and 7 of 89 here, which is 6, so row 2 and row 4 of 89 in this column are both ranked 6

4 method Indicates Max

Df = pd.read_excel(r'C:\Users\admin\Desktop\ test.xlsx ') print(df[" score "].rank(method=' Max '))Copy the code

The result:

0 4.0 1 7.0 2 3.0 3 7.0 4 2.0 5 8.0 6 1.0 7 5.0 Name: float64Copy the code
0 4.0 1 6.0 2 3.0 3 6.0 4 2.0 5 8.0 6 1.0 7 5.0 Name: performance, dType: float64Copy the code

Let me explain it here. Because the score column in ascending order is [34, 43, 65, 78, 87, 89, 89, 90]

Let’s take a look at some of the numbers 89, which appear twice in the results list, in 7th and 8th place (counting from 1 here). So method=’min’ means to take the maximum value of the position 6 and 7 of 89 here, which is 7, so in this column 89 corresponds to row 2 and row 4 ranking of 7

5 Default value

Df = pd.read_excel(r'C:\Users\admin\Desktop\ test.xlsx ') print(df[" score "].rank())Copy the code

result:

0 4.0 1 6.5 2 3.0 3 6.5 4 2.0 5 8.0 6 1.0 7 5.0 Name: float64Copy the code

By comparison, we can see that the default value of the method parameter is average

Finally, the method parameter values described above only work if the columns to be sorted have duplicate values.