From: http://blog.csdn.net/wangying19911991/article/details/73928172
https://www.zhihu.com/question/58993137
How exactly is Axis defined in Python? Do they represent DataFrame rows or columns? Consider the following code:
>>>df = pd.DataFrame([[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3]], \
columns=["col1", "col2", "col3", "col4"])
>>>df
col1 col2 col3 col4
0 1 1 1 1
1 2 2 2 2
2 3 3 3 3
Copy the code
If we call df.mean(axis=1), we get the mean calculated by row
>>> df.mean(axis=1)
0 1
1 2
2 3
Copy the code
However, if we call df.drop((name, axis=1), we actually drop a column, not a row:
>>> df.drop("col4", axis=1)
col1 col2 col3
0 1 1 1
1 2 2 2
2 3 3 3
Copy the code
Can someone help me understand what is meant by an “axis” in pandas/numpy/scipy? The axis parameter is used in pandas, numpy and scipy.
The top poll answers reveal the nature of the problem:
Df. mean actually takes the mean of all columns on each row, instead of preserving the mean of each column. Maybe it’s easy to remember that Axis =0 stands for down, and axis=1 stands for across, as an adverb of method action.
In other words:
- A value of 0 is used to indicate that a method is executed down each column or row label/index value
- A value of 1 indicates that the corresponding method is executed along each row or column label module
The following figure represents what axis 0 and 1 represent in the DataFrame:
In addition, remember that Pandas maintains Numpy’s usage of the axis keyword, as explained in the glossary of the Numpy library:
Axes are used to define attributes for arrays of more than one dimension. Two-dimensional data has two axes: axis 0 runs vertically down the row and axis 1 runs horizontally down the column.
So the first column df.mean(axis=1) means to compute the mean along the horizontal direction of the column, while the second column df.drop(name, axis=1) means to drop the column labels corresponding to name along the horizontal direction.
Copy the code