“This is the fifth day of my participation in the Gwen Challenge.

Once you have created the DataFrame format, you can further select and process the data. First, create a DataFrame with three rows and three columns, with row indexes r1, R2, R3, and column indexes C1, C2, and C3, as shown below.

The import pandas as pd data = pd. DataFrame ([[1, 2, 3], [4 and 6], [7,8,9]], the index = [' r1, r2, r3 ' '], the columns = [' c1, c2, c3 '])Copy the code

You can also create a DataFrame from a two-dimensional array. The number 1 is used as the starting point and the number 10 is used as the end point (the end point is not available) to generate a total of 9 numbers from 1 to 9 as data in the DataFrame. The code is as follows.

Data = pd. DataFrame (np) arange (1, 10). Reshape (3, 3), the index = [' r1, r2, r3 '], the columns = [' c1, c2, c3 '])Copy the code

The data obtained by the two methods is the same, and the output is as follows:

Next, take the data created above as an example to explain data selection, screening, overall view, operation, sorting and deletion and other knowledge points.

(1) Select data according to columns

Start by simply selecting a single column of data, as shown below.

a=data['c1']
Copy the code

The printout of a is as follows

You can see that the selected data does not contain column index information, because a selection of a column by data[‘c1’] returns a one-dimensional Series data. The following code returns a two-dimensional table of data.

b=data[['c1']]
Copy the code

The printout for b is as follows.

To select multiple columns, specify the list in brackets []. For example, to select columns C1 and C3, write data[[‘c1′,’c3’]]. Note that brackets [] must be a list, not data[‘c1′,’c3’]. Here’s the code.

c=data[['c1','c3']]
Copy the code

The output of c is as follows:

(2) Select data by row

You can select data by line number, as shown below.

a = data[1:3]
Copy the code

The printout of a is as follows.

Pandas recommends using the ILOC method to select data by row number. This is more intuitive and does not cause confusion like data[1:3]. Here’s the code.

b = data.iloc[1:3]
Copy the code

If you want to select a single row, you must use the ILOC method. For example, take the first line from the bottom, and the code looks like this.

c = data.iloc[-1]
Copy the code

(3) Select data by block

If you want to select rows and columns, for example, the first two rows of columns C1 and C3, the code looks like this

a = data[['c1','c3']][0:2]
Copy the code

In fact, the previous method of selecting data by row and column is integrated. The print-out result of A is as follows.