How do I operate pandas?
Practicing with an actual data set is probably the fastest and best way.
Explore the basics of manipulating and manipulating data in Pandas.
1. Get stock data¶
To obtain data using the data API, see BigQuant Data API for details
Df = d.new holdings (stock, '2017-08-01', '2017-08-05', stock = d.new holdings ()[:20] ['company_name','company_type','volume', 'fs_net_profit','fs_roe']) Return on equity df.head() # Only look at the first 5 linesCopy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | |
---|---|---|---|---|---|---|---|
0 | 000001.SZA | 2017-08-01 | 203570991 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 |
1 | 000002.SZA | 2017-08-01 | 20952262 | China Vanke Co., Ltd. | The public enterprise | 6.954116 e+08 | 0.6116 |
2 | 000004.SZA | 2017-08-01 | 653388 | Shenzhen Cau Technology Co., Ltd. | The private enterprise | 1.177969 e+06 | 0.9797 |
3 | 000005.SZA | 2017-08-01 | 7343560 | Shenzhen Fountain Corporation | The private enterprise | 1.014027 e+07 | 0.7938 |
4 | 000006.SZA | 2017-08-01 | 19458890 | Shenzhen Zhenye (Group) Co., Ltd. | Local state-owned enterprises | 1.038588 e+08 | 2.0537 |
2. Preliminary study of data¶
Df.head (3Copy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | |
---|---|---|---|---|---|---|---|
0 | 000001.SZA | 2017-08-01 | 203570991 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 |
1 | 000002.SZA | 2017-08-01 | 20952262 | China Vanke Co., Ltd. | The public enterprise | 6.954116 e+08 | 0.6116 |
2 | 000004.SZA | 2017-08-01 | 653388 | Shenzhen Cau Technology Co., Ltd. | The private enterprise | 1.177969 e+06 | 0.9797 |
Df.tail (3Copy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | |
---|---|---|---|---|---|---|---|
77 | 000021.SZA | 2017-08-04 | 11471227 | Shenzhen Kaifa Technology Co., Ltd. | Central state-owned enterprise | 116724688.0 | 2.1430 |
78 | 000022.SZA | 2017-08-04 | 13885068 | Shenzhen Chiwan Wharf Holdings Limited | Central state-owned enterprise | 138844496.0 | 2.9044 |
79 | 000023.SZA | 2017-08-04 | 0 | Shenzhen Universe (Group) Co., Ltd | The public enterprise | 10531537.0 | 2.6827 |
Df.sample (5) # 5 pieces of data were randomly selected for viewingCopy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | |
---|---|---|---|---|---|---|---|
67 | 000009.SZA | 2017-08-04 | 84400932 | China Baoan Group Co., Ltd. | The public enterprise | 3.347200 e+07 | 0.7437 |
51 | 000014.SZA | 2017-08-03 | 1209840 | Shahe Industrial Co., Ltd. | Local state-owned enterprises | 3.359943 e+06 | 0.4574 |
38 | 000022.SZA | 2017-08-02 | 10208777 | Shenzhen Chiwan Wharf Holdings Limited | Central state-owned enterprise | 1.388445 e+08 | 2.9044 |
19 | 000023.SZA | 2017-08-01 | 0 | Shenzhen Universe (Group) Co., Ltd | The public enterprise | 1.053154 e+07 | 2.6827 |
74 | 000018.SZA | 2017-08-04 | 9099017 | Sino Great Wall Co., Ltd. | The private enterprise | 9.902958 e+07 | 5.4350 |
Df.shape # Check the column size of the dataCopy the code
(80, 7)Copy the code
Df.columns # Check the column name of the dataCopy the code
Index(['instrument', 'date', 'volume', 'company_name', 'company_type',
'fs_net_profit', 'fs_roe'],
dtype='object')Copy the code
Df.describe () # View statistics of dataCopy the code
volume | fs_net_profit | fs_roe | |
---|---|---|---|
count | 8.000000 e+01 | 8.000000 e+01 | 80.000000 |
mean | 2.153483 e+07 | 3.934997 e+08 | 1.595305 |
std | 3.982048 e+07 | 1.353417 e+09 | 2.877810 |
min | 0.000000 e+00 | 2.550807 e+07 | 1.158000 |
25% | 3.457600 e+06 | 1.352505 e+05 | 0.058325 |
50% | 7.385609 e+06 | 1.838845 e+07 | 0.821900 |
75% | 1.738763 e+07 | 1.222546 e+08 | 2.286550 |
max | 2.062069 e+08 | 6.214000 e+09 | 11.774600 |
Df.info () # check the imported data typeCopy the code
<class 'pandas.core.frame.DataFrame'> Int64Index: 80 entries, 0 to 79 Data columns (total 7 columns): instrument 80 non-null object date 80 non-null datetime64[ns] volume 80 non-null int64 company_name 80 non-null object company_type 80 non-null object fs_net_profit 80 non-null float64 fs_roe 80 non-null float64 dtypes: Datetime64 [NS](1), Float64 (2), INT64 (1), Object (3) Memory Usage: 5.0+ KBCopy the code
3. Row/column selection¶
Df.iloc [22] # use the iloc command for line selectionCopy the code
Instrument 000004.SZA date 2017-08-02 00:00:00 Volume 792395 company_name Shenzhen Cau Technology Co.,Ltd. Company_type Private enterprise fs_net_profit -1.17797e+06 fs_roe -0.9797 Name: 22, dtype: objectCopy the code
Df.loc [22:25] # use the loc command to select linesCopy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | |
---|---|---|---|---|---|---|---|
22 | 000004.SZA | 2017-08-02 | 792395 | Shenzhen Cau Technology Co., Ltd. | The private enterprise | 1.177969 e+06 | 0.9797 |
23 | 000005.SZA | 2017-08-02 | 6313300 | Shenzhen Fountain Corporation | The private enterprise | 1.014027 e+07 | 0.7938 |
24 | 000006.SZA | 2017-08-02 | 15834085 | Shenzhen Zhenye (Group) Co., Ltd. | Local state-owned enterprises | 1.038588 e+08 | 2.0537 |
25 | 000007.SZA | 2017-08-02 | 5313578 | Shenzhen Quanxinhao Co., Ltd | The private enterprise | 2.768926 e+05 | 0.0747 |
Df.loc [[22,33,44]Copy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | |
---|---|---|---|---|---|---|---|
22 | 000004.SZA | 2017-08-02 | 792395 | Shenzhen Cau Technology Co., Ltd. | The private enterprise | 1.177969 e+06 | 0.9797 |
33 | 000017.SZA | 2017-08-02 | 3603500 | China Bicycle Company (Holdings) Limited | The private enterprise | 2.123222 e+05 | 1.4668 |
44 | 000006.SZA | 2017-08-03 | 14414290 | Shenzhen Zhenye (Group) Co., Ltd. | Local state-owned enterprises | 1.038588 e+08 | 2.0537 |
Df ['company_name'] #Copy the code
2 Shenzhen Cau Technology Co.,Ltd. 3 Shenzhen Cau Technology Corporation 4 Shenzhen Zhenye(Group) Co.,Ltd. 5 Shenzhen Quanxinhao Co.,Ltd. 6 China high-speed Railway Technology Co.,Ltd. 8 Shenzhen Ecobeauty Co.,Ltd. 9 Shenzhen Properties & Resources Development (Group) Ltd. 10 CSG Holding Co.,Ltd. 11 Shahe Industrial Co.,Ltd. 12 Konka Group Co.,Ltd. 13 China Bicycle Company (Holdings) Limited 14 Sino Great Wall Co.,Ltd Shenzhen Shenbao Industrial Co.,Ltd 16 Shenzhen Zhongheng Huafa Co.,Ltd. 17 Shenzhen Chiwan Technology Co.,Ltd. 18 Shenzhen Chiwan Wharf Holdings Limited 19 Shenzhen Universe (Group) Co.,Ltd. 20 Ping An Bank Co.,Ltd. 21 China Vanke Co.,Ltd Shenzhen Cau Technology Co.,Ltd. 23 Shenzhen Fountain Corporation 24 Shenzhen Zhenye(Group) Co.,Ltd. 25 Shenzhen Quanxinhao Co.,Ltd. 26 China High-speed Railway Technology Co.,Ltd. 27 China Baoan Group Co.,Ltd. 28 Shenzhen Ecobeauty Co.,Ltd. 29 Shenzhen Properties & Resources Development (Group) Ltd.... 50 CSG Holding Co.,Ltd. 51 Shahe Industrial Co.,Ltd. 52 Konka Group Co.,Ltd 53 China Bicycle Company (Holdings) Limited 54 Sino Great Wall Co.,Ltd. 55 Shenzhen Shenbao Industrial Co.,Ltd. 56 Shenzhen Zhongheng Huafa Co.,Ltd Shenzhen Kaifa Technology Co.,Ltd. 58 Shenzhen Chiwan Wharf Holdings Limited 59 Shenzhen Universe (Group) Co.,Ltd. 60 Ping An Bank Co.,Ltd. 61 China Vanke Co.,Ltd. 62 Shenzhen Cau Technology Co.,Ltd. 63 Shenzhen Fountain Corporation 64 Shenzhen Zhenye(Group) Co.,Ltd. 65 Shenzhen Quanxinhao Co.,Ltd. 66 China High-speed Railway Technology Co.,Ltd. 67 China Baoan Group Co.,Ltd Shahe Industrial Co.,Ltd. 72 Konka Group Co.,Ltd 73 China Bicycle Company (Holdings) Limited 74 Sino Great Wall Co.,Ltd. 75 Shenzhen Shenbao Industrial Co.,Ltd. 76 Shenzhen Zhongheng Huafa Co.,Ltd Shenzhen Chiwan Wharf Holdings Limited 79 Shenzhen Universe (Group) Co.,Ltd Name: company_name, Dtype: objectCopy the code
Df [['instrument', 'company_name', 'fs_roe']] #Copy the code
instrument | company_name | fs_roe | |
---|---|---|---|
0 | 000001.SZA | Ping An Bank Co., Ltd. | 3.0319 |
1 | 000002.SZA | China Vanke Co., Ltd. | 0.6116 |
2 | 000004.SZA | Shenzhen Cau Technology Co., Ltd. | 0.9797 |
3 | 000005.SZA | Shenzhen Fountain Corporation | 0.7938 |
4 | 000006.SZA | Shenzhen Zhenye (Group) Co., Ltd. | 2.0537 |
5 | 000007.SZA | Shenzhen Quanxinhao Co., Ltd | 0.0747 |
6 | 000008.SZA | China High-speed Railway Technology Co., Ltd. | 0.1525 |
7 | 000009.SZA | China Baoan Group Co., Ltd. | 0.7437 |
8 | 000010.SZA | Shenzhen Ecobeauty Co., Ltd. | 1.1580 |
9 | 000011.SZA | Shenzhen Properties & Resources Development (Group) Ltd. | 11.7746 |
10 | 000012.SZA | CSG Holding Co., Ltd. | 2.1545 |
11 | 000014.SZA | Shahe Industrial Co., Ltd. | 0.4574 |
12 | 000016.SZA | Konka Group Co., Ltd | 0.9001 |
13 | 000017.SZA | China Bicycle Company (Holdings) Limited | 1.4668 |
14 | 000018.SZA | Sino Great Wall Co., Ltd. | 5.4350 |
15 | 000019.SZA | Shenzhen Shenbao Industrial Co., Ltd | 0.9659 |
16 | 000020.SZA | Shenzhen Zhongheng Huafa Co., Ltd. | 0.1317 |
17 | 000021.SZA | Shenzhen Kaifa Technology Co., Ltd. | 2.1430 |
18 | 000022.SZA | Shenzhen Chiwan Wharf Holdings Limited | 2.9044 |
19 | 000023.SZA | Shenzhen Universe (Group) Co., Ltd | 2.6827 |
20 | 000001.SZA | Ping An Bank Co., Ltd. | 3.0319 |
21 | 000002.SZA | China Vanke Co., Ltd. | 0.6116 |
22 | 000004.SZA | Shenzhen Cau Technology Co., Ltd. | 0.9797 |
23 | 000005.SZA | Shenzhen Fountain Corporation | 0.7938 |
24 | 000006.SZA | Shenzhen Zhenye (Group) Co., Ltd. | 2.0537 |
25 | 000007.SZA | Shenzhen Quanxinhao Co., Ltd | 0.0747 |
26 | 000008.SZA | China High-speed Railway Technology Co., Ltd. | 0.1525 |
27 | 000009.SZA | China Baoan Group Co., Ltd. | 0.7437 |
28 | 000010.SZA | Shenzhen Ecobeauty Co., Ltd. | 1.1580 |
29 | 000011.SZA | Shenzhen Properties & Resources Development (Group) Ltd. | 11.7746 |
. | . | . | . |
50 | 000012.SZA | CSG Holding Co., Ltd. | 2.1545 |
51 | 000014.SZA | Shahe Industrial Co., Ltd. | 0.4574 |
52 | 000016.SZA | Konka Group Co., Ltd | 0.9001 |
53 | 000017.SZA | China Bicycle Company (Holdings) Limited | 1.4668 |
54 | 000018.SZA | Sino Great Wall Co., Ltd. | 5.4350 |
55 | 000019.SZA | Shenzhen Shenbao Industrial Co., Ltd | 0.9659 |
56 | 000020.SZA | Shenzhen Zhongheng Huafa Co., Ltd. | 0.1317 |
57 | 000021.SZA | Shenzhen Kaifa Technology Co., Ltd. | 2.1430 |
58 | 000022.SZA | Shenzhen Chiwan Wharf Holdings Limited | 2.9044 |
59 | 000023.SZA | Shenzhen Universe (Group) Co., Ltd | 2.6827 |
60 | 000001.SZA | Ping An Bank Co., Ltd. | 3.0319 |
61 | 000002.SZA | China Vanke Co., Ltd. | 0.6116 |
62 | 000004.SZA | Shenzhen Cau Technology Co., Ltd. | 0.9797 |
63 | 000005.SZA | Shenzhen Fountain Corporation | 0.7938 |
64 | 000006.SZA | Shenzhen Zhenye (Group) Co., Ltd. | 2.0537 |
65 | 000007.SZA | Shenzhen Quanxinhao Co., Ltd | 0.0747 |
66 | 000008.SZA | China High-speed Railway Technology Co., Ltd. | 0.1525 |
67 | 000009.SZA | China Baoan Group Co., Ltd. | 0.7437 |
68 | 000010.SZA | Shenzhen Ecobeauty Co., Ltd. | 1.1580 |
69 | 000011.SZA | Shenzhen Properties & Resources Development (Group) Ltd. | 11.7746 |
70 | 000012.SZA | CSG Holding Co., Ltd. | 2.1545 |
71 | 000014.SZA | Shahe Industrial Co., Ltd. | 0.4574 |
72 | 000016.SZA | Konka Group Co., Ltd | 0.9001 |
73 | 000017.SZA | China Bicycle Company (Holdings) Limited | 1.4668 |
74 | 000018.SZA | Sino Great Wall Co., Ltd. | 5.4350 |
75 | 000019.SZA | Shenzhen Shenbao Industrial Co., Ltd | 0.9659 |
76 | 000020.SZA | Shenzhen Zhongheng Huafa Co., Ltd. | 0.1317 |
77 | 000021.SZA | Shenzhen Kaifa Technology Co., Ltd. | 2.1430 |
78 | 000022.SZA | Shenzhen Chiwan Wharf Holdings Limited | 2.9044 |
79 | 000023.SZA | Shenzhen Universe (Group) Co., Ltd | 2.6827 |
80 rows × 3 columns
Df.loc [:10, ['company_name', 'fs_roe']] #Copy the code
company_name | fs_roe | |
---|---|---|
0 | Ping An Bank Co., Ltd. | 3.0319 |
1 | China Vanke Co., Ltd. | 0.6116 |
2 | Shenzhen Cau Technology Co., Ltd. | 0.9797 |
3 | Shenzhen Fountain Corporation | 0.7938 |
4 | Shenzhen Zhenye (Group) Co., Ltd. | 2.0537 |
5 | Shenzhen Quanxinhao Co., Ltd | 0.0747 |
6 | China High-speed Railway Technology Co., Ltd. | 0.1525 |
7 | China Baoan Group Co., Ltd. | 0.7437 |
8 | Shenzhen Ecobeauty Co., Ltd. | 1.1580 |
9 | Shenzhen Properties & Resources Development (Group) Ltd. | 11.7746 |
10 | CSG Holding Co., Ltd. | 2.1545 |
Df.iloc [:5,3:Copy the code
company_name | company_type | fs_net_profit | fs_roe | |
---|---|---|---|---|
0 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 |
1 | China Vanke Co., Ltd. | The public enterprise | 6.954116 e+08 | 0.6116 |
2 | Shenzhen Cau Technology Co., Ltd. | The private enterprise | 1.177969 e+06 | 0.9797 |
3 | Shenzhen Fountain Corporation | The private enterprise | 1.014027 e+07 | 0.7938 |
4 | Shenzhen Zhenye (Group) Co., Ltd. | Local state-owned enterprises | 1.038588 e+08 | 2.0537 |
4. More row and column operations¶
Next, we will practice using what we have learned above:
Import numpy as np df['open_int'] = np.nanCopy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | |
---|---|---|---|---|---|---|---|---|
0 | 000001.SZA | 2017-08-01 | 203570991 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | NaN |
1 | 000002.SZA | 2017-08-01 | 20952262 | China Vanke Co., Ltd. | The public enterprise | 6.954116 e+08 | 0.6116 | NaN |
2 | 000004.SZA | 2017-08-01 | 653388 | Shenzhen Cau Technology Co., Ltd. | The private enterprise | 1.177969 e+06 | 0.9797 | NaN |
3 | 000005.SZA | 2017-08-01 | 7343560 | Shenzhen Fountain Corporation | The private enterprise | 1.014027 e+07 | 0.7938 | NaN |
4 | 000006.SZA | 2017-08-01 | 19458890 | Shenzhen Zhenye (Group) Co., Ltd. | Local state-owned enterprises | 1.038588 e+08 | 2.0537 | NaN |
Df ['open_int'] = 999 #Copy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | |
---|---|---|---|---|---|---|---|---|
0 | 000001.SZA | 2017-08-01 | 203570991 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999 |
1 | 000002.SZA | 2017-08-01 | 20952262 | China Vanke Co., Ltd. | The public enterprise | 6.954116 e+08 | 0.6116 | 999 |
2 | 000004.SZA | 2017-08-01 | 653388 | Shenzhen Cau Technology Co., Ltd. | The private enterprise | 1.177969 e+06 | 0.9797 | 999 |
3 | 000005.SZA | 2017-08-01 | 7343560 | Shenzhen Fountain Corporation | The private enterprise | 1.014027 e+07 | 0.7938 | 999 |
4 | 000006.SZA | 2017-08-01 | 19458890 | Shenzhen Zhenye (Group) Co., Ltd. | Local state-owned enterprises | 1.038588 e+08 | 2.0537 | 999 |
Df ['test'] = df.company_type == 'private' #Copy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | test | |
---|---|---|---|---|---|---|---|---|---|
0 | 000001.SZA | 2017-08-01 | 203570991 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999 | False |
1 | 000002.SZA | 2017-08-01 | 20952262 | China Vanke Co., Ltd. | The public enterprise | 6.954116 e+08 | 0.6116 | 999 | False |
2 | 000004.SZA | 2017-08-01 | 653388 | Shenzhen Cau Technology Co., Ltd. | The private enterprise | 1.177969 e+06 | 0.9797 | 999 | True |
3 | 000005.SZA | 2017-08-01 | 7343560 | Shenzhen Fountain Corporation | The private enterprise | 1.014027 e+07 | 0.7938 | 999 | True |
4 | 000006.SZA | 2017-08-01 | 19458890 | Shenzhen Zhenye (Group) Co., Ltd. | Local state-owned enterprises | 1.038588 e+08 | 2.0537 | 999 | False |
Df.loc [df.company_type == 'private enterprise ', 'test'] =' private enterprise, tax burden is really not light ' Df.head () = 'private ', 'test'] =' private 'Copy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | test | |
---|---|---|---|---|---|---|---|---|---|
0 | 000001.SZA | 2017-08-01 | 203570991 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999 | It’s not a private enterprise. I don’t know what it is |
1 | 000002.SZA | 2017-08-01 | 20952262 | China Vanke Co., Ltd. | The public enterprise | 6.954116 e+08 | 0.6116 | 999 | It’s not a private enterprise. I don’t know what it is |
2 | 000004.SZA | 2017-08-01 | 653388 | Shenzhen Cau Technology Co., Ltd. | The private enterprise | 1.177969 e+06 | 0.9797 | 999 | He is a private enterprise, the tax burden is really not light |
3 | 000005.SZA | 2017-08-01 | 7343560 | Shenzhen Fountain Corporation | The private enterprise | 1.014027 e+07 | 0.7938 | 999 | He is a private enterprise, the tax burden is really not light |
4 | 000006.SZA | 2017-08-01 | 19458890 | Shenzhen Zhenye (Group) Co., Ltd. | Local state-owned enterprises | 1.038588 e+08 | 2.0537 | 999 | It’s not a private enterprise. I don’t know what it is |
Df.loc [2:4, 'test'] = 'I'm not going to listen' # select a row and assign df.head()Copy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | test | |
---|---|---|---|---|---|---|---|---|---|
0 | 000001.SZA | 2017-08-01 | 203570991 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999 | It’s not a private enterprise. I don’t know what it is |
1 | 000002.SZA | 2017-08-01 | 20952262 | China Vanke Co., Ltd. | The public enterprise | 6.954116 e+08 | 0.6116 | 999 | It’s not a private enterprise. I don’t know what it is |
2 | 000004.SZA | 2017-08-01 | 653388 | Shenzhen Cau Technology Co., Ltd. | The private enterprise | 1.177969 e+06 | 0.9797 | 999 | I’m not listening. I’m not listening |
3 | 000005.SZA | 2017-08-01 | 7343560 | Shenzhen Fountain Corporation | The private enterprise | 1.014027 e+07 | 0.7938 | 999 | I’m not listening. I’m not listening |
4 | 000006.SZA | 2017-08-01 | 19458890 | Shenzhen Zhenye (Group) Co., Ltd. | Local state-owned enterprises | 1.038588 e+08 | 2.0537 | 999 | I’m not listening. I’m not listening |
Df.rename (columns={'test':' columns '}, inplace=True) # Rename the column name and fix the df.head() operationCopy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | A random column | |
---|---|---|---|---|---|---|---|---|---|
0 | 000001.SZA | 2017-08-01 | 203570991 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999 | It’s not a private enterprise. I don’t know what it is |
1 | 000002.SZA | 2017-08-01 | 20952262 | China Vanke Co., Ltd. | The public enterprise | 6.954116 e+08 | 0.6116 | 999 | It’s not a private enterprise. I don’t know what it is |
2 | 000004.SZA | 2017-08-01 | 653388 | Shenzhen Cau Technology Co., Ltd. | The private enterprise | 1.177969 e+06 | 0.9797 | 999 | I’m not listening. I’m not listening |
3 | 000005.SZA | 2017-08-01 | 7343560 | Shenzhen Fountain Corporation | The private enterprise | 1.014027 e+07 | 0.7938 | 999 | I’m not listening. I’m not listening |
4 | 000006.SZA | 2017-08-01 | 19458890 | Shenzhen Zhenye (Group) Co., Ltd. | Local state-owned enterprises | 1.038588 e+08 | 2.0537 | 999 | I’m not listening. I’m not listening |
Df_test.columns = [' columns %s' % STR (I) for I in range(1,len(df_test.columns)+1)Copy the code
Column 1 | Column 2 | Column 3 | Column 4 | Column 5 | Column 6 | The column 7 | Column 8 | The nine | |
---|---|---|---|---|---|---|---|---|---|
0 | 000001.SZA | 2017-08-01 | 203570991 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999 | It’s not a private enterprise. I don’t know what it is |
1 | 000002.SZA | 2017-08-01 | 20952262 | China Vanke Co., Ltd. | The public enterprise | 6.954116 e+08 | 0.6116 | 999 | It’s not a private enterprise. I don’t know what it is |
2 | 000004.SZA | 2017-08-01 | 653388 | Shenzhen Cau Technology Co., Ltd. | The private enterprise | 1.177969 e+06 | 0.9797 | 999 | I’m not listening. I’m not listening |
3 | 000005.SZA | 2017-08-01 | 7343560 | Shenzhen Fountain Corporation | The private enterprise | 1.014027 e+07 | 0.7938 | 999 | I’m not listening. I’m not listening |
4 | 000006.SZA | 2017-08-01 | 19458890 | Shenzhen Zhenye (Group) Co., Ltd. | Local state-owned enterprises | 1.038588 e+08 | 2.0537 | 999 | I’m not listening. I’m not listening |
5 | 000007.SZA | 2017-08-01 | 10004406 | Shenzhen Quanxinhao Co., Ltd | The private enterprise | 2.768926 e+05 | 0.0747 | 999 | He is a private enterprise, the tax burden is really not light |
Df_test. Reindex (columns = [' 1 ', '2', '4', 'column 12', '3', '5', '6', '8', '7', 'nine']) # rearrangement column df_testCopy the code
Column 1 | Column 2 | Column 3 | Column 4 | Column 5 | Column 6 | The column 7 | Column 8 | The nine | |
---|---|---|---|---|---|---|---|---|---|
0 | 000001.SZA | 2017-08-01 | 203570991 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999 | It’s not a private enterprise. I don’t know what it is |
1 | 000002.SZA | 2017-08-01 | 20952262 | China Vanke Co., Ltd. | The public enterprise | 6.954116 e+08 | 0.6116 | 999 | It’s not a private enterprise. I don’t know what it is |
2 | 000004.SZA | 2017-08-01 | 653388 | Shenzhen Cau Technology Co., Ltd. | The private enterprise | 1.177969 e+06 | 0.9797 | 999 | I’m not listening. I’m not listening |
3 | 000005.SZA | 2017-08-01 | 7343560 | Shenzhen Fountain Corporation | The private enterprise | 1.014027 e+07 | 0.7938 | 999 | I’m not listening. I’m not listening |
4 | 000006.SZA | 2017-08-01 | 19458890 | Shenzhen Zhenye (Group) Co., Ltd. | Local state-owned enterprises | 1.038588 e+08 | 2.0537 | 999 | I’m not listening. I’m not listening |
5 | 000007.SZA | 2017-08-01 | 10004406 | Shenzhen Quanxinhao Co., Ltd | The private enterprise | 2.768926 e+05 | 0.0747 | 999 | He is a private enterprise, the tax burden is really not light |
Df_test.reindex (index= [3,4,5,0,1,2]Copy the code
Column 1 | Column 2 | Column 3 | Column 4 | Column 5 | Column 6 | The column 7 | Column 8 | The nine | |
---|---|---|---|---|---|---|---|---|---|
3 | 000005.SZA | 2017-08-01 | 7343560 | Shenzhen Fountain Corporation | The private enterprise | 1.014027 e+07 | 0.7938 | 999 | I’m not listening. I’m not listening |
4 | 000006.SZA | 2017-08-01 | 19458890 | Shenzhen Zhenye (Group) Co., Ltd. | Local state-owned enterprises | 1.038588 e+08 | 2.0537 | 999 | I’m not listening. I’m not listening |
5 | 000007.SZA | 2017-08-01 | 10004406 | Shenzhen Quanxinhao Co., Ltd | The private enterprise | 2.768926 e+05 | 0.0747 | 999 | He is a private enterprise, the tax burden is really not light |
0 | 000001.SZA | 2017-08-01 | 203570991 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999 | It’s not a private enterprise. I don’t know what it is |
1 | 000002.SZA | 2017-08-01 | 20952262 | China Vanke Co., Ltd. | The public enterprise | 6.954116 e+08 | 0.6116 | 999 | It’s not a private enterprise. I don’t know what it is |
2 | 000004.SZA | 2017-08-01 | 653388 | Shenzhen Cau Technology Co., Ltd. | The private enterprise | 1.177969 e+06 | 0.9797 | 999 | I’m not listening. I’m not listening |
5. Delete rows and columns¶
Drop ([2,5],axis=0) # drop rowsCopy the code
Column 1 | Column 2 | Column 3 | Column 4 | Column 5 | Column 6 | The column 7 | Column 8 | The nine | |
---|---|---|---|---|---|---|---|---|---|
0 | 000001.SZA | 2017-08-01 | 203570991 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999 | It’s not a private enterprise. I don’t know what it is |
1 | 000002.SZA | 2017-08-01 | 20952262 | China Vanke Co., Ltd. | The public enterprise | 6.954116 e+08 | 0.6116 | 999 | It’s not a private enterprise. I don’t know what it is |
3 | 000005.SZA | 2017-08-01 | 7343560 | Shenzhen Fountain Corporation | The private enterprise | 1.014027 e+07 | 0.7938 | 999 | I’m not listening. I’m not listening |
4 | 000006.SZA | 2017-08-01 | 19458890 | Shenzhen Zhenye (Group) Co., Ltd. | Local state-owned enterprises | 1.038588 e+08 | 2.0537 | 999 | I’m not listening. I’m not listening |
Df_test. drop([' column 1',' column 2'], axis=1Copy the code
Column 3 | Column 4 | Column 5 | Column 6 | The column 7 | Column 8 | The nine | |
---|---|---|---|---|---|---|---|
0 | 203570991 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999 | It’s not a private enterprise. I don’t know what it is |
1 | 20952262 | China Vanke Co., Ltd. | The public enterprise | 6.954116 e+08 | 0.6116 | 999 | It’s not a private enterprise. I don’t know what it is |
2 | 653388 | Shenzhen Cau Technology Co., Ltd. | The private enterprise | 1.177969 e+06 | 0.9797 | 999 | I’m not listening. I’m not listening |
3 | 7343560 | Shenzhen Fountain Corporation | The private enterprise | 1.014027 e+07 | 0.7938 | 999 | I’m not listening. I’m not listening |
4 | 19458890 | Shenzhen Zhenye (Group) Co., Ltd. | Local state-owned enterprises | 1.038588 e+08 | 2.0537 | 999 | I’m not listening. I’m not listening |
5 | 10004406 | Shenzhen Quanxinhao Co., Ltd | The private enterprise | 2.768926 e+05 | 0.0747 | 999 | He is a private enterprise, the tax burden is really not light |
6. Data type conversion¶
Df ['date'].sample(5) #Copy the code
9 2017-08-01
35 2017-08-02
54 2017-08-03
23 2017-08-02
5 2017-08-01
Name: date, dtype: datetime64[ns]Copy the code
Print (type(df.date[0])) df.date = df.date.map(lambda x: x.trftime ('%Y-%m-%d')) print(type(df.date[0]))Copy the code
<class 'pandas.tslib.Timestamp'>
<class 'str'>
Copy the code
7. Data filtering¶
Df [df['fs_roe']>1]. Head () # select fs_roe>10Copy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | A random column | |
---|---|---|---|---|---|---|---|---|---|
0 | 000001.SZA | 2017-08-01 | 203570991 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999 | It’s not a private enterprise. I don’t know what it is |
4 | 000006.SZA | 2017-08-01 | 19458890 | Shenzhen Zhenye (Group) Co., Ltd. | Local state-owned enterprises | 1.038588 e+08 | 2.0537 | 999 | I’m not listening. I’m not listening |
9 | 000011.SZA | 2017-08-01 | 10230327 | Shenzhen Properties & Resources Development (Group) Ltd. | Local state-owned enterprises | 3.015978 e+08 | 11.7746 | 999 | It’s not a private enterprise. I don’t know what it is |
10 | 000012.SZA | 2017-08-01 | 15099069 | CSG Holding Co., Ltd. | The public enterprise | 1.701309 e+08 | 2.1545 | 999 | It’s not a private enterprise. I don’t know what it is |
13 | 000017.SZA | 2017-08-01 | 2857907 | China Bicycle Company (Holdings) Limited | The private enterprise | 2.123222 e+05 | 1.4668 | 999 | He is a private enterprise, the tax burden is really not light |
Df [(df [' fs_roe] > 1) & (df [' fs_roe] < 4)]. The head () # choose fs_roe in a certain range of dataCopy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | A random column | |
---|---|---|---|---|---|---|---|---|---|
0 | 000001.SZA | 2017-08-01 | 203570991 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999 | It’s not a private enterprise. I don’t know what it is |
4 | 000006.SZA | 2017-08-01 | 19458890 | Shenzhen Zhenye (Group) Co., Ltd. | Local state-owned enterprises | 1.038588 e+08 | 2.0537 | 999 | I’m not listening. I’m not listening |
10 | 000012.SZA | 2017-08-01 | 15099069 | CSG Holding Co., Ltd. | The public enterprise | 1.701309 e+08 | 2.1545 | 999 | It’s not a private enterprise. I don’t know what it is |
13 | 000017.SZA | 2017-08-01 | 2857907 | China Bicycle Company (Holdings) Limited | The private enterprise | 2.123222 e+05 | 1.4668 | 999 | He is a private enterprise, the tax burden is really not light |
17 | 000021.SZA | 2017-08-01 | 7955667 | Shenzhen Kaifa Technology Co., Ltd. | Central state-owned enterprise | 1.167247 e+08 | 2.1430 | 999 | It’s not a private enterprise. I don’t know what it is |
Df [(df [' fs_roe] > 1) & (df [' company_type]! = 'local state-owned enterprises')], the head () # information choose to satisfy a variety of conditionsCopy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | A random column | |
---|---|---|---|---|---|---|---|---|---|
0 | 000001.SZA | 2017-08-01 | 203570991 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999 | It’s not a private enterprise. I don’t know what it is |
10 | 000012.SZA | 2017-08-01 | 15099069 | CSG Holding Co., Ltd. | The public enterprise | 1.701309 e+08 | 2.1545 | 999 | It’s not a private enterprise. I don’t know what it is |
13 | 000017.SZA | 2017-08-01 | 2857907 | China Bicycle Company (Holdings) Limited | The private enterprise | 2.123222 e+05 | 1.4668 | 999 | He is a private enterprise, the tax burden is really not light |
14 | 000018.SZA | 2017-08-01 | 8572320 | Sino Great Wall Co., Ltd. | The private enterprise | 9.902958 e+07 | 5.4350 | 999 | He is a private enterprise, the tax burden is really not light |
17 | 000021.SZA | 2017-08-01 | 7955667 | Shenzhen Kaifa Technology Co., Ltd. | Central state-owned enterprise | 1.167247 e+08 | 2.1430 | 999 | It’s not a private enterprise. I don’t know what it is |
8. Data sorting¶
Df.sort_values (by='fs_roe').head() #Copy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | A random column | |
---|---|---|---|---|---|---|---|---|---|
28 | 000010.SZA | 2017-08-02 | 5410561 | Shenzhen Ecobeauty Co., Ltd. | The private enterprise | 2.550807 e+07 | 1.1580 | 999 | He is a private enterprise, the tax burden is really not light |
68 | 000010.SZA | 2017-08-04 | 5912089 | Shenzhen Ecobeauty Co., Ltd. | The private enterprise | 2.550807 e+07 | 1.1580 | 999 | He is a private enterprise, the tax burden is really not light |
8 | 000010.SZA | 2017-08-01 | 4519425 | Shenzhen Ecobeauty Co., Ltd. | The private enterprise | 2.550807 e+07 | 1.1580 | 999 | He is a private enterprise, the tax burden is really not light |
48 | 000010.SZA | 2017-08-03 | 4826822 | Shenzhen Ecobeauty Co., Ltd. | The private enterprise | 2.550807 e+07 | 1.1580 | 999 | He is a private enterprise, the tax burden is really not light |
62 | 000004.SZA | 2017-08-04 | 711022 | Shenzhen Cau Technology Co., Ltd. | The private enterprise | 1.177969 e+06 | 0.9797 | 999 | He is a private enterprise, the tax burden is really not light |
Df.sort_values (by='fs_roe',ascending= False). Head () #ascendig= False, descending orderCopy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | A random column | |
---|---|---|---|---|---|---|---|---|---|
69 | 000011.SZA | 2017-08-04 | 5736572 | Shenzhen Properties & Resources Development (Group) Ltd. | Local state-owned enterprises | 301597824.0 | 11.7746 | 999 | It’s not a private enterprise. I don’t know what it is |
49 | 000011.SZA | 2017-08-03 | 7180287 | Shenzhen Properties & Resources Development (Group) Ltd. | Local state-owned enterprises | 301597824.0 | 11.7746 | 999 | It’s not a private enterprise. I don’t know what it is |
29 | 000011.SZA | 2017-08-02 | 8782902 | Shenzhen Properties & Resources Development (Group) Ltd. | Local state-owned enterprises | 301597824.0 | 11.7746 | 999 | It’s not a private enterprise. I don’t know what it is |
9 | 000011.SZA | 2017-08-01 | 10230327 | Shenzhen Properties & Resources Development (Group) Ltd. | Local state-owned enterprises | 301597824.0 | 11.7746 | 999 | It’s not a private enterprise. I don’t know what it is |
14 | 000018.SZA | 2017-08-01 | 8572320 | Sino Great Wall Co., Ltd. | The private enterprise | 99029584.0 | 5.4350 | 999 | He is a private enterprise, the tax burden is really not light |
Df.sort_values (by= ['fs_roe','fs_net_profit'],ascending= False). Head () #Copy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | A random column | |
---|---|---|---|---|---|---|---|---|---|
9 | 000011.SZA | 2017-08-01 | 10230327 | Shenzhen Properties & Resources Development (Group) Ltd. | Local state-owned enterprises | 301597824.0 | 11.7746 | 999 | It’s not a private enterprise. I don’t know what it is |
29 | 000011.SZA | 2017-08-02 | 8782902 | Shenzhen Properties & Resources Development (Group) Ltd. | Local state-owned enterprises | 301597824.0 | 11.7746 | 999 | It’s not a private enterprise. I don’t know what it is |
49 | 000011.SZA | 2017-08-03 | 7180287 | Shenzhen Properties & Resources Development (Group) Ltd. | Local state-owned enterprises | 301597824.0 | 11.7746 | 999 | It’s not a private enterprise. I don’t know what it is |
69 | 000011.SZA | 2017-08-04 | 5736572 | Shenzhen Properties & Resources Development (Group) Ltd. | Local state-owned enterprises | 301597824.0 | 11.7746 | 999 | It’s not a private enterprise. I don’t know what it is |
14 | 000018.SZA | 2017-08-01 | 8572320 | Sino Great Wall Co., Ltd. | The private enterprise | 99029584.0 | 5.4350 | 999 | He is a private enterprise, the tax burden is really not light |
9. Descriptive statistics of data¶
This section uses the ‘fs_roe’ column as an example to describe and collect statistics
Df ['fs_roe']. Mean () #Copy the code
1.5953049875795842Copy the code
Df ['fs_roe'].idxmax() #Copy the code
9Copy the code
Df.loc [df['fs_roe'].idxmin()] # locate the smallest column of fs_roeCopy the code
Instrument 000010.sZA Date 2017-08-01 Volume 4519425 company_name Shenzhen Ecobeauty Co.,Ltd. Company_type Private enterprise fs_net_profit He is a private enterprise with a heavy tax burden. Name: 8, dtype: objectCopy the code
Df.fs_net_education. corr(df.volume) #Copy the code
0.81501625795410615Copy the code
Df.com pany_type. Unique () #Copy the code
Array ([' public ', 'private ',' local state-owned enterprise ', 'central State-owned enterprise '], dType =object)Copy the code
Df.com pany_type. Value_counts () #Copy the code
Private 28 Public 24 Local State-Owned 16 Central state-owned 12 Name: company_type, dtype: int64Copy the code
10. Processing missing data¶
Df_test = df.sample(5) df_test.loc[df_test['fs_roe']<=1,'fs_roe'] = np.nan df_test.loc[66] = np.nan Df_test # Practice data set construction completedCopy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | A random column | |
---|---|---|---|---|---|---|---|---|---|
23 | 000005.SZA | 2017-08-02 | 6313300.0 | Shenzhen Fountain Corporation | The private enterprise | 1.014027 e+07 | NaN | 999.0 | He is a private enterprise, the tax burden is really not light |
47 | 000009.SZA | 2017-08-03 | 69392114.0 | China Baoan Group Co., Ltd. | The public enterprise | 3.347200 e+07 | NaN | 999.0 | It’s not a private enterprise. I don’t know what it is |
16 | 000020.SZA | 2017-08-01 | 0.0 | Shenzhen Zhongheng Huafa Co., Ltd. | The private enterprise | 4.211734 e+05 | NaN | 999.0 | He is a private enterprise, the tax burden is really not light |
5 | 000007.SZA | 2017-08-01 | 10004406.0 | Shenzhen Quanxinhao Co., Ltd | The private enterprise | 2.768926 e+05 | NaN | 999.0 | He is a private enterprise, the tax burden is really not light |
40 | 000001.SZA | 2017-08-03 | 98421938.0 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999.0 | It’s not a private enterprise. I don’t know what it is |
66 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
Df_test.dropna () # Delete lines containing NaN valuesCopy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | A random column | |
---|---|---|---|---|---|---|---|---|---|
40 | 000001.SZA | 2017-08-03 | 98421938.0 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999.0 | It’s not a private enterprise. I don’t know what it is |
Df_test. dropna(how= 'all',inplace= True) # drop all NaN rows and solidify the drop with 'inplace=True'. df_testCopy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | A random column | |
---|---|---|---|---|---|---|---|---|---|
23 | 000005.SZA | 2017-08-02 | 6313300.0 | Shenzhen Fountain Corporation | The private enterprise | 1.014027 e+07 | NaN | 999.0 | He is a private enterprise, the tax burden is really not light |
47 | 000009.SZA | 2017-08-03 | 69392114.0 | China Baoan Group Co., Ltd. | The public enterprise | 3.347200 e+07 | NaN | 999.0 | It’s not a private enterprise. I don’t know what it is |
16 | 000020.SZA | 2017-08-01 | 0.0 | Shenzhen Zhongheng Huafa Co., Ltd. | The private enterprise | 4.211734 e+05 | NaN | 999.0 | He is a private enterprise, the tax burden is really not light |
5 | 000007.SZA | 2017-08-01 | 10004406.0 | Shenzhen Quanxinhao Co., Ltd | The private enterprise | 2.768926 e+05 | NaN | 999.0 | He is a private enterprise, the tax burden is really not light |
40 | 000001.SZA | 2017-08-03 | 98421938.0 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999.0 | It’s not a private enterprise. I don’t know what it is |
Df_test.dropna (axis= 1) # Drop columns containing NaN valuesCopy the code
instrument | date | volume | company_name | company_type | fs_net_profit | open_int | A random column | |
---|---|---|---|---|---|---|---|---|
23 | 000005.SZA | 2017-08-02 | 6313300.0 | Shenzhen Fountain Corporation | The private enterprise | 1.014027 e+07 | 999.0 | He is a private enterprise, the tax burden is really not light |
47 | 000009.SZA | 2017-08-03 | 69392114.0 | China Baoan Group Co., Ltd. | The public enterprise | 3.347200 e+07 | 999.0 | It’s not a private enterprise. I don’t know what it is |
16 | 000020.SZA | 2017-08-01 | 0.0 | Shenzhen Zhongheng Huafa Co., Ltd. | The private enterprise | 4.211734 e+05 | 999.0 | He is a private enterprise, the tax burden is really not light |
5 | 000007.SZA | 2017-08-01 | 10004406.0 | Shenzhen Quanxinhao Co., Ltd | The private enterprise | 2.768926 e+05 | 999.0 | He is a private enterprise, the tax burden is really not light |
40 | 000001.SZA | 2017-08-03 | 98421938.0 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 999.0 | It’s not a private enterprise. I don’t know what it is |
When working with data, the best way is not to delete it, but to populate it appropriately. Let’s construct a new data set to do the example operation.
Df_test.loc [23] = np.nan df_test.loc[[16,5],'volume'] = np.nanCopy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | A random column | |
---|---|---|---|---|---|---|---|---|---|
23 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
47 | 000009.SZA | 2017-08-03 | 69392114.0 | China Baoan Group Co., Ltd. | The public enterprise | 3.347200 e+07 | NaN | 999.0 | It’s not a private enterprise. I don’t know what it is |
16 | 000020.SZA | 2017-08-01 | NaN | Shenzhen Zhongheng Huafa Co., Ltd. | The private enterprise | 4.211734 e+05 | NaN | 999.0 | He is a private enterprise, the tax burden is really not light |
5 | 000007.SZA | 2017-08-01 | NaN | Shenzhen Quanxinhao Co., Ltd | The private enterprise | 2.768926 e+05 | NaN | 999.0 | He is a private enterprise, the tax burden is really not light |
40 | 000001.SZA | 2017-08-03 | 98421938.0 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999.0 | It’s not a private enterprise. I don’t know what it is |
49 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
Df_test.fillna (0) # fills all missing data with 0Copy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | A random column | |
---|---|---|---|---|---|---|---|---|---|
23 | 0 | 0 | 0.0 | 0 | 0 | 0.000000 e+00 | 0.0000 | 0.0 | 0 |
47 | 000009.SZA | 2017-08-03 | 69392114.0 | China Baoan Group Co., Ltd. | The public enterprise | 3.347200 e+07 | 0.0000 | 999.0 | It’s not a private enterprise. I don’t know what it is |
16 | 000020.SZA | 2017-08-01 | 0.0 | Shenzhen Zhongheng Huafa Co., Ltd. | The private enterprise | 4.211734 e+05 | 0.0000 | 999.0 | He is a private enterprise, the tax burden is really not light |
5 | 000007.SZA | 2017-08-01 | 0.0 | Shenzhen Quanxinhao Co., Ltd | The private enterprise | 2.768926 e+05 | 0.0000 | 999.0 | He is a private enterprise, the tax burden is really not light |
40 | 000001.SZA | 2017-08-03 | 98421938.0 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999.0 | It’s not a private enterprise. I don’t know what it is |
49 | 0 | 0 | 0.0 | 0 | 0 | 0.000000 e+00 | 0.0000 | 0.0 | 0 |
Df_test.fillna ({'date':'1988-09-01','volume':'20000000'}) # Fill the missing data in different columns with different valuesCopy the code
instrument | date | volume | company_name | company_type | fs_net_profit | fs_roe | open_int | A random column | |
---|---|---|---|---|---|---|---|---|---|
23 | NaN | 1988-09-01 | 20000000 | NaN | NaN | NaN | NaN | NaN | NaN |
47 | 000009.SZA | 2017-08-03 | 6.93921 e+07 | China Baoan Group Co., Ltd. | The public enterprise | 3.347200 e+07 | NaN | 999.0 | It’s not a private enterprise. I don’t know what it is |
16 | 000020.SZA | 2017-08-01 | 20000000 | Shenzhen Zhongheng Huafa Co., Ltd. | The private enterprise | 4.211734 e+05 | NaN | 999.0 | He is a private enterprise, the tax burden is really not light |
5 | 000007.SZA | 2017-08-01 | 20000000 | Shenzhen Quanxinhao Co., Ltd | The private enterprise | 2.768926 e+05 | NaN | 999.0 | He is a private enterprise, the tax burden is really not light |
40 | 000001.SZA | 2017-08-03 | 9.84219 e+07 | Ping An Bank Co., Ltd. | The public enterprise | 6.214000 e+09 | 3.0319 | 999.0 | It’s not a private enterprise. I don’t know what it is |
49 | NaN | 1988-09-01 | 20000000 | NaN | NaN | NaN | NaN | NaN | NaN |
Df_test.volume.fillna (df_test.volume.mean()) # Fill with average valueCopy the code
23 83907026.0
47 69392114.0
16 83907026.0
5 83907026.0
40 98421938.0
49 83907026.0
Name: volume, dtype: float64Copy the code
Df_test.volume.fillna (method= 'ffill') # forward fill ('ffill') or backward fill ('bfill')Copy the code
23 NaN
47 69392114.0
16 69392114.0
5 69392114.0
40 98421938.0
49 98421938.0
Name: volume, dtype: float64Copy the code
Df_test.volume. fillna(method= 'ffill',limit= 1) #Copy the code
23 NaN
47 69392114.0
16 69392114.0
5 NaN
40 98421938.0
49 98421938.0
Name: volume, dtype: float64Copy the code
11. Data preservation¶
Save the cleaned data on the platform:
Df.to_csv ('df_Pandaslearning') # Save dataCopy the code
The basic uses of Pandas are introduced to Pandas.