It is very nice to be able to manipulate a text file in the same way that SQL does. It is very nice to be able to manipulate a text file in a fixed delimiter
How to load TXT file?
Sample file data papa.txt
paxi_id grade
1 50
2 50
3 100
4 200
3 100
5 100
Copy the code
Install Jupyter and run The Jupyter Notebook in a file directory. In the browser window that opens, select Python to run the notebook
import pandas # introduction of pandas
papa=pandas.read_csv('papa.txt',sep='\t') # load papa. TXT and specify its delimiter as \t
papa.head() Display the first few lines of data
Copy the code
You can see the results of the load visually presented in a table
How do I know how many rows of data I just loaded? There are several columns?
The operation instructions are as follows
rowNum=papa.shape[0] Table headers are not included
colNum=papa.columns.size
Copy the code
The results for
How do I de-weight the entire data in one column?
The operation instructions are as follows
uPapa=papa.drop_duplicates(['paxi_id'])
Copy the code
The results are as follows
How do I get the deduplicated value of a column? How many are there after de-weighting?
The operation instructions are as follows
uPaxiId=papa['paxi_id'].unique()
print("uPaxiId:",uPaxiId)
totalUPaxiIdNum=uPaxiId.size
print("num:",totalUPaxiIdNum)
Copy the code
The result is as follows
How do you compute the sum of a column?
The operation instructions are as follows
papa['grade'].sum()
Copy the code
The results are as follows
How do I filter rows for specific values?
The operation instructions are as follows
papa[ ( papa['grade'] == 50 ) | ( papa['grade'] == 100)]Copy the code
The results are as follows
How do you compute the number of values in a column?
The operation instructions are as follows
gPapa=papa.groupby('grade').size()
Copy the code
The results are as follows
How do you compute the sum of two or all of them?
The operation instructions are as follows
v=gPapa[50]+gPapa[100]
print("The sum of two :",v)
print("The sum.",gPapa.sum())
Copy the code
The results are as follows
How can values be graphically represented?
The operation instructions are as follows
import matplotlib.pyplot as plt
fig=plt.figure()
gPapa.plot(kind='bar',grid=True) #bar and barh can switch between x and y axes
plt.show() # when the display is required, it will draw all the images at once
Copy the code
The results are as follows
How to join two TXT files according to one column?
The other file is xixi.txt
paxi_id type1, 3, 2, 4, 3, 4, 4, 5, 3Copy the code
Execute the following instructions
xixi=pandas.read_csv('xixi.txt',sep='\t')
uXixi=xixi.drop_duplicates(['paxi_id'])
pandas.merge(uPapa,uXixi,on=['paxi_id']) #join
Copy the code
The results are as follows
Export the graph of the dictionary
period={'1': 100,'2': 200,'3':150}
import matplotlib.pyplot as plt
fig=plt.figure()
plt.bar(range(len(period)),period.values(),align='center')
plt.xticks(range(len(period)),list(period.values()))
plt.show()
Copy the code
Pandas official documentation is attached
Pandas.pydata.org/pandas-docs… Have a tutorial –