The import file
import numpy as np
import pandas as pd
odata = pd.read_csv('example.csv') `Copy the code
These three lines of code are enough to import a CSV file. Note the path of the file
Delete rows
Data1 = data. Drop ([16])Copy the code
This code deletes lines 16 and 17 from the file, and lines 18 and 19 do not automatically fill in positions 16 and 17. After 15, the line number jumps to 18
If the.drop() method does not set inplace=True, the drop effect can only be implemented in the new data block, not the corresponding row of the original data block. Inplace = True
Odata. Drop (odata. Index [[16]], inplace = True)Copy the code
Note the difference between using inplace and not using it. Instead of using inplace, we use another variable Data1 to temporarily store the processed data, while using inplace, we directly call a function to manipulate the original data. It is worth noting that the inplace attribute does not modify the original file, so it is safe. In other words, the original data is deleted directly, but it is not deleted to the file. The original variables are only manipulated in memory.
Delete the column
del data['date']Copy the code
Delete the code as shown above, noting that the del can only have one argument in square brackets. Only one column can be deleted at a time.
Pop () method
The.pop method pops the selected column out of the original block, which no longer holds the column,
Data1 = data. The pop (' latitude)Copy the code
The.pop method pulls out individual pieces of data and is useful when we want to be interested in a particular piece of data.
The use of the split ()
Simple Python string splitting
When we preprocess data, we often have to process a string of data with various symbols. But at runtime we’ll split them up, so we’ll need to use Python’s split function
str = ('www.google.com')
print str
str_split = str.split('. ')
print str_splitCopy the code
The result of this operation is
www.google.com [' WWW ', 'Google', 'com]Copy the code
If we want to set the number of splits, we add a parameter to split:
The str_split = STR. The split ('. ', 1)Copy the code
The result is:
www.google.com/' WWW ', 'google.com'Copy the code
That is, only the first character is split, the second character is not split. For split function, and a string is the same, such as we want to be the data is a string of “| |”, we want to split the same way:
str = ('WinXP | | 7 | | doing | | Win8.1')
print str
str_split = str.split('| |')
print str_splitCopy the code
Get (note the single quotation mark, not the double quotation mark because it is a string)
[' WinXP ', 'Win7 ', 'Win8 ', 'Win8.1] 'Copy the code
Use of the Re module
When we deal with real data, we often need to split it by multiple separators, such as ‘Beautiful, is; For a string like better*than\nugly’, we want to split it into separate words. We can’t do this with split because the split function runs once and then converts the data to a list. The split function can’t handle the list data, so we can’t split twice. And the split function can’t set multiple arguments, so Python’s built-in re module helps us solve this problem. Specific use is as follows
Import re
a='Beautiful, is; better*than\nugly'
x= re.split('[,|; |\*|\n]',a)
print(x)Copy the code
The results are:
[' Beautiful ', 'is',' better ', 'than', 'nugly']Copy the code