The import file

import numpy as np
import pandas as pd
odata = pd.read_csv('example.csv') `Copy the code

These three lines of code are enough to import a CSV file. Note the path of the file

Delete rows

Data1 = data. Drop ([16])Copy the code

This code deletes lines 16 and 17 from the file, and lines 18 and 19 do not automatically fill in positions 16 and 17. After 15, the line number jumps to 18

If the.drop() method does not set inplace=True, the drop effect can only be implemented in the new data block, not the corresponding row of the original data block. Inplace = True

Odata. Drop (odata. Index [[16]], inplace = True)Copy the code

Note the difference between using inplace and not using it. Instead of using inplace, we use another variable Data1 to temporarily store the processed data, while using inplace, we directly call a function to manipulate the original data. It is worth noting that the inplace attribute does not modify the original file, so it is safe. In other words, the original data is deleted directly, but it is not deleted to the file. The original variables are only manipulated in memory.

Delete the column

del data['date']Copy the code

Delete the code as shown above, noting that the del can only have one argument in square brackets. Only one column can be deleted at a time.

Pop () method

The.pop method pops the selected column out of the original block, which no longer holds the column,

Data1 = data. The pop (' latitude)Copy the code

The.pop method pulls out individual pieces of data and is useful when we want to be interested in a particular piece of data.

The use of the split ()

Simple Python string splitting

When we preprocess data, we often have to process a string of data with various symbols. But at runtime we’ll split them up, so we’ll need to use Python’s split function

str = ('www.google.com')
print str
str_split = str.split('. ')
print str_splitCopy the code

The result of this operation is

www.google.com [' WWW ', 'Google', 'com]Copy the code

If we want to set the number of splits, we add a parameter to split:

The str_split = STR. The split ('. ', 1)Copy the code

The result is:

www.google.com/' WWW ', 'google.com'Copy the code

That is, only the first character is split, the second character is not split. For split function, and a string is the same, such as we want to be the data is a string of “| |”, we want to split the same way:

str = ('WinXP | | 7 | | doing | | Win8.1')
print str
str_split = str.split('| |')
print str_splitCopy the code

Get (note the single quotation mark, not the double quotation mark because it is a string)

[' WinXP ', 'Win7 ', 'Win8 ', 'Win8.1] 'Copy the code

Use of the Re module

When we deal with real data, we often need to split it by multiple separators, such as ‘Beautiful, is; For a string like better*than\nugly’, we want to split it into separate words. We can’t do this with split because the split function runs once and then converts the data to a list. The split function can’t handle the list data, so we can’t split twice. And the split function can’t set multiple arguments, so Python’s built-in re module helps us solve this problem. Specific use is as follows

Import re
a='Beautiful, is; better*than\nugly'
x= re.split('[,|; |\*|\n]',a)
print(x)Copy the code

The results are:

[' Beautiful ', 'is',' better ', 'than', 'nugly']Copy the code