Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”
SimpleImputer parameter description
class sklearn.impute.SimpleImputer(*, missing_values=nan, strategy=’mean’, fill_value=None, verbose=0, copy=True, add_indicator=False)
Parameter meaning
- missing_values:
int
.float
.str
, (default)np.nan
orNone
That is, what the missing value is. - strategy: null-filled strategy, with four options (default)
mean
,median
,most_frequent
,constant
.mean
The missing value for the column is filled in by the mean of the column.median
Is the median,most_frequent
For the number.constant
Means to fill a null value with a custom value, but the custom value passesfill_value
To define. - fill_value:
str
orThe numerical
By default,Zone
. whenstrategy == "constant"
When,fill_value
Is used to replace all missing values (missing_values
).fill_value
forZone
, when dealing with numerical data, missing values (missing_values
Will be replaced by0
For string or object data types"missing_value"
This string. - verbose:
int
, (default)0
Control,imputer
The lengthy. - copy:
boolean
, (default)True
, indicating that a copy of the data is processed,False
Modify data in place. - add_indicator:
boolean
, (default)False
.True
Will be added after the datan
Column by0
and1
Of the same size of data,0
Represents a non-missing value at the location,1
Indicates that the value is missing.
Commonly used method
fit(X)
The return value is the SimpleImputer() class, and the relevant value of the X matrix can be calculated through the FIT (X) method for filling other missing data matrices.
transform(X)
Fill in the missing values, usually using the fit() method before processing the matrix.
from sklearn.impute import SimpleImputer
import numpy as np
X = np.array([[1.2.3],
[4.5.6],
[7.8.9]])
X1 = np.array([[1.2, np.nan],
[4, np.nan, 6],
[np.nan, 8.9]])
imp = SimpleImputer(missing_values=np.nan, strategy='mean')
imp.fit(X)
print(imp.transform(X1))
Run result
[[1. 2. 6.]
[4. 5. 6.]
[4. 8. 9.]]
Copy the code
Since FIT (X) and strategy=’mean’, the fill value is the mean of each column of the X matrix.
fit_transform(X)
Equivalent to fit() + transform().
X1 = np.array([[1.2, np.nan],
[4, np.nan, 6],
[np.nan, 8.9]])
imp = SimpleImputer(missing_values=np.nan, strategy='mean')
print(imp.fit_transform(X1))
Run result
[[1. 2. 7.5]
[4. 5. 6. ]
[2.5 8. 9. ]]
Copy the code
get_params()
Obtain the SimpleImputer parameter information.
imp = SimpleImputer(missing_values=np.nan, strategy='mean')
print(imp.get_params())
Run result
{'add_indicator': False.'copy': True.'fill_value': None.'missing_values': nan, 'strategy': 'mean'.'verbose': 0}
Copy the code
inverse_transform(X)
Converts data back to its original representation. Inverts the conversion operation performed on the array. This operation can only be performed after simpleImputer is instantiated with add_indicator=True note: Invert can only be performed on features of binary indicators with missing values. If a feature has no missing values at the time of fitting, then the feature has no binary index and the assignment at the time of transformation will not be reversed. Simply put, there is no restore without replacing the missing value.
X1 = np.array([[1.2, np.nan],
[4, np.nan, 6],
[np.nan, 8.9]])
imp = SimpleImputer(missing_values=np.nan, strategy='mean', add_indicator=True)
X1 = imp.fit_transform(X1)
print(X1)
print(imp.inverse_transform(X1))
Run result
[[1. 2. 7.5 0. 0. 1. ]
[4. 5. 6. 0. 1. 0. ]
[2.5 8. 9. 1. 0. 0. ]]
[[ 1. 2. nan]
[ 4. nan 6.]
[nan 8. 9.]]
Copy the code
Custom value fill
Fill_value User-defined.
X = np.array([[1.2.3],
[4.5.6],
[7.8.9]])
imp = SimpleImputer(missing_values=1, strategy='constant', fill_value=Awesome!)
print(imp.fit_transform(X))
Run result
[[Awesome! 2 3]
[4 5 6]
[7 8 9]]
Copy the code
Fill_value is the default value Zone.
X = np.array([[1.2.3],
[4.5.6],
[7.8.9]])
imp = SimpleImputer(missing_values=1, strategy='constant', fill_value=None)
print(imp.fit_transform(X))
Run result
[[0 2 3]
[4 5 6]
[7 8 9]]
Copy the code
For starters
Python
Or they want to get startedPython
You can search on wechat [A new vision of Python
Sometimes a simple question card for a long time, but others may dial a point will suddenly see light, heartfelt hope that we can make progress together.