Quick manual sklearn. Decomposition. PCA parameters


Sklearn. Decomposition. PCA (ncomponents = None, copy = True, whiten = False, svdsolver = ‘auto’, tol = 0.0, iteratedpower = ‘auto’, N_components randomstate = None) parameters


The number of principal components n to be retained in PCA algorithm is also the number of retained features N

Set up the

Int or string, which defaults to None, all elements are retained. Assigning an int, such as =1, will reduce the raw data to a dimension of string, such as ‘mle’, and will automatically pick the number of features N so that the desired percentage of variance is satisfied



Whiten, so that each feature has the same variance.

Set up the

Bool. The default value is False. If there are subsequent data processing actions after PCA dimension reduction, whitening can be considered



Method of constant singular value decomposition SVD

Set up the

The Auto PCA class automatically selects the following three algorithm trade-offs


It is suitable for dimension reduction of PCA with large amount of data, multiple data dimensions and low proportion of principal components


SVD in the traditional sense uses the sciPY library counterpart


The SCIpy library’s SPARSE SVD was implemented directly, which was similar to randomized



Indicates whether a copy of the original training data is made while running the algorithm.

Set up the

If True, the value of the original training data will not change after the PCA algorithm is run, because the operation is performed on the copy of the original data. If it is False, the value of the original training data will be changed after PCA algorithm is run, because dimension reduction is performed on the original data.


Criteria for stopping solvers, type float, default value 0 When svd_solver selects’ arpack ‘, its error tolerance for running the SVD algorithm


Int or STR. The default value is’ auto ‘. Number of iterations of the SVD algorithm run when svd_solver selects’ randomized ‘


Int. Defaults to None. Seed of a pseudorandom number generator used for probability estimation when shuffling data



Returns the component with the largest variance


Variance of each principal component after dimensionality reduction. The larger the variance, the more important the principal component


The proportion of variance value of each principal component to the total variance value after dimensionality reduction, the larger the proportion, the more important the principal component


Singular values of each of the corresponding components


Empirical mean of features estimated by training set = X.bean (Axis = 0)


Returns the number of retained ingredients n


Number of features of training data


Sample number of training data


Returns the covariance of the noise


fit(self, X[, y])

The PCA model was trained with data X

fit_transform(self, X[, y])

The PCA model is trained with X, and the data after dimension reduction is returned


Calculate data covariance (with generative model)

get_params(self[, deep])

Get the parameters of PCA


Calculate data accuracy matrix (with generation model)

inverse_transform(self, X)

Convert the dimensionally reduced data to the original data, but it may not be exactly the same

score(self, X[, y])

Calculate the log likelihood average for all samples

score_samples(self, X)

Returns the logarithmic likelihood value for each sample

set_params(self, **params)

Set the parameters of PCA

transform(self, X)

Transform the data X into the data after dimension reduction. After the model is trained, the transform method can also be used to reduce the dimension of the newly input data

Machine Learning and Statistics [Welcome to scan code for attention and receive massive data][1]

[1]: https://img2018.cnblogs.com/blog/743008/201907/743008-20190725101305454-1314391880.png