class sklearn.ensemble.RandomForestClassifier(
        n_estimators=10.# Number of decision trees in random forest
        criterion='gini'.# Algorithm of decision tree construction (Cart, Gini coefficient, probability P of randomly selecting 2 data belonging to the same category, then 1-P, the larger the value is, the more chaotic the information is)
        max_depth=None.# Maximum depth per tree
        min_samples_split=2.# Minimum partition node
        min_samples_leaf=1.# Minimum number of samples per leaf node
        min_weight_fraction_leaf=0.0, max_leaf_nodes = max_features = "auto"None, 
        min_impurity_decrease=0.0, 
        min_impurity_split=None, 
        bootstrap=True, 
        oob_score=False.# cross validation
        n_jobs=1.# Parallel computing has no impact on model performance, but it will affect the speed of model training
        random_state=None.# can reproduce
        verbose=0, 
        warm_start=False, 
        class_weight=None )
Copy the code

1/ Parameter Introduction:

<1>n_estimators

Integer, optional (default=10) The number of trees in The forest. An optional integer (default: 10), the number of decision trees in a random forest. Simply put: How many trees do you plan to vote with? If this parameter is large, the performance of the model will be improved, because if there are more decision trees, it is equivalent to more 'experts' participating in the decision making, so the probability of error will be reduced. However, if there are too many decision trees, the amount of calculation will also be increased. There are pros and cons, but in comparison, it's better to be bigger.Copy the code

<2>criterion

String, optional (default= "gini") The value is a string that can be used to measure feature splitting performance. The supported criteria are "gini" for gini impurity, and "entropy" for information gain. Note: This parameter is tree specific.Copy the code

<3>max_features

Int, float, String or None, optional (default= "auto") Integer, floating point, string or no value, optional (default is "auto") If it is int, the max_feature is considered at each split. If float, then max_features is a percentage, and the (max_feature*n_features) feature integer value is considered at each split. If auto, then max_features= SQRT (n_features), the square root value of n_features. If log2, then max_features=log2(n_features) If None, then max_features=n_features note: The search for segmentation points does not stop until at least one valid node partition is found, even though it requires effective checking of features beyond max_features.Copy the code

<4>max_depth

Integer or None, optional (default=None) Integer or no value, optional (Default: None) Maximum depth of the tree. If the value is None, then the node is extended until all leaves are pure, or until all leaves contain fewer samples than min_sample_split.Copy the code

2/ The reason why random forest will not have over-fitting

In building each decision tree, there are two things to note: random sampling and complete splitting. Firstly, there are two random sampling processes. Random forest randomly samples the input sample data in rows and columns. For row random sampling, the method of putting back is adopted, that is, in the sample set obtained by sampling, there may be duplicate samples. Assuming that the total input sample is N, then the random sample is also N. In this way, the total input samples of each tree are the same during training, but they are not all samples, making over-fitting relatively difficult. Then, column random sampling is carried out, and M features are selected (M << M) from M features. Then, a decision tree is established by completely splitting the sample data after random sampling. In this way, a leaf node of the decision tree either cannot continue splitting, or all the samples in the tree point to the same classification. In general, pruning is an important step in many decision tree algorithms, but this is not the case. Because the randomness of the previous two random sampling processes is guaranteed, over-fitting will not occur even without pruning. Each tree in this random forest is weak, but the group of trees is strong. Random forest algorithm can be compared like this: Every decision tree is a master of experts in the field of a certain narrow (because we have a feature from M M a set of each decision tree learning), so that in the random forests have a lot of experts proficient in different areas, to a new problem (new input data), can use different perspective to look at it, eventually by individual specialists, The vote gets the result.Copy the code