preface
Test environment:
Python3.7 2. numpy >= '1.16.4' 3. sklearn >= '0.23.1'Copy the code
A. Introduction of KNN
1.1KNN establishment process 1.1.1: Given a test sample, calculate its distance from each sample in the training set. 1.1.2: Find K training samples with the nearest distance. As a close neighbor of the test sample. 1.1.3: The category of the sample is determined according to the category of the K neighbors. 1.2 Classification decision 1.2.1: Voting decision, minority subject to majority. Take the most categories as the test sample category. 1.2.2 weighting voting method: the nearest neighbor votes are weighted according to the distance calculated. The closer the distance is, the greater the weight is. Weighting is the reciprocal of the square of the distance.
Ii. Test process
2.1: Library function import 2.2: data import 2.3: model training & visualization
KNN completes the classification task
3.1 KNN Classification Demo:
Neighbors import KNeighborsClassifier from sklearn Sklearn.model_selection import train_test_split # Import iris = datasets.load_iris() X = irs.data y = irs.target Print ("X :", X) print("y :", y) 2 X_train, X_test, y_train, y_test = train_test_split(X, y, Test_size =0.2) # classifier (KNeighborsClassifier(n_neighbor =5, p=2, metric="minkowski") Sum (X_pred == y_test)/x_pred. shape[0] print(" print accuracy acc: %.3f" % acc)Copy the code
Show X:
[[5.1 3.5 1.4 0.2] [4.9, 1.4, 0.2] [4.7 3.2 1.3 0.2] [4.6 3.1 1.5 0.2] [5. 3.6 1.4 0.2] [5.4 3.9 1.7 0.4] [4.6 3.4 1.4 3.4 1.5 0.2 0.3] [5] [] 4.4 2.9 1.4 0.2 4.9 3.1 1.5 0.1 [] [] 5.4 3.7 1.5 0.2 1.6 0.2 [4.8 3.4] [4.8 3. 1.4 0.1] [4.3 3. 1.1 0.1] [5.8 4. 1.2 0.2] [5.7 4.4 1.5 0.4] [5.4 3.9 1.3 0.4] [5.1 3.5 1.4 0.3] [5.7 3.8 1.7 0.3] [5.1 3.8 1.5 0.3] [5.4 5.1 3.7 1.5 0.4 3.4 1.7 0.2] [] [4.6 3.6 1. 0.2] [5.1 3.3 1.7 0.5] [4.8 3.4 1.9 0.2] [5. 3. 1.6 to 0.2] [5. 3.4 1.6 0.4] 5.2 3.5 1.5 0.2 [] [] 5.2 3.4 1.4 0.2 4.7 3.2 1.6 0.2 [] [] 4.8 3.1 1.6 0.2 5.4 3.4 1.5 0.4 [] [] 5.2 4.1 1.5 0.1 [5.5 4.2 1.4 4.9 3.1 1.5 0.2 0.2] [] [5. 3.2 1.2 0.2] [5.5 3.5 1.3 0.2] [4.9 3.6 1.4 0.1] [4.4 3. 1.3 0.2] [5.1 3.4 1.5 0.2] [5. 3.5 4.5 2.3 1.3 0.3 1.3 0.3] [] [] 4.4 3.2 1.3 0.2 [5. 3.5 1.6 0.6] [5.1 3.8 1.9 0.4] [4.8 3. 1.4 0.3] [5.1 3.8 1.6 0.2] [4.6 5.3 3.7 1.5 0.2 3.2 1.4 0.2] [] [5. 3.3 1.4 0.2] [7. 3.2 4.7 1.4] [6.4 3.2 4.5 1.5] [6.9 3.1 4.9 1.5] [1.3] 5.5 2.3 4. 5.7 2.8 4.5 6.5 2.8 4.6 1.5 [] [] 1.3 [6.3 3.3 4.7 1.6] [4.9 2.4 3.3 1.] [6.6 2.9 4.6 1.3] [5.2 2.7 3.9 1.4] [5. 2. 3.5 5.9 3. 1.] [4.2 1.5] [6. 2.2 4. 1.] [6.1 2.9 4.7 1.4] [5.6 2.9 3.6 1.3] [6.7 3.1 4.4 1.4] [5.6 3. 4.5 1.5] [5.8 2.7 6.2 2.2 4.5 1.5 4.1 1.] [] [] 5.6 2.5 3.9 1.1 4.8 1.8 [5.9 3.2] [6.1 2.8 4. 1.3] [6.3 2.5 4.9 1.5] [6.1 2.8 4.7 1.2] [6.4 3. 6.6 2.9 4.3 1.3] [4.4 1.4] [6.8 2.8 4.8 1.4] [6.7 3. 5. 1.7] [6. 2.9 4.5 1.5] [5.7 2.6 3.5 1.] [5.5 2.4 3.8 1.1] [5.5 2.4 3.7 1.] [5.8 2.7 3.9 1.2] [6. 2.7 5.1 1.6] [5.4 3. 4.5 1.5] [6. 3.4 4.5 1.6] [6.7 3.1 4.7 1.5] [6.3 2.3 4.4 1.3] [5.6 3. 4.1 1.3] [5.5 2.5 4. 1.3] [5.5 2.6 4.4 1.2] [6.1 3. 4.6 1.4] [1.2] 5.8 2.6 4. [5. 2.3 3.3 1.] [5.6 2.7 4.2 1.3] [5.7 3. 4.2 1.2] [5.7 2.9 4.2 1.3] [6.2 2.9 4.3 1.3] [5.1 2.5 (3) 1.1] [5.7 2.8 4.1 1.3] [6.3, 3.3 6. 2.5] [5.8 2.7 5.1 1.9] [7.1 3. 5.9 2.1] [6.3 2.9 5.6 1.8] [6.5 3. 5.8 2.2] [7.6 3. 6.6 2.1] [4.9 2.5 4.5 1.7] [7.3 2.9 6.3 1.8] [6.7 7.2 3.6 6.1 2.5 2.5 5.8 1.8] [] [6.5 3.2 5.1 (2)] [6.4 2.7 5.3 1.9] [6.8, 5.5, 2.1] [5.7 2.5 5. 2.] [5.8 2.8 5.1 2.4] [6.4 3.2 5.3 2.3] [6.5, 5.5, 1.8] [] 7.7 3.8 6.7 2.2 7.7 2.6 6.9 2.3 [] [] 6. 5. 2.2 1.5 [6.9 3.2 5.7 2.3] [5.6 2.8 4.9 2. ] [7.7 2.8 6.7 (2)] [6.3 2.7 4.9 1.8] [6.7 3.3 5.7 2.1] [7.2, 3.2 6. 1.8] [6.2 2.8 4.8 1.8] [6.1 3. 4.9 1.8] [6.4 2.8 5.6 2.1] [7.2 3. 5.8 1.6] [7.4 2.8 6.1 1.9] [7.9 3.8 6.4 (2)] [6.4 2.8 5.6 2.2] [6.3 2.8 5.1 1.5] [6.1 2.6 5.6 1.4] [7.7 3. 6.1 2.3] [6.3 3.4 5.6 2.4] [6.4 3.1 5.5 1.8] [6. 3. 4.8 to 1.8] [6.9 3.1 5.4 2.1] [6.7 3.1 5.6 2.4] [6.9 3.1 5.1 2.3] 6.8 3.2 5.9 5.8 2.7 5.1 1.9 [] [] 2.3 [6.7 3.3 5.7 2.5] [6.7 3. 5.2 2.3] [6.3, 2.5 5. 1.9] [6.5 3. 5.2 2.] [6.2 3.4 5.4 3] [5.9 3.5.1 1.8]]Copy the code
Y display:
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2Copy the code
3.2 Analyzing KNN classification task X is the input, and its form is array; Y is the corresponding label, which is a classic triad. Later build your own data, such as reading data from tables (see juejin.cn/post/707996… Take the inputs and outputs, and the rest is “building blocks.”
KNN completes the return mission
4.1 Demo: Demo is from skLearn official website
import numpy as np import matplotlib.pyplot as plt from sklearn.neighbors import KNeighborsRegressor np.random.seed(0) # Randomly generate 40 numbers before (0, 1), multiply by 5, T = np.linspace(0, 5, 500)[:, X = np.sort(5 * Np.random. Rand (40, 1), axis=0) # create arithmetic array between [0, 5]. Np.newaxis] # use sin function to get y value, Sin (X). Ravel () # Add noise to targets[y value adds noise] y[::5] += 1 * (0.5-Np.random.rand (8)) # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Fit regression model # set up multiple k neighbor N_neighbors = [1, 3, 5, 8, 10, 40] # figure(figsize=(10,20)) for I, k in enumerate(n_neighbors): Predictor (N_NEIGHBORS = K, P =2, metric="minkowski") Subplot (6, 1, I + 1) plt.scatter(X, y, color='red', label='data') plt.plot(T, y_, color='navy', label='prediction') plt.axis('tight') plt.legend() plt.title("KNeighborsRegressor (k = %i)" % (k)) plt.tight_layout() plt.show()Copy the code
KNN regression analysis:
Set multiple K nearest neighbors for comparison. From the combination of these values, it can be obtained:
When K=1: function is overfitted, generalization of new data is poor
When K=40: the function is not fit enough to fit the data
When K=3,5,8,10, the function fitting is suitable. Among the four fitting degrees, there is generalization in combination with the image :(K=3)>(K=5)>(K=8)>(K=10).