This paper is a code reproduction of the book statistical Learning Methods [1] written by Li Hang.
Author: Huang Haiguang [2]
Note: the code can be downloaded at Github [3].
I will release the code in the public number “Machine learning beginners”, please pay attention.
The code directory
- Chapter 1 Introduction to statistical learning methods
- Chapter 2 perceptron
- Chapter 3 k nearest neighbor method
- Chapter 4 naive Bayes
- Chapter 5 decision tree
- Chapter 6: Regression of Logic
- Chapter 7 Support vector machines
- Chapter 8 Methods of promotion
- Chapter 9 EM algorithm and its extension
- Chapter 10 Hidden Markov model
- Chapter 11 conditional random airport
- Chapter 12 summarizes the methods of supervised learning
Code reference: Wzyonggege [4],WenDesi[5], Hot hot [6]\
Chapter 2 perceptron
1. Perceptron is a linear classification model that classifies input instances according to their feature vectors:
The perceptron model corresponds to the separated hyperplane in the input space (feature space).
2. The strategy of perceptron learning is to minimize the loss function:
The loss function corresponds to the total distance from the misclassification point to the separation hyperplane.
3. Perceptron learning algorithm is an optimization algorithm of loss function based on stochastic gradient descent, which has original form and dual form. The algorithm is simple and easy to implement. In the original form, a hyperplane is selected randomly, and then the objective function is minimized by gradient descent method. In this process, a random misclassification point is selected to make it gradient descent.
4. When the training data set is linearly separable, the perceptron learning algorithm converges. The number of misclassification of perceptron algorithm on the training data set satisfies the inequality:
When the training data set is linearly separable, the perceptron learning algorithm has infinitely many solutions, which may be different due to different initial values or different iteration sequences.
Dichotomy model
Given training set:
Define the loss function of the perceptron
algorithm
The Stochastic Gradient Descent is followed
A random misclassification point is selected to make it gradient descent.
When the instance point is misclassified, that is, on the wrong side of the separation hyperplane, the value of, is adjusted so that the separation hyperplane moves to the side of the unclassified point until the misclassified point is correctly classified
Take the data of two categories and [Sepal length, Sepal Width] in iris data set as features
import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
import matplotlib.pyplot as plt
%matplotlib inline
# load data
iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['label'] = iris.target
df.columns = [
'sepal length'.'sepal width'.'petal length'.'petal width'.'label'
]
df.label.value_counts()
Copy the code
2 50
1 50
0 50
Name: label, dtype: int64
Copy the code
plt.scatter(df[:50] ['sepal length'], df[:50] ['sepal width'], label='0')
plt.scatter(df[50:100] ['sepal length'], df[50:100] ['sepal width'], label='1')
plt.xlabel('sepal length')
plt.ylabel('sepal width')
plt.legend()
Copy the code
data = np.array(df.iloc[:100[0.1, -1]])
X, y = data[:,:-1], data[:,-1]
y = np.array([1 if i == 1 else -1 for i in y])
Copy the code
Perceptron
# Linearly separable data, binary data
# Here is a linear equation of one variable
class Model:
def __init__(self) :
self.w = np.ones(len(data[0]) - 1, dtype=np.float32)
self.b = 0
self.l_rate = 0.1
# self.data = data
def sign(self, x, w, b) :
y = np.dot(x, w) + b
return y
Random gradient descent
def fit(self, X_train, y_train) :
is_wrong = False
while not is_wrong:
wrong_count = 0
for d in range(len(X_train)):
X = X_train[d]
y = y_train[d]
if y * self.sign(X, self.w, self.b) <= 0:
self.w = self.w + self.l_rate * np.dot(y, X)
self.b = self.b + self.l_rate * y
wrong_count += 1
if wrong_count == 0:
is_wrong = True
return 'Perceptron Model! '
def score(self) :
pass
Copy the code
perceptron = Model()
perceptron.fit(X, y)
Copy the code
'Perceptron Model! '
Copy the code
x_points = np.linspace(4.7.10)
y_ = -(perceptron.w[0] * x_points + perceptron.b) / perceptron.w[1]
plt.plot(x_points, y_)
plt.plot(data[:50.0], data[:50.1].'bo', color='blue', label='0')
plt.plot(data[50:100.0], data[50:100.1].'bo', color='orange', label='1')
plt.xlabel('sepal length')
plt.ylabel('sepal width')
plt.legend()
Copy the code
Scikit – learn instance
from sklearn.linear_model import Perceptron
clf = Perceptron(fit_intercept=False, max_iter=1000, shuffle=False)
clf.fit(X, y)
Copy the code
Perceptron(alpha=0.0001, class_weight=None, early_stopping=False, eta0=1.0,
fit_intercept=False, max_iter=1000, n_iter=None, n_iter_no_change=5,
n_jobs=None, penalty=None, random_state=0, shuffle=False, tol=None,
validation_fraction=0.1, verbose=0, warm_start=False)
Copy the code
# Weights assigned to the features.
print(clf.coef_)
Copy the code
[[ 74.6 127.2]]
Copy the code
# Intercept in decision function.
print(clf.intercept_)
Copy the code
[0.]
Copy the code
x_ponits = np.arange(4.8)
y_ = -(clf.coef_[0] [0]*x_ponits + clf.intercept_)/clf.coef_[0] [1]
plt.plot(x_ponits, y_)
plt.plot(data[:50.0], data[:50.1].'bo', color='blue', label='0')
plt.plot(data[50:100.0], data[50:100.1].'bo', color='orange', label='1')
plt.xlabel('sepal length')
plt.ylabel('sepal width')
plt.legend()
Copy the code
References [1[Statistical Learning Methods: HTTPS://baike.baidu.com/item/ Statistical learning methods /10430179
[2[HTTPS://github.com/fengdu78
[3] github: https://github.com/fengdu78/lihang-code
[4] wzyonggege: https://github.com/wzyonggege/statistical-learning-method
[5] WenDesi: https://github.com/WenDesi/lihang_book_algorithm
[6] hot: hot://blog.csdn.net/tudaodiaozhaleAbout this site "machine learning beginners" public account is founded by Dr. Huang Haiguang, Huang Bo personally know fans23000+, Github ranks among the top in the world100Name (33000+). This public number is committed to the direction of artificial intelligence science articles, for beginners to provide learning routes and basic information. Original works include: Personal Notes on Machine learning, notes on deep learning, etc. An excellent review of those years of academic philanthropy - You're not alone in the battle12000+, provide Baidu cloud image) Ng deep learning notes and videos and other resources (Github star8500+, provide Baidu cloud image) "statistical learning methods" python code implementation (Github star7200+) Mathematical Essence of machine learningCopy the code