The background knowledge needed to read this article: perceptron learning algorithm, a didu programming knowledge
One, the introduction
In the previous section, we studied the series of machine learning algorithms (I) – Perceptron Learning Algorithm (PLA), which can perfectly divide data sets into two types, but with the premise that data sets are linearly separable. In the process of actual data is collected, perhaps because of a variety of reasons (such as anti-spam example collecting email words or artificial classification errors, will not be mistaken for spam spam) makes the data set is wrong data, the data set is then may not be linearly separable, there is no way to stop the perceptron learning algorithm, Therefore, Pocket Algorithm is designed to deal with linear inseparability based on perceptron learning Algorithm.
Ii. Model introduction
Pocket Algorithm is a binary classification Algorithm, which divides a data set into two types by linear combination. As shown in the figure below
The algorithm is based on perceptron learning algorithm to do the improvement, its core idea and the ideological unity of the perceptron learning algorithm, are driven by mistake, if the current result is better than the result in the pocket, will pocket the result in replacement for the current results, maintained the current pocket to see the best results, finally found a relatively good answer, Hence the name pocket algorithm.
Three, algorithm steps
Initialize vector w, for example w initialized to zero vector loop t = 0,1,2… Find a random error data, that is, h(x) does not match the target value y
Indicates yn sign (wtTxn (t)) (t) \ operatorname {signs} \ left (w_ {t} ^ {t} x_ {n (t)} \ right) \ neq y_ {n (t)} sign (wtTxn (t)) = yn (t)
Try to correct the vector W. If the error point of the updated W is less than the one before the update, update W, otherwise enter the next cycle.
Wt + 1 please wt + yn xn (t) (t) w_ {t + 1} \ leftarrow w_ + y_ {t} {n (t)} x_ (t)} {n + 1 please wt wt + yn (t) xn (t)
Until the set maximum number of cycles is reached and the cycle exits, the resulting W is the solution of a set of equations
As can be seen from the above steps, since we do not know when the cycle should stop, we need to artificially define a maximum cycle number as the exit condition. Therefore, the running time of pocket algorithm is slower than that of perceptron learning algorithm. Error points are randomly selected in the loop, and the resulting output is not a stable result each time it is run.
Four, code implementation
Using Python to implement pocket algorithms:
import numpy as np
def errorIndexes(w, X, y) :
""" Get the index set of error points args: w - weight coefficient X - training data set Y - target label values return: errorIndexes - Index set of error points """
errorIndexes = []
Walk through the training data set
for index in range(len(X)):
x = X[index]
# Determine if it does not match the target value
if x.dot(w) * y[index] <= 0:
errorIndexes.append(index)
return errorIndexes
def pocket(X, y, iteration, maxIterNoChange = 10) :
""" Pocket algorithm implements ARgs: X-training data set Y-target tag value iteration - Maximum number of iterations maxIterNoChange - Number of iterations that do not advance before stopping early return: W - weight coefficient ""
np.random.seed(42)
Initialize the weight coefficient
w = np.zeros(X.shape[1])
Get the set of subscripts for error points
errors = errorIndexes(w, X, y)
iterNoChange = 0
# cycle
for i in range(iteration):
iterNoChange = iterNoChange + 1
Get error index randomly
errorIndex = np.random.randint(0.len(errors))
# Calculate the temporary weight coefficient
tmpw = w + y[errors[errorIndex]] * X[errorIndex]
Get the subscript set of error points under temporary weight coefficient
tmpErrors = errorIndexes(tmpw, X, y)
If the number of error points is smaller, update the weight coefficient
if len(errors) >= len(tmpErrors):
iterNoChange = 0
# Correct weight coefficient
w = tmpw
errors = tmpErrors
# Stop early
if iterNoChange >= maxIterNoChange:
break
return w
Copy the code
Five, animation demonstration
Classification of simple training data sets:
Classification of complex training data sets:
Mind mapping
7. References
- Zh.wikipedia.org/wiki/%E6%84…
- www.coursera.org/learn/ntuml…
The full demo can be found here
Note: this article strives to be accurate and easy to understand, but because the author is a beginner, the level is limited, such as the existence of errors or omissions in the article, please readers through the way of comment criticism