This article has participated in the activity of “Newcomer Creation Ceremony” to start the road of gold-digging creation together.

Linearly fractionable data set

As we showed in the previous chapter of our machine learning tutorial, a neural network consisting of just one perceptron is sufficient to separate our example class. Of course, we carefully designed these classes to make them work. There are many class clusters that don’t work for them. We’ll look at some other examples and discuss cases where classes cannot be separated.

Our class is linearly separable. Linear separability makes sense in Euclidean geometry. Two sets of points (or classes) are said to be linearly separable if there is at least one straight line in the plane such that all points in one class are on one side of the line and all points in the other class are on the other side.

More formal:

If two data clusters (classes) can be separated by decision boundaries in the form of linear equations

Sigma I =1nX I =0

They’re called linearly separable.

Otherwise, that is, if such a decision boundary does not exist, the two classes are said to be linearly indivisible. In this case, we cannot use a simple neural network.

In our next example, we will write a neural network in Python that implements the logical AND function. It defines two inputs as follows:

We learned in the previous chapter that a neural network with one perceptron and two input values can be interpreted as a decision boundary, a straight line dividing two categories. The two classes we will classify in the example are as follows:

Import matplotlib.pyplot as PLT and numpy as NP

Graph ax is equal to PLT. Subgraph () xmin, xmax = -0.2, 1.4x = np. Arange (xmin, xmax, 0.1) ax. Scatter (0, 0, color = “r”) ax. Scatter (0, 1, color = “r”) ax. Color = “r”) ax. Scatter (1, 1, color = “g”) ax. Set_xlim ([xmin, xmax]) ax. Set_ylim ([-0.1, 1.1]) m = -1 #ax.plot(X, m * X + 1.2, label=”decision boundary”) PLT. The plot ()

Output:

We also found that such a primitive neural network could only create lines through the origin. So the dividing line looks like this:

Import matplotlib.pyplot as PLT and numpy as NP

Graph ax is equal to PLT. Subgraph () xmin, xmax = -0.2, 1.4x = np. Arange (xmin, xmax, 0.1) ax. Set_xlim ([xmin, xmax]) ax. Set_ylim ([-0.1, 1.1]) m = -1 for m in np. Range (0, 6, 0.1): AX. Plot X, m times X, ax. Scatter (0, 0, color = “r”) ax. Scatter (0, 1, color = “r”) ax. Scatter (1, 0, color = “r”) ax. (1, 1, color = “g”) The plot ()

Output:

We can see that none of these lines can be used as a decision boundary, or any other line that goes through the origin.

We need a line

⋅X+C where the intercept C does not equal 0.

For example, line

Is a = X + 1.2

Can be used as a dividing line for our questions:

Import matplotlib.pyplot as PLT and numpy as NP

Graph ax is equal to PLT. Subgraph () xmin, xmax = -0.2, 1.4x = np. Arange (xmin, xmax, 0.1) ax. Scatter (0, 0, color = “r”) ax. Scatter (0, 1, color = “r”) ax. Color = “r”) ax. Scatter (1, 1, color = “g”) ax. Set_xlim ([xmin, xmax]) ax. Set_ylim ([-0.1, 1.1]) m, c = -1, 1.2ax. Plot (X, m * X + c) PLT. The plot ()

The output

The question now is, can we find a solution with a slight modification of the network model? Or to put it another way: Can we create a perceptron that defines arbitrary decision boundaries?

The solution involves adding bias nodes.

A single perceptron with bias a perceptron with two input values and a bias corresponds to a general line. With bias values, b we can train the perceptron to determine the decision boundary C with non-zero intercept.

Although the input value can be changed, the offset value always remains the same. Only the weights of bias nodes can be adjusted.

Now, the linear equation for the perceptron contains the bias:

⋅ 1 = 1N Watts 1 + ⋅ N +1 b =0

In our example, it looks like this:

Merc 1+ merC 2+ MerC 3 =0

This is equivalent to

⋅ 1 = -12 ⋅ 1- 12

This means:

M = -watt 1 watt 2

and

C= -watts 3 watts 2⋅ B

Import numpy as NP from collections import Counter class perceptron: Def __init__ (self, weights, bias = 1, learning_rate = 0.3): """ 'weights' can be a NUMpy array, list, or tuple with actual values of weights. The number of input values consists of the length of 'weights' """ self. Weight = NP. Array (weight) self. Prejudice = prejudice against self. @staticmethod def unit_step_function (x): if x <= 0 else: Return 1 def __call__ (self, in_data): in_data = np. Concatenate ((IN_DATA, [self-bias])) result = self. Weights @in_data returns the perceptron. Def (self, target_result, in_data): if type (in_data)! = np.ndarray: in_data = NP. Array (IN_DATA) # calculated_result = since (IN_DATA) error = target_result - calculated_result if error =! 0: IN_DATA = NP. Join ((in_data, [self.deviation])) correction = error * in_data * self. Learning_rate self. Weight += correction DEF evaluation (since, data, tag) : evaluation = counter () for sample, tag in zipper (data, tag) : Result = since (sample) # Prediction If result == tag: evaluation [" correct "] += 1 otherwise: Evaluation [" error "] += 1 returns evaluationCopy the code

Let’s assume that the Python code above with the Perceptron class is stored in your current working directory as “perceptrons.py”.

From perceptrons import Perceptron def labelled_samples (n): for _ in range (n): S = NP. Random. Randint (0, 2, (2,)) yield (s, 1) if S [0] == 1 and s [1] == 1 else (s, 0) p = perceptron (weight = [0.3, 0.3, 0.3], learning_rate = 0.2) for IN_DATA, the tag is at Labelled_samples (30) : p. Adjust (tags, input data) test_data, test_labels = list (zip (* labelled_samples (30))) evaluation = p. Evaluate (test_data, test_Labels) print (evaluate)Copy the code

Output:

Counter ({‘ correct ‘: 30})

Import numpy as np graph, ax = PLT. Subgraph () xmin, xmax = -0.2, 1.4x = np. Arange (xmin, xmax, 0.1) ax. Scatter (0, 0, color = "r") ax. Scatter (0, 1, color = "r") ax. Color = "r") ax. Scatter (1, 1, color = "g") ax. Set_xlim ([xmin, xmax]) ax. Set_ylim ([-0.1, 1.1]) m = -p. Weight [0] / p. Weight [1] c = -p. Weight [2] / p. Weights [1] Prints (m, c) ax. Plot (X, m * X + c) PLT. The plot ()Copy the code

Output:

-3.0000000000000004 3.0000000000000013 [] We will create another example with a linearly separable data set that requires a bias node to separate. We will use the following make_blobs function sklearn.datasets:

Import make_blobs from sklearn.datasets

N_samples = 250 samples with the tag = make_blobs (n_samples = n_samples, Center = ([2.5, 3], [6.7, 7.9]), random_state = 0) Let’s visualize the data we created previously:

Import Matplotlib.pyplot as PLT

Color = (‘green’, ‘magenta’, ‘blue’, ‘cyan’, ‘yellow’, ‘red’) Graph (a)

Used for n_class in range (2) : axe. Scatter (sample [label == n_class][:, 0], sample [label == n_class][:, 1], C = color [n_class], S = 40, label = STR (n_class))

N_learn_data = int (n_samples * 0.8) # 80% of the available data points learn_data, test_data = samples [: n_learn_data ], samples [ – n_learn_data :] learn_labels , test_labels = labels [: n_learn_data ], labels [ – n_learn_data :]

Import perceptrons from perceptrons

P = perceptron (weight = [0.3, 0.3, 0.3], Learning_rate = 0.8)

For samples, labels in zipper (LEARn_DATA, LEARn_labels) : P. Adjust (label, sample)

Evaluation = P. Evaluate (learn data, learn labels) Print (evaluate) output: counter ({‘ correct ‘: 200}) Let’s visualize decision boundaries:

Graph ax is equal to PLT. Colors = ('green', 'blue') for n_class in range (2): ax. Scatter (LEARn_data [Learn_labels == n_class][:, 0], Learn_data [Learn_labels == n_class][:, 1], C = color [n_class], Colors = ('lightgreen', 'lightblue') for n_class in range (2): ax. Scatter (test_data [test_labels == n_class][:, 0], test_data [test_labels == n_class][:, 1], c = color [n_class], S = 40, label = STR (n_class)) X = NP. Arange (Np.max (samples [:, 0])) m = -p. Weight [0] / p. Weight [1] c = -p. Weight [2] / p. Weights [1] Prints (m, c) ax. Plot (X, m * X + C) PLT. Plot () PLT. According to ()Copy the code

Output:

1.5513529034664024 11.736643489707035

In the next section, we will introduce the XOR problem for neural networks. It is the simplest example of a nonlinear separable neural network. It can be solved with additional layers of neurons called hidden layers.

Xor problem of neural networks

XOR (XOR) functions are defined by the following truth table:

This problem cannot be solved by a simple neural network, as shown in the figure below:

No matter which line you choose, you will not succeed in having a blue dot on one side and an orange dot on the other. This is shown below. The orange dot is on the orange line. That means it can’t be a dividing line. If we parallel move this line in either direction, there will always be two oranges and a blue dot on one side, and only one blue dot on the other side. If we move the orange line in a non-parallel manner, there will be a blue and an orange dot on both sides, unless the line passes through the orange dot. So there’s no way to separate these points with a straight line.

To solve this problem, we need to introduce a new kind of neural network, one with a so-called hidden layer. The hidden layer allows the network to reorganize or rearrange input data.

All we need is a hidden layer with two neurons. One works like a door and the other works like a door or. When the OR gate fires but the AND gate does not, the output will “fire”.

As we already mentioned, we can’t find a line that separates the orange dots from the blue ones. But they can be separated by two lines, such as L 1 and L 2 in the picture below:

To solve this problem, we need the following types of networks with hidden layers N 1 and N 2

Neuron N 1 will identify one line, say L 1 and neuron N 2 will identify another line L 2. N 3 will finally solve our problem:

Implementing this in Python will have to wait for the next chapter in our machine learning tutorial.

practice

Exercise 1

We can extend logical AND to a floating point value between 0 AND 1 by:

Try training a neural network with only one perceptron. Why doesn’t it work?

Ex 2

A point belongs to class 0 if X1<0.5 and belongs to class 1 if X1>=0.5. Train a network with a perceptron to classify arbitrary points. What do you think about cutting boundaries? What about the input value X2

Practice problem solution

The solution to the first exercise

P = perceptron (weight = [0.3, 0.3, 0.3], deviation = 1, learning_rate = 0.2) def labelled_samples (n): for _ in range (n): S = np. Random. Random ((2,)) yield (s, 1) if S [0] >= 0.5 and s [1] >= 0.5 else (s, 0) The label is at Labelled_samples (30) : p. Adjust (tags, input data) test_data, test_labels = list (zip (* labelled_samples (60))) evaluation = p. Evaluate (test_data, test_Labels) print (evaluate)Copy the code

Output:

Counter ({' true ': 52,' false ': 8})Copy the code

The easiest way to see why it doesn’t work is to visualize the data.

Test_data = [test_data [I] for I in range (len (test_data)) if test_labels [I] == 1] zeroes = [ Test_data [I] for I in range (len (test_data)) if test_labels [I] == 0] graph, ax = PLT. Subplots () xmin, xmax = -0.2, 1.2x, Y = list (zip (* ones)) ax. Scatter (X, Y, color = "g") X, Y = List (zip (* Zeroes)) ax. Scatter (X, Y, color = "r") ax. Set_xlim ([xmin, xmax]) ax. Set_ylim ([-0.1, 1.1]) c = -p. Weight [2] / p. The weight [1] m = -p. Weight [0] / p. Weight [1] X = Np.arange (xmin, xmax, 0.1) ax. Plot (X, m * X + C, label = "decision boundary")Copy the code

Output:

[< matplotlib. Lines. Line2D in 0 x7fba8a295790 >]

We can see that the green dot and the red dot are not a straight line.

The solution to the second exercise

Import numpy as NP from collections import Counter def labelled_samples (n): for _ in range (n): s = NP. Random. Random ((2,)) yield (s, 0) if s [0] < 0.5 else (s, 1) p = [0.3, 0.3, 0.3], Learning_rate = 0.4) for IN_DATA, the tag is at Labelled_samples (300) : p. Adjust (labels, input data) test_data, test_labels = list (zip (* labelled_samples (500))) print (p. Weight) p. Evaluation (test_data, test_labels)Copy the code

Output:

[2.22622234-0.05588858-0.9] counter ({' correct ': 460,' wrong ': 40})Copy the code
Test_data = [test_data [I] for I in range (len (test_data)) if test_labels [ I] == 1] zeroes = [test_data [I] for I in range (len (test_data)) if test_labels [I] == 0] graph, ax = PLT. Subplots () xmin, xmax = -0.2, 1.2x, Y = list (zip (* ones)) ax. Scatter (X, Y, color = "g") X, Y = List (zip (* Zeroes)) ax. Scatter (X, Y, color = "r") ax. Set_xlim ([xmin, xmax]) ax. Set_ylim ([-0.1, 1.1]) c = -p. Weight [2] / p. The weight [1] m = -p. Weight [0] / p. Weight [1] X = Np.arange (xmin, xmax, 0.1) ax. Plot (X, m * X + C, label = "decision boundary")Copy the code

Output:

[<matplotlib.lines.Line2D at 0x7fba8a1fbac0>] p. Weight, meter output: (Array ([2.22622234, -0.05588858, -0.9]), 39.83322163376969) m In this case, the slope must get bigger and bigger.