1 task

First of all, let’s talk about the learning task of the network we are going to build: let our neural network learn the logical Xor operation, xor operation is commonly known as “the same take 0, different take 1”. To simplify our requirements, we need to build such a neural network that outputs 0 when we input (1,1), outputs 1 when we input (1,0), and so on.

2 Implementation Ideas

Since our requirement requires two inputs and one output, we need to set two input nodes in the input layer and one output node in the output layer. Because the problem is relatively simple, we only need to set 10 nodes in the hidden layer to achieve a good effect. We use ReLU function for the activation function of the hidden layer and Sigmoid function for the output layer to keep the output in a range from 0 to 1. If the output is greater than 0.5, we can set the output to 1 and less than 0.5. Let the output be 0.

3 Implementation Process

The simple quick build method we used.

3.1 Introducing necessary libraries

import torch
import torch.nn as nn
import numpy as np
Copy the code

To use pyTorch, of course, you need to introduce the Torch package, and then replace the nn in the torch package with nn for coding convenience. Nn is short for neural network, which is a package used to build neural network. Numpy was introduced to create a matrix as input.

3.2 Creating a Training Set

# Build input set
x = np.mat('0 0; '
           '0 1; '
           '1 0; '
           '1 1')
x = torch.tensor(x).float()
y = np.mat('1; '
           '0. '
           '0. '
           '1')
y = torch.tensor(y).float(a)Copy the code

Personally, I prefer the Np. mat method of constructing the matrix. It feels easy to write, but you can also use other methods. But you have to do this after building the matrix: Torch.tensor(x).float(), you have to convert the input you’ve created into a tensor variable.

A tensor? You can see that it’s just a variable in PyTorch, but you have to convert your variable into a tensor to use the PyTorch framework. And our neural network is going to require that your input and output be a float, meaning a float in a tensor, and the input that you create with np.mat is an int, and a tensor will automatically be a tensor int, So I’m going to put a.float () at the end to convert it to floating point.

So we build the input and output (the x matrix and the y matrix, respectively). X is a matrix with four rows and two columns, and each row is an input, two values at a time. Here we list all the input cases. The output y is a matrix with four rows and a column, and each row is an output, corresponding to the input of each row of the x matrix.

3.3 Setting up the Network

myNet = nn.Sequential( 
    nn.Linear(2.10),
    nn.ReLU(),
    nn.Linear(10.1),
    nn.Sigmoid()
    )
print(myNet)
Copy the code

Output results:

We build the network using Sequential in the nn package, and this function is the one thing that allows us to build the neural network like building blocks.

Nn.Linear(2,10) builds the input layer, where 2 represents the number of input nodes and 10 represents the number of output nodes. Linear means that this layer does not include any other activation functions, and it outputs to you what you input. Nn.relu () this means that an activation layer is thrown into the ReLU function. Then there’s Linear, and then there’s Sigmoid. 2,10, and 1 represent the number of layers, simple and clear.

3.4 Setting the Optimizer

optimzer = torch.optim.SGD(myNet.parameters(),lr=0.05)
loss_func = nn.MSELoss()
Copy the code

The understanding of this step is that you need to have an optimization method to train your network, so this step sets the optimization method that we will use.

Torch. Optim.SGD means to use SGD(stochastic Gradient Descent) training. All you need to do is upload your network parameters and learning rates, mynet.paramets and LR, respectively. The loss_func statement sets the cost function. Since our problem is relatively simple, we use the MSE, which is the mean square error cost function.

3.5 Training Network

for epoch in range(5000):
    out = myNet(x)
    loss = loss_func(out,y)
    optimzer.zero_grad()
    loss.backward()
    optimzer.step()
Copy the code

I set a loop here of 5000 times (probably not that many times) to iterate the trained action 5000 times. Every time the output goes straight to myNet (x), throw the input into your network and you get the output out. Then use the cost function and your standard output y to find the error. The clear gradient step is for each iteration to clear the gradient from the previous iteration, so you just memorize that step, you don’t have to understand it too much at first. Loss.backward (), of course, makes the error propagate back, and then optimzer.step () lets the optimizer we just set work.

3.6 test

print(myNet(x).data)
Copy the code

Running result:

You can see that this is very close to what we expect, but you can change the data and the results will be similar. Here’s a quick explanation of why we put a.data at the end of our code, because our tensor variable actually has two parts, one is the tensor data, the other is the tensor’s automatic derivative parameter, and we put a.data which means we take the data in our tensor, If not, the output looks like this: