Deep learning 001- The basic unit of neural networks – sensors
(Python libraries and versions used in this article: Python 3.6, Numpy 1.14, Scikit-learn 0.19, matplotlib 2.2)
In the field of artificial intelligence, deep learning has emerged as the preferred solution for large and complex problems. Deep learning differs from traditional machine learning in that it uses neural networks that mimic the human brain to build models. Early shallow layers neural network can also be regarded as a branch in the field of machine learning, but for now, due to the hierarchical deepening of the neural network, the parameter is more and more complex, more and more able to solve practical problems, the neural network, especially the depth of the neural network, more and more so deep learning is divided into a separate branch.
1. Introduction to perceptron
To better understand neural networks, let’s first take a look at the basic building blocks of neural networks — neurons. Neurons are biological terms. In computer science, we call them perceptrons.
Perceptron is an early neural network model proposed by American scholar F.Rosenblatt in 1957. Perceptron introduced the concept of learning for the first time, so that the learning function of human brain can be simulated to a certain extent in mathematics based on symbol processing, so it has attracted extensive attention.
Perceptron structure:
1. Input: A perceptron has one or more inputs for receiving characteristic values.
1. Weight: Each input has a weight, representing the influence of the input on the result.
2. Bias: although the weight can represent the influence of the input on the result, in order to better fit the result, it is sometimes necessary to move the input left and right, that is, to increase or decrease in parallel, hence the bias.
3. Output: One or more outputs will be given after calculation.
Generally speaking, the perceptron is like a more complicated function y=f(u,v), where u and v are the characteristic values of the input, F represents the combination of weights and biases, namely the calculation process, and Y is the output. So the perceptron can be represented by the following formula.
Where x represents the input, w is the weight, SITa is the bias, f is the activation function, and V is the output. So, it’s perfectly possible to think of the perceptron as a more complicated function.
2. Perceptron training
If we want to fit a series of data with a line, assuming that the equation of the line is y=a*x+b, then all we need to do is calculate the most suitable a and B. Similarly, there are two unknown variables (weight and bias) in the function of the perceptron, so what we need to do is to calculate the most appropriate weight and bias, and this calculation process is the training of the perceptron.
Perceptron training is based on the rules of perceptron, perceptron rules that if sample input function is linearly separable, the perceptron learning algorithm after finite iterations, convergence will be able to get the correct weights and bias, that is to say, no matter how much is the weight and bias at first, after N iterations, You always get a unique deterministic weight and a unique deterministic bias. This is why we initially set weights and biases to random numbers or 0.
The training process of perceptron can be summarized as follows: in the training, the input vector of one sample is taken out from the training data every time, its output is calculated by perceptron, and then the weight is adjusted according to the perceptron rules. Adjust the weight for each sample. After several rounds of iteration (that is, all training data is repeatedly processed for several rounds), the weight of perceptron can be trained to achieve the objective function.
For the mathematical derivation and basic theory of perceptrons, see the post on Perception.
So, how do you train perceptrons in code? Install the NeuroLab module through PIP Install NeuroLab before running the following code.
Prepare the data set first and display the distribution of the data set. This part of the code is so simple that you can look at the code directly.
# Build the perceptron model
import neurolab as nl
perceptron = nl.net.newp([[dataset_X[:,0].min(),dataset_X[:,0].max()], # specify the minimum and maximum values for feature 1
[dataset_X[:,1].min(),dataset_X[:,1].max()]], # Min and Max of feature 2
1) # Only 1 perceptron
Dataset_y needs to be processed in two dimensions to be suitable for train
dataset_y=dataset_y[:,np.newaxis]
cost=perceptron.train(dataset_X,dataset_y,epochs=50,show=10,lr=0.01)
# Train the single perceptron, 50 turns, display the training result every 10 turns, the learning rate is 0.01
Display the trend of cost change in the training process
plt.plot(cost)
plt.xlabel('Number of epochs')
plt.ylabel('Training cost')
plt.grid()
plt.title('Training cost progress')
Copy the code
# # # # # # # # # # # # # # # # # # # # # # # # small * * * * * * * * * * and # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
1. Perceptron is mainly to understand its basic meaning and theoretical knowledge, because perceptron is the basic component unit of neural network.
2. Perceptron training is of little use in future deep learning. Generally mature deep learning frameworks have integrated this part of content, which is only used for demonstration here.
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
Note: This part of the code has been uploaded to (my Github), welcome to download.
References:
1, Classic Examples of Python machine learning, by Prateek Joshi, translated by Tao Junjie and Chen Xiaoli