Hello everyone, I’m Yu Feng. In the next few days, I will share some basic knowledge of deep learning and various applications, including regression, image classification, image segmentation, semantic segmentation, DCGAN, Pix2PIx, SRGAN and so on. They are all fairly basic, so they are called “Hand To Teach you” series. I hope this series can help you to have a preliminary understanding of deep learning and its application, and find the direction that you are interested in. Today, I would like to share with you my first introduction to neural networks. Or the old saying, I am Yufeng, I hope that the article I share can help you and more friends. Welcome to forward or reprint ah! Welcome to “Yufeng Code Word”

Human brain neural network

Artificial neural networks are inspired by their biological counterparts. Biological neural networks enable the brain to process large amounts of information in complex ways. The brain’s biological neural network is made up of about 100 billion neurons, the brain’s basic processing unit. Neurons perform their functions through huge connections (called synapses) with each other. The human brain has about 100 trillion synapses, and each neuron has about 1,000!

Reception area (receptive Zone) :

The dendrites receive input.Copy the code

Trigger zone:

Where the axon meets the cell body and determines whether nerve impulses are generated.Copy the code

Conducting zones:

Transmission of nerve impulses by axons.Copy the code

Output zone:

The purpose of a nerve impulse is to release neurotransmitters or electricity from nerve endings, synapses, in order to affect the next recipient cell (neuron, muscle cell, or gland cell). This is called synaptic transmission.Copy the code

The main function of neurons is that neurons can produce excitement after being stimulated and transmit excitement to other neurons. The transmission route of nerve impulse in organism is: synaptic of last neuron → dendrite → cell body → axon → synapse. So, what is an artificial neural network? There are many definitions of artificial neural networks. Teuvo Kohonen, a Finnish computer scientist, defines an artificial neural network as a widely interconnected network of simple adaptive units organized in a way that simulates the interactions of the biological nervous system to the real world. Artificial neural network and human brain neuron function similar, also has input layer (equivalent to the brain neuron receiving area), output layer (equivalent to the brain neuron output area), hidden layer (equivalent to the brain neuron conduction area).

The input layer:

The input layer receives the eigenvector X.Copy the code

Output layer:

The final prediction of output of output layer H.Copy the code

Hidden layer:

The hidden layer is between the input layer and the output layer. It is called the hidden layer because the values generated in the hidden layer are not as directly visible as the sample matrix X used by the input layer or the label matrix Y used by the output layer.Copy the code

Artificial neural networks consist of an input layer and an output layer, where the input layer receives data from external sources (data files, images, hardware sensors, microphones, etc.), one or more hidden layers process the data, and the output layer provides one or more data points based network functions. For example, a neural network that detects people, cars, and animals will have an output layer with three nodes. The network that classifies bank transactions between security and fraud will have only one output.

Learning about neural networks

1 M – P model

According to human brain neuron, we establish m-P model. In order to simplify modeling and facilitate formal expression, we ignore complex factors such as time integration and refractory period, and take synaptic delay and intensity of neurons as constants. Below is a diagram of the M-P model.

The m-P model is compared with the list of characteristics of biological neurons to facilitate the understanding of the comparison between MP model and human brain neurons

Six characteristics of M-P model:

  1. Each neuron is a multi-input single-output information processing unit

  2. There are two types of neuron input: excitatory input and inhibitory input (depending on the weight w)

  3. Neurons have spatial integration characteristics and threshold characteristics (neurons will only be activated when threshold is reached)

  4. There is a fixed time lag between neuron input and output, mainly due to synaptic delay (computational time)

  5. Ignore time integration and refractory period

  6. The neuron itself is time-invariant, that is, its synaptic delay and synaptic strength are constant

The previous four points are consistent with human brain neurons. This “threshold weighted sum” neuron Model is called m-P Model (McCulloch-pitts Model), also known as a Processing Element (PE) of neural network. M-p model can represent the operations of logical “and”, “or” and “inverse”, but such operations need to be determined manually and cannot be obtained through “learning”.

2 Single layer perceptron

In 1958, Frank Rosenblatt, an American psychologist, proposed a neural network with a single layer of computing units, called Perceptron. It is actually a structure based on m-P model. The structure of single-layer perceptron is shown in the figure below.

However, different from the M-P model, which requires manual setting of parameters, the perceptron can automatically determine parameters through training, that is, it can obtain parameters through “learning”. The training method is supervised learning, that is, it is necessary to set training samples and expected output, and then adjust the difference between actual output and expected output (error correction learning) to adjust weights and bias items and other parameters.

Perceptron training process:

1. Prepare training a. Prepare N training samples XI and expected output RI B. Initialize parameters w and b 2. Adjust parameters a. Iterate until the error is 0 or less than a specified value 1). You add samples one by one, you calculate the actual output, and when the actual output is equal to the expected output, the parameters don't change. When the actual output is different from the expected, the parameters are adjusted by error correction learning. Repeat step "1)" repeat Step "A"Copy the code

Perceptron is the first artificial neural network designed and implemented. Through error correction learning, it can automatically obtain parameters, which is a great revolution triggered by perceptron. While the single-layer perceptron is simple and elegant, it is clearly not clever enough — it is capable of classifying only linear problems. What is a linear problem? In a nutshell, it’s a graph that can be separated by a straight line. For example, logical and and and logical or are linear problems:

Y = f (w1x1 + w2x2 – theta)

(1) “and” operation. When w1 = w2 = 1, θ = 1.5, the above equation completes the logical “and” operation.

(2) “or” operation, when wl = w2 = 1, θ = 0.5, the above equation completes the logical “or” operation.

As with many algebraic equations, the inequality in the above equation has some geometric significance.

3. Multilayer perceptron

The limitation of the single-layer perceptron is that it can only solve linearly separable problems, for which it is no longer applicable. In order to solve more complex tasks such as linear non-separable problems, multilayer perceptron model has been proposed, as shown in the figure

The multi-layer perceptron model is composed of input layer, hidden layer and output layer. The perceptron of hidden layer is connected with each unit of input layer by weight, and the output value of each unit of hidden layer is calculated by threshold function. The hidden layer and the output layer are also connected by weights. Multilayer perceptron also determines the weight of connection between two layers through error correction learning, but it cannot be adjusted across layers, so multilayer training cannot be carried out. Therefore, the initial multilayer perceptron uses random numbers to determine the connection weight between the input layer and the hidden layer, and only studies the error correction of the connection weight between the intermediate layer and the output layer. In this way, there will be a large error, sometimes there will be different input data, but the output value of the hidden layer is the same, leading to the failure of accurate classification. In order to train multi-layer network, error back propagation algorithm is proposed. Error back propagation algorithm is to get error signals by comparing the actual output with the expected output, and the error signals of each layer are propagated forward from the output layer layer by layer, and the error is reduced by adjusting the connection weight of each layer. Weight adjustment mainly uses gradient descent algorithm. However, as the number of hidden layers increased, the problem of “gradient explosion” appeared in the training. Due to the technology at that time, this problem has not been solved, and the research of deep learning entered a low tide. In 2006, Hinton et al. published a paper “Reducing the Dimensionality of Data with Neural Networks” in the Journal Science, which opened the prelude of a new training deep neural network algorithm. The use of unsupervised RBM networks for pre-training and dimensionality reduction of images to achieve better results than PCA is generally regarded as the beginning of the rise of deep learning.

In 2011, Glorot et al. proposed the ReLU activation function, which effectively suppressed the problem of gradient disappearance of deep network. At present, the best activation function is from the ReLU family, simple and effective.

In 2012, Hinton’s student Alex Krizhevsky proposed AlexNet network, crushing the second place (SVM method) for classification performance. Thanks to the competition, CNN has attracted the attention of many researchers.

Since then, various neural networks have been put forward, making machine learning and deep learning technology more and more popular.

That’s all for today, and we’ll talk about the various applications of deep learning, so stay tuned.

I am Yufeng, the public number “yufeng code word”, welcome to tease