Original text: theaisummer.com/Deep-Learni…

By Sergios Karagiannakos

Because the article is a little long, so it will be divided into two articles, divided into development.

Introduction to the

Deep learning took off after a neural network outperformed humans in an image-recognition contest in 2012, but few could have predicted what would happen next.

Over the past decade, more and more algorithms have become available, and more and more companies are starting to incorporate them into their everyday businesses.

This article will attempt to introduce all the important deep learning algorithms and network structures over the years, including those used in computer vision and natural language processing related applications. Some of them are widely used, but each algorithm has its own strengths and weaknesses.

The main goal of this article is to give you a general understanding of the field and to give you an idea of which algorithm to use in different specific situations, as some people may feel confused and confused about learning from scratch. After reading this article, I’m sure you know what these algorithms are and how to use them.

directory

The contents of this article are as follows:

  1. What is deep learning?
  2. Neural Networks
  3. Feedforward Neural Networks (FNN)
  4. Convolutional Neural Networks (CNN)
  5. Recurrent Neural Networks (RNN)
  6. Recursive Neural Network
  7. AutoEncoders
  8. Deep Belief Networks and Restricted Boltzmann Machines
  9. Generative Adversarial Networks
  10. Transformers
  11. Graph Neural Networks
  12. Natural language Processing based on deep learning
    • Word Embedding
    • Sequence Modeling
  13. Computer vision based on deep learning
    • Localization and Object Detection
    • Single shot detectors(SSD)
    • Semantic Segmentation
    • Pose Estimation

1. What is deep learning?

According to wikipedia’s definition [1] :

Deep learning (also known as deep structured learning or differential programming) is a member of the machine learning algorithm family. It is based on artificial neural networks and representation learning, and its learning can be supervised, semi-supervised, or unsupervised.

In my view, deep learning is a set of algorithms born from the human brain’s ability to process data and create patterns for making decisions, extending and enhancing a single model structure called an artificial neural network.

2. Neural Networks

Like the human brain, neural networks [2] contain many neurons. Each neuron takes the input, multiplies it by its corresponding weight, sums it up and feeds it into a nonlinear function. These neurons are stacked on top of each other and organized in layers. As shown below on the left:

What if you did it the way neurons in the brain do? The result, as shown on the right, is that neural networks are excellent function approximators.

It is assumed that every behavior and every system can ultimately be represented by a mathematical function (some of which may be very complex). If we can find such a function, we can know everything about the system, but finding such a function is very difficult, so we need to evaluate the neural network.

Back propagation

The neural network learns the objective function through a large amount of data and an iterative algorithm called ** back propagation [3]**. We feed the data into the network, which outputs the results, which we then compare with the expected results (via a loss function), and then adjust the weights based on the difference.

Repeat the process over and over. The method of weight adjustment is realized by a nonlinear optimization technique — stochastic gradient descent [4].

After a period of training, the network will be able to produce very good results, so the training is over. In other words, we get an approximate function. When we give the network an input of unknown result, the network will output the result according to the approximate function learned.

Here’s an example to better illustrate the process. For example, we now have a task that requires us to identify images with trees. We send any kind of picture (that is, training picture) to the network, and then the network will output a result, because we already know whether there is a tree in the picture, so we just need to compare the result of the network with the real category of the picture (whether there is a tree or not), and then adjust the network.

As the number of training images increases, the network will make fewer and fewer mistakes. Now we can send the network an unknown image (non-training image) and the network will tell us whether the image contains a tree.

Over the years, researchers have come up with a number of surprising improvements to this original idea, with each new network structure responding to specific problems and achieving better accuracy and speed. We’ll look at each of these models one by one.

3. Feedforward Neural Networks (FNN)

Forward neural networks usually adopt full connection layer [5], that is to say, neurons at each layer are connected with all neurons at the next layer. This structure, also known as the multilayer perceptron, was first created in 1958, as shown below. A single-layer perceptron can only learn linear separation models, but a multi-layer perceptron can learn nonlinear relationships between data.

Multilayer perceptron has good performance in classification and regression tasks, but compared with other machine learning algorithms, multilayer perceptron is not easy to converge. In addition, the more training data, the higher the accuracy of the multilayer perceptron.

Convolutional Neural Networks (CNN)

Convolutional neural network adopts a convolution function [6]. Instead of all the neurons between layers being connected, the convolution layer only connects part of the neurons between the two layers (i.e. the receptive field).

To some extent, CNN tries to regularize on the basis of FNN to prevent over-fitting (that is, the trained model has poor generalization ability), and can well identify the spatial relationship between data. A simple NETWORK structure of CNN is shown in the figure below

Because it can well identify the spatial relationship between data, CNN is mainly used in computer vision applications, such as image classification, video recognition, medical image analysis and automatic driving [7], and has achieved recognition accuracy higher than human in these fields.

In addition, CNN can also be well combined with other types of models, such as recurrent neural network and autoencoder, one of which is symbolic language recognition [8].

5. Recurrent Neural Networks (RNN)

Recurrent neural networks are very suitable for time dependent data and are applied to time series prediction. The network model takes the form of feedback, in which the output is returned to the input. You can think of it as a loop, going from output to input, passing information back to the network, so network models have the ability to remember historical data and apply it to predictions.

To improve the performance of the model, researchers modified the original neurons to create more complex structures, such as GRU units [9] and LSTM units [11], as shown in the figure below. LSTM is widely used in natural language processing tasks, including translation, speech generation, speech generation from text, etc.

6. Recursive Neural Network

Recursive neural network is another form of recurrent neural network, the difference is that recursive neural network is a tree structure, so it can model hierarchy in the training set.

Speech to text and semantic analysis are commonly used in natural language processing because these tasks are associated with binary trees, context, and natural language-based analysis, but recursive neural networks are slower than cyclic neural networks.


reference

  1. En.wikipedia.org/wiki/Deep_l…
  2. karpathy.github.io/neuralnets/
  3. Brilliant.org/wiki/backpr…
  4. Ruder. IO/optimizing -…
  5. Theaisummer.com/Neural_Netw…
  6. Theaisummer.com/Neural_Netw…
  7. Theaisummer.com/Self_drivin…
  8. Theaisummer.com/Sign-Langua…
  9. www.coursera.org/lecture/nlp…
  10. Theaisummer.com/Bitcon_pred…

Welcome to follow my wechat official account — the growth of algorithmic ape, or scan the QR code below, we can communicate, learn and progress together!