Understanding deep learning (Vernacular explanation +8 pros and cons +4 typical algorithms)

This article is from easyAI – Knowledge base of artificial intelligence

Understanding deep learning (Vernacular explanation +8 pros and cons +4 Typical Algorithms)

Deep learning has a good performance, leading the third wave of artificial intelligence. Most of the best-performing apps so far use deep learning, like the hugely popular AlphaGo.

This paper will introduce the basic concepts, advantages and disadvantages of deep learning and four typical algorithms of the mainstream in detail.

Deep learning, neural networks, machine learning, artificial intelligence

Deep learning, machine learning, artificial intelligence

In a nutshell:

Deep learning is a branch of machine learning (the most important branch)
Machine learning is a branch of artificial intelligence

Some of the best performing applications today are mostly deep learning, and that’s what’s driving the third wave of AI. For more information, see The History of ARTIFICIAL Intelligence: The Three Waves of AI.

Deep learning, neural networks

The concept of deep learning originates from the research of artificial neural network, but it is not equal to the traditional neural network.

However, many deep learning algorithms include the term “neural network”, such as convolutional neural network and cyclic neural network.

Therefore, deep learning can be said to be an upgrade on the basis of traditional neural network, approximately equal to neural network.

Explain deep learning in plain English

I have read many versions of the explanation and found that kai-fu Lee’s in artificial Intelligence is the easiest to understand, so I quote his explanation directly below:

Let’s take the example of recognizing Chinese characters in pictures.

Suppose the information deep learning deals with is “water flow,” and the deep learning network that processes the data is a vast network of pipes and valves. The inlet of the network is a number of pipe openings, and the outlet of the network is also a number of pipe openings. The network of pipes has many layers, each with a number of valves that control the direction and flow of water. According to the needs of different tasks, the number of layers of water pipe network and the number of regulating valves in each layer can have different combinations. For complex tasks, the total number of regulating valves can be tens of thousands or even more. In the water pipe network, each regulator of each layer is connected with all the regulating valves of the next layer through the water pipe, forming a water flow system that is completely connected layer by layer from front to back.

So how does a computer use this vast network of water pipes to learn to read?

For example, when a computer sees a picture with the word “field” written on it, it simply turns all the numbers that make up the picture (in computers, each color dot in the picture is represented by a number of zeros and ones) into a stream of information, fed through the inlet into a network of pipes.

We pre-placed a plaque at each outlet of the network of pipes, corresponding to each Chinese character we wanted the computer to recognize. In this case, since the input is the Chinese character “tian”, and when the water flows through the network, the computer will go to the outlet of the pipe to see if the marked “tian” pipe outlet has the most water coming out. If so, the pipeline network meets the requirements. If this is not the case, adjust each flow regulator in the pipe network, so that the “field” outlet “outflow” of the most water.

Now, the computer will be busy for a while, with so many valves to adjust! Fortunately, the speed of the computer, violent calculation and algorithm optimization can always quickly come up with a solution to adjust all the valves, so that the flow at the outlet meets the requirements.

Next, when we learn the word “shen”, we use a similar method, each picture with “Shen” written into a large number of water flow into the network of pipes, to see if the pipe with “Shen” written is the most water outlet, if not, we have to adjust all the valves. This time, to ensure that the just learned “tian” character is not affected, but also to ensure that the new “Shen” character can be handled correctly.

This is repeated until the water corresponding to all the Characters flows through the network in the desired way. At this point, we say that the network of pipes is a well-trained deep learning model. When a large number of characters are processed by the network and all the valves are adjusted in place, the entire network can be used to read the characters. At this time, we can adjust all the valves are “welded dead”, waiting for the arrival of new water flow.

Similar to what is done during training, the unknown picture is turned into a stream of data by a computer and fed into a trained network of pipes. At this point, the computer simply observes, which outlet flows the most water, this picture is written which word.

Deep learning is basically a semi-theoretical and semi-empirical modeling method that uses human mathematical knowledge and computer algorithm to construct the overall structure, and then combines as much training data as possible and the large-scale computing ability of computers to adjust internal parameters and approach the problem target as much as possible.

Traditional machine learning VS deep learning

Similarities between traditional machine learning and deep learning

In terms of data preparation and preprocessing, they are very similar.

They may do something with the data:

Data cleaning
Data labels
The normalized
denoising
Dimension reduction

For those interested in data preprocessing, check out the 6 Most Common Problems with AI Data Sets (with Solutions)

Core differences between traditional machine learning and deep learning

Traditional machine learning feature extraction mainly relies on manual extraction, which is simple and effective for a specific simple task, but not universal.

Deep learning feature extraction does not rely on manual, but automatic machine extraction. This is why deep learning is often said to be poorly interpretable, because sometimes deep learning works well, but we don’t know how it works.

Pros and cons of deep learning

Advantage 1: Strong learning ability

According to the results, the deep learning performance is very good, and his learning ability is very strong.

Advantage 2: wide coverage, good adaptability

The neural network of deep learning has many layers and wide width. It can be mapped to any function theoretically, so it can solve very complex problems.

Advantage 3: Data-driven, high upper limit

Deep learning relies heavily on data, and the more data it has, the better it will perform. Some tasks, such as image recognition, facial recognition and NLP, have even surpassed human performance. At the same time, it can be adjusted to further increase its upper limit.

Advantage 4: Good portability

Due to the excellent performance of deep learning, there are many frameworks available, such as TensorFlow and Pytorch. These frameworks are compatible with many platforms.

Disadvantage 1: Large calculation and poor portability

Deep learning requires a lot of data a lot of computing power, so it’s expensive. And many apps are not yet suitable for use on mobile devices. There are already many companies and teams working on chips for portable devices. This problem will be solved in the future.

Disadvantage 2: High hardware requirements

Deep learning has high requirements on computing power, and common cpus cannot meet the requirements of deep learning. The mainstream computing power is GPU and TPU, so the hardware requirements are very high, and the cost is also very high.

Disadvantage 3: Complex model design

The model design of deep learning is very complicated, which requires a lot of manpower, material resources and time to develop new algorithms and models. Most people can only use off-the-shelf models.

Weakness 4: No “humanity”, easy to bias

Deep learning relies on data and is not highly interpretable. In the case of unbalanced training data, there will be gender discrimination, racial discrimination and other problems.

Four typical deep learning algorithms

Convolutional Neural Networks – CNN

CNN’s value:

Effectively reduce the dimension of large data images to small data (without affecting the results)
Can retain the characteristics of the picture, similar to the principle of human vision

The rationale for CNN:

Convolution layer – the main function is to preserve the features of the image
Pooling layer – the main function is to reduce the dimension of data, which can effectively avoid over-fitting
Full connection layer – Output the results we want for different tasks

Practical application of CNN:

Image classification and retrieval
Target location detection
The target segmentation
Face recognition
Bone identification

Learn more about “Understanding Convolutional Neural Networks — CNN (Fundamentals + Unique Values + Practical Applications)”

Recurrent neural network – RNN

RNN is an efficient algorithm for processing sequence data. For example: article content, audio, stock price movements…

He can process sequential data because the previous input in the sequence also affects the subsequent output, which is equivalent to a “memory function”. But RNNS have serious short-term memory problems, and long-term data has little impact (even if it is important information).

Therefore, LSTM and GRU and other variant algorithms appeared based on RNN. These variant algorithms mainly have several characteristics:

Long-term information can be effectively retained
Select important information to keep, and select “forget” for less important information.

Several typical applications of RNN are as follows:

The text generated
Speech recognition
Machine translation
Generated image description
Video tags

Understanding Circular Neural Network — RNN (Unique Value + Optimization Algorithm + Practical Application)

Generate adversarial network – GANs

Imagine a chaotic city, and soon there will be countless thieves in the city. Some of these thieves may be master thieves, others may have no skills at all. If the city began to clean up its order, suddenly there was a “campaign” against crime, the police resumed patrolling the city, and soon a group of “unlearned” thieves were caught. It is not clear how the level of security in the city has changed after catching a number of low-level thieves, but it is clear that the average level of thieves in the city has improved greatly.

The police continued to train their crime-solving skills and began to catch thieves who were becoming more sophisticated. With the arrest of these professional criminals, the police also practiced a special skill, they can quickly find suspicious people from a group of people, so come forward to check, and eventually arrest suspects; Thieves are having a hard time, too, because the police have become so much better that if they try to keep up their sneaky behaviour they will soon be caught. To avoid being caught, thieves are trying to act less “suspicious” and police are raising their game, trying to separate them from innocent ordinary people. As the police and the thief between this kind of “communication” and “play”, the thieves have become very cautious, they have high stealing skills, behave with exactly the same as ordinary people, and the police have learned “critical”, once found suspicious personnel, can immediately find and timely control – in the end, we got a thief in the strongest and the most powerful at the same time the police.

Learn more about Generative Adversarial Networks – GAN (Basic Concepts + How They Work)

Deep reinforcement learning – RL

The idea behind reinforcement learning algorithms is very simple. Take a game as an example. If a strategy is used in a game to achieve a higher score, then the strategy is further “reinforced” in order to continue to achieve better results. This strategy is very similar to the various “performance rewards” in everyday life. We often use this strategy to improve our game.

In Flappy Bird, it takes a simple click to control the bird, dodge various hoses, and fly as far as possible, because the farther you fly, the more points you earn.

This is a typical reinforcement learning scenario:

The machine has a clear bird role — agent
Need to control the bird to fly further – the target
The whole game needs to avoid all kinds of water pipes – environment
The way to avoid water pipes is to let the bird fly hard – action
The farther you fly, the more points you get — rewards

You’ll find that the biggest difference between reinforcement learning and supervised or unsupervised learning is that it doesn’t require a lot of data feeding. Instead, you learn certain skills by constantly trying.

Learn more about reinforcement Learning (Basic Concepts + Application Scenarios + Mainstream Algorithms)

conclusion

Deep learning belongs to the category of machine learning. It can be said that deep learning is an upgrade on the basis of traditional neural network, which is approximately equal to neural network.

Deep learning and traditional machine learning are similar in data preprocessing. The core difference lies in feature extraction. In deep learning, feature extraction is completed by the machine itself without manual extraction.

Advantages of deep learning:

Quick learner
Wide coverage and good adaptability
Data driven with high upper limit
Good portability

Disadvantages of deep learning:

Large amount of calculation, poor portability
High hardware demand
Complex model design
Without “humanity”, prejudice is easy to exist

Four typical algorithms of deep learning:

Convolutional Neural Networks – CNN
Recurrent neural network – RNN
Generate adversarial network – GANs
Deep reinforcement learning – RL