Since AlexNet was proposed by Alex Krizhevsky and others at the University of Toronto in 2012, “deep learning” as a powerful method of machine learning has gradually led to today’s AI boom. As this technology has been applied to so many different domains, so many new models and architectures have been developed that we can’t tease out the relationships between network types. Recently, researchers from the University of Dayton have made a comprehensive review and summary of the development of deep learning in recent years, and pointed out the main technical challenges people are facing at present. Heart of the Machine feels that this is a very detailed overview paper, suitable for both those who understand deep learning from scratch and those who have a foundation.
introduce
A. Types of deep learning methods
. For example, the input is X_T, agent prediction
, the loss value will be obtained
. The agent then iteratively adjusts the network parameters to better approximate the desired output. After successful training, an agent can make correct answers to environmental questions. Supervised learning mainly includes the following types: deep neural network (DNN), convolutional neural network (CNN), cyclic neural network (including LSTM) and gated cyclic unit (GRU). These networks are detailed in sections 2, 3, 4, and 5, respectively.
, agentreceivecost:
Where P is the unknown probability distribution, the environment asks the agent a question and gives it a score with noise as the answer. This approach is sometimes called semi-supervised learning. Many semi-supervised and unsupervised learning methods have been implemented based on this concept (section 8). In RL, we don’t have a simple forward loss function, so this makes machine learning more difficult than traditional supervisory methods. The fundamental difference between RL and supervised learning is that, first of all, we cannot retrieve the function you are optimizing, but must query it through interaction; Second, we are interacting with a state-based environment: the input X_T depends on previous actions.
B. Feature learning
In deep learning, on the other hand, these features are automatically learned and layered on multiple layers. This is why deep learning leapfrog traditional machine learning methods. The table above shows the relationship between different feature learning methods and different learning steps.
C. The timing and domain of applying deep learning
D. Cutting-edge development of deep learning
2) Automatic speech recognition
E. Why use deep learning
F. Challenges of deep learning:
-
Use deep learning for big data analysis
-
Deep learning methods need to be scalable
-
The ability to generate data is important in situations where data is not available for learning systems (especially for computer vision tasks, such as reverse graphics).
-
Low energy consumption technology for special purpose equipment, such as mobile intelligence, FPGA, etc.
-
Multitasking and transfer learning (generalization) or multi-module learning. This means learning from different fields or different models together.
-
Deal with cause and effect in learning.
Second, in most cases, solutions to large-scale problems are being deployed on high-performance computer (HPC) systems (supercomputers, clusters, sometimes referred to as cloud computing), which offer great potential for data-intensive business computing. But as data explodes in speed, diversity, accuracy, and quantity, it becomes increasingly difficult to use enterprise servers for storage and computing performance. Most papers take these requirements into account and propose efficient HPC using heterogeneous computing systems. For example, Lawrence Livermore National Laboratory (LLNL) has developed a framework that: Livermore Big Artificial Neural Networks (LBANN) for large-scale deployment of deep learning (supercomputing scale) definitively answers the question of whether deep learning is scalable [24].