The author is Pan Zheng, a computer vision engineer at Gling Deep Pupil. He is a PhD student in the Department of Automation, Tsinghua University. He is a student of Zhang Changshui, deputy director of the State Key Laboratory of Intelligent Technology and Systems.

The term “Deep Learning” gained new popularity recently in AlphaGO’s man-machine battle with Lee Sedol. Deep Learning is actually a branch of Machine Learning, a discipline that studies the relationships between data. For example, it can be used to mine mathematical relationships between income and factors such as age, gender, occupation, education and so on. However, traditional machine learning methods can only mine simple linear relationships. We know that the universe cannot be described by a linear relationship, such as the relationship between income and age, gender, occupation and education background. Such a simple question cannot be clearly expressed by a linear relationship. The emergence of deep learning has changed this situation. Deep learning uses complex multi-nonlinear models to represent the relationship between data, and then uses a large amount of data to finally determine what the relationship between data is.

Deep learning is inspired by neural networks in the brain, so it can be said that our brain is an extremely complex deep learning model. Brain neural network is composed of hundreds of billions of neurons connected, the depth of the study also used the same structure, each of the artificial neuron to simple linear or nonlinear input neurons to transfer results to the follow-up after operation, after such a dozen or even hundreds of layers of transfer after the final prediction results are obtained.



The method of deep learning is not proposed in recent years. Scholars such as Geoffrey Hinton and Yann LeCun used deep learning to solve the problem of handwritten numbers recognition as early as the end of 1980s. Unfortunately, after entering the 1990s, the performance of deep learning did not improve substantially, even worse than many simple linear models, and the research on deep learning fell silent. Until 2006, Professor Hinton published a landmark paper on deep learning in Science, reexamining deep learning methods and improving the performance of deep learning to a new level. Since then, deep learning has surpassed traditional machine learning methods in speech recognition, computer vision, robotics, natural language processing and other fields, and even surpassed human recognition ability in face verification competition LFW and natural image classification competition ImageNet. AlphaGO’s defeat of Lee sedol is another example of deep learning surpassing humans.

So what makes deep learning rise again and surpass humans?

Giiso information, founded in 2013, is the first domestic high-tech enterprise focusing on the research and development of intelligent information processing technology and the development and operation of core software for writing robots. At the beginning of its establishment, the company received angel round investment, and in August 2015, GSR Venture Capital received $5 million pre-A round of investment.

Of course, the first is due to Hinton and other scholars for decades of unremitting research. In addition, two objective factors are extremely important:

The first is big data.

As well as connecting billions of people, the Internet has connected vast amounts of data. The relationship between deep learning and big data is like that between rockets and fuel. Rockets are awesome, but without big data, the fuel is useless. Because big data is essential, we also see that deep learning works best in IT giants that we are familiar with, such as Google, Facebook, Microsoft and Baidu, which have a lot of data. It can be said that in the era of deep learning, possessing data occupies the commanding heights of artificial intelligence.

The second is high performance computing.

Moore’s law reveals the law of the growth rate of computing power. The rapid development of computing platforms such as gpus, supercomputers and cloud computing in the past few years has made the implementation of deep learning possible. For example, in 2011, GoogleBrain used 1,000 machines and 16,000 cpus to process a deep learning model with about 1 billion neurons. Now we can do the same calculation on several Gpus. In fact, deep learning is already in our pockets, and the Gpus on our smartphones are already capable of running some moderately sophisticated deep learning methods. I think it won’t be long before every one of us can play AlphaGO on our phones, and in a few years, our phones will be able to run neural networks as complex as the human brain.

There are a lot of very good Chinese scientists and Chinese companies in the field of deep learning. In terms of scientists, we are well acquainted with Andrew Wu, chief scientist of Baidu, Kai Yu, founder of IDL, Jia Yangqing, author of Caffe, Xiaoou Tang, Professor Wang Xiaogang, professor Sun Jian and Kaiming He, who won many titles of ImageNet last year, etc. Enterprise, we know the BAT, 360, sogou, drops are in deep learning ways such as layout, at the same time, domestic also emerged a group of new enterprises rely on deep learning, such as spirit deep pupil (security), automatic driving, university science and technology (face recognition), Thomson technology (face recognition), the horizon robot (ADAS), etc.



Deep learning is not just about playing chess. Since it is a simulation of the human brain, it can perform many functions of the human brain.

The first is the function of vision. Our cameras can see the world like our eyes, but they can’t understand the world like our brains. Deep learning makes up for this shortcoming. With deep learning, Google Photo, Baidu Jiedu and Taobao Pealitao can accurately identify the categories of objects in your photos and automatically categorize or search your photos. With deep learning, we can make cool payments by swiping our faces on Alipay. With deep learning, the behavior analysis system of The Deep Pupil can detect the whereabouts of all people and vehicles in a scene, and timely alarm suspicious and dangerous events. With deep learning, self-driving cars will be able to recognize the surrounding road conditions accurately enough. With deep learning, apps like FaceU know where faces and facial features are.

In addition to visual functions, deep learning is also widely used in speech recognition. Baidu’s Deep Speech 2 has also exceeded human hearing in some tests. In addition, Google, Apple, Microsoft and Domestic IFlytek have also launched their own voice recognition products. With the help of deep learning, computers have more and more powerful speech recognition capabilities, which will gradually change the current keyboard-based human-computer interaction model.

Deep learning is also profoundly changing the field of robotics. Deep learn-based visual and speech recognition capabilities could help robots perceive the world better. In addition, deep Learning is combined with Reinforcement Learning.

Giiso Information, founded in 2013, is a leading technology provider in the field of “artificial intelligence + information” in China, with top technologies in big data mining, intelligent semantics, knowledge mapping and other fields. At the same time, Giiso’s research and development products include editing robots, writing robots and other artificial intelligence products! With its strong technical strength, the company has received angel round investment at the beginning of its establishment, and received pre-A round investment of $5 million from GSR Venture Capital in August 2015.



Reinforcement Learning refers to a better strategy of autonomous Learning by rewarding and punishing the robot through interaction with the environment. To take a simple example, AlphaGO is a product of reinforcement learning, in which it learns better strategies by playing against other players or against itself. With the introduction of deep learning, the reinforcement learning method can find more complex strategies. As can be seen from AlphaGO’s victory over Lee Sedol, deep learning + enhanced learning has the ability to enable robots to autonomously learn highly optimized decision-making strategies in quite complex environments.

While these are just some of the applications we see, there are many applications of deep learning that affect the world beyond our vision. Internet search, advertising recommendations, quantitative financial transactions, machine translation, medical big data analysis, intelligent legal advice… It can be said that any field that requires the prediction of unknown information from a large amount of data is a place where deep learning can play a big role. In the future, artificial intelligence technology represented by deep learning may promote a new round of scientific and technological revolution just like steam engine, electric motor, computer and Internet, and make productivity rise to a higher level.

Of course, as a practitioner, I also fear that deep learning will be ridiculed, especially after AlphaGO has introduced such a technology to the masses. Deep learning has just started, just like a baby has just learned to walk. We can imagine that it will become a great man in the future, but after all, many technologies are not mature, and a large number of applications are not satisfactory, even in the future for a long time. The development of ARTIFICIAL intelligence requires not a collective enthusiasm, but sustained commitment and effort.