\

This article is from ali_tech, Python Chinese community authorized to publish, all rights reserved.

Jia Yangqing may be best remembered for writing Caffe, an AI framework he wrote six years ago. After years of precipitation, become “Ali new” he, artificial intelligence and how to view? Recently, Jia Yangqing shared his thoughts and insights in Ali. Welcome to discuss and exchange with us.

\

\

Jia Yangqing, a native of Shangyu, Zhejiang Province, graduated from the Department of Automation at Tsinghua University and received his PhD in computer science from The University of California, Berkeley.

\

The popularity of deep learning in recent years is generally regarded as a milestone from AlexNet’s success in the field of image recognition in 2012. AlexNet has increased acceptance of machine learning across the industry: While many machine learning algorithms used to be “almost demo worthy,” AlexNet’s effects have crossed the threshold for many applications, creating an explosion of interest in the application world.

\

Of course, nothing happens overnight, and before 2012, many factors for success had already begun to emerge: ImageNet database in 2009 laid the foundation for a large amount of annotated data; In 2010, Dan Ciresan of IDSIA used GPGPU for object recognition for the first time. In 2011, at the ICDAR conference in Beijing, neural networks made a big impact on Chinese offline recognition. Even the ReLU layer used in AlexNet was mentioned in the neuroscience literature in 2001. Therefore, to a certain extent, the success of neural networks is also a natural process. You can read a lot about what happened after 2012, but I won’t repeat it here.

\

\

Successes and Limitations

\

While looking at the success of neural networks, we should also dig into the theoretical and engineering background behind it. Why did neural networks and deep learning fail decades ago, but succeed now? What accounts for its success? And what are its limitations? We can only mention a few key points here:

\

  • Part of the reason for success is big data, part is high-performance computing.
  • The limitations are partly due to structured understanding and partly to efficient learning algorithms on small data.

\

Large amounts of data, such as the rise of the mobile Internet and low-cost access to annotated data platforms such as AWS, allow machine learning algorithms to break through data constraints; The rise of high-performance computing, such as GPGPU, has made it possible to train complex networks to perform exaflop calculations in a manageable amount of time (in days or less). It should be noted that high performance computing is not limited to GPU. A large number of vectorized computing on CPU and MPI abstraction in distributed computing are inseparable from the research results in the FIELD of HPC, which began to rise in the 1960s.

\

However, we should also look at the limitations of deep learning. Today, many deep learning algorithms have made a breakthrough on the level of perception, which can recognize voices and images from such unstructured data. For more structured problems, simply applying deep learning algorithms may not work very well. Some students may ask why algorithms like AlphaGo and Starcraft can be successful. On the one hand, deep learning solves the problem of perception. On the other hand, we should also see that there are many traditional non-deep learning algorithms, such as Q-Learning and other enhanced learning algorithms. Together to hold up the whole system. Moreover, when the amount of data is very small, the complex network of deep learning often fails to achieve good results. However, in many fields, especially medical fields, it is very difficult to obtain data, which may be a very meaningful direction of scientific research in the future.

\

Next, where does deep learning or more broadly, AI go? My personal feeling is that while there has been a lot of focus on AI frameworks in the last few years, the homogenization of frameworks in recent years means that it is no longer a problem that needs to be solved. The widespread use of frameworks like TensorFlow in the industry, and the great use of Python by various frameworks in modeling, It has helped us solve a lot of problems that we used to have to program ourselves, so as AI engineers, we need to go beyond the framework and look for value in a broader field.

\

challenge

\

As we move up, we will encounter many new challenges for products and research, such as:

\

  • How should traditional deep learning applications, such as voice, graphics, and so on, deliver product and value? For example, computer vision now basically stays at the level of security, how to penetrate into medical, traditional industry, and even social care (how to help the blind see the world?). These areas need not only technology, but also product thinking.
  • How to solve more problems than just voice and image. One of the “silent majority” applications in Alibaba and many Internet companies is the recommendation system: It often accounts for more than 80% or even 90% of machine learning computing power. How to further integrate deep learning with traditional recommendation systems, how to find new models, and how to model the effects of search and recommendation may not be as well known as voice and image, but they are indispensable skills for companies.
  • Even on the scientific side, our challenges are just beginning: Berkeley professor Jitendra Malik once said, “We used to tune algorithms by hand, but now we tune network architectures by hand, and if we’re constrained by this model, artificial intelligence won’t progress.” How to get out of the manual adjustment of the old way, with intelligence to improve intelligence, is a very interesting problem. The original AutoML system still used a lot of computational power to brute force through model structures, but now that various more efficient AutoML technologies are emerging, it’s worth noting.

\

\

The opportunity to

\

Moving forward, we will find that traditional knowledge of systems and architecture, as well as the practice of computer software engineering, will bring many new opportunities to AI, such as:

\

  • Traditional AI frameworks are written high-performance code, but with models changing and new hardware platforms emerging, how can we further improve software efficiency? We are already seeing compiler technology and traditional AI search methods used to optimize AI frameworks in reverse, such as Google’s XLA and THE University of Washington’s TVM, which are in their early stages but are already showing their potential.
  • How does the platform improve its integration capabilities? In the open source world, it’s one person, one machine, several Gpus, training more academic models. However, in large-scale applications, our data volume is very large, the model is very complex, and the cluster will present various scheduling challenges (can 256 Gpus be required all at once? Can computing resources be flexibly scheduled? “, which presents a lot of challenges for our own machine learning platform and for the services we provide to our customers on the cloud.
  • How to codesign software and hardware. The advantages of new hardware and special hardware (such as ASIC) begin to emerge when deep learning computing models begin to solidified (such as CNN). How to realize the co-design of hardware and software to prevent the problem of “when the hardware comes out, I don’t know how to write programs” or “the model has changed, and the hardware is obsolete once it comes out”, will be a great direction in the next few years.

\

Artificial intelligence is a fast-changing field, and we have a joke that the scientific achievements of 2012 are now ancient stories. Rapid iteration brings a large number of opportunities and challenges that are very exciting. In today’s era of cloud and intelligence, if experienced researchers or engineers who are new to AI can quickly learn and refresh various challenges of algorithms and engineering, they can lead and empower all fields of society through algorithm innovation. In this regard, the open source code, research articles and platforms in the field of ARTIFICIAL intelligence have created easier entry barriers than ever before, and the opportunities are all in our own hands.

\

Hot recommended in Python create WeChat robot in Python robot to monitor WeChat group chat in Python for cameras and real-time control face open source project | beautification LeetCode warehouse is recommended in Python Python Chinese community’s several kind of public service, announced notice | Python Chinese Community award for articles \

Click to become a registered member of the community ** “Watching” **