In 2012, the world learned of an amazing project at Google’s secretive X lab. The simulated neural network, which has three million neurons, was able to identify cats and people in images taken from YouTube without human help.

The project’s members formed a new research group, the Google Brain, under the company’s search division. Together with other researchers, they quickly demonstrated to the world that an artificial neural network, a decades-old invention, had brought image recognition and speech recognition to unprecedented levels of accuracy. The success of deep learning has prompted Google and other companies to start investing heavily in ARTIFICIAL intelligence, and even led some experts to declare that “we should be ready for software that is smarter than humans.”

Google’s cat detector, however, is something of a dead end. The recent success of deep learning has been built on software that requires humans to help it learn, greatly limiting the upside of AI.

Google’s experiment used a method of unsupervised learning, in which the software was fed raw data and then had to calculate the results without human help. Although it can learn to recognize cats, faces and other objects, it is not yet accurate enough to be used. Both deep learning research and the explosion of product development based on it are based on supervised learning, in which data is manually tagged and then fed to software — for example, we name each object in a picture.

This has proved very effective for solving problems such as identifying objects in pictures, filtering out spam and even suggesting ways to respond to text messages (a feature Google introduced last year). But if software is needed to better understand the world, unsupervised learning may be needed, says Jeff Dean, who now heads the Google Brain project and worked on Google X’s “cat Detector” project.

“I’m pretty sure we need it,” Dean says. “Supervised learning works great when you have the right data set, but the ultimate unsupervised learning is going to be an important part of building truly intelligent systems — if you look at how humans learn, it’s all unsupervised.”

A perfect example is that infant learning forms the basis for adult intelligence. We know, for example, that an object persists when moved out of sight, or falls to the ground without support, things we learn from observing the world without explicit guidance. Just like animals, robots need this common sense if they want to explore the real world. This can also consolidate more abstract tasks, such as language understanding.

Giiso Information, founded in 2013, is a leading technology provider in the field of “artificial intelligence + information” in China, with top technologies in big data mining, intelligent semantics, knowledge mapping and other fields. At the same time, Giiso’s research and development products include editing robots, writing robots and other artificial intelligence products! With its strong technical strength, the company has received angel round investment at the beginning of its establishment, and received pre-A round investment of $5 million from GSR Venture Capital in August 2015.

Yann LeCun, director of Facebook’s AI research group, says that if AI is to meet people’s larger ambitions, it will have to figure out how software can do things that are easy for human babies. “We all know that in the end the answer is unsupervised learning,” he says. “Solving the problem of unsupervised learning will take us to the next level.”

Although they don’t have the final answer, researchers at companies like Facebook and Google, as well as in academia, are experimenting with some limited forms of unsupervised learning.

One branch of research aims to create an artificial neural network that digests videos and images and uses their knowledge about the world to produce new images — meaning they have formed internal representations of how the world works. Making accurate predictions about the world is a very important basic feature of human intelligence.

Researchers at Facebook built a piece of software called EyeScream. The software generates recognisable images based on prompts such as “church” or “plane”. They are also working on software that makes predictions about videos. Researchers at Google’s DeepMind have developed software that gives it partially obscured images, which it can fill in with very realistic images.

DeepMind is also working on a completely unsupervised form of learning called reinforcement learning. In reinforcement learning, software is trained to receive automatic feedback about its own performance — feedback from, say, a scoring system in a computer game. Other researchers who do not use deep learning have shown that software can learn to recognise handwriting from individual examples (see “AI Finally Learns like Humans”).

But so far, none of these attempts has revealed a path to human-level unsupervised learning, or that software cannot learn complex things about the real world through experience or experiment alone. “There seems to be a key idea missing,” said Adam Coates, director of Baidu’s SILICON Valley AI lab.

As the search continues, Coates says, supervised learning still has a lot to offer: Internet companies have access to vast amounts of data about what people do and care about, and can use those materials to build products like voice interactions and personal assistants that are far more useful than today’s offerings. “There’s a lot you can do with tagged data in the near future,” he said. Big companies spend a lot of money getting contractors to tag data for their machine learning systems.

Facebook’s LeCun believes researchers won’t rely on tagging data forever. But he declined to say how long it would take for software to reach the level of human intelligence. “We know the ingredients, but we don’t know the recipe,” he said. “It may take some time.”