Computers are really good at recognizing objects. But a new paper directs our attention to areas where super-intelligent algorithms are completely useless. The paper details how researchers fooled state-of-the-art deep neural networks with simple, randomly generated images. Again and again, these algorithms treat mixed abstract shapes as parrots, ping-pong bats, bagels and butterflies.

These findings force us to understand an obvious but critically important fact: computer vision is fundamentally different from human vision. However, as computers increasingly rely on neural networks to learn to see, we are not quite sure how computer vision differs from human vision. As Jeff Clune, one of the researchers who carried out the study, says, with AI, “we can get results without knowing how to get results.”

Look at the black and yellow stripes below and tell me what you see. It’s nothing, right? But if you ask the top AI the same question, it will tell you that this is a school bus. It would say the assessment is more than 99 percent valid. But that answer is 100% wrong.

Upgrade images to fool artificial intelligence

One way to find out why these self-training algorithms are so smart is to find out where they’re dumb. In this case, Clune and PhD students Anh Nguyen and Jason Yosinski wanted to see if a top-level image-recognition neural network was susceptible to false positives. We know that computers can recognize koalas. But is it possible to make a computer recognize something else as a koala?

To find the answer to this question, the team used an evolutionary algorithm to generate random images. Basically, these algorithms generate very effective visual lures. In evolutionary algorithms, the program generates a picture and then changes the picture slightly (mutation). Both the original image and the copied image were shown to the Imagenet-trained neural network. ImageNet contains 1.3 million images and has become a must-have resource for training computer vision ARTIFICIAL intelligence. If the algorithm is more certain about the copied photo, the researchers keep it, and so on. Otherwise they will take a step back and try again. “It’s not survival of the fittest,” Clune says. “It’s survival of the prettiest images.” Or, more precisely, survival of the images that the computer recognizes with the highest accuracy.

In the end, the technique produced dozens of photos that the neural network judged to be more than 99 percent accurate. To your eyes, these photos look very different, just a series of wavy blue and orange lines, a bunch of ovals, and yellow and black stripes. But to the AI, they were all obvious matches: a goldfish, a remote control and a school bus.

A peek inside the black box

In some cases, you can start to understand how artificial intelligence can be fooled. Squinting, the school bus looks like it’s made up of yellow and black stripes. Similarly, you can see that a randomly generated image that the AI thought was a “monarch butterfly” actually produced butterfly wings, and that a “ski mask” image actually looked like an exaggerated human face.

But things are more complicated. The researchers also found that AI was always fooled by purely static images. Using a slightly different evolutionary technique, the researchers created a different set of images. The pictures look almost identical, similar to what would appear on a broken TV. However, the top neural network confirmed with 99 per cent accuracy that the images were of centipedes, cheetahs and peacocks.

To Clune, these findings suggest that neural networks recognize objects through multiple visual cues. These cues may or may not resemble human visual cues (like a school bus). The results of the still images show that, at least some of the time, these cues are granular. Perhaps during training, the neural network noticed that a line of “green pixels, green pixels, purple pixels, green pixels” was common in photos of peacocks. When Clune and his team generated photos that happened to have the same lines, they triggered the peacock feature. The researchers were also able to trigger “lizard” features with abstract images that looked completely different, showing that the neural network relies on just a few cues to recognize objects, and that each trigger confirmed features.



The fact that we have elaborate plans to fool these algorithms also points to a larger truth about artificial intelligence today: even when they work, we don’t always know why they work. “These models are getting very large, very complex, and they’re learning by themselves,” said Clune, who heads the Evolutionary Artificial Intelligence Laboratory at the University of Wyoming. “There are millions of neurons in a neural network, and they’re all doing their own thing. And we don’t quite understand why they’ve done so spectacularly well.”

Similar studies attempt to reverse engineer these models. They want to understand the broad Outlines of artificial intelligence. “In the last year or two, we’ve learned a lot about what’s going on inside neural network black boxes,” Clune explains. It’s all still very vague, but we’re starting to see it.”



Why is computer misjudgment such a big deal, anyway?

Clune discussed these findings with fellow researchers at the Conference on Neural Information Processing Systems in Montreal earlier this month. The conference brought together some of the brightest minds in artificial intelligence. The reaction fell into two camps. One camp, older and more experienced in artificial intelligence, found the research significant. They may have predicted different outcomes, but at the same time considered them entirely reasonable.

The second camp, made up of people who have not spent much time thinking about what makes today’s computer brains work, expressed shock at the findings. At least initially, they were surprised that these powerful algorithms could make such simple mistakes. They also published papers on neural networks, mind you, and appeared at this year’s top AI conferences.

To Clune, the polarised response is evidence of a generational shift in ai. A few years ago, people working in artificial intelligence were building artificial intelligence. These days, neural networks are good enough that researchers just take what they have and work with it. “A lot of times, you can solve a problem directly with these algorithms,” Clune said. “It’s a gold rush for people to come in and use AI.”



This is not necessarily a bad thing. But as more and more things are built on AI, it’s becoming more and more important to explore its flaws. If an algorithm could determine that an image was an animal based on a single pixel line, imagine how easy it would be for porn to get past a secure search filter. In the short term, Clune hopes the study will encourage other researchers to develop algorithms that take into account images as a whole. In other words, algorithms that make computer vision more like human vision.

The study also allows us to consider other manifestations of these defects. For example, does facial recognition use the same technology? “It’s the same problem that affects facial recognition algorithms,” Clune said.

Giiso, founded in 2013, is a leading technology provider in the field of “artificial intelligence + information” in China, with top technologies in big data mining, intelligent semantics, knowledge mapping and other fields. At the same time, Giiso’s research and development products include editing robots, writing robots and other artificial intelligence products! With its strong technical strength, the company has received angel round investment at the beginning of its establishment, and received pre-A round investment of $5 million from GSR Venture Capital in August 2015.



You can also imagine all the interesting applications of this discovery. Maybe some kind of 3D-printed nose will be enough to trick a computer into thinking you’re someone else. Maybe wear a layer of clothing with geometric shapes on the surface, and the surveillance system will ignore you completely. The finding confirms that as the use of computer vision increases, so does the likelihood of destroying it.

More broadly, the finding is a reminder of a rapidly emerging reality as we move into the era of self-learning systems. Now we still have control over what we create. But as they continue to build themselves, we will soon find that they are too complex for us to see. “Humans can’t read the computer code anymore,” Clune said. It’s like an economy of interactive parts, from which intelligence emerges.”

We will certainly use this intelligence immediately. But whether we fully understand it when we do so is less clear.