Tom B. Brown and Catherine Olsson, research engineers on the Google Brain team
Source | Google developers Public on machine learning is increasingly applied to real world applications, including medical, chemical and agriculture. We still face significant challenges when it comes to deploying machine learning in security-critical environments. In particular, all known machine learning algorithms are vulnerable to attack against samples (AI.Google/Research/PU… Adversarial samples are input data deliberately designed by an attacker to make the model wrong. Most previous studies on adversarial samples have focused on investigating errors caused by minor modifications in order to build improved models, but real-world adversarial agents are often not subject to the condition of “minor modifications”. In addition, machine learning algorithms often make confidence errors when faced with an adversary, so we urgently need to develop classifiers that do not make any confidence errors, even when faced with an adversary that can submit arbitrary inputs in order to deceive the system.
Today, we are announcing the launch of the Unconstrained Adversarial Sample Challenge, a community-based challenge designed to inspire and measure progress towards the goal of zero confidence classification errors in machine learning models. Previous studies have focused on only small modifications of labeled data points in advance against samples (researchers can assume that, after applying small interference image should still have the same label), and the challenge allows the use of unlimited input, participants can submit images of the target class, in order to use a wider range of counter samples to develop and test the model.
The structure of the challenge
Contestants can submit entries from either of two roles: as a defender submitting a classifier that is hard to fool, or as an attacker submitting an arbitrary input sample intended to fool the defender’s model. In the “warm-up” phase before the challenge, we will provide a series of fixed attacks for participants to design defensive networks. After the community is finally able to defeat those fixed attacks, we will launch a full two-sided challenge with prizes for both attackers and defenders.
The defender’s goal is to properly tag clean bird and bike test sets and maintain high accuracy without creating any confidence errors with any images of birds or bikes provided by attackers. The attacker’s goal is to find an image of a bird that the defense classifier has credibly labeled as a bicycle (and vice versa). We wanted to keep the defender’s challenge as low as possible, so we left out all images that were ambiguous (like a bird on a bike) or not obvious (like an aerial view of a park or irregular noise).
An attacker can simply submit any image of birds or bicycles in an attempt to fool the defensive classifier. For example, an attacker could take photos of birds, use 3D rendering software, use image editing software for image compositing, use generative models or other techniques to produce novel images of birds.
To verify a new image provided by an attacker, we ask a bunch of people to tag the image. This procedure allows an attacker to submit arbitrary images, not just test set images with minor modifications. The defense model is broken if the defense classifier explicitly classifies any image provided by an attacker as a “bird,” but the human labeling consistently labels it as a bicycle. You can in the article we know more details on the structure of a challenge (drive.google.com/file/d/1T0y…
Participate in the way
If you’re interested in participating, you can find the getting started guide in the Github project. We’ve released data sets for “warming up,” assessment channels, and baseline attacks, and we’ll be updating our leaderboards and publishing our best defense models for the community. We look forward to your participation! Note: Github project link github.com/google/unre…
Thank you
The unrestricted Combat Demo Challenge was organized by Tom Brown, Catherine Olsson, Nicholas Carlini, Chiyuan Zhang, Ian Goodfellow from Google, and OpenAI Paul Christiano.