Learning Universal Adversarial Perturbations with Generative Models - PowerPoint PPT Presentation

Learning Universal Adversarial Perturbations with Generative Models Jamie Hayes & George Danezis UCL

Adversarial examples transfer between different models. An adversarial example crafted against one model will generally fool other models. Adversarial Example Model 1 Model 2 [SZS13] Szegedy et al. Intriguing properties of neural networks.

Why do adversarial examples transfer?

Why do adversarial examples transfer? [GSS15] Goodfellow et al. Explaining and Harnessing Adversarial Examples [LCL17] Liu et al. Delving into Transferable Adversarial Examples and Black-Box Attacks

In the most extreme case, it is possible to construct a single perturbation that will fool a model when added to any image! Banana Truck Hammer Cat Dog Football [GSS15] Goodfellow et al. Explaining and Harnessing Adversarial Examples [MFF16] Moosavi-Dezfooli. Universal adversarial perturbations.

Can a neural network learn universal adversarial perturbations?

Can a neural network learn universal adversarial perturbations? Scale Clip Classify Target Adversarial Model Model

Can a neural network learn universal adversarial perturbations? Scale Clip Classify Target Adversarial Model Model Given a model, f , and a image, x , classified correctly as c 0 , the attacker model is training to minimize: We scale the perturbation such that never exceeds 0.04.

Learned Universal Adversarial Perturbations Inception-V3 ResNet-152 VGG-19 ImageNet test accuracy Original: 77.2% 78.4% 71.0% Adversarial: 22.7% 11.1% 15.1%

Inception-V3: Inception-V3: Fire engine (54.6%) Wrecker (79.4%) ResNet-152: ResNet-152: Table lamp (87.2%) Tabby cat (41.9%) VGG-19: VGG-19: Radio telescope (97.5%) Great Pyrenees (36.7%)

We can perform targeted attacks to force the model to always classify as label, c , by changing the loss term from: To:

Target class: Golf Ball Inception-V3: Inception-V3: American egret (95.0%) Golf ball (98.8%) ResNet-152: ResNet-152: Binoculars (99.9%) Golf ball (62.9%) VGG-19: VGG-19: Indian cobra (99.9%) Golf ball (99.7%)

Adversarial Training Defense Include adversarial examples during training to improve robustness. Instead of optimizing , optimize

Adversarial Training Defense Play Cat and Mouse game: 1) Train generative model to create perturbations, report target model accuracy on adversarial examples 2) Use adversarial training to defend target model, report target model accuracy on adversarial examples. 3) Go to (1)

Related Work Three pre-prints using the same technique appeared online within a few days of one another. This work, Poursaeed et al. [1], Mopuri et al. [2]. V G G - 1 9 I N C E P T I O N - V 1 This work 0.846 0.809 Poursaeed et al. [1] 0.801 0.792 Mopuri et al. [2] 0.838 0.904 [1] Poursaeed et al. Generative Adversarial Perturbations. [2] Moosavi-Dezfooli. NAG: Network for Adversary Generation.

Thanks! j.hayes@cs.ucl.ac.uk @_jamiedh

Learning Universal Adversarial Perturbations with Generative Models - PowerPoint PPT Presentation

Learning Universal Adversarial Perturbations with Generative Models Jamie Hayes & George Danezis UCL Adversarial examples transfer between different models. An adversarial example crafted against one model will generally fool other models.

On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models Paul Michel, Xian Li,

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr &

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations Florian

N formalism for curvature perturbations formalism for curvature perturbations from inflation

P Perturbations Perturbations P t t b ti b ti in Lee in Lee in Lee Wick Bouncing Universe

Stochastic Perturbations of Proximal-Gradient methods for nonsmooth convex optimization: the

Measuring Perturbations Measuring Perturbations with Weak Lensing of SNe with Weak Lensing of

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Adding Aerosol Cans to the Universal Waste Regulations Where does Universal Waste fit? HAZARDOUS

UNIVERSAL ROBOTS RUC 2018 Universal Robots - Evolving the future UNIVERSAL ROBOTS SET THE

Tech Day: Universal Acceptance Mark van rek Universal Acceptance Todays Objectives

Universal Credit Universal Credit Universal Credit is for working-age people aged over 18 and

SECURITY, ADVERSARIAL SECURITY, ADVERSARIAL LEARNING, AND PRIVACY LEARNING, AND PRIVACY

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Living in a fools wireless - secured paradise Stefan Kiese Topics Wireless (consumer)

For the wise men of old, the cardinal problem of human life was how to conform the soul to

Outsourcing Source Code Distribution Requirements Alexios Zavras, Stefano Zacchiroli Intel,

Adversaries & Interpretability SIDN: An IAP Practicum Shibani Santurkar Dimitris Tsipras

Data Sovereignty The importance of geolocating data in the cloud Zachary N J Peterson Mark

Mapping of the space change of Oued Ali Mountains (Mascara, Algeria) by Landsat optical imagery

Bayesian Learning By Harivinod N Vivekananda College of Engineering Technology, Puttur 15CS73

Learning Universal Adversarial Perturbations with Generative Models - PowerPoint PPT Presentation

Learning Universal Adversarial Perturbations with Generative Models Jamie Hayes & George Danezis UCL Adversarial examples transfer between different models. An adversarial example crafted against one model will generally fool other models.

On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models Paul Michel, Xian Li,

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr &amp;

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations Florian

N formalism for curvature perturbations formalism for curvature perturbations from inflation

P Perturbations Perturbations P t t b ti b ti in Lee in Lee in Lee Wick Bouncing Universe

Stochastic Perturbations of Proximal-Gradient methods for nonsmooth convex optimization: the

Measuring Perturbations Measuring Perturbations with Weak Lensing of SNe with Weak Lensing of

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Adding Aerosol Cans to the Universal Waste Regulations Where does Universal Waste fit? HAZARDOUS

UNIVERSAL ROBOTS RUC 2018 Universal Robots - Evolving the future UNIVERSAL ROBOTS SET THE

Tech Day: Universal Acceptance Mark van rek Universal Acceptance Todays Objectives

Universal Credit Universal Credit Universal Credit is for working-age people aged over 18 and

SECURITY, ADVERSARIAL SECURITY, ADVERSARIAL LEARNING, AND PRIVACY LEARNING, AND PRIVACY

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Living in a fools wireless - secured paradise Stefan Kiese Topics Wireless (consumer)

For the wise men of old, the cardinal problem of human life was how to conform the soul to

Outsourcing Source Code Distribution Requirements Alexios Zavras, Stefano Zacchiroli Intel,

Adversaries &amp; Interpretability SIDN: An IAP Practicum Shibani Santurkar Dimitris Tsipras

Data Sovereignty The importance of geolocating data in the cloud Zachary N J Peterson Mark

Mapping of the space change of Oued Ali Mountains (Mascara, Algeria) by Landsat optical imagery

Bayesian Learning By Harivinod N Vivekananda College of Engineering Technology, Puttur 15CS73

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr &

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Adversaries & Interpretability SIDN: An IAP Practicum Shibani Santurkar Dimitris Tsipras