Adversarial examples are not mysterious, generalization is Angus - PowerPoint PPT Presentation

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs Adversarial examples are not mysterious, generalization is Angus Galloway University of Guelph gallowaa@uoguelph.ca

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs T HE ADVERSARIAL EXAMPLES PHENOMENON Machine learning models generalize well to an unseen test set, yet every input of a particular class is extremely close to an input of another class. “Accepted” informal definition : Any input designed to fool a machine learning system.

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs F ORMAL DEFINITIONS A “misclassification” adversarial candidate ˆ x on a neural network F with input x via some perturbation of x by δ : ˆ x = x + δ where δ is usually derived from the gradient of the loss ∇ L ( θ, y , x ) w.r.t x , and for some small scalar ǫ , � δ � p ≤ ǫ, p ∈ { 1 , 2 , ∞} such that F ( x ) � = F (ˆ x )

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs G OODFELLOW ET AL . 2015 For input x ∈ R n , there is an adversarial example ˜ x = x + η subject to the constraint � η � ∞ < ǫ . The dot product between a weight vector w and an adversarial example ˜ x is then: w T · ˜ x = w T · x + w T · η If w has mean m , activation grows linearly with ǫ m n . . .

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs T ANAY & G RIFFIN 2016 But both w T · x and w T · η grow linearly with dimension n , provided that the distribution of w and x do not change.

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs T HE B OUNDARY T ILTING P ERSPECTIVE Submanifold of sampled data Dense distribution of “low probability pockets” The boundary is “outside the box” Image space Image space (a) (b) Recall manifold learning hypothesis : training data sub-manifold exists with finite topological dimension f �� n .

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs C m ( I , C ) m ( i , C ) j J I i (c) B C B m ( i, C ) j = m ( i, B ) m ( I , B ) m ( I , C ) I I j i i J J (d)

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs T AXONOMY er 0.5 titling of L poorly performing classifiers (0 < v z << 1) B Type 0 titling of L Type 1 Type 2 (v z = 0) er min T optimal L classifiers (low reg.) δ 0 π/2

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs A TTACKING B INARIZED N EURAL N ETWORKS tf.sign() tf.sign() Batch Batch Norm Norm Full-Precision Conv2D ReLU ReLU Scalar Binary Conv2D The empirical observation that BNNs with low-precision weights and activations are at least as robust as their full-precision counter parts.

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs A TTACKING B INARIZED N EURAL N ETWORKS (2) 1. Regularizing effect due to decoupling between continuous and quantized parameters used in forward pass, biased gradient estimator (STE?) 2. Strikes better trade-off on IB curve in over-parameterized regime by discarding irrelevant information. (e) (f)

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs W HY ONLY CONSIDER SMALL PERTURBATIONS ? Fault tolerant engineering design: Want performance degradation to be proportional to perturbation magnitude, regardless of an attacker’s strategy.

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs 100 80 Train w/PGD Natural Accuracy (%) 60 40 20 0 0.1 0.3 0.5 0.7 0.9 Shift Magnitude

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs H UMAN -D RIVEN A TTACKS (g) (h)

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs A P RACTICAL B LACK -B OX A TTACK

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs T RADE - OFFS 100 100 Expert-L2 90 80 Natural FGSM Accuracy (%) 80 60 70 40 40 60 20 50 0 0 10 20 30 40 50 0.0 0.1 0.2 0.3 0.4 0.5 Pixels changed FGSM attack epsilon (i) (j)

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs I NTERPRETABILITY OF L OGISTIC R EGRESSION (k) (l) (m) (n)

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs C ANDIDATE E XAMPLES (o) (p) (q) (r) (s)

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs CIFAR-10 ARCHITECTURE Table: Simple fully-convolutional architecture adapted from the CleverHans library. Model uses ReLU activations, and does not use batch normalization or pooling. Layer h w c in c out s params 8 8 3 32 2 6.1k Conv1 6 6 32 64 2 73.7k Conv2 5 5 64 64 1 102.4k Conv3 1 1 256 10 1 2.6k Fc1 Total – – – – – 184.8k Model has 0.4% as many parameters as WideResNet.

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs L ∞ A DVERSARIAL E XAMPLES 100 WRN FGSM WRN PGD 80 CNN-L2 FGSM CNN-L2 PGD WRN-Nat PGD accuracy (%) 60 40 20 0 25 50 75 100 epsilon

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs R OBUSTNESS 100 WRN CNN 80 CNN-L2 WRN-Nat accuracy (%) 60 40 20 0 0.2 0.4 0.6 0.8 fraction of pixels swapped

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs N OISY E XAMPLES

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs W ITH L 2 WEIGHT DECAY The “independent components” of natural scenes are edge filters (Bell & Sejnowski 1997).

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs W ITHOUT WEIGHT DECAY

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs F OOLING I MAGES 4 years ago I didn’t think small-perturbation adversarial examples were going to be so hard to solve. I thought after another n months of working on those, I’d be basically done with them and would move on to fooling attacks . Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images (CVPR 2015)

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs F OOLING I MAGES (CIFAR-10)

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs F OOLING I MAGES (SVHN)

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs F OOLING I MAGES (SVHN) Robust training procedure does not learn random labels (lower Rademacher complexity).

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs F OOLING I MAGES 100 attack success rate (ASR) - margin (M) WRN ASR WRN M 80 CNN-L2 ASR CNN-L2 M 60 40 20 0 0 50 100 150 200 250 epsilon

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs D IVIDE AND C ONQUER ? Image from Dube (2018).

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs R EMARKS ◮ Test accuracy on popular ML benchmarks a weak measure of generalization. ◮ Plethora of band-aid fixes to std DNNs do not yield compelling results (e.g. provably robust framework). ◮ Incorporate expert knowledge, e.g. by excplicitly modeling part-whole relationships, other priors that relate to the known causal features such as edges in natural scenes. ◮ Good generalization implies some level of privacy, and more “fair” models assuming original intent is fair.

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs F UTURE WORK Information bottleneck (IB) theory seems essential for efficiently learning robust models from finite data. But why do models with no bottleneck generalize well on common machine learning datasets? i -RevNet retains all information until final layer and achieves high accuracy, but is extremely sensitive to adversarial examples.

Adversarial examples are not mysterious, generalization is Angus - PowerPoint PPT Presentation

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs Adversarial examples are not mysterious, generalization is Angus Galloway University of Guelph gallowaa@uoguelph.ca I NTRODUCTION Theory Trade-offs Practical Attacks DNNs T HE

SAINT PETERSBURG MYSTERIOUS IN WINTERTIME WHITE DAYS During the mysterious and beautiful Russian

String theory and the String theory and the mysterious quantum matter of mysterious quantum

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

A Closer Look at Adversarial Examples for Separated Data Kamalika Chaudhuri University of

Adversarial Examples Hanxiao Liu April 2, 2018 1 / 22 Adversarial Examples Inputs to ML

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Adversarial Examples in NLP Sameer Singh sameer@uci.edu @sameer_ sameersingh.org What are

Thermometer Encoding: One Hot Way to Resist Adversarial Examples Stanford, 2017-11-16 Aurko Roy*

Deep learning: Challenges in learning and generalization Tomas Mikolov, Facebook AI What is

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Transfer Learning Eu Wern Teh What are we covering? Why transfer learning? Fine

Direct calculation of hadronic light-by-light scatering Jeremy Green Nils Asmussen, Oleksii

Hadronic contributions to 2 from latice QCD Jeremy Green NIC, DESY, Zeuthen Second annual

What can the IDRS and EDRS tell us about the so-called ice epidemic in Victoria? Shelley

Electromagnetic Counterparts I M. Benacquista ICE Summer School: Gravitational Wave Astronomy

Strange and Charmed Mesons in Nuclear Matter and Nuclei _ Laura Tols D, K ICE, IEEC/CSIC,

A t r i B h a t t a c h a r y a T a l k a t I n s t i t u t e O f

LaTeX: Not Just for Papers! Adam McKay 1 1 University of Texas Austin/McDonald Observatory October

Adversarial examples are not mysterious, generalization is Angus - PowerPoint PPT Presentation

I NTRODUCTION Theory Trade-offs Practical Attacks DNNs Adversarial examples are not mysterious, generalization is Angus Galloway University of Guelph gallowaa@uoguelph.ca I NTRODUCTION Theory Trade-offs Practical Attacks DNNs T HE

SAINT PETERSBURG MYSTERIOUS IN WINTERTIME WHITE DAYS During the mysterious and beautiful Russian

String theory and the String theory and the mysterious quantum matter of mysterious quantum

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

A Closer Look at Adversarial Examples for Separated Data Kamalika Chaudhuri University of

Adversarial Examples Hanxiao Liu April 2, 2018 1 / 22 Adversarial Examples Inputs to ML

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Adversarial Examples in NLP Sameer Singh sameer@uci.edu @sameer_ sameersingh.org What are

Thermometer Encoding: One Hot Way to Resist Adversarial Examples Stanford, 2017-11-16 Aurko Roy*

Deep learning: Challenges in learning and generalization Tomas Mikolov, Facebook AI What is

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Transfer Learning Eu Wern Teh What are we covering? Why transfer learning? Fine

Direct calculation of hadronic light-by-light scatering Jeremy Green Nils Asmussen, Oleksii

Hadronic contributions to 2 from latice QCD Jeremy Green NIC, DESY, Zeuthen Second annual

What can the IDRS and EDRS tell us about the so-called ice epidemic in Victoria? Shelley

Electromagnetic Counterparts I M. Benacquista ICE Summer School: Gravitational Wave Astronomy

Strange and Charmed Mesons in Nuclear Matter and Nuclei _ Laura Tols D, K ICE, IEEC/CSIC,

A t r i B h a t t a c h a r y a T a l k a t I n s t i t u t e O f

LaTeX: Not Just for Papers! Adam McKay 1 1 University of Texas Austin/McDonald Observatory October

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin