Toward Adversarial Robustness by Diversity in an Ensemble of - PowerPoint PPT Presentation

Toward Adversarial Robustness by Diversity in an Ensemble of Specialized Deep Neural Networks presented at Canadian AI 2020 Mahdieh Abbasi 1 , Arezoo Rajabi 2 , Christian Gagné 1,3 , and Rakesh B. Bobba 2 1. IID, Université Laval, Québec, Canada 2. Oregon State University, Corvallis, USA 3. Mila, Canada CIFAR AI Chair

Adversarial attacks aims at fooling a Ensemble of Reject samples model through specialists trained with low ensemble imperceptible on diverse subsets classification modifications of classes certainty Legitimate sample ! or Adversarial instance ? ! ! 2

Adversarial example Adding the right imperceptible perturbation to a clean sample creates an adversarial example such that NNs if fooled by misclassifying it confidently ! P ( C = panda | x ) > 0 . 9 x P ( C = gibbon | x 0 ) > 0 . 9 x + ✏ = x 0 3

Examples of attacks Attack name Algorithm Properties ! Fast, one step Fast Gradient Sign Gradient Ascent – ! Non-optimal (FGS) maximize loss of neural network (NN) adversaries ! Fast, one step Gradient Descent – Targeted FGS ! Non-optimal minimize loss of NN toward a target class adversaries ! Moderate - iterative Project the sample to DeepFool ! Better adversaries the nearest decision boundary ! Slow - iterative Directly optimizing Carlini & Wagner ! Optimal adversaries (CW) an objective function 4

Attack models Black-box attack ! Attacker does not know any thing about the victim classifier ! Another NN used as proxy to current classifier Gray-box attack ! Attacker knows the defense mechanism but not the victim’s specific model parameters White-box attack ! Attacker knows the defense mechanism and all the victim’s model parameters 5

The goal Without any specific adversarial training , detect adversarial instances by calibrating our model predictive confidence Reducing predictive confidence over adversarial examples while keeping that of clean samples high How: leveraging diversity in the ensemble of specialists 6

Ensemble of specialists Build an ensemble of specialists, each trained on different set of classes ! Train several specialist neural networks on subset of classes ! A generalist network is also trained on all the classes Example: a 3-class classification problem Generalist NN, 3 classes Specialist NN, 2 classes Specialist NN, 2 classes 7

Schematic explanation (1/2) A black-box attack fooling the generalist classifier (left) can be classified as different classes by the specialists, creating diversity (entropy) in their predictions ! low confidence prediction 8

Schematic Explanation (2/2) Hardening generation of high confidence white-box adversaries as the specialists fool toward distinct fooling classes given the subsets 9

Our approach 1. Creation of the ensemble of specialists 2. Voting Mechanism : merging members predictions to compute the ensemble decision 3. Reject the samples when predictive confidence < ! 10

Ensemble creation 2) For each class, define two class 1) Build a fooling matrix by FGS subsets: most likely fooling classes and adversaries remaining classes True classes Fool classes "#$%&'()$')$*)+*,-.*$/0$12$+3*4('.(+5+$6$7$8*)*&'.(+5 11

Voting mechanism Principle : a sample should be classified by all the relevant models to its given class Agreement: All relevant models vote for a given class, only their prediction confidences are averaged as output Disagreement: There is no agreement from all relevant models to a given class, prediction confidences of all models are averaged as output 12

Confidence upper bound In the presence of disagreement, the ensemble predictive confidence is upper bounded ( M is the ensemble size) 1 ¯ h ( x ) ≤ 0 . 5 + 2 M Based on this corollary, we set a fixed global threshold (i.e. ! =0.5), for rejecting adversaries 13

Evaluation metrics Risk rate for clean samples ( E D ) : ratio of correctly classified but rejected samples (i.e. confidence less than ! ) and those that are misclassified but not rejected (i.e. confidence value above ! ) Risk rate for adversaries ( E A ) : percentage of misclassified adversaries that are not rejected (i.e. confidence value above ! ) 14

Experiments MNIST black-box attacks 9.*')$+',3.*+ :;<$'=>*&+'&(*+ 15

Experiments CIFAR-10 black-box attacks 9.*')$+',3.*+ :;<$'=>*&+'&(*+ 16

Experiments White-box attacks ! Generating adversaries specifically targeted for our ensemble model, pure ensemble, or a vanilla CNN ! Lower success rate is better! It shows the difficulty of adversaries generation 17

Experiments Gray-box attack ! Generate 100 CW adversarial examples using another ensemble specialists ! 74% of them have confidence lower than 0.5 (rejection) ! 26% remaining have confidence higher than 0.5 (not rejected) !"#$%&"'%($)$*'$+% ,+-$(.,(/$.%,($%0,(+% '"%($*"1&/2$%$-$&%3"(% ,%04#,&%"5.$(-$(%% 18

Our contributions Method for building an ensemble of diverse specialists along with a simple and computationally efficient voting mechanism for calibrating the predictive confidence for distinguising clean and adversarial examples Detecting adversaries using a provable fixed global threshold on the predictive confidence 19

!"#$%&'()*' your #++,$+-)$'. Mahdieh Abbasi, PhD student mahdieh.abbasi.1@ulaval.ca Christian Gagné, professor christian.gagne@gel.ulaval.ca https://vision.gel.ulaval.ca/~cgagne https://iid.ulaval.ca Arezoo Rajabi, PhD student rajabia@oregonstate.edu Rakesh B. Booba, professor rakesh.bobba@oregonstate.edu !""#$%&&''($)*+',*-$"."')'/0&#'*#1'&2*22.3+.4'$! 20

Toward Adversarial Robustness by Diversity in an Ensemble of - PowerPoint PPT Presentation

Toward Adversarial Robustness by Diversity in an Ensemble of Specialized Deep Neural Networks presented at Canadian AI 2020 Mahdieh Abbasi 1 , Arezoo Rajabi 2 , Christian Gagn 1,3 , and Rakesh B. Bobba 2 1. IID, Universit Laval, Qubec,

Improving Adversarial Robustness via Promoting Ensemble Diversity Tianyu

Limits on Robustness to Adversarial Examples Elvis Dohmatob Criteo AI Lab October 2, 2019 Elvis

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

UCSD Robustness Summer School David Donoho 20190812 David Donoho UCSD Robustness Summer School

Robustness? Robustness ? Robustness?

Boosting (ensemble) Module 4 - Ensemble classifiers - Objectives module 4: boosting (ensemble

Lessons Learned from Evaluating the Robustness of Defenses to Adversarial Examples Nicholas

Adversarial Robustness for Code Pavol Bielik , Martin Vechev pavol.bielik@inf.ethz.ch,

Adversarial Robustness of Machine Learning Models for Graphs Prof. Dr. Stephan Gnnemann

Adversarial Domain Adaptation and Adversarial Robustness Judy Hoffman + = Big Deep success

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr &

Adversarial Approaches to Bayesian Learning and Bayesian Approaches to Adversarial Robustness

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Robustness and Generalization Huan Xu The University of Texas at Austin Department of Electrical

Where Are We? Lecture 9 Robustness through Training 1 Robustness Explicit Handling of Noise

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

Lecture 3: Focus+Context Information Visualization CPSC 533C, Fall 2007 Tamara Munzner UBC

MANCHESTER: DISCONNECTED CITY The idea of the ring

Singularity formation in incompressible fluids Tarek M. Elgindi (UC-San Diego) In-Jee Jeong

Feature Visualization Niloy Mitra Iasonas Kokkinos Paul Guerrero Nils Thuerey Tobias Ritschel

Multimedia Databases Storage and Retrieval of Media Data Jukka Teuhola Dept. of Information

Communi unity Det etec ection & & Modula larit ity The search for clustered and

Vulnerability of machine learning models to adversarial examples Petra Vidnerov Institute of

The Environment Model of Evaluation Jim Royer CIS 352 March 22, 2019 CIS 352 The Environment

Toward Adversarial Robustness by Diversity in an Ensemble of - PowerPoint PPT Presentation

Toward Adversarial Robustness by Diversity in an Ensemble of Specialized Deep Neural Networks presented at Canadian AI 2020 Mahdieh Abbasi 1 , Arezoo Rajabi 2 , Christian Gagn 1,3 , and Rakesh B. Bobba 2 1. IID, Universit Laval, Qubec,

Improving Adversarial Robustness via Promoting Ensemble Diversity Tianyu

Limits on Robustness to Adversarial Examples Elvis Dohmatob Criteo AI Lab October 2, 2019 Elvis

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

UCSD Robustness Summer School David Donoho 20190812 David Donoho UCSD Robustness Summer School

Robustness? Robustness ? Robustness?

Boosting (ensemble) Module 4 - Ensemble classifiers - Objectives module 4: boosting (ensemble

Lessons Learned from Evaluating the Robustness of Defenses to Adversarial Examples Nicholas

Adversarial Robustness for Code Pavol Bielik , Martin Vechev pavol.bielik@inf.ethz.ch,

Adversarial Robustness of Machine Learning Models for Graphs Prof. Dr. Stephan Gnnemann

Adversarial Domain Adaptation and Adversarial Robustness Judy Hoffman + = Big Deep success

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr &amp;

Adversarial Approaches to Bayesian Learning and Bayesian Approaches to Adversarial Robustness

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Robustness and Generalization Huan Xu The University of Texas at Austin Department of Electrical

Where Are We? Lecture 9 Robustness through Training 1 Robustness Explicit Handling of Noise

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

Lecture 3: Focus+Context Information Visualization CPSC 533C, Fall 2007 Tamara Munzner UBC

MANCHESTER: DISCONNECTED CITY The idea of the ring

Singularity formation in incompressible fluids Tarek M. Elgindi (UC-San Diego) In-Jee Jeong

Feature Visualization Niloy Mitra Iasonas Kokkinos Paul Guerrero Nils Thuerey Tobias Ritschel

Multimedia Databases Storage and Retrieval of Media Data Jukka Teuhola Dept. of Information

Communi unity Det etec ection &amp; &amp; Modula larit ity The search for clustered and

Vulnerability of machine learning models to adversarial examples Petra Vidnerov Institute of

The Environment Model of Evaluation Jim Royer CIS 352 March 22, 2019 CIS 352 The Environment

Adversarial Training and Robustness for Multiple Perturbations Poster #87 Florian Tramr &

Communi unity Det etec ection & & Modula larit ity The search for clustered and