CS109B Data Science 2
Pavlos Protopapas and Mark Glickman
Lecture 21: Adversarial Networks
1
Lecture 21: Adversarial Networks CS109B Data Science 2 Pavlos - - PowerPoint PPT Presentation
Lecture 21: Adversarial Networks CS109B Data Science 2 Pavlos Protopapas and Mark Glickman 1 How vulnerable are Neural Networks? Uses of Neural Networks CS109B, P ROTOPAPAS , G LICKMAN How vulnerable are Neural Networks? CS109B, P ROTOPAPAS ,
CS109B Data Science 2
Pavlos Protopapas and Mark Glickman
1
CS109B, PROTOPAPAS, GLICKMAN
How vulnerable are Neural Networks?
Uses of Neural Networks
CS109B, PROTOPAPAS, GLICKMAN
How vulnerable are Neural Networks?
CS109B, PROTOPAPAS, GLICKMAN
Explaining Adversarial Examples
[Goodfellow et. al ‘15] 1. Robust attacks with FGSM
CS109B, PROTOPAPAS, GLICKMAN
Explaining Adversarial Examples
CS109B, PROTOPAPAS, GLICKMAN
Some of these adversarial examples can even fool humans:
CS109B, PROTOPAPAS, GLICKMAN
Attacking with Fast Gradient Sign Method (FGSM)
CS109B, PROTOPAPAS, GLICKMAN
Attacking with Fast Gradient Sign Method (FGSM)
CS109B, PROTOPAPAS, GLICKMAN
CS109B, PROTOPAPAS, GLICKMAN
Defending with Adversarial Training
1. Generate adversarial examples
CS109B, PROTOPAPAS, GLICKMAN
Defending with Adversarial Training
1. Generate adversarial examples
“Panda”
CS109B, PROTOPAPAS, GLICKMAN
Defending with Adversarial Training
1. Generate adversarial examples
“Panda”
CS109B, PROTOPAPAS, GLICKMAN
Attack methods post GoodFellow 2015
CS109B, PROTOPAPAS, GLICKMAN
White box attacks
CS109B, PROTOPAPAS, GLICKMAN
“Black Box” Attacks
“Black Box” Attacks [Papernot et. al ‘17]
CS109B, PROTOPAPAS, GLICKMAN
“Black Box” Attacks
Examine inputs and outputs of the model
CS109B, PROTOPAPAS, GLICKMAN
“Black Box” Attacks
CS109B, PROTOPAPAS, GLICKMAN
“Black Box” Attacks
CS109B, PROTOPAPAS, GLICKMAN
“Black Box” Attacks
CS109B, PROTOPAPAS, GLICKMAN
“Black Box” Attacks
Train a model that performs the same as the black box
CS109B, PROTOPAPAS, GLICKMAN
“Black Box” Attacks
Train a model that performs the same as the black box
Panda Gibbon Ostrich
CS109B, PROTOPAPAS, GLICKMAN
“Black Box” Attacks
Now attack the model you just trained with “white” box attack
CS109B, PROTOPAPAS, GLICKMAN
“Black Box” Attacks
Use those adversarial examples to the “black” box
CS109B, PROTOPAPAS, GLICKMAN
CleverHans
A Python library to benchmark machine learning systems' vulnerability to adversarial examples. https://github.com/tensorflow/cleverhans http://www.cleverhans.io/
PAVLOS PROTOPAPAS
More Defenses
Mixup:
Smooth decision boundaries:
to x
CS109B, PROTOPAPAS, GLICKMAN
Physical attacks
CS109B, PROTOPAPAS, GLICKMAN