Lecture 21: Adversarial Networks CS109B Data Science 2 Pavlos - - PowerPoint PPT Presentation

lecture 21 adversarial networks
SMART_READER_LITE
LIVE PREVIEW

Lecture 21: Adversarial Networks CS109B Data Science 2 Pavlos - - PowerPoint PPT Presentation

Lecture 21: Adversarial Networks CS109B Data Science 2 Pavlos Protopapas and Mark Glickman 1 How vulnerable are Neural Networks? Uses of Neural Networks CS109B, P ROTOPAPAS , G LICKMAN How vulnerable are Neural Networks? CS109B, P ROTOPAPAS ,


slide-1
SLIDE 1

CS109B Data Science 2

Pavlos Protopapas and Mark Glickman

Lecture 21: Adversarial Networks

1

slide-2
SLIDE 2

CS109B, PROTOPAPAS, GLICKMAN

How vulnerable are Neural Networks?

Uses of Neural Networks

slide-3
SLIDE 3

CS109B, PROTOPAPAS, GLICKMAN

How vulnerable are Neural Networks?

slide-4
SLIDE 4

CS109B, PROTOPAPAS, GLICKMAN

Explaining Adversarial Examples

[Goodfellow et. al ‘15] 1. Robust attacks with FGSM

  • 2. Robust defense with Adversarial Training
slide-5
SLIDE 5

CS109B, PROTOPAPAS, GLICKMAN

Explaining Adversarial Examples

slide-6
SLIDE 6

CS109B, PROTOPAPAS, GLICKMAN

Some of these adversarial examples can even fool humans:

slide-7
SLIDE 7

CS109B, PROTOPAPAS, GLICKMAN

Attacking with Fast Gradient Sign Method (FGSM)

W X L

x + λ · sign(rxL) ) x∗

slide-8
SLIDE 8

CS109B, PROTOPAPAS, GLICKMAN

Attacking with Fast Gradient Sign Method (FGSM)

W X L

x + λ · sign(rxL) ) x∗

slide-9
SLIDE 9

CS109B, PROTOPAPAS, GLICKMAN

x + λ · sign(rxL) ) x∗

slide-10
SLIDE 10

CS109B, PROTOPAPAS, GLICKMAN

Defending with Adversarial Training

1. Generate adversarial examples

  • 2. Adjust labels
slide-11
SLIDE 11

CS109B, PROTOPAPAS, GLICKMAN

Defending with Adversarial Training

1. Generate adversarial examples

  • 2. Adjust labels

“Panda”

slide-12
SLIDE 12

CS109B, PROTOPAPAS, GLICKMAN

Defending with Adversarial Training

1. Generate adversarial examples

  • 2. Adjust labels
  • 3. Add them to the training set
  • 4. Train new network

“Panda”

slide-13
SLIDE 13

CS109B, PROTOPAPAS, GLICKMAN

Attack methods post GoodFellow 2015

  • FGSM [Goodfellow et. al ‘15]
  • JSMA [Papernot et. al ‘16]
  • C&W [Carlini + Wagner ‘16]
  • Step-LL [Kurakin et. al ‘17]
  • I-FGSM [Tramer et. al ‘18]
slide-14
SLIDE 14

CS109B, PROTOPAPAS, GLICKMAN

White box attacks

W

x + λ · rxL ) x∗

L

x + λ · sign(rxL) ) x∗

slide-15
SLIDE 15

CS109B, PROTOPAPAS, GLICKMAN

“Black Box” Attacks

“Black Box” Attacks [Papernot et. al ‘17]

slide-16
SLIDE 16

CS109B, PROTOPAPAS, GLICKMAN

“Black Box” Attacks

Examine inputs and outputs of the model

slide-17
SLIDE 17

CS109B, PROTOPAPAS, GLICKMAN

“Black Box” Attacks

Panda

slide-18
SLIDE 18

CS109B, PROTOPAPAS, GLICKMAN

“Black Box” Attacks

Panda Gibbon

slide-19
SLIDE 19

CS109B, PROTOPAPAS, GLICKMAN

“Black Box” Attacks

Panda Gibbon Ostrich

slide-20
SLIDE 20

CS109B, PROTOPAPAS, GLICKMAN

“Black Box” Attacks

Train a model that performs the same as the black box

slide-21
SLIDE 21

CS109B, PROTOPAPAS, GLICKMAN

“Black Box” Attacks

Train a model that performs the same as the black box

Panda Gibbon Ostrich

slide-22
SLIDE 22

CS109B, PROTOPAPAS, GLICKMAN

“Black Box” Attacks

Now attack the model you just trained with “white” box attack

L W

x + λ · rxL ) x∗

x + λ · sign(rxL) ) x∗

slide-23
SLIDE 23

CS109B, PROTOPAPAS, GLICKMAN

“Black Box” Attacks

Use those adversarial examples to the “black” box

slide-24
SLIDE 24

CS109B, PROTOPAPAS, GLICKMAN

CleverHans

A Python library to benchmark machine learning systems' vulnerability to adversarial examples. https://github.com/tensorflow/cleverhans http://www.cleverhans.io/

slide-25
SLIDE 25

PAVLOS PROTOPAPAS

More Defenses

Mixup:

  • Mix two training examples
  • Augment training set

Smooth decision boundaries:

  • Regularize the derivatives wrt

to x

˜ x = λxi + (1 − λ)xj ˜ y = λyi + (1 − λ)yj

slide-26
SLIDE 26

CS109B, PROTOPAPAS, GLICKMAN

Physical attacks

  • Object Detection
  • Adversarial Stickers
slide-27
SLIDE 27

CS109B, PROTOPAPAS, GLICKMAN

Thank you.