Featur ure Deno noising ng for r Impr provi ving ng Adv - - PowerPoint PPT Presentation

featur ure deno noising ng for r impr provi ving ng adv
SMART_READER_LITE
LIVE PREVIEW

Featur ure Deno noising ng for r Impr provi ving ng Adv - - PowerPoint PPT Presentation

Featur ure Deno noising ng for r Impr provi ving ng Adv dversari rial Robus bustne ness Cihang Xie Johns Hopkins University Background Towards Robust Adversarial Defense Deep networks are Go Good Deep Label: King Penguin


slide-1
SLIDE 1

Featur ure Deno noising ng for r Impr provi ving ng Adv dversari rial Robus bustne ness

Cihang Xie Johns Hopkins University

slide-2
SLIDE 2
  • Background
  • Towards Robust Adversarial Defense
slide-3
SLIDE 3

Deep Networks

Label: King Penguin

Deep networks are Go Good

slide-4
SLIDE 4

Deep Networks

Label: King Penguin Label: Chihuahua

Deep networks are FR FRAGILE to small & carefully crafted perturbations

slide-5
SLIDE 5

Deep networks are FR FRAGILE to small & carefully crafted perturbations

We call such images as

Adversarial Examples

slide-6
SLIDE 6

Adversarial Examples can exist on Di Differ eren ent Tasks

semantic segmentation pose estimation text classification

[1] Cihang Xie, Jianyu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, and Alan Yuille. "Adversarial examples for semantic segmentation and object detection." In ICCV. 2017. [2] Moustapha Cisse, Yossi Adi, Natalia Neverova, and Joseph Keshet. "Houdini: Fooling deep structured prediction models." In NeurIPS. 2018. [3] Javid Ebrahimi, Anyi Rao, Daniel Lowd, and Dejing Dou. "HotFlip: White-Box Adversarial Examples for Text Classification." In ACL. 2018.

slide-7
SLIDE 7

Adversarial Examples can be created other than Addi Adding ng Pe Perturbation

[4] Chaowei Xiao, Jun-Yan Zhu, Bo Li, Warren He, Mingyan Liu, and Dawn Song. "Spatially transformed adversarial examples." In ICLR. 2018. [5] Jianyu Wang, Zhishuai Zhang, Cihang Xie, et al. "Visual concepts and compositional voting." In Annals of Mathematical Sciences and Applications. 2018 .

bird : 0.663 bird : 0.629 bird : 0.628 person : 0.736 tvmonitor : 0.998 bird : 0.342 person : 0.817

slide-8
SLIDE 8

Adversarial Examples can exist on Th The Physic ical al World ld

[6] Lifeng Huang, Chengying Gao, Yuyin Zhou, Changqing Zou, Cihang Xie, Alan Yuille, Ning Liu. "UPC: Learning Universal Physical Camouflage Attacks on Object Detectors," Arxiv, 2019

slide-9
SLIDE 9

Generating Adversarial Example is SI SIMPL PLE:

maximize loss(f(x+r), ytrue; θ)

Maximize the loss function w.r.t. Adversarial Perturbation r

slide-10
SLIDE 10

Generating Adversarial Example is SI SIMPL PLE:

maximize loss(f(x+r), ytrue; θ) minimize loss(f(x), ytrue; θ);

Maximize the loss function w.r.t. Adversarial Perturbation r Minimize the loss function w.r.t. Network Parameters θ

slide-11
SLIDE 11
  • Background
  • Towards Robust Adversarial Defense

Deep Networks

Label: King Penguin

slide-12
SLIDE 12

1 2 3 4 1 2 3 4 1 2 3 1 2 3 1 2 3 1 2 3

clean adversarial

Clean Adversarial

Ob Obser ervation

  • n: Adversarial perturbations are SM

SMALL on the pixel space

slide-13
SLIDE 13

1 2 3 4 1 2 3 4 1 2 3 1 2 3 1 2 3 1 2 3

clean adversarial

Clean Adversarial

Ob Obser ervation

  • n: Adversarial perturbations are BI

BIG on the feature space

slide-14
SLIDE 14

1 2 3 4 1 2 3 4 1 2 3 1 2 3 1 2 3 1 2 3

clean adversarial

Clean Adversarial

Ob Obser ervation

  • n: Adversarial perturbations are BI

BIG on the feature space

We should DENOISE these feature maps

slide-15
SLIDE 15

Ou Our Sol

  • lution
  • n: Den

Denoi

  • ising at fea

eature e level el

Tradition

  • nal Image Denoi
  • ising

g Operation

  • ns:

Local filters (predefine a local region Ω " for each pixel i):

  • Bilateral filter #$ =

& '()*) ∑∀.∈0 $ 1 2$, 2. 2.

  • Median filter #$ = 456"78 ∀9 ∈ Ω " : 2.
  • Mean filter #$ =

& '()*) ∑∀.∈0 $ 2.

Non-local filters (the local region Ω " is the whole image I):

  • Non-local means #$ =

& '()*) ∑∀.∈; 1 2$, 2. 2.

slide-16
SLIDE 16

Denoising Block ck Design

1×1 conv

denoising

  • peration

Denoising operations may lose information

  • we add a residual connection to balance the tradeoff between

removing noise and retaining original signal

slide-17
SLIDE 17

Tr Training Strategy: Adversarial training

  • Core Idea: train with adversarial examples
slide-18
SLIDE 18

min θ max

&

loss(f(x+r),ytrue; θ)

max step: generate adversarial perturbation

Tr Training Strategy: Adversarial training

  • Core Idea: train with adversarial examples
slide-19
SLIDE 19

min θ max

&

loss(f(x+r),ytrue; θ)

max step: generate adversarial perturbation min step: optimize network parameters

Tr Training Strategy: Adversarial training

  • Core Idea: train with adversarial examples
slide-20
SLIDE 20

Tw Two Ways for Evaluating Robustness

Defending Against White-box Attacks

  • Attackers know everything about models
  • Directly maximize loss(f(x+r), ytrue; θ)
slide-21
SLIDE 21

Tw Two Ways for Evaluating Robustness

Defending Against White-box Attacks

  • Attackers know everything about models
  • Directly maximize loss(f(x+r), ytrue; θ)

Defending Against Blind Attacks

  • Attackers know nothing about models
  • Attackers generate adversarial examples using substitute networks

(rely on transferability)

slide-22
SLIDE 22

De Defen ending Against White-box x Attacks

  • Evaluating against adversarial attackers with attack iteration up to 2000

(more attack iterations indicate stronger attacks)

slide-23
SLIDE 23

De Defen ending Against White-box x Attacks – Pa Part I

10 100 200 400 600 800 1000 1200 1400 1600 1800 2000

attack iterations

25 30 35 40 45 50 55

accuracy (%)

ALP

41.7 40.4 39.6 38.9 39.2 27.9

ALP

41.7 40.4 39.6 38.9 39.2 27.9

2000-iter PGD attack ALP, Inception-v3

  • urs, R-152 baseline

A successful adversarial training can give us a STRONG baseline

slide-24
SLIDE 24

De Defen ending Against White-box x Attacks – Pa Part I

10 100 200 400 600 800 1000 1200 1400 1600 1800 2000

attack iterations

25 30 35 40 45 50 55

accuracy (%)

ALP

45.5 44.4 43.3 42.8 42.6 41.7 40.4 39.6 38.9 39.2 27.9

2000-iter PGD attack ALP

45.5 44.4 43.3 42.8 42.6 41.7 40.4 39.6 38.9 39.2 27.9

2000-iter PGD attack ALP, Inception-v3

  • urs, R-152 baseline
  • urs, R-152 denoise

Feature Denoising can give us

additional benefits

slide-25
SLIDE 25

De Defen ending Against White-box x Attacks – Pa Part II

10 20 30 40 50 60 70 80 90 100

attack iterations

42 44 46 48 50 52 54 56 58 60 62

accuracy (%) 55.7 45.5 53.5 43.4 52.5 41.7 55.7 45.5 53.5 43.4 52.5 41.7

ResNet-152 baseline +4 bottleneck (ResNet-164) +4 denoise: null (1x1 only) +4 denoise: 3x3 mean +4 denoise: 3x3 median +4 denoise: bilateral, dot prod +4 denoise: bilateral, gaussian +4 denoise: nonlocal, dot prod +4 denoise: nonlocal, gaussian

All denoising operations can help

slide-26
SLIDE 26

De Defen ending Against White-box x Attacks – Pa Part III

10 20 30 40 50 60 70 80 90 100

attack iterations

42 44 46 48 50 52 54 56 58 60 62

accuracy (%) 55.7 45.5 52.5 41.7 55.7 45.5 52.5 41.7 57.3 46.1 57.3 46.1

ResNet-152 ResNet-152, denoise ResNet-638

Feature Denoising is nearly as powerful as adding ~500 additional layers

slide-27
SLIDE 27

De Defen ending Against White-box x Attacks – Pa Part III

10 20 30 40 50 60 70 80 90 100

attack iterations

42 44 46 48 50 52 54 56 58 60 62

accuracy (%) 55.7 45.5 52.5 41.7 55.7 45.5 52.5 41.7 61.3 49.9 61.3 49.9 57.3 46.1 57.3 46.1

ResNet-152 ResNet-152, denoise ResNet-638 ResNet-638, denoise*

Feature Denoising can still provide benefits for the VERY deep ResNet-638

slide-28
SLIDE 28

De Defen ending Against Blind Attacks

  • Offline evaluation against 5 BEST attackers from NeurIPS Adversarial Competition 2017
  • Online competition against 48 UNKNOWN attackers in CAAD 2018
slide-29
SLIDE 29

De Defen ending Against Blind Attacks

  • Offline evaluation against 5 BEST attackers from NeurIPS Adversarial Competition 2017
  • Online competition against 48 UNKNOWN attackers in CAAD 2018

CAAD 2018 “all or nothing” criterion: an image is considered correctly classified only if the model correctly classifies all adversarial versions of this image created by all attackers

slide-30
SLIDE 30

De Defen ending Against Blind Attacks ---

  • -- CA

CAAD 2 2017 O Offline E Evaluation

slide-31
SLIDE 31

De Defen ending Against Blind Attacks ---

  • -- CA

CAAD 2 2017 O Offline E Evaluation

slide-32
SLIDE 32

De Defen ending Against Blind Attacks ---

  • -- CA

CAAD 2 2017 O Offline E Evaluation

slide-33
SLIDE 33

De Defen ending Against Blind Attacks ---

  • -- CA

CAAD 2 2018 O Online Co Competition

10 20 30 40 50 1st 2nd 3rd 4th 5th 50.6 40.8 8.6 3.6 0.6

slide-34
SLIDE 34

Vi Visualization

Before denoising After denoising Adversarial Examples

0.2 0.4 0.6 0.8 0.5 1 1.5 0.6 1.2 1.8 2.4

Denoising Operations

slide-35
SLIDE 35

Defending against adversarial attacks is still a long way to go…

slide-36
SLIDE 36

Questions?