featur ure deno noising ng for r impr provi ving ng adv
play

Featur ure Deno noising ng for r Impr provi ving ng Adv - PowerPoint PPT Presentation

Featur ure Deno noising ng for r Impr provi ving ng Adv dversari rial Robus bustne ness Cihang Xie Johns Hopkins University Background Towards Robust Adversarial Defense Deep networks are Go Good Deep Label: King Penguin


  1. Featur ure Deno noising ng for r Impr provi ving ng Adv dversari rial Robus bustne ness Cihang Xie Johns Hopkins University

  2. ● Background ● Towards Robust Adversarial Defense

  3. Deep networks are Go Good Deep Label: King Penguin Networks

  4. FRAGILE to small & carefully crafted perturbations Deep networks are FR Deep Label: King Penguin Networks Label: Chihuahua

  5. FRAGILE to small & carefully crafted perturbations Deep networks are FR We call such images as Adversarial Examples

  6. Adversarial Examples can exist on Di Differ eren ent Tasks semantic segmentation pose estimation text classification [1] Cihang Xie, Jianyu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, and Alan Yuille. "Adversarial examples for semantic segmentation and object detection." In ICCV . 2017. [2] Moustapha Cisse, Yossi Adi, Natalia Neverova, and Joseph Keshet. "Houdini: Fooling deep structured prediction models." In NeurIPS. 2018. [3] Javid Ebrahimi, Anyi Rao, Daniel Lowd, and Dejing Dou. "HotFlip: White-Box Adversarial Examples for Text Classification." In ACL . 2018.

  7. Adversarial Examples can be created other than Addi Adding ng Pe Perturbation person : 0.817 bird : 0.342 person : 0.736 bird : 0.629 bird : 0.628 tvmonitor : 0.998 bird : 0.663 [4] Chaowei Xiao, Jun-Yan Zhu, Bo Li, Warren He, Mingyan Liu, and Dawn Song. "Spatially transformed adversarial examples." In ICLR. 2018. [5] Jianyu Wang, Zhishuai Zhang, Cihang Xie, et al. "Visual concepts and compositional voting." In Annals of Mathematical Sciences and Applications. 2018 .

  8. Adversarial Examples can exist on Th The Physic ical al World ld [6] Lifeng Huang, Chengying Gao, Yuyin Zhou, Changqing Zou, Cihang Xie, Alan Yuille, Ning Liu. "UPC: Learning Universal Physical Camouflage Attacks on Object Detectors," Arxiv, 2019

  9. Generating Adversarial Example is SI PLE : SIMPL maximize loss(f(x+ r ), y true ; θ) Maximize the loss function w.r.t. Adversarial Perturbation r

  10. Generating Adversarial Example is SI PLE : SIMPL maximize loss(f(x+ r ), y true ; θ) Maximize the loss function w.r.t. Adversarial Perturbation r minimize loss(f(x), y true ; θ ); Minimize the loss function w.r.t. Network Parameters θ

  11. ● Background ● Towards Robust Adversarial Defense Deep Networks Label: King Penguin

  12. SMALL on the pixel space Ob Obser ervation on : Adversarial perturbations are SM 4 3 3 3 Clean 2 clean 2 2 1 1 1 0 0 0 4 3 3 adversarial 3 2 2 Adversarial 2 1 1 1 0 0 0

  13. BIG on the feature space Ob Obser ervation on : Adversarial perturbations are BI 4 3 3 3 Clean 2 clean 2 2 1 1 1 0 0 0 4 3 3 adversarial 3 2 2 Adversarial 2 1 1 1 0 0 0

  14. BIG on the feature space Obser Ob ervation on : Adversarial perturbations are BI 4 3 3 3 Clean 2 clean 2 2 1 1 1 0 0 0 4 3 3 adversarial 3 2 2 Adversarial 2 1 1 1 0 0 0 We should DENOISE these feature maps

  15. Ou Our Sol olution on : Den Denoi oising at fea eature e level el Tradition onal Image Denoi oising g Operation ons : Local filters (predefine a local region Ω " for each pixel i): & '() * ) ∑ ∀.∈0 $ 1 2 $ , 2 . 2 . Bilateral filter # $ = ● Median filter # $ = 456"78 ∀9 ∈ Ω " : 2 . ● & '() * ) ∑ ∀.∈0 $ 2 . Mean filter # $ = ● Non-local filters (the local region Ω " is the whole image I): & '() * ) ∑ ∀.∈; 1 2 $ , 2 . 2 . Non-local means # $ = ●

  16. Denoising Block ck Design 1×1 conv Denoising operations may lose information • we add a residual connection to balance the tradeoff between denoising removing noise and retaining original signal operation

  17. Tr Training Strategy : Adversarial training Core Idea: train with adversarial examples ●

  18. Training Strategy : Adversarial training Tr Core Idea: train with adversarial examples ● min θ max loss(f(x+ r ),ytrue; θ) & max step: generate adversarial perturbation

  19. Tr Training Strategy : Adversarial training Core Idea: train with adversarial examples ● min θ max loss(f(x+r),ytrue; θ ) & max step: generate adversarial perturbation min step: optimize network parameters

  20. Tw Two Ways for Evaluating Robustness Defending Against White-box Attacks Attackers know everything about models ● Directly maximize loss(f(x+ r ), y true ; θ) ●

  21. Tw Two Ways for Evaluating Robustness Defending Against White-box Attacks Attackers know everything about models ● Directly maximize loss(f(x+ r ), y true ; θ) ● Defending Against Blind Attacks Attackers know nothing about models ● Attackers generate adversarial examples using substitute networks ● ( rely on transferability )

  22. De Defen ending Against White-box x Attacks Evaluating against adversarial attackers with attack iteration up to 2000 ● ( more attack iterations indicate stronger attacks )

  23. De Defen ending Against White-box x Attacks – Pa Part I ALP, Inception-v3 55 ours, R-152 baseline A successful adversarial training can 50 give us a STRONG baseline 45 accuracy (%) 41.7 41.7 2000-iter PGD attack 40.4 40.4 39.6 39.6 39.2 39.2 40 38.9 38.9 35 30 27.9 27.9 ALP ALP 25 10 100 200 400 600 800 1000 1200 1400 1600 1800 2000 attack iterations

  24. De Defen ending Against White-box x Attacks – Pa Part I ALP, Inception-v3 55 ours, R-152 baseline ours, R-152 denoise 50 45.5 45.5 44.4 44.4 2000-iter PGD attack 2000-iter PGD attack 45 43.3 43.3 accuracy (%) 42.8 42.8 42.6 42.6 41.7 41.7 40.4 40.4 39.6 39.6 39.2 39.2 40 38.9 38.9 Feature Denoising can give us additional benefits 35 30 27.9 27.9 ALP ALP 25 10 100 200 400 600 800 1000 1200 1400 1600 1800 2000 attack iterations

  25. De Defen ending Against White-box x Attacks – Pa Part II 62 ResNet-152 baseline +4 bottleneck (ResNet-164) 60 +4 denoise: null (1x1 only) +4 denoise: 3x3 mean 58 +4 denoise: 3x3 median 56 +4 denoise: bilateral, dot prod 55.7 55.7 +4 denoise: bilateral, gaussian 54 +4 denoise: nonlocal, dot prod 53.5 53.5 accuracy (%) +4 denoise: nonlocal, gaussian 52.5 52.5 52 50 48 46 45.5 45.5 All denoising operations can help 44 43.4 43.4 42 41.7 41.7 10 20 30 40 50 60 70 80 90 100 attack iterations

  26. De Defen ending Against White-box x Attacks – Pa Part III 62 ResNet-152 ResNet-152, denoise 60 ResNet-638 57.3 57.3 58 56 55.7 55.7 54 accuracy (%) 52.5 52.5 52 50 48 46.1 46.1 Feature Denoising is nearly as powerful 46 45.5 45.5 as adding ~500 additional layers 44 42 41.7 41.7 10 20 30 40 50 60 70 80 90 100 attack iterations

  27. De Defen ending Against White-box x Attacks – Pa Part III 62 61.3 61.3 ResNet-152 ResNet-152, denoise 60 ResNet-638 57.3 57.3 ResNet-638, denoise* 58 56 55.7 55.7 54 accuracy (%) 52.5 52.5 52 49.9 49.9 50 Feature Denoising can still provide 48 benefits for the VERY deep ResNet-638 46.1 46.1 46 45.5 45.5 44 42 41.7 41.7 10 20 30 40 50 60 70 80 90 100 attack iterations

  28. De Defen ending Against Blind Attacks Offline evaluation against 5 BEST attackers from NeurIPS Adversarial Competition 2017 ● Online competition against 48 UNKNOWN attackers in CAAD 2018 ●

  29. De Defen ending Against Blind Attacks Offline evaluation against 5 BEST attackers from NeurIPS Adversarial Competition 2017 ● Online competition against 48 UNKNOWN attackers in CAAD 2018 ● CAAD 2018 “all or nothing” criterion : an image is considered correctly classified only if the model correctly classifies all adversarial versions of this image created by all attackers

  30. De Defen ending Against Blind Attacks --- --- CA CAAD 2 2017 O Offline E Evaluation

  31. De Defen ending Against Blind Attacks --- --- CA CAAD 2 2017 O Offline E Evaluation

  32. De Defen ending Against Blind Attacks --- --- CA CAAD 2 2017 O Offline E Evaluation

  33. De Defen ending Against Blind Attacks --- --- CA CAAD 2 2018 O Online Co Competition 0 10 20 30 40 50 1st 50.6 2nd 40.8 3rd 8.6 4th 3.6 5th 0.6

  34. Visualization Vi Adversarial Examples Before denoising After denoising 0.8 0.6 0.4 0.2 0 2.4 Denoising 1.8 Operations 1.2 0.6 0 1.5 1 0.5 0

  35. Defending against adversarial attacks is still a long way to go…

  36. Questions?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend