Mag Net A Two-Pronged Defense against Adversarial Examples Dongyu - PowerPoint PPT Presentation

Mag Net A Two-Pronged Defense against Adversarial Examples Dongyu Meng Hao Chen ShanghaiTech University, China University of California, Davis, USA

Neural networks in real-life applications user authentication autonomous vehicle 2

Neural networks as classifier Panda 0.62 Tiger 0.03 Gibbon 0.11 Output Input Classifier (distribution) 3

Adversarial examples x panda Examples carefully crafted to - look like normal examples + x p(x is panda) = 0.58 - cause misclassification gibbon p(x is gibbon) = 0.99 4 [ICLR 15] Goodfellow, Shlens, and Szegedy. EXPLAINING AND HARNESSING ADVERSARIAL EXAMPLES

Attacks Fast gradient sign method(FGSM) [Goodfellow, 2015] Carlini’s attack [Carlini, 2017] confidence Iterative gradient sign [Kurakin, 2016] Deepfool [Moosavi-Dezfooli, 2015] …… 5

Defenses target specific attack modify classifier Adversarial training Yes Yes [Goodfellow, 2015] Defensive distillation Yes [Papernot, 2016] Detecting specific attacks Yes [Metzen, 2017] …… 6

Desirable properties Does not modify target classifier. - Can be deployed more easily as an add-on. Does not rely on attack-specific properties. - Generalizes to unknown attacks. 7

Manifold hypothesis Manifold Possible inputs take up dense sample space. But inputs we care about lie on a low dimensional manifold . 8

Our hypothesis for adversarial examples Manifold Some adversarial examples are far away from the manifold. 9 Classifiers are not trained to work on these inputs.

Our hypothesis for adversarial examples Manifold Other adversarial example are close to the manifold boundary where the classifier generalizes poorly . 10

Sanitize your inputs. 11

Our solution Manifold Detector : Decides if the example is far from the manifold. 12

Our solution Manifold Reformer : Draws the example towards the manifold. 13

Workflow MagNet returns y MagNet rejects the input 14

Autoencoder - Neural nets. Reconstruction error: - Learn to copy input to output. - Trained with constraints. 15

Autoencoder Autoencoders - learn to map inputs towards manifold. - approximate input-manifold distance with reconstruction error. Train autoencoders on normal examples only as building blocks. 16

Detector ? -- based on reconstruction error ? Input is normal. x x’ MagNet accepts the input. yes ||X-X’|| 2 < threshold? autoencoder x’ no Input is adversarial. MagNet rejects the input. 17

Detector ? -- based on probability divergence ? P P Panda 0.62 Panda 0.62 Input is normal. Tiger 0.03 Tiger 0.03 x x’ MagNet accepts the input. Gibbon 0.11 Gibbon 0.11 yes ? ? classifier D KL (P||Q) < threshold? D KL (P||Q) autoencoder Q Q Panda ... Panda ... x’ Tiger ... no Tiger ... Input is adversarial. Gibbon ... Gibbon ... MagNet rejects the input. classifier 18

Reformer ? ? Q Panda ... x x’ Tiger ... x’ Gibbon ... classifier autoencoder MagNet returns Q as final classification result. 19

Threat model knows the parameters of ... target classifier defense blackbox defense whitebox defense 20

Blackbox defense on MNIST dataset accuracy on adversarial examples 21

Blackbox defense on CIFAR-10 dataset accuracy on adversarial examples 22

Detector vs. reformer complete MagNet: detector+reformer confidence detector reformer no defense small distortion large distortion less noticeable more transferable (distortion) Detector and reformer complement each other . 23

Whitebox defense is not practical To defeat whitebox attacker, defender has to either - make it impossible for attacker to find adversarial examples, - or create a perfect classification network. 24

Graybox model knows the parameters of... defense classifier A B C D blackbox defense - Attacker knows possible defenses. - Exact defense is only known at run time. graybox defense Defense strategy whitebox - Train diverse defenses. defense - Randomly pick one for each session. 25

Train diverse defenses With MagNet, this means training diverse autoencoders. Our Method: Train n autoencoders at the same time. average reconstructed image reconstruction error Minimize autoencoder diversity 26

Graybox classification accuracy generate attack on Train diverse defense defend with Idea Penalize the resemblance of autoencoders. 27

Limitations The effectiveness of MagNet depends on assumptions that - detector and reformer functions exist. - we can approximate them with autoencoders. We show empirically that these assumptions are likely correct. 28

Conclusion We propose MagNet framework: ● Detector detects examples far from the manifold ● Reformer moves examples closer to the manifold We demonstrated effective defense against adversarial examples in blackbox scenario with MagNet. Instead of whitebox model, we advocate graybox model, where security rests on model diversity. 29

Thanks & Questions? Find more about MagNet: ● https://arxiv.org/abs/1705.09064 Paper ● https://github.com/Trevillie/MagNet Demo code ● mengdy.me Author homepage 30

Mag Net A Two-Pronged Defense against Adversarial Examples Dongyu - PowerPoint PPT Presentation

Mag Net A Two-Pronged Defense against Adversarial Examples Dongyu Meng Hao Chen ShanghaiTech University, China University of California, Davis, USA Neural networks in real-life applications user authentication autonomous

Forward Looking - Safe Harbor Statement MAG Silver Corp. is a Canadian issuer. This presentation

Cautionary / Forward Looking Statements MAG Silver Corp. is a Canadian issuer. This presentation

Interim REPORT SEPNOV 2017 MAG INTERACTIVE AB (publ) MAG Interactive is a leading developer

June 2016 Corporate Presentation TSX : MAG NYSE AMEX : MVG Forward Looking - Safe Harbor

MAG Economic Development Committee Funding Sources for Transportation in the MAG Region

in care institutions Mag. pharm. Dr. Elisabeth Kretschmer aHPh Mag. pharm. Diemut Strasser

Maricopa Association of Governments (MAG) MAG Membership Working Regionally . . . GOVERNANCE

Tolland Ranch Trail Caribou Ranch Indian Peaks Wilderness Reynolds Ranch Nederland Eldora

Northwest Valley Local Transit System Study MAG Transit Committee September 12, 2013 Goals and

DISCUSSION OF THE CENTURYLINK PROPOSAL FOR 9-1-1 MANAGED SERVICES MAG Management Committee

James Hennessey Expert Advisor to the MAG Landscape Architect and Urban Designer The Ministerial

Labor Force Analytics Maricopa County MAG Economic Development Committee March 3, 2015

Becky Coffin Kingfisher plc Net Positive 2 Net Positive 3 Net Positive 4 Creating the

by net ki net kimy mya Net Kimya is dedicated to sustainability, supplying environmentally

Safety Net Because sometimes friends may be fake David Grundy - Safety Net Project Worker

www.TrueInteriors.net www.TrueInteriors.net Floor Plan www.TrueInteriors.net Leasing warmth

Defense Against Adversarial Images using Web-Scale Nearest-Neighbor Search Abhimanyu Dubey,

Hash-flooding DoS reloaded: Hash flooding begins? attacks and defenses July 1998 article

Lecture 08 Control-flow Hijacking Defenses Stephen Checkoway University of Illinois at

Clickjacking Credit: paper (Clickjacking: Attacks and Defenses Huang et al.) and most slide

Adversarial Robustness via Runtime Masking and Cleansing Yi-Hsuan Wu Chia-Hung Yuan Shan-Hung

Offensive Threat Modeling for Attackers turning threat modeling on its head Rafal M. Los

Cybercrime Kill Chain vs. Effec4veness of Defense Layers

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial