Wild Patterns: Ten Years After the Rise of Adversarial Machine - PowerPoint PPT Presentation

Pattern Recognition University of and Applications Lab Cagliari, Italy Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning Battista Biggio * Slides from this talk are inspired from the tutorial I prepared with Fabio Roli on such topic. https://www.pluribus-one.it/sec-ml/wild-patterns/ Winter School on Quantitative Systems Biology: Learning and Artificial Intelligence, Nov. 15-16, Trieste, Italy

Countering Evasion Attacks What is the rule? The rule is protect yourself at all times (from the movie “Million dollar baby”, 2004) 93

Security Measures against Evasion Attacks $ ∑ & max min ||* + ||,- ℓ(0 & , 2 $ 3 & + * & ) 1. Reduce sensitivity to input changes with robust optimization – Adversarial Training / Regularization bounded perturbation! SVM-RBF (no reject) SVM-RBF (higher rejection rate) 1 1 2. Introduce rejection / detection of adversarial examples 0 0 1 1 1 0 1 1 0 1 @biggiobattista http://pralab.diee.unica.it 94

Countering Evasion: Reducing Sensitivity to Input Changes with Robust Optimization 95

Reducing Input Sensitivity via Robust Optimization Robust optimization (a.k.a. adversarial training ) • min ||* + || , -. ∑ 0 ℓ 1 0 , " max # 3 0 + * 0 # bounded perturbation! Robustness and regularization (Xu et al., JMLR 2009) • – under linearity of ℓ and " # , equivalent to robust optimization ∑ 5 ℓ 1 0 , " min # 3 0 + 6||7 3 "|| 8 # dual norm of the perturbation ||7 3 "|| 8 = ||#|| 8 @biggiobattista http://pralab.diee.unica.it 96

Results on Adversarial Android Malware Infinity-norm regularization is the optimal regularizer against sparse evasion attacks • – Sparse evasion attacks penalize | " | # promoting the manipulation of only few features ∑ ( ) Sec-SVM min w , b w ∞ + C max 0,1 − y i f ( x i ) , w ∞ = max i = 1,..., d w i i Experiments on Android Malware Why? It bounds the maximum weight absolute values! Absolute weight values |$| in descending order @biggiobattista http://pralab.diee.unica.it 97 [Demontis, Biggio et al., Yes, ML Can Be More Secure!..., IEEE TDSC 2017]

Adversarial Training and Regularization Adversarial training can also be seen as a form of regularization, which penalizes the (dual) • norm of the input gradients ! |# $ ℓ | & Known as double backprop or gradient/Jacobian regularization • – see, e.g., Simon-Gabriel et al., Adversarial vulnerability of neural networks increases with input dimension, ArXiv 2018 ; and Lyu et al., A unified gradient regularization family for adversarial examples, ICDM 2015. g (') with adversarial training Take-home message: the net effect of these techniques is to make the prediction function of the classifier smoother ' ' ’ @biggiobattista http://pralab.diee.unica.it 98

Ineffective Defenses: Obfuscated Gradients Work by Carlini & Wagner (SP’ 17) and Athalye et al. (ICML ‘18) has shown that • – some recently-proposed defenses rely on obfuscated / masked gradients, and – they can be circumvented Obfuscated gradients do not ... but substitute g (") g (") allow the models and/or correct smoothing can execution of correctly reveal gradient-based meaningful attacks... input gradients! " " ’ " " ’ @biggiobattista http://pralab.diee.unica.it 99

Countering Evasion: Detecting & Rejecting Adversarial Examples 100

Detecting & Rejecting Adversarial Examples Adversarial examples tend to occur in blind spots • – Regions far from training data that are anyway assigned to ‘legitimate’ classes blind-spot evasion rejection of adversarial examples through (not even required to enclosing of legitimate classes mimic the target class) @biggiobattista http://pralab.diee.unica.it 101

Detecting & Rejecting Adversarial Examples input perturbation (Euclidean distance) @biggiobattista http://pralab.diee.unica.it 102 [Melis, Biggio et al., Is Deep Learning Safe for Robot Vision? ICCVW ViPAR 2017]

Why Rejection (in Representation Space) Is Not Enough? @biggiobattista [S. Sabour at al., ICLR 2016] http://pralab.diee.unica.it 103

Why Rejection (in Representation Space) Is Not Enough? Slide credit: David Evans, DLS 2018 - https://www.cs.virginia.edu/~evans/talks/dls2018/ @biggiobattista http://pralab.diee.unica.it 104

Adversarial Examples against Machine Learning Web Demo https://sec-ml.pluribus-one.it/demo 105

Poisoning Machine Learning 106

Poisoning Machine Learning x x x x x x x x x x x x x 1 x x x x x x x x x x x x x 2 x x x x ... x x x x x x x d classifier generalizes well training data pre-processing and classifier learning on test data (with labels) feature extraction w start x start SPAM +2 bang bang +1 portfolio start 1 Start 2007 +1 portfolio winner bang 1 with a bang ! +1 winner year 1 portfolio Make WBFS +1 year ... 1 winner YOUR ... ... university 1 PORTFOLIO ’s year -3 university campus first winner ... ... -4 campus of the year 0 university ... 0 campus @biggiobattista http://pralab.diee.unica.it 107

Poisoning Machine Learning x x x x x x x x x x x x x 1 x x x x x x x x x x x x x 2 x x x x ... x x x x x x x d x x x ... to maximize error corrupted pre-processing and classifier learning on test data training data feature extraction is compromised... w SPAM x start +2 Start 2007 bang +1 start 1 with a bang ! +1 portfolio bang 1 Make WBFS +1 winner 1 portfolio YOUR +1 year 1 winner PORTFOLIO ’s ... 1 ... year first winner poisoning +1 university ... ... of the year +1 data 1 campus university ... 1 university campus campus ... @biggiobattista http://pralab.diee.unica.it 108

Poisoning Attacks against Machine Learning Goal : to maximize classification error • Knowledge: perfect / white-box attack • Capability: injecting poisoning samples into TR • Strategy: find an optimal attack point x c in TR that maximizes classification error • classification error = 0.022 classification error = 0.039 classification error as a function of x c x c x c @biggiobattista http://pralab.diee.unica.it 109 [Biggio, Nelson, Laskov. Poisoning attacks against SVMs. ICML, 2012]

Poisoning is a Bilevel Optimization Problem Attacker’s objective • – to maximize generalization error on untainted data, w.r.t. poisoning point x c & ' ()* , , ∗ Loss estimated on validation data max (no attack points!) $ % s. t. , ∗ = argmin 6 ℒ ' 89 ∪ ; < , = < , , Algorithm is trained on surrogate data (including the attack point) Poisoning problem against (linear) SVMs: • B max(0,1 − = ? , ∗ ; ? ) max > $ % ?@A s. t. , ∗ = argmin H,I A J H K H + C ∑ O@A P max(0,1 − = O , ; O ) + C max(0,1 − = < , ; < ) [Biggio, Nelson, Laskov. Poisoning attacks against SVMs. ICML, 2012] [Xiao, Biggio, Roli et al., Is feature selection secure against training data poisoning? ICML, 2015] @biggiobattista http://pralab.diee.unica.it 110 [Munoz-Gonzalez, Biggio, Roli et al., Towards poisoning of deep learning..., AISec 2017]

Gradient-based Poisoning Attacks Gradient is not easy to compute • (0) – The training point affects the classification function x c Trick: • x c – Replace the inner learning problem with its equilibrium (KKT) conditions – This enables computing gradient in closed form Example for (kernelized) SVM • – similar derivation for Ridge, LASSO, Logistic Regression, etc. x c (0) x c [Biggio, Nelson, Laskov. Poisoning attacks against SVMs. ICML, 2012] @biggiobattista http://pralab.diee.unica.it 111 [Xiao, Biggio, Roli et al., Is feature selection secure against training data poisoning? ICML, 2015]

Experiments on MNIST digits Single-point attack Linear SVM; 784 features; TR: 100; VAL: 500; TS: about 2000 • – ‘0’ is the malicious (attacking) class – ‘4’ is the legitimate (attacked) one (0) x c x c @biggiobattista http://pralab.diee.unica.it 112 [Biggio, Nelson, Laskov. Poisoning attacks against SVMs. ICML, 2012]

Experiments on MNIST digits Multiple-point attack Linear SVM; 784 features; TR: 100; VAL: 500; TS: about 2000 • – ‘0’ is the malicious (attacking) class – ‘4’ is the legitimate (attacked) one @biggiobattista http://pralab.diee.unica.it 113 [Biggio, Nelson, Laskov. Poisoning attacks against SVMs. ICML, 2012]

How about Poisoning Deep Nets? ICML 2017 Best Paper by Koh et al., “Understanding black-box predictions via Influence • Functions” has derived adversarial training examples against a DNN – they have been constructed attacking only the last layer (KKT-based attack against logistic regression) and assuming the rest of the network to be ”frozen” @biggiobattista http://pralab.diee.unica.it 114

Towards Poisoning Deep Neural Networks Solving the poisoning problem without exploiting KKT conditions (back-gradient) • – Muñoz-González, Biggio, Roli et al. , AISec 2017 https://arxiv.org/abs/1708.08689 @biggiobattista http://pralab.diee.unica.it 115

Countering Poisoning Attacks What is the rule? The rule is protect yourself at all times (from the movie “Million dollar baby”, 2004) 116

Wild Patterns: Ten Years After the Rise of Adversarial Machine - PowerPoint PPT Presentation

Pattern Recognition University of and Applications Lab Cagliari, Italy Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning Battista Biggio * Slides from this talk are inspired from the tutorial I prepared with Fabio Roli on

DOTSTAR in the past ten years DOTSTAR in the past ten years DOTSTAR in the past ten years

Wild Horse and Burro Roundtable Wild Horse and Burro Roundtable Wild Horse and Burro Roundtable

Literacy Activity Wild Animal Habitat What is your favourite wild animal? Where do wild animals

Ten Years after the Financial Crisis: Ten Years after the Financial Crisis: What Have We Learned

SOHO SOHO SOHO SOHO HIGH RISE HIGH RISE HIGH RISE HIGH RISE CONDOMINIUMS CONDOMINIUMS

Factory Patterns: Factory Method and Abstract Factory Design Patterns In Java Bob Tarr

GLO Science Professional Before & After Images Before GLO After GLO Before GLO After GLO

The Rise of Democracy Chapter 26 1 Chap. 26.126.5 Rise of Democracy 2011.notebook September

Principles and Patterns 26 February, 2020 Recap Principles Patterns Inheritance Anti-patterns

Quality Sampling Top Ten Reasons to Drill a Hole Top Ten Reasons to Drill a Hole Getting paid

Top ten mental tips Number one Know your real goal Top ten mental tips Number two Get nervous

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1 * , V. Pondenkandath 1*

Numerical Simulations of the Wardle Instability Sam Falle, Department of Applied Mathematics,

for AI and Robotics Exploration and information gathering Alessandro Farinelli Outline

Dialogue corpora NPFL070 December 11, 2019 (NPFL070) Dialogue corpora December 11, 2019 1 /

Scheduling multi-task applications on heterogeneous platforms Anne Benoit, Jean-Fran cois

Context to Sequence Typical Frameworks and Applications Piji Li Department of Systems

Deep Learning in Computer Vision (CSC2523) Reading List Bid for papers: Tue, Jan 26, 11.59pm,

Master Recherche IAC Option 2 Robotique et agents autonomes Jamal Atif Mich` ele Sebag LRI

Wild Patterns: Ten Years After the Rise of Adversarial Machine - PowerPoint PPT Presentation

Pattern Recognition University of and Applications Lab Cagliari, Italy Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning Battista Biggio * Slides from this talk are inspired from the tutorial I prepared with Fabio Roli on

DOTSTAR in the past ten years DOTSTAR in the past ten years DOTSTAR in the past ten years

Wild Horse and Burro Roundtable Wild Horse and Burro Roundtable Wild Horse and Burro Roundtable

Literacy Activity Wild Animal Habitat What is your favourite wild animal? Where do wild animals

Ten Years after the Financial Crisis: Ten Years after the Financial Crisis: What Have We Learned

SOHO SOHO SOHO SOHO HIGH RISE HIGH RISE HIGH RISE HIGH RISE CONDOMINIUMS CONDOMINIUMS

Factory Patterns: Factory Method and Abstract Factory Design Patterns In Java Bob Tarr

GLO Science Professional Before &amp; After Images Before GLO After GLO Before GLO After GLO

The Rise of Democracy Chapter 26 1 Chap. 26.126.5 Rise of Democracy 2011.notebook September

Principles and Patterns 26 February, 2020 Recap Principles Patterns Inheritance Anti-patterns

Quality Sampling Top Ten Reasons to Drill a Hole Top Ten Reasons to Drill a Hole Getting paid

Top ten mental tips Number one Know your real goal Top ten mental tips Number two Get nervous

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1 * , V. Pondenkandath 1*

Numerical Simulations of the Wardle Instability Sam Falle, Department of Applied Mathematics,

for AI and Robotics Exploration and information gathering Alessandro Farinelli Outline

Dialogue corpora NPFL070 December 11, 2019 (NPFL070) Dialogue corpora December 11, 2019 1 /

Scheduling multi-task applications on heterogeneous platforms Anne Benoit, Jean-Fran cois

Context to Sequence Typical Frameworks and Applications Piji Li Department of Systems

Deep Learning in Computer Vision (CSC2523) Reading List Bid for papers: Tue, Jan 26, 11.59pm,

Master Recherche IAC Option 2 Robotique et agents autonomes Jamal Atif Mich` ele Sebag LRI

GLO Science Professional Before & After Images Before GLO After GLO Before GLO After GLO