Safe and Robust Deep Learning Gagandeep Singh PhD Student - PowerPoint PPT Presentation

Safe and Robust Deep Learning Gagandeep Singh PhD Student Department of Computer Science 1

SafeAI @ ETH Zurich (safeai.ethz.ch) Joint work with Markus Timon Matthew Mislav Petar Martin Maximilian Dana Vechev Püschel Gehr Mirman Balunovic Baader Tsankov Drachsler Publications: Systems: S&P’18: AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation ERAN: Generic neural network verifier NeurIPS’18: Fast and Effective Robustness Certification POPL’19: An Abstract Domain for Certifying Neural Networks DiffAI: System for training provably robust networks ICLR’19: Boosting Robustness Certification of Neural Networks ICML’18: Differentiable Abstract Interpretation for Provably Robust Neural Networks DL2: System for training and querying networks with logical constraints ICML’19: DL2: Training and Querying Neural Network with Logic 2

Deep learning systems Self driving cars Translation Voice assistant https://waymo.com/tech/ https://translate.google.com https://www.amazon.com/ Amazon-Echo-And-Alexa-Devices 3

Attacks on deep learning The self-driving car incorrectly The Ensemble model is fooled by Adding small noise to the input decides to turn right on Input 2 the addition of an adversarial audio makes the network and crashes into the guardrail distracting sentence in blue. transcribe any arbitrary phrase DeepXplore: Automated Whitebox Adversarial Examples for Evaluating Audio Adversarial Examples: Testing of Deep Learning Systems, Reading Comprehension Systems, Targeted Attacks on Speech-to-Text, SOSP’17 EMNLP’17 ICML 2018 4

Attacks based on intensity changes in images 8 𝐽 " 0 𝐽 = 𝐽 " + 0.01 T o verify absence of attack: 𝑀 ) -norm: consider all images 𝐽 in the 𝜗 -ball ℬ (. / ,)) (𝜗) around 𝐽 2 5

Attacks based on geometric transformations 7 𝐽 " 3 𝐽 = 𝑠𝑝𝑢𝑏𝑢𝑓(𝐽 " ,- 35) T o verify absence of attack: Consider all images 𝐽 obtained by applying geometric transformations to 𝐽 2 6

Attacks based on intensity changes to sound “Stop” 𝑡 " “Go” 𝑡 = 𝑡 " − 110 𝑒𝐶 T o verify absence of attack: Consider all signals 𝑡 in the 𝜗 -ball ℬ (; / ,)) (𝜗) around 𝑡 2 7

Neural network verification: problem statement Example networks and regions: Neural Network 𝑔 , Given: Image classification network 𝒈 Input Region ℛ Region ℛ based on changes to pixel intensity Safety Property 𝜔 Region ℛ based on geometric : e.g., rotation Speech recognition network 𝒈 Region ℛ based on added noise to audio signal Prove: ∀𝐽 ∈ ℛ , Aircraft collision avoidance network 𝒈 Region ℛ based on input sensor values prove that 𝑔(𝐽) satisfies 𝜔 Input Region ℛ can contain an infinite number of inputs, thus enumeration is infeasible 8

Experimental vs. certified robustness Experimental robustness Certified robustness Prove absence of violating inputs Tries to find violating inputs Actual verification guarantees Like testing, no full guarantees E.g.: Reluplex [2017], Wong et al. 2018, AI2 [2018] E.g. Goodfellow 2014, Carlini & Wagner 2016, Madry et al. 2017 In this talk we will focus on certified robustness 9

General approaches to network verification Complete verifiers, but suffer from scalability issues: SMT: Reluplex [CAV’17], MILP: MIPVerify [ICLR’19], Splitting: Neurify [NeurIPS’18],… Incomplete verifiers, trade-off precision for scalability: Box/HBox [ICML'18], SDP [ICLR’18], Wong et.al. [ICML'18], FastLin [ICML'18], Crown [NeurIPS'18],… Key Challenge: scalable and precise automated verifier 10

Network verification with ERAN Input region ERAN verification framework https://github.com/eth-sri/eran Based on Pixel Intensity changes Box Based on Geometric DeepZ [NeurIPS’18] transformations: vector fields, rotations, etc. DeepPoly [POPL’19] Yes Based on Audio processing GPUPoly [submitted] No Aircraft RefineZono [ICLR’19]: MILP + DeepZ Possible sensor values sensors KPoly [submitted]: MILP + DeepPoly Neural Network Fully connected ReLU Extensible to other verification tasks Convolutional Sigmoid State-of-the-art complete and Residual Tanh incomplete verification LSTM Maxpool Sound w.r.t. floating point arithmetic Safety Property 11

Complete and incomplete verification with ERAN Faster Complete Verification Aircraft collision avoidance system (ACAS) Reluplex Neurify ERAN > 32 hours 921 sec 227 sec Scalable Incomplete Verification CIFAR10 ResNet-34 𝝑 %verified Time (s) 0.03 66% 79 sec 12

Geometric and audio verification with ERAN Geometric Verification Rotation between - 30 ° and 30 ° on MNIST CNN with 4,804 neurons 𝝑 %verified Time(s) 0.001 86 10 sec Audio Verification LSTM with 64 hidden neurons 𝝑 %verified Time (s) -110 dB 90% 9 sec 13

Example: analysis of a toy neural network Input layer Hidden layers Output layer 0 0 1 max (0, 𝑦 I ) max (0, 𝑦 L ) [−1,1] 1 1 1 𝑦 H 𝑦 I 𝑦 J 𝑦 L 𝑦 M 𝑦 HH 1 1 0 1 1 1 𝑦 K 𝑦 N 𝑦 O 𝑦 P 𝑦 H2 𝑦 HK [−1,1] 1 −1 max (0, 𝑦 N ) −1 max (0, 𝑦 P ) 0 0 0 We want to prove that 𝑦 HH > 𝑦 HK for all values of 𝑦 H , 𝑦 K in the input set 14

Input layer Hidden layers Output layer 0 0 1 max (0, 𝑦 I ) max (0, 𝑦 L ) [−1,1] 1 1 1 𝑦 H 𝑦 I 𝑦 J 𝑦 L 𝑦 M 𝑦 HH 1 1 0 1 1 1 𝑦 K 𝑦 N 𝑦 O 𝑦 P 𝑦 H2 𝑦 HK [−1,1] −1 −1 1 max (0, 𝑦 N ) max (0, 𝑦 P ) 0 0 0 Each 𝑦 W = 𝐧𝐛𝐲(0, 𝑦 [ ) corresponds to ( 𝑦 [ ≤ 0 and 𝑦 W = 0 ) or ( 𝑦 [ > 0 and 𝑦 W = 𝑦 [ ) Solver has to explore two paths per ReLU resulting in exponential number of paths Complete verification with solvers often does not scale 15

Abstract interpretation An elegant framework for approximating concrete behaviors Key Concept: Abstract Domain Abstract element: approximates set of concrete points Concretization function 𝛿 : concretizes an abstract element to the set of points that it represents. Abstract transformers: approximate the effect of Patrick and Radhia Cousot applying concrete transformers e.g. affine, ReLU Inventors Tradeoff between the precision and the scalability of an abstract domain 16

Network verification with ERAN: high level idea Certification ... All possible outputs (before softmax) Output constraint 𝜒 _ Attacker region 𝑀 ) ball with 𝜗 = 0.1 : 𝑦 2 = [0.1,0.3] 𝑦 2 = 0 𝑦 H = [0.4,0.6] 𝑦 H = 2.60 + 0.015𝜃 2 + 0.023𝜃 H + 5.181𝜃 K + ⋯ 𝑦 K = [0.18,0.36] 𝑦 K = 4.63 − 0.005𝜃 2 − 0.006𝜃 H + 0.023𝜃 K + ⋯ … … 𝑦 LPN = [0.7,0.9] 𝑦 M = 0.12 − 0.125𝜃 2 + 0.102𝜃 H + 3.012𝜃 K + ⋯ ∀𝑗. 𝜃 [ ∈ [0,1] 17

Box approximation (scalable but imprecise) [1,7] [−2,2] [0,4] [0,4] [−1,1] [0,2] 0 0 1 max (0, 𝑦 I ) max (0, 𝑦 L ) [−1,1] 1 1 1 𝑦 H 𝑦 I 𝑦 J 𝑦 L 𝑦 M 𝑦 HH 1 1 0 1 1 1 𝑦 K 𝑦 N 𝑦 O 𝑦 P 𝑦 H2 𝑦 HK [−1,1] −1 −1 1 max (0, 𝑦 N ) max (0, 𝑦 P ) 0 0 0 [−1,1] [0,2] [−2,2] [−2,2] [0,2] [0,2] Verification with the Box domain fails as it cannot capture relational information 18

DeepPoly approximation [POPL’19] j constraint with each 𝑦 [ i and an upper polyhedral 𝑏 [ Shape: associate a lower polyhedral 𝑏 [ Concretization of abstract element 𝑏: Domain invariant: store auxiliary concrete lower and upper bounds 𝑚 [ , 𝑣 [ for each 𝑦 [ 𝑜: #neurons, 𝑛: # constraints • less precise than Polyhedra, restriction 𝑥 rst : max #neurons in a layer, 𝑀 : # layers needed to ensure scalability • captures affine transformation precisely Transformer Polyhedra Our domain unlike Octagon, TVPI Ο(𝑜𝑛 K ) K Affine Ο(𝑥 rst 𝑀) • custom transformers for ReLU, sigmoid, Ο(exp (𝑜, 𝑛)) Ο(1) ReLU tanh, and maxpool activations 19

Example: analysis of a toy neural network Input layer Hidden layers Output layer 0 0 1 max (0, 𝑦 I ) max (0, 𝑦 L ) [−1,1] 1 1 1 𝑦 H 𝑦 I 𝑦 J 𝑦 L 𝑦 M 𝑦 HH 1 1 0 1 1 1 𝑦 K 𝑦 N 𝑦 O 𝑦 P 𝑦 H2 𝑦 HK [−1,1] −1 −1 1 max (0, 𝑦 N ) max (0, 𝑦 P ) 0 0 0 1. 4 constraints per neuron 2. Pointwise transformers => parallelizable. 3. Backsubstitution => helps precision. 4. Non-linear activations => approximate and minimize the area 20

0 0 1 max (0, 𝑦 I ) max (0, 𝑦 L ) [−1,1] 1 1 1 𝑦 H 𝑦 I 𝑦 J 𝑦 L 𝑦 M 𝑦 HH 1 1 0 1 1 1 𝑦 K 𝑦 N 𝑦 O 𝑦 P 𝑦 H2 𝑦 HK [−1,1] −1 −1 1 max (0, 𝑦 N ) max (0, 𝑦 P ) 0 0 0 21

ReLU activation Pointwise transformer for 𝑦 W ≔ 𝑛𝑏𝑦(0, 𝑦 [ ) that uses 𝑚 [ , 𝑣 [ i = 𝑏 W j = 0, 𝑚 W = 𝑣 W = 0, 𝑗𝑔 𝑣 [ ≤ 0, 𝑏 W max (0, 𝑦 I ) i = 𝑏 W j = 𝑦 [ , 𝑚 W = 𝑚 [ , 𝑣 W = 𝑣 [ , 𝑦 I 𝑦 J 𝑗𝑔 𝑚 [ ≥ 0, 𝑏 W 𝑗𝑔 𝑚 [ < 0 𝑏𝑜𝑒 𝑣 [ > 0 𝑦 N 𝑦 O max (0, 𝑦 N ) choose (b) or (c) depending on the area Constant runtime 22

Affine transformation after ReLU 𝑦 J 1 0 𝑦 L 𝑦 O 1 j Imprecise upper bound 𝑣 L by substituting 𝑣 J , 𝑣 O for 𝑦 J and 𝑦 O in 𝑏 L 23

Backsubstitution 𝑦 J 1 0 𝑦 L 𝑦 O 1 24

Safe and Robust Deep Learning Gagandeep Singh PhD Student - PowerPoint PPT Presentation

Safe and Robust Deep Learning Gagandeep Singh PhD Student Department of Computer Science 1 SafeAI @ ETH Zurich (safeai.ethz.ch) Joint work with Markus Timon Matthew Mislav Petar Martin Maximilian Dana Vechev Pschel Gehr Mirman

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

Short Course in Supervised Learning Robust Optimization and Machine Learning Robust Supervised

Robust Deep Learning Based on Meta-learning Deyu Meng Xian Jiaotong University

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

, , Weakly Supervised Classification Robust Learning and More: Robust Learning and More:

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Deep learning for natural language processing A short primer on deep learning Benoit Favre <

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Effective Approaches to Attention-based Neural Machine Translation Minh-Thang Luong , Hieu Pham,

Simulating neural computation and information processing with Brian Marcel Stimberg Institut de

CSCE 478/878 Lecture 4: Artificial Neural Networks Stephen D. Scott (Adapted from Tom

SLID IDE : In In Defen ense e of Smart Algorithms over er Ha Hardware e Accel eler

CS480/680 Lecture 9: June 5, 2019 Perceptrons, Neural Networks [D] Chapt. 4, [HTF] Chapt. 11,

1/16/2014 HIV Disease and Distal Sensory Polyneuropathy (DSP) David M. Kietrys, PT, PhD, OCS

neural circuits Lennart Oettl 2.11.2018 Oxytocin Discovered by Sir Henry H. Dale in 1906

Safe and Robust Deep Learning Gagandeep Singh PhD Student - PowerPoint PPT Presentation

Safe and Robust Deep Learning Gagandeep Singh PhD Student Department of Computer Science 1 SafeAI @ ETH Zurich (safeai.ethz.ch) Joint work with Markus Timon Matthew Mislav Petar Martin Maximilian Dana Vechev Pschel Gehr Mirman

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

Short Course in Supervised Learning Robust Optimization and Machine Learning Robust Supervised

Robust Deep Learning Based on Meta-learning Deyu Meng Xian Jiaotong University

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

, , Weakly Supervised Classification Robust Learning and More: Robust Learning and More:

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Deep learning for natural language processing A short primer on deep learning Benoit Favre &lt;

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Effective Approaches to Attention-based Neural Machine Translation Minh-Thang Luong , Hieu Pham,

Simulating neural computation and information processing with Brian Marcel Stimberg Institut de

CSCE 478/878 Lecture 4: Artificial Neural Networks Stephen D. Scott (Adapted from Tom

SLID IDE : In In Defen ense e of Smart Algorithms over er Ha Hardware e Accel eler

CS480/680 Lecture 9: June 5, 2019 Perceptrons, Neural Networks [D] Chapt. 4, [HTF] Chapt. 11,

1/16/2014 HIV Disease and Distal Sensory Polyneuropathy (DSP) David M. Kietrys, PT, PhD, OCS

neural circuits Lennart Oettl 2.11.2018 Oxytocin Discovered by Sir Henry H. Dale in 1906

Deep learning for natural language processing A short primer on deep learning Benoit Favre <