Safe and Robust Deep Learning
Mislav Balunović Department of Computer Science
1
Safe and Robust Deep Learning Mislav Balunovi Department of Computer - - PowerPoint PPT Presentation
Safe and Robust Deep Learning Mislav Balunovi Department of Computer Science 1 SafeAI @ ETH Zurich (safeai.ethz.ch) Joint work with Markus Gagandeep Petar Martin Timon Matthew Maximilian Dana Pschel Singh Tsankov Vechev Gehr
1
2
Martin Vechev Markus Püschel Gagandeep Singh Timon Gehr Maximilian Baader Petar Tsankov Dana Drachsler Matthew Mirman
Publications: S&P’18: AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation NeurIPS’18: Fast and Effective Robustness Certification POPL’19: An Abstract Domain for Certifying Neural Networks ICLR’19: Boosting Robustness Certification of Neural Networks ICML’18: Differentiable Abstract Interpretation for Provably Robust Neural Networks ICML’19: DL2: Training and Querying Neural Network with Logic Systems: ERAN: Generic neural network verifier https://github.com/eth-sri/eran/ DiffAI: System for training provably robust networks https://github.com/eth-sri/diffai DL2: System for training and querying networks with logical constraints https://github.com/eth-sri/dl2
https://www.amazon.com/ Amazon-Echo-And-Alexa-Devices https://waymo.com/tech/
Self driving cars Voice assistant Translation
https://translate.google.com
3
The self-driving car incorrectly decides to turn right on Input 2 and crashes into the guardrail DeepXplore: Automated Whitebox T esting of Deep Learning Systems, SOSP’17
4
The self-driving car incorrectly decides to turn right on Input 2 and crashes into the guardrail DeepXplore: Automated Whitebox T esting of Deep Learning Systems, SOSP’17 Adversarial Examples for Evaluating Reading Comprehension Systems, EMNLP’17 The Ensemble model is fooled by the addition of an adversarial distracting sentence in blue.
5
Adding small noise to the input audio makes the network transcribe any arbitrary phrase Audio Adversarial Examples: Targeted Attacks on Speech-to-T ext, ICML 2018 The self-driving car incorrectly decides to turn right on Input 2 and crashes into the guardrail DeepXplore: Automated Whitebox T esting of Deep Learning Systems, SOSP’17 Adversarial Examples for Evaluating Reading Comprehension Systems, EMNLP’17 The Ensemble model is fooled by the addition of an adversarial distracting sentence in blue.
6
7
8
9
T
10
11
12
T
13
14
15
T
16
Image classification network f Region R based on changes to pixel intensity Region R based on geometric: e.g., rotation Speech recognition network f Region R based on added noise to audio signal Aircraft collision avoidance network f Region R based on input sensor values
Example networks and regions:
E.g. Goodfellow 2014, Carlini & Wagner 2016, Madry et al. 2017
E.g.: Reluplex [2017], Wong et al. 2018, AI2 [2018]
17
18
19
Based on Pixel Intensity changes Box DeepZ DeepPoly RefineZono: MILP + DeepZ
ERAN verification framework https://github.com/eth-sri/eran
K-Poly: MILP + DeepPoly
Yes Fully connected Convolutional Residual LSTM ReLU Sigmoid Tanh Maxpool Neural Network
Sound w.r.t. floating point arithmetic
Extensible to other verification tasks
Possible sensor values
Aircraft sensors
Safety Property
GPUPoly
No
Based on Geometric transformations: vector fields, rotations, etc. Based on Audio processing
Input region
State-of-the-art complete and incomplete verification
20
21
22
𝑦1 𝑦3 𝑦5 𝑦11 𝑦2 𝑦7 𝑦9 𝑦4 𝑦6 𝑦8 𝑦10 𝑦12 1 max(0, 𝑦3) 1 1 −1 −1 1 max(0, 𝑦7) max(0, 𝑦4) max(0, 𝑦8) 1 1 1 1 1 [−1,1] [−1,1]
1
23
𝑦1 𝑦3 𝑦5 𝑦11 𝑦2 𝑦7 𝑦9 𝑦4 𝑦6 𝑦8 𝑦10 𝑦12 1 max(0, 𝑦3) 1 1 −1 −1 1 max(0, 𝑦7) max(0, 𝑦4) max(0, 𝑦8) 1 1 1 1 1 [−1,1] [−1,1]
1
24
𝑦1 𝑦3 𝑦5 𝑦11 𝑦2 𝑦7 𝑦9 𝑦4 𝑦6 𝑦8 𝑦10 𝑦12 1 max(0, 𝑦3) 1 1 −1 −1 1 max(0, 𝑦7) max(0, 𝑦4) max(0, 𝑦8) 1 1 1 1 1 [−1,1] [−1,1]
1
25
... Certification
Output constraint 𝜒𝑜
𝑦0 = 0 𝑦1 = 2.60+ 0.015𝜗0 + 0.023𝜗1 + 5.181𝜗2 + ⋯ 𝑦2 = 4.63 − 0.005𝜗0 − 0.006𝜗1 + 0.023𝜗2 + ⋯ … 𝑦9 = 0.12− 0.125𝜗0 + 0.102𝜗1 + 3.012𝜗2 + ⋯ ∀𝑗. 𝜗𝑗 ∈ [0,1]
Attacker region:
𝑦0 = 0 𝑦1 = 0.975 + 0.025𝜗1 𝑦2 = 0.125 … 𝑦784 = 0.938 + 0.062𝜗784 ∀𝑗. 𝜗𝑗 ∈ [0,1]
All possible outputs (before softmax)
𝑦1 𝑦3 𝑦5 𝑦11 𝑦2 𝑦7 𝑦9 𝑦4 𝑦6 𝑦8 𝑦10 𝑦12 1 max(0, 𝑦3) 1 1 −1 −1 1 max(0, 𝑦7) max(0, 𝑦4) max(0, 𝑦8) 1 1 1 1 1 [−1,1] [−1,1] 1
26
[−1,1] [−1,1] [−2,2] [−2,2] [0,2] [0,2] [0,4] [−2,2] [0,4] [0,2] [1,7] [0,2]
𝑦1 𝑦3 𝑦5 𝑦11 𝑦2 𝑦7 𝑦9 𝑦4 𝑦6 𝑦8 𝑦10 𝑦12 1 max(0, 𝑦3) 1 1 −1 −1 1 max(0, 𝑦7) max(0, 𝑦4) max(0, 𝑦8) 1 1 1 1 1 [−1,1] [−1,1] 1
27
[−1,1] [−1,1] [−2,2] [−2,2] [0,2] [0,2] [0,4] [−2,2] [0,4] [0,2] [1,7] [0,2]
≤ and an upper polyhedral 𝑏𝑗 ≥ constraint with each 𝑦𝑗
28
𝑦1 𝑦3 𝑦5 𝑦11 𝑦2 𝑦7 𝑦9 𝑦4 𝑦6 𝑦8 𝑦10 𝑦12 1 max(0, 𝑦3) 1 1 −1 −1 1 max(0, 𝑦7) max(0, 𝑦4) max(0, 𝑦8) 1 1 1 1 1 [−1,1] [−1,1] 1
29
𝑦3 𝑦5 𝑦4 𝑦6 max(0, 𝑦3) max(0, 𝑦4)
≤ = 𝑏𝑘 ≥ = 0, 𝑚𝑘 = 𝑣𝑘 = 0,
≤ = 𝑏𝑘 ≥ = 𝑦𝑗, 𝑚𝑘 = 𝑚𝑗, 𝑣𝑘 = 𝑣𝑗,
30
𝑦3 𝑦5 𝑦4 𝑦6 max(0, 𝑦3) max(0, 𝑦4)
≤ = 𝑏𝑘 ≥ = 0, 𝑚𝑘 = 𝑣𝑘 = 0,
≤ = 𝑏𝑘 ≥ = 𝑦𝑗, 𝑚𝑘 = 𝑚𝑗, 𝑣𝑘 = 𝑣𝑗,
31
𝑦3 𝑦5 𝑦4 𝑦6 max(0, 𝑦3) max(0, 𝑦4)
≤ = 𝑏𝑘 ≥ = 0, 𝑚𝑘 = 𝑣𝑘 = 0,
≤ = 𝑏𝑘 ≥ = 𝑦𝑗, 𝑚𝑘 = 𝑚𝑗, 𝑣𝑘 = 𝑣𝑗,
32
𝑦5 𝑦7 𝑦6 1 1
33
𝑦5 𝑦7 𝑦6 1 1
≥
34
𝑦5 𝑦7 𝑦6 1 1
35
𝑦5 𝑦7 𝑦6 1 1
36
𝑦5 𝑦7 𝑦6 1 1 𝑦3 𝑦4 max(0, 𝑦3) max(0, 𝑦4) 𝑦1 𝑦2 1 −1 1 1
37
2
𝑦5 𝑦7 𝑦6 1 1 𝑦3 𝑦4 max(0, 𝑦3) max(0, 𝑦4) 𝑦1 𝑦2 1 −1 1 1
38
𝑦1 𝑦3 𝑦5 𝑦11 𝑦2 𝑦7 𝑦9 𝑦4 𝑦6 𝑦8 𝑦10 𝑦12 1 max(0, 𝑦3) 1 1 −1 −1 1 max(0, 𝑦7) max(0, 𝑦4) max(0, 𝑦8) 1 1 1 1 1 [−1,1] [−1,1] 1
39
40
41
42
43
44
45
Based on Pixel Intensity changes Box DeepZ DeepPoly RefineZono: MILP + DeepZ
ERAN verification framework https://github.com/eth-sri/eran
K-Poly: MILP + DeepPoly
Yes Fully connected Convolutional Residual LSTM ReLU Sigmoid Tanh Maxpool Neural Network
Sound w.r.t. floating point arithmetic
Extensible to other verification tasks
Possible sensor values
Aircraft sensors
Safety Property
GPUPoly
No
Based on Geometric transformations: vector fields, rotations, etc. Based on Audio processing
Input region
State-of-the-art complete and incomplete verification
46
47
Based on Pixel Intensity changes Box DeepZ DeepPoly RefineZono: MILP + DeepZ ERAN verification framework https://github.com/eth-sri/eran K-Poly: MILP + DeepPoly Yes Fully connected Convolutional Residual LSTM ReLU Sigmoid Tanh Maxpool Neural Network Sound w.r.t. floating point arithmetic Extensible to other verification tasks Possible sensor values Air cra ft se ns
Safety Property GPUPoly No Based on Geometric transformations: vector fields, rotations, etc. Based on Audio processing Input region State-of-the-art complete and incomplete verification
Neural Network Verification Framework Attacks on Deep Learning