Sa Safe fety y ve veri rifi fica cati tion
- n fo
for d r deep ep neural ral networks works wit ith pro rova vable le guara rantees tees
- Prof. Marta Kwiatkowska
networks works wit ith pro rova vable le guara rantees tees - - PowerPoint PPT Presentation
ERTS 2020, Toulouse 29 th January, 2020 Safe Sa fety y ve veri rifi fica cati tion on fo for d r deep ep neural ral networks works wit ith pro rova vable le guara
1940s First proposed 1998 Convolutional nets 2006 Deep nets trained 2011 Rectifier units 2015 Vision breakthrough 2016 Win at Go 2019 Turing Award
− Big data − Flexible, easy to build models − Availability of GPUs − Efficient inference
www.bsfilms.me - Black Sheep Films
Credit: Anita Dufala/Public source
− hybrid combination of continuous and discrete dynamics, with stochasticity − autonomous control
− achieved through learning − parameter estimation − continuous adaptation
− model-based design − automated verification via model checking − correct-by-construction model synthesis from specifications
0.4 0.5 0.1
− randomisation, uncertainty, risk
− safety, security, reliability, performance, resource usage, trust, authentication, …
− (reliability) “the probability of the car crashing in the next hour is less than 0.001” − (energy) “energy usage is below 2000 mA per minute”
− probabilistic model checker PRISM, www.prismmodelchecker.org − HVC 2016 Award (joint with Dave Parker and Gethin Norman)
16
17
− reads electrical signals through sensors in the right atrium and right ventricle − monitors the timing of heart beats and local electrical activity − generates artificial pacing signal as necessary
60-100 beats per minute − Killed by code: FDA recalls 23 defective pacemaker devices because of adverse
Quantitative verification of implantable cardiac pacemakers over hybrid heart models. Chen et al, Information and Computation 2014
− (basic safety) maintain 60-100 beats per minute − (energy usage) detailed analysis, plotted against timing parameters
− rate-adaptive pacemaker, for patients with chronotropic deficiency − (advanced safety) adapt the rate to exercise and stress levels − in silico testing
Closed-Loop Quantitative Verification of Rate-Adaptive Pacemakers. Paoletti et al, ACM Transactions on Cyber-Physical Systems 2018
− minimise energy usage? − maximise cardiac output? − explore trade offs?
− (optimal timing delay synthesis): find values for timing delays that
adapted to patient’s ECG
Synthesising robust and optimal parameters for cardiac pacemakers using symbolic and evolutionary computation techniques. Kwiatkowska et al, HSB’16
− are they secure?
− ECG used as a biometric identifier − biometric template created first − compared with real ECG signal
− for access into buildings and restricted spaces − for payment − etc
Broken Hearted: How to Attack ECG Biometrics, Ebertz et al., In Proc NDSS 2017
− build model from data, 41 volunteers − inject synthetic signals to break authentication − 80% success rate
− serious weakness − countermeasures needed
− ECG − eye movements − mouse movements − touchscreen dynamics − gait − etc
− easy for eye movements − ECG more chaotic
When your fitness tracker betrays you, Ebertz et al., In Proc S&P 2018
works
guarantees
Lidar image, Credit: Oxford Robotics Institute
− often imperceptible changes to the image [Szegedy et al 2014, Biggio et al 2013 …] − sometimes artificial white noise − practical attacks, potential security risk − transferable between different architectures − not just image classification: also images segmentation, pose recognition, sentiment analysis…
− Nexar Traffic Light Challenge: Red light classified as green with 68%/95%/78% confidence after one pixel change.
− Nexar Traffic Light Challenge: red light classified as green with 68%/95%/78% confidence after one pixel change
39 Feature-Guided Black-Box Safety Testing of Deep Neural Networks. Wicker et al, In Proc. TACAS, 2018.
stop 30m 80m 30m go go speed speed speed right straight limit limit limit
Confidence 0.999964 0.99 Safety Verification of Deep Neural Networks. Huang et al, In Proc. CAV, 2017.
− same as pointwise robustness… η
− trained network f : D → {c1,…ck} − diameter for support region η − norm, e.g. L2, L∞
− i.e. ∄y ∈ η such that f(x) ≠ f(y)
− e.g. scratches, weather conditions, camera angle, etc x y
TACAS 2018, https://arxiv.org/abs/1710.07859
A Game-Based Approximate Verification of Deep Neural Networks with Provable Guarantees. Wu et al, CoRR abs/1807.03571, 2018.
56
58
Classified as Car 85% Confidence Iterative Sample Occlusion only removes 56 points Random Occlusion removes 1385 Misclassified - Bathtub 28% Confidence Misclassified - Airplane 12% Confidence
Robustness of 3D Deep Learning in an Adversarial Setting. Wicker & K, In Proc. CVPR 2019.
− account for noise, uncertainty, etc − return an uncertainty measure along with the output
− often intractable − can we do better?
− compare robustness and accuracy trade offs for different inference methods IJCAI 2019, https://arxiv.org/abs/1903.01980 x y
62
ICRA 2020, https://arxiv.org/abs/1909.09884
65
− formal verification, safety assurance
− rigorous foundations, methodology
− ethics, fairness and morality
66
− ERC Advanced Grant − EPSRC Mobile Autonomy Programme Grant
− PRISM www.prismmodelchecker.org