safety critical systems Fumio Machida University of Tsukuba June - PowerPoint PPT Presentation

N-version machine learning models for safety critical systems Fumio Machida University of Tsukuba June 24, 2019 In Dependable and Secure Machine Learning 2019

Machine learning (ML) in AV For safe driving, a red light on the road ahead should be recognized accurately Diverse sensor inputs Diverse ML models for recognition CNN SVM Red light ! Autonomous vehicle (AV) 2019/6/24 2

Outline 1. Background 2. N-version machine learning architecture 3. Reliability model 4. Numerical example 5. Conclusion 2019/6/24 3

Quality assurance of ML systems Quality control becomes an emergent challenge for ML system providers ◼ ML systems  Information systems increasingly employ ML module as a core of intelligent function ➢ Prediction, classification, decision making, etc. ◼ Threats to dependability  Outputs of ML models are generally uncertain and very sensitive to input data  ML models can be fooled easily (e.g. by adversarial examples) 2019/6/24 4

Related studies ◼ Improving the robustness of ML models  Adversarial learning [Goodfellow et al. 2014]  Safety verification [Huang et al. 2017]  Robust optimization method [ Mądry et al. 2017]  … ◼ White-box testing method for ML system  DeepXplore [Pai et al. 2017] ◼ Falsifying the execution of ML models  Falsification framework for CPS [Dreossi. 2017] 2019/6/24 5

Our approach: N-version architecture Different versions of ML models are used in a system to improve the output reliability ◼ Focus  Not on training a robust model  But on reliable system processing with multiple ML models whose outputs are probably inaccurate ◼ Approach  Taking a multi-version system architecture  Exploiting the diversity of ML models and input data ➢ Even if a ML model fails to recognize a red light, another model can recognize it accurately 2019/6/24 6

Contributions ◼ Our study formally first defines two types of diversity ( model diversity and input diversity ) that should be considered in N-version ML architecture ◼ We present a reliability model for N-version architecture with the diversity metrics ◼ Our numerical results on the reliability model shows that the combination of two diversities can achieve the best system reliability 2019/6/24 7

N-version ML models Motivated from N-version programming N-version programming N-version ML Target Software program ML module (generated from (constructed from data) specification) Mitigation for Software faults Prediction errors Components More than two More than two ML to use functionally equivalent models for the same task programs from the same specification Sources of Development teams, ML algorithms, hyper diversity programming languages, parameters and input libraries and tools, etc. data 2019/6/24 9

Two-version architecture Use two independent versions of ML models Double model with single input Double model with double input (DMSI) (DMDI) x 1 x 1 m 1 m 1 m 2 x 2 m 2 ◼ The system fails when either module do not output expected answer (e.g., red signal) 2019/6/24 10

Three-version architecture Use three versions with majority voting Triple model with single input Triple model with triple input (TMSI) (TMTI) m 1 m 1 x 1 x 1 m 2 x 2 m 2 x 3 m 3 m 3 ◼ The system fails when more than two modules output errors (by majority voting) 2019/6/24 11

Single model architecture Use the same model in parallel with different inputs Single model with double input Single model with triple input (SMDI) (SMTI) m 1 x 1 x 1 m 1 x 2 m 1 x 2 m 1 x 3 m 1 ◼ SMDI fails when both outputs are errors ◼ SMTI fails when more than two modules output errors 2019/6/24 12

Notations ◼ System reliability  Probability that the output of the system is correct  𝑆 𝑗,𝑘 : Reliability of ML system with i versions and j diverse inputs ◼ Probability of error output 𝑙 : Probability that the ML model 𝑛 𝑙 outputs error  𝑔 The set of input data that leads to output error by 𝑛 𝑙 𝐹 𝑙 𝑔 𝑙 = 𝑇 Total sample space of inputs in a given context 2019/6/24 14

Definition of diversity Intersection of errors (model diversity) Let E 1 and E 2 be the subsets of input space S that make models m 1 and m 2 output errors, respectively. Define the intersection of errors 𝛽 1,2 ∈ [0,1] as the ratio of the intersection over the smaller the size of E 1 and E 2 . |𝐹 1 ረ 𝐹 2 | 𝛽 1,2 = min |𝐹 1 |, |𝐹 2 | . Conjunction of errors (input diversity) Let x 1 and x 2 be the inputs from the same sample space S to model m 1 . Define the conjunction of errors 𝛾 1 ∈ [0,1] as the probability that m 1 outputs error by x 2 provided that m 1 outputs error by x 1 . 𝛾 1 = Pr 𝑦 2 ∈ 𝐹 1 |𝑦 1 ∈ 𝐹 1 . 2019/6/24 15

Reliabilities of DMSI and SMDI SMDI DMSI x 1 x 1 m 1 m 1 m 1 m 2 x 2 Model diversity Input diversity Failure 𝑔 𝐸𝑁𝑇𝐽 𝑛 1 , 𝑛 2 𝑔 𝑇𝑁𝐸𝐽 𝑛 1 probability = 𝐹 1 ⋂𝐹 2 = 𝑄𝑠 𝑦 1 ∈ 𝐹 1 , 𝑦 2 ∈ 𝐹 1 = Pr 𝑦 2 ∈ 𝐹 1 |𝑦 1 ∈ 𝐹 1 𝑇 ∙ 𝑄𝑠 𝑦 1 ∈ 𝐹 1 = 𝛽 1,2 ∙ 𝑛𝑗𝑜 𝐹 1 , 𝐹 2 = 𝛾 1 ∙ 𝑔 𝑇 1 Reliability 𝑆 2,1 (𝑛 1 , 𝑛 2 ) = 1 − 𝛽 1,2 ∙ 𝑔 𝑆 1,2 (𝑛 1 ) = 1 − 𝛾 1 ∙ 𝑔 1 1 *) we assume |𝐹 1 | ≤ |𝐹 2 | 2019/6/24 16

Reliability of DMDI x 1 m 1 Model diversity & input diversity x 2 m 2 Failure probability 𝑔 𝐸𝑁𝐸𝐽 𝑛 1 , 𝑛 2 = Pr 𝑦 1 ∈ 𝐹 1 , 𝑦 2 ∈ 𝐹 2 = Pr 𝑦 2 ∈ 𝐹 2 |𝑦 1 ∈ 𝐹 1 ∙ Pr 𝑦 1 ∈ 𝐹 1 • When x 2 has conjunction with x 1 Τ Pr 𝑦 2 ∈ 𝐹 1 |𝑦 1 ∈ 𝐹 1 ∙ Pr 𝑦 2 ∈ 𝐹 2 |𝑦 2 ∈ 𝐹 1 = 𝛾 1 ∙ 𝛽 1,2 ∙ min 𝑔 1 , 𝑔 𝑔 2 1 • When x 2 has no conjunction with x 1 Pr 𝑦 2 ∈ 𝐹 1 |𝑦 1 ∈ 𝐹 1 ∙ Pr 𝑦 2 ∈ 𝐹 2 |𝑦 2 ∈ 𝐹 1 = 1 − 𝛾 1 ∙ 𝑔 2 − 𝛽 1,2 ∙ min 𝑔 1 , 𝑔 2 1 − 𝑔 1 𝐸𝑁𝐸𝐽 𝑛 1 , 𝑛 2 = 𝛾 1 ∙ 𝛽 1,2 + 1 − 𝛾 1 ∙ 𝑔 2 − 𝛽 1,2 ∙ 𝑔 1 ∴ 𝑔 ∙ 𝑔 1 1 − 𝑔 1 Reliability 𝑆 2,2 (𝑛 1 , 𝑛 2 ) = 1 − ቀ𝛾 1 − 𝑔 1 ) ∙ 𝛽 1,2 + 𝑔 2 ∙ 𝑔 1 2019/6/24 17

Reliability impacts of model diversity ◼ Varying α 1,2 with f 1 = f 2 = 0.2, and β 1 =0.4 • 𝑆 2,1 achieves complete reliability when two models do not have intersection (i.e., α 1,2 =0) • 𝑆 2,2 generally achieves better reliability 2019/6/24 19

Reliability impacts of input diversity ◼ Varying β 1 with f 1 = f 2 = 0.2, and α 1,2 =0.5 • When β 1 =0.2 (= f 1 ), there is no conjunction and two modules output errors independently • As β 1 increases, both R 1,2 and R 2,2 decrease 2019/6/24 20

Conclusion ◼ For N-version machine learning architecture, two types of diversity are formally presented ◼ Numerical example on the proposed reliability model show that both diversities contribute to improve two-version architecture ◼ Future work will address the empirical study to show the reliability improvement by N-version architecture 2019/6/24 21

Q & A 2019/6/24 22

safety critical systems Fumio Machida University of Tsukuba June - PowerPoint PPT Presentation

N-version machine learning models for safety critical systems Fumio Machida University of Tsukuba June 24, 2019 In Dependable and Secure Machine Learning 2019 Machine learning (ML) in AV For safe driving, a red light on the road ahead should

Intersection Safety Intersection Safety Intersection Safety FHWA Safety Focus Areas FHWA Safety

Smart Cards Smart Cards a(s) a(s) Safety Critical Systems Safety Critical Systems Gemplus

CYBER CYBER-SAFETY CYBER CYBER SAFETY SAFETY SAFETY BASICS BASICS Engineering Staff College

Safety Presentation The Silence 1 Safety Presentation SAFETY SAFETY OR 2 Safety

Clearsy Provides safety critical systems and softwares Fersil is the railway activity of

Critical Loads Critical Loads Tim Sullivan Tim Sullivan and and Jack Cosby Jack Cosby

C++ for Safety-Critical Systems DI Gnter Obiltschnig Applied Informatics Software Engineering

Safety critical software y Patrick R.H Place Kyo C.Kang

Aviation Safety Cases The Safety Case and Safety Argument Dr Tim Fowler 29 November 2005

Safety Management Systems Subcommittee Presentation to the Ocean Energy Safety Advisory Committee

Assurance For Increasingly Autonomous (IA) Safety Critical Systems John Rushby Computer Science

Critical Care Response Team Nursing Orientation What is a Critical Care Response Team? A

Critical- -Software Software Critical Critical-Software Development Solutions Development

Critical Thinking Skills & Mindset www.insightassessment.com Why Assess Critical Thinking?

Critical Issues on Full- - Critical Issues on Full Length Articles Length Articles Objectives

The challenge of discovering QCD critical point M. Stephanov M. Stephanov QCD Critical Point

Searching for Diverse Software Engineering Solutions Robert Feldt, robert.feldt@chalmers.se 23rd

Automating Programming Assessments What I Learned Porting 15-150 to Autolab Iliano Cervesato

Efficient Exploration by Novelty Pursuit Ziniu Li ziniuli@link.cuhk.edu.cn The Chinese

Realistic Image Synthesis - Spatio-temporal Sampling and Reconstruction. Exploiting Temporal

2017 Expand the LCLD Community Connect and Advance Take Root 1 6/28/17 PRESENT: 2017

WELLNESS & WELL-BEING EDUCATION OUR STORY Lets talk about how we got to where we are

Lecture 2: Divide&Conquer Paradigm, Merge sort and Quicksort Instructor: Saravanan

Divide and Conquer Algorithms: Advanced Sorting Prichard Ch. 10.2: Advanced Sorting Algorithms

safety critical systems Fumio Machida University of Tsukuba June - PowerPoint PPT Presentation

N-version machine learning models for safety critical systems Fumio Machida University of Tsukuba June 24, 2019 In Dependable and Secure Machine Learning 2019 Machine learning (ML) in AV For safe driving, a red light on the road ahead should

Intersection Safety Intersection Safety Intersection Safety FHWA Safety Focus Areas FHWA Safety

Smart Cards Smart Cards a(s) a(s) Safety Critical Systems Safety Critical Systems Gemplus

CYBER CYBER-SAFETY CYBER CYBER SAFETY SAFETY SAFETY BASICS BASICS Engineering Staff College

Safety Presentation The Silence 1 Safety Presentation SAFETY SAFETY OR 2 Safety

Clearsy Provides safety critical systems and softwares Fersil is the railway activity of

Critical Loads Critical Loads Tim Sullivan Tim Sullivan and and Jack Cosby Jack Cosby

C++ for Safety-Critical Systems DI Gnter Obiltschnig Applied Informatics Software Engineering

Safety critical software y Patrick R.H Place Kyo C.Kang

Aviation Safety Cases The Safety Case and Safety Argument Dr Tim Fowler 29 November 2005

Safety Management Systems Subcommittee Presentation to the Ocean Energy Safety Advisory Committee

Assurance For Increasingly Autonomous (IA) Safety Critical Systems John Rushby Computer Science

Critical Care Response Team Nursing Orientation What is a Critical Care Response Team? A

Critical- -Software Software Critical Critical-Software Development Solutions Development

Critical Thinking Skills &amp; Mindset www.insightassessment.com Why Assess Critical Thinking?

Critical Issues on Full- - Critical Issues on Full Length Articles Length Articles Objectives

The challenge of discovering QCD critical point M. Stephanov M. Stephanov QCD Critical Point

Searching for Diverse Software Engineering Solutions Robert Feldt, robert.feldt@chalmers.se 23rd

Automating Programming Assessments What I Learned Porting 15-150 to Autolab Iliano Cervesato

Efficient Exploration by Novelty Pursuit Ziniu Li ziniuli@link.cuhk.edu.cn The Chinese

Realistic Image Synthesis - Spatio-temporal Sampling and Reconstruction. Exploiting Temporal

2017 Expand the LCLD Community Connect and Advance Take Root 1 6/28/17 PRESENT: 2017

WELLNESS &amp; WELL-BEING EDUCATION OUR STORY Lets talk about how we got to where we are

Lecture 2: Divide&amp;Conquer Paradigm, Merge sort and Quicksort Instructor: Saravanan

Divide and Conquer Algorithms: Advanced Sorting Prichard Ch. 10.2: Advanced Sorting Algorithms

Critical Thinking Skills & Mindset www.insightassessment.com Why Assess Critical Thinking?

WELLNESS & WELL-BEING EDUCATION OUR STORY Lets talk about how we got to where we are

Lecture 2: Divide&Conquer Paradigm, Merge sort and Quicksort Instructor: Saravanan