Testing Deep Neural Networks Xiaowei Huang, University of Liverpool

Outline Safety Problem of AI Verification (brief) Testing Conclusions and Future Works

Human-Level Intelligence

Robotics and Autonomous Systems

Deep neural networks all implemented with

Figure: safety in image classification networks

Figure: safety in natural language processing networks

Figure: safety in voice recognition networks

Figure: safety in security systems

Safety Definition: Human Driving vs. Autonomous Driving Traffic image from “The German Traffic Sign Recognition Benchmark”

Safety Definition: Human Driving vs. Autonomous Driving Image generated from our tool

Safety Problem: Incidents

Safety Definition: Illustration

Safety Requirements ◮ Pointwise Robustness (this talk) ◮ if the decision of a pair (input, network) is invariant with respect to the perturbation to the input. ◮ Network Robustness ◮ or more fundamentally, Lipschitz continuity, mutual information, etc ◮ model interpretability

Certification of DNN https://github.com/TrustAI

Outline Safety Problem of AI Verification (brief) Testing Conclusions and Future Works

Safety Definition: Traffic Sign Example

Maximum Safe Radius Definition The maximum safe radius problem is to compute the minimum distance from the original input α to an adversarial example, i.e., α ′ ∈ D {|| α − α ′ || k | α ′ is an adversarial example } MSR ( α ) = min (1)

Existing Approaches ◮ layer-by-layer exhaustive search, see e.g., [2] 1 ◮ SMT, MILP, SAT based constraint solving, see e.g., [3] 2 ◮ global optimisation, see e.g., [6] 3 ◮ abstract interpretation, see e.g., [1] 4 1 Huang , Kwiatkowska, Wang, Wu, CAV2017 2 Katz, Barrett, Dill, Julian, Kochenderfer, CAV2017 3 Ruan, Huang , Kwiatkowska, IJCAI2018 4 Gehr, Mirman, Drachsler-Cohen, Tsankov, Chaudhuri, Vechev, S&P2018

Outline Safety Problem of AI Verification (brief) Testing Test Coverage Criteria Test Case Generation Conclusions and Future Works

Deep Neural Networks (DNNs) Input Hidden Hidden Output layer layer layer layer n 2 , 1 n 3 , 1 v 1 , 1 u 4 , 1 n 2 , 2 n 3 , 2 v 1 , 2 u 4 , 2 n 2 , 3 n 3 , 3 label = argmax 1 ≤ l ≤ s K u K , l

Deep Neural Networks (DNNs) Input Hidden Hidden Output layer layer layer layer n 2 , 1 n 3 , 1 v 1 , 1 u 4 , 1 n 2 , 2 n 3 , 2 v 1 , 2 u 4 , 2 n 2 , 3 n 3 , 3 label = argmax 1 ≤ l ≤ s K u K , l 1) neuron activation value 2) rectified linear unit (ReLU): � u k , i = b k , i + w k − 1 , h , i · v k − 1 , h 1 ≤ h ≤ s k − 1 v k , i = max { u k , i , 0 } weighted sum plus a bias; w , b are parameters learned

DNN as a program . . . // 1) neuron a c t i v a t i o n value u k , i = b k , i for (unsigned h = 0; h ≤ s k − 1 ; h += 1) { u k , i += w k − 1 , h , i · v k − 1 , h } v k , i = 0 // 2) ReLU i f ( u k , i > 0) { v k , i = u k , i } . . .

Testing Framework ◮ Test Coverage Criteria ◮ Test Case Generation

Examples of Test Coverage Criteria ◮ Neuron coverage [5] 5 ◮ Neuron boundary coverage [4] 6 ◮ MC/DC for DNNs [8] 7 ◮ Lipschitz continuity 5 Pei, Cao, Yang, Jana, SOSP2017. 6 Ma, Xu, Zhang, Sun, Xue, Li, Chen, Su, Li, Liu, Zhao, Wang, ASE2018 7 Sun, Huang , Kroening, ASE2018

Neuron coverage For any hidden neuron n k , i , there exists test case t ∈ T such that the neuron n k , i is activated: u k , i > 0. Test coverage conditions: {∃ x . u [ x ] k , i > 0 | 2 ≤ k ≤ K − 1 , 1 ≤ i ≤ s k }

Neuron coverage ◮ ≈ statement (line) coverage . . . // 1) neuron a c t i v a t i o n v a l u e For any hidden neuron n k , i , u k , i = b k , i there exists test case t ∈ T such for (unsigned h = 0; h ≤ s k − 1 ; h += 1) that the neuron n k , i is activated: { u k , i += w k − 1 , h , i · v k − 1 , h u k , i > 0. } v k , i = 0 Test coverage conditions: // 2) ReLU i f ( u k , i > 0) {∃ x . u [ x ] k , i > 0 | { v k , i = u k , i ⇐ this line is covered 2 ≤ k ≤ K − 1 , 1 ≤ i ≤ s k } } . . .

Neuron Coverage Problem of neuron coverage: ◮ too easy to reach 100% coverage

MC/DC in Software Testing Developed by NASA and has been widely adopted in e.g., avionics software development guidance to ensure adequate testing of applications with the highest criticality. Idea: if a choice can be made, all the possible factors (conditions) that contribute to that choice (decision) must be tested. For traditional software, both conditions and the decision are usually Boolean variables or Boolean expressions.

MC/DC Example Example: the decision d ⇐ ⇒ (( a > 3) ∨ ( b = 0)) ∧ ( c � = 4) (2) contains the three conditions ( a > 3), ( b = 0) and ( c � = 4). The following two test cases provide 100% condition coverage (i.e., all possibilities of the conditions are exploited): 1. ( a > 3)=True, ( b = 0)=True, ( c � = 4)=True, d = True 2. ( a > 3)=False, ( b = 0)=False, ( c � = 4)=False, d = False

MC/DC Example Example: the decision d ⇐ ⇒ (( a > 3) ∨ ( b = 0)) ∧ ( c � = 4) (3) contains the three conditions ( a > 3), ( b = 0) and ( c � = 4). The following six test cases provide 100% MC/DC coverage: 1. ( a > 3)=True, ( b = 0)=True, ( c � = 4)=True, d = True 2. ( a > 3)=False, ( b = 0)=False, ( c � = 4)=False, d = False 3. ( a > 3)=False, ( b = 0)=False, ( c � = 4)=True, d = False 4. ( a > 3)=False, ( b = 0)=True, ( c � = 4)=True, d = True 5. ( a > 3)=False, ( b = 0)=True, ( c � = 4)=False, d = False 6. ( a > 3)=True, ( b = 0)=False, ( c � = 4)=True, d = True

MC/DC for DNNs – General Idea The core idea of our criteria is to ensure that not only the presence of a feature needs to be tested but also the effects of less complex features on a more complex feature must be tested. n 2 , 1 n 3 , 1 v 1 , 1 v 4 , 1 n 2 , 2 n 3 , 2 v 1 , 2 v 4 , 2 n 2 , 3 n 3 , 3 For example, check the impact of n 2 , 1 , n 2 , 2 , n 2 , 3 on n 3 , 1 .

MC/DC for DNNs – Neuron Pair and Sign Change A neuron pair ( n k , i , n k +1 , j ) are two neurons in adjacent layers k and k + 1 such that 1 ≤ k ≤ K − 1, 1 ≤ i ≤ s k , and 1 ≤ j ≤ s k +1 . (Sign Change of a neuron) Given a neuron n k , l and two test cases x 1 and x 2 , we say that the sign change of n k , l is exploited by x 1 and x 2 , denoted as sc ( n k , l , x 1 , x 2 ), if sign ( v k , l [ x 1 ]) � = sign ( v k , l [ x 2 ]).

MC/DC for DNNs – Value Change and Distance Change (Value Change of a neuron) Given a neuron n k , l and two test cases x 1 and x 2 , we say that the value change of n k , l is exploited with respect to a value function g by x 1 and x 2 , denoted as vc ( g , n k , l , x 1 , x 2 ), if g ( u k , l [ x 1 ] , u k , l [ x 2 ])=True .

MC/DC for DNNs – Sign-Sign Cover, or SS Cover A neuron pair α = ( n k , i , n k +1 , j ) is SS-covered by two test cases x 1 , x 2 , denoted as cov SS ( α, x 1 , x 2 ), if the following conditions are satisfied by the network instances N [ x 1 ] and N [ x 2 ]: ◮ sc ( n k , i , x 1 , x 2 ); ◮ ¬ sc ( n k , l , x 1 , x 2 ) for all n k , l ∈ P k \ { i } ; ◮ sc ( n k +1 , j , x 1 , x 2 ).

MC/DC for DNNs – Other Covering Methods Value-Sign Cover, or VS Cover Sign-Value Cover, or SV Cover Value-Value Cover, or VV Cover

Relation M N denotes the neuron coverage metric arrows represent “weaker than” relation between metrics

Activation pattern 8 Activation Pattern ◮ Given a concrete input x , N [ x ] corresponds to a linear model C ◮ C represents the set of inputs following the same activation pattern ◮ One DNN activation pattern corresponds to a program execution path ◮ traverse of all activation patterns ⇒ formal verification ◮ too many patterns: e.g., 2 > 10 , 000 ... 8 Sun, Huang , Kroening. ”Testing Deep Neural Networks.” (2018).

Safety Coverage [10] 9 Definition Let each hyper-rectangle rec contains those inputs with the same pattern of ReLU, i.e., for all x 1 , x 2 ∈ rec we have sign ( n k , l , x 1 ) = sign ( n k , l , x 2 ) for all n k , l ∈ H ( N ). A hyper-rectangle rec is safe covered by a test case x , denoted as cov S ( rec , x ), if x ∈ rec . 9 Wicker, Huang , Kwiatkowska, TACAS2018

Relation M S denotes the safety coverage metric

Safety Coverage Problem of safety coverage: ◮ exponential number of hyper-rectangles to be covered Therefore, our MC/DC based criteria strikes the balance between intensive testing and computational feasibility (justified by the experimental results).

Relation with a few other criteria from [4] ◮ M MN : multi-section neuron coverage ◮ M NB : neuron boundary coverage ◮ M TN : top-k neuron coverage

What we can do? ◮ bug finding ◮ DNN safety statistics ◮ testing efficiency ◮ DNN internal structure analysis

Test Case Generation ◮ optimisation based (symbolic) approach ◮ concolic testing ◮ monte carlo tree based input mutation testing ◮

Testing Deep Neural Networks Xiaowei Huang, University of Liverpool - PowerPoint PPT Presentation

Testing Deep Neural Networks Xiaowei Huang, University of Liverpool Outline Safety Problem of AI Verification (brief) Testing Conclusions and Future Works Human-Level Intelligence Robotics and Autonomous Systems Deep neural networks all

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Deep Learning with Neural Networks The Structure and Optimization of Deep Neural Networks Allan

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Optimizing Deep Neural Networks Leena Chennuru Vankadara 26-10-2015 Table of Contents Neural

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

On the Expressive Power of Deep Neural Networks Maithra Raghu, Ben Poole, Jon Kleinberg, Surya

Weight Parameterizations in Deep Neural Networks Sergey Zagoruyko e Paris-Est, Universit

(Very) Brief Introduction to Neural Networks IITP-03 Algorithms for NLP 1 / 31 Learning

Introduction to Deep Neural Networks 0. Logistics Spring 2020 1 Neural Networks are taking

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

INVARIANTS FOR FINITE INSTANCES AND BEYOND October, 21 st 2013 Sylvain Conchon, Amit Goel, Sava

Real-time Model Checking Timed Temporal Logics Nicolas M ARKEY Lav. Sp ecification

Introduction to SMT Albert Oliveras Technical University of Catalonia 8th International

Leonardo de Moura and Nikolaj Bjrner Microsoft Research Verification/Analysis tools need some

CS137: Today Electronic Design Automation Specification/Implementation Abstraction

Valeria Bertacco Advanced Computer Architecture Laboratory University of Michigan, Ann Arbor

UC UCb b 1 2 methods a b c 3 4 & Tools b c 3 4 UPPAAL Model 2 DISC Summer

Optimal strategies in weighted timed games: undecidability and approximation Nicolas Markey LSV,