approximate verification of deep neural networks with
play

Approximate Verification of Deep Neural Networks with Provable - PowerPoint PPT Presentation

Approximate Verification of Deep Neural Networks with Provable Guarantees Xiaowei Huang, University of Liverpool Outline Background and Challenges Safety Definition and Layer-by-Layer Refinement Game-based Approach for a Single Layer


  1. Approximate Verification of Deep Neural Networks with Provable Guarantees Xiaowei Huang, University of Liverpool

  2. Outline Background and Challenges Safety Definition and Layer-by-Layer Refinement Game-based Approach for a Single Layer Verification Experimental Results

  3. Human-Level Intelligence

  4. Robotics and Autonomous Systems

  5. Deep neural networks all implemented with

  6. Major problems and critiques ◮ un-safe, e.g., lack of robustness (this talk) ◮ hard to explain to human users ◮ ethics, trustworthiness, accountability, etc.

  7. Figure: safety in image classification networks

  8. Figure: safety in natural language processing networks

  9. Figure: safety in voice recognition networks

  10. Figure: safety in security systems

  11. Outline Background and Challenges Safety Definition and Layer-by-Layer Refinement Safety Definition Challenges Approaches Game-based Approach for a Single Layer Verification Experimental Results

  12. Certification of DNN

  13. Safety Requirements ◮ Pointwise Robustness (this talk) ◮ if the decision of a pair (input, network) is invariant with respect to the perturbation to the input. ◮ Network Robustness ◮ or more fundamentally, Lipschitz continuity, mutual information, etc ◮ model interpretability

  14. Safety Definition: Human Driving vs. Autonomous Driving Traffic image from “The German Traffic Sign Recognition Benchmark”

  15. Safety Definition: Human Driving vs. Autonomous Driving Image generated from our tool

  16. Safety Problem: Incidents

  17. Safety Definition: Illustration

  18. Safety Definition: Deep Neural Networks ◮ R n be a vector space of inputs (points) ◮ f : R n → C , where C is a (finite) set of class labels, models the human perception capability, ◮ a neural network classifier is a function ˆ f ( x ) which approximates f ( x )

  19. Safety Definition: Deep Neural Networks A (feed-forward) neural network N is a tuple ( L , T , Φ), where ◮ L = { L k | k ∈ { 0 , ..., n }} : a set of layers. ◮ T ⊆ L × L : a set of sequential connections between layers, ◮ Φ = { φ k | k ∈ { 1 , ..., n }} : a set of activation functions φ k : D L k − 1 → D L k , one for each non-input layer.

  20. Safety Definition: Traffic Sign Example

  21. Maximum Safe Radius Definition The maximum safe radius problem is to compute the minimum distance from the original input α to an adversarial example, i.e., α ′ ∈ D {|| α − α ′ || k | α ′ is an adversarial example } MSR ( α ) = min (1)

  22. Challenges Challenge 1: continuous space, i.e., there are an infinite number of points to be tested in the high-dimensional space Challenge 2: The spaces are high dimensional Challenge 3: the functions f and ˆ f are highly non-linear, i.e., safety risks may exist in the pockets of the spaces Challenge 4: not only heuristic search but also verification

  23. Approach 1: Single Layer – Discretisation Define manipulations δ k : D L k → D L k over the activations in the vector space of layer k . δ 2 δ 2 δ 1 δ 1 α x,k α x,k δ 3 δ 3 δ 4 δ 4 Figure: Example of a set { δ 1 , δ 2 , δ 3 , δ 4 } of valid manipulations in a 2-dimensional space

  24. Exploring a Finite Number of Points η k ( α x,k ) η k ( α x,k ) α x j +1 ,k α x j +1 ,k δ k δ k α x j ,k α x j ,k δ k δ k δ k δ k α x 2 ,k α x 2 ,k α x,k = α x 0 ,k α x,k = α x 0 ,k α x 1 ,k α x 1 ,k δ k δ k δ k δ k δ k δ k

  25. Finite Approximation Definition Let τ ∈ (0 , 1] be a manipulation magnitude. The finite maximum safe radius problem FMSR ( τ, α ) is defined over the manipulation magnitude τ (details to be given later). Lemma For any τ ∈ (0 , 1] , we have that MSR ( α ) ≤ FMSR ( τ, α ) .

  26. Approach 2: Single Layer – Exhaustive Search η k ( α x,k ) η k ( α x,k ) α x j +1 ,k α x j +1 ,k δ k δ k α x j ,k α x j ,k δ k δ k δ k δ k α x 2 ,k α x 2 ,k α x,k = α x 0 ,k α x,k = α x 0 ,k α x 1 ,k α x 1 ,k δ k δ k δ k δ k δ k δ k Figure: exhaustive search (verification) vs. heuristic search

  27. Approach 3: Single Layer – Anytime Algorithms

  28. Approach 4: Layer-by-Layer Refinement Will explain how to determine τ ∗ 0 later.

  29. Approach 2: Layer-by-Layer Refinement

  30. Approach 2: Layer-by-Layer Refinement

  31. Outline Background and Challenges Safety Definition and Layer-by-Layer Refinement Game-based Approach for a Single Layer Verification Experimental Results

  32. Preliminaries: Lipschitz network Definition Network N is a Lipschitz network with respect to distance function L k if there exists a constant � c > 0 for every class c ∈ C such that, for all α, α ′ ∈ D , we have | N ( α ′ , c ) − N ( α, c ) | ≤ � c · || α ′ − α || k . (2) Most known types of layers, including fully-connected, convolutional, ReLU, maxpooling, sigmoid, softmax, etc., are Lipschitz continuous [4].

  33. Preliminaries: Feature-Based Partitioning Partition the input dimensions with respect to a set of features. Here, features in the simplest case can be a uniform partition, i.e., do not necessarily follow a particular method. Useful for the reduction to two-player game, in which player One chooses a feature and player Two chooses how to manipulate the selected feature.

  34. Preliminaries: Input Manipulation Let τ > 0 be a positive real number representing the manipulation magnitude, then we can define input manipulation operations δ τ, X , i : D → D for X ⊆ P 0 , a subset of input dimensions, and i : P 0 → N , an instruction function by: � α ( j ) + i ( j ) ∗ τ, if j ∈ X δ τ, X , i ( α )( j ) = α ( j ) , otherwise for all j ∈ P 0 .

  35. Approximation Based on Finite Optimisation Definition Let τ ∈ (0 , 1] be a manipulation magnitude. The finite maximum safe radius problem FMSR ( τ, α ) based on input manipulation is as follows: min min min i ∈I {|| α − δ τ, X , i ( α ) || k | δ τ, X , i ( α ) is an adv. example } Λ ′ ⊆ Λ( α ) X ⊆ � λ ∈ Λ ′ P λ (3) Lemma For any τ ∈ (0 , 1] , we have that MSR ( α ) ≤ FMSR ( τ, α ) . We need to determine the condition for τ to satisfy so that FMSR ( τ, α ) = MSR ( α ).

  36. Grid Space Definition An image α ′ ∈ η ( α, L k , d ) is a τ -grid input if for all dimensions p ∈ P 0 we have | α ′ ( p ) − α ( p ) | = n ∗ τ for some n ≥ 0. Let G ( α, k , d ) be the set of τ -grid inputs in η ( α, L k , d ).

  37. misclassification aggregator Definition An input α 1 ∈ η ( α, L k , d ) is a misclassification aggregator with respect to a number β > 0 if, for any α 2 ∈ η ( α 1 , L k , β ), we have that N ( α 2 ) � = N ( α ) implies N ( α 1 ) � = N ( α ). Lemma If all τ -grid inputs are misclassification aggregators with respect to 1 2 d ( k , τ ) , then MSR ( k , d , α, c ) ≥ FMSR ( τ, k , d , α, c ) − 1 2 d ( k , τ ) .

  38. Conditions for Achieving Misclassification Aggregator Given a class label c , we let g ( α ′ , c ) = c ′ ∈ C , c ′ � = c { N ( α ′ , c ) − N ( α ′ , c ′ ) } min (4) be a function maintaining for an input α ′ the minimum confidence margin between the class c and another class c ′ � = N ( α ′ ). Lemma Let N be a Lipschitz network with a Lipschitz constant � c for every class c ∈ C. If 2 g ( α ′ , N ( α ′ )) d ( k , τ ) ≤ (5) max c ∈ C , c � = N ( α ′ ) ( � N ( α ′ ) + � c ) for all τ -grid input α ′ ∈ G ( α, k , d ) , then all τ -grid inputs are misclassification aggregators with respect to 1 2 d ( k , τ ) .

  39. Main Theorem Theorem Let N be a Lipschitz network with a Lipschitz constant � c for every class c ∈ C. If 2 g ( α ′ , N ( α ′ )) d ( k , τ ) ≤ max c ′ ∈ C , c ′ � = N ( α ′ ) ( � N ( α ′ ) + � c ′ ) for all τ -grid inputs α ′ ∈ G ( α, k , d ) , then we can use FMSR ( τ, k , d , α, c ) to estimate MSR ( k , d , α, c ) with an error bound 1 2 d ( k , τ ) .

  40. Two Player Game Player-I Player-I … Player-II Player-II … … … Player-I … … … Player-I … … … … Player-II … … … … Player-II MCTS: Random Simulation Admissible A*/Alpha-Beta Pruning: More Tree Expansion

  41. Flow of Reductions Monte-Carlo Upper Tree Search Two-Player Bound Lipschitz Turn-Based Finite MSR or Optimal Constants Game MSR or FR Finite FR Rewards of Problem problem Player I Lower Admissible A* Bound or Alpha-Beta Pruning

  42. Outline Background and Challenges Safety Definition and Layer-by-Layer Refinement Game-based Approach for a Single Layer Verification Experimental Results

  43. Convergence of Lower and Upper Bounds

  44. Experimental Results: GTSRB Image Classification Network for The German Traffic Sign Recognition Benchmark Total params: 571,723

  45. Experimental Results: GTSRB

  46. Experimental Results: imageNet Image Classification Network for the ImageNet dataset, a large visual database designed for use in visual object recognition software research. Total params: 138,357,544

  47. Experimental Results: ImageNet

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend