Machine learning for instance selection in SMT solving (W ork in - PowerPoint PPT Presentation

Machine learning for instance selection in SMT solving (W ork in Progress ) Jasmin Christian Blanchete 1, 2 Daniel El Ouraoui 2 Pascal Fontaine 2 Cezary Kaliszyk 3 Vrije Universiteit Amsterdam, Amsterdam, The Netherlands University of Lorraine, CNRS, Inria, and LORIA, Nancy, France University of Innsbruck, Innsbruck, Austria 9th April 2019

Contents 1 Introduction 2 CDCL(T) 3 Instantiation techniques 4 Machine learning for instance selection 5 Evaluation 6 Conclusion 2 / 32

Motivations Instantiation Satisfiability modulo theories (SMT) Hard for SMT solvers Automation Proof assistant Heuristically solved Verification conditions Model checking Solvers Challenge Z3, cvc 4, veriT , ... Improve instantiation techniques Solve more problems Be more efficient 4 / 32

Our tool Université de Lorraine/UFRN ( http://www.verit-solver.org ) 5 / 32

Context Ground b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y Instantiation 7 / 32

Ground problem How efficiently check the satisfiability of a ground formula ( f ( a , b ) = g ( a ) ∨ d = b ) ∧ d = g ( b ) ∧ d � = f ( a , b ) ∧ b = a ∧ d � = g ( a ) ( f ( a , b ) = g ( a ) ∨ d = b ) ∧ d = g ( b ) ∧ d � = f ( a , b ) ∧ b = a ∧ d � = g ( a ) l 2 l 5 l 1 l 3 l 4 l 6 ( l 1 ∨ ¬ l 2 ) ∧ l 3 ∧ l 4 ∧ l 5 ∧ l 6 8 / 32

CDCL(T) Ground Solver Conflict clauses Theory solvers SAT solver Boolean model Formulas are embedded in SAT SAT solver produces a boolean model Theory solvers produce conflict clauses Conflict clauses guide the SAT solver 9 / 32

First-Order problem b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y Instantiation 10 / 32

First-Order problem How to find an instance such that the problem is UNSAT b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y SAT f ( a ) � = f ( b ) ∨ a = b b � = a ∧ f ( a ) = f ( b ) ∧ f ( a ) � = f ( b ) UNSAT 11 / 32

First-Order CDCL(T) SMT Solver Instances Ground solver Instantiation FO model 12 / 32

State of the art Conflict based instantiation Introduced by Reynolds, this technique produces relevant sets of instances. The idea is that, given a ground model M and a quantified formula ∀ ( x n : τ n ) .ϕ , we find a substitution σ such that M | = ¬ ϕσ . Congruence Closure with Free Variable (CCFV) Introduced by Barbosa et al., generalizes the idea of Conflict based instantiation by reasoning over equivalence classes. 14 / 32

State of the art � Enumerative instantiation ∀ ( x : τ ) .ψ [ x ] ≡ ψ [ t ] t ∈D τ Enumerate all ground terms over the domain of x (aka. Herbrand universe) Trigger based instantiation Triggers A trigger T for a quantified formula ∀ x n .ψ is a set of non-ground terms u 1 , . . . , u n ∈ T ( ψ ) such that: { x } ⊆ FV ( u 1 ) ∪ . . . ∪ FV ( u n ) . E = f ( a ) ≃ g ( b ) , a ≃ g ( b ) Q = ∀ x f ( g ( x )) �≃ g ( x ) T = f ( g ( x )) f ( a ) E -matches f ( g ( x )) under x �→ b 15 / 32

Strategie CCFV Works Fails ground solver Trigger + Enum Figure: Instantiation strategie 16 / 32

Summarize Conflict based instantiation and CCFV : Pro Efficient, if find substitution kill the model Pro All generated instances are useful Cons Finds contradiction involving only one instance Enumerative and Trigger based instanciation : Pro Useful when CCFV fail Cons Many heuristics Cons Generates a lot of junk, and many instances 17 / 32

Summarize Conflict based instantiation and CCFV : Pro Efficient, if find substitution kill the model Pro All generated instances are useful Cons Finds contradiction involving only one instance Enumerative and Trigger based instanciation : Pro Useful when CCFV fail Cons Many heuristics Cons Generates a lot of junk, and many instances Indeed This is what we want improve! 17 / 32

Problem How many lemmas are generated to solve a problem? around 300 for the UF category of the SMT-LIB some generate more than 100 000 instances How many lemmas are needed to solve a problem? Only 10% of this number, and sometimes much less 19 / 32

Problem How many lemmas are generated to solve a problem? around 300 for the UF category of the SMT-LIB some generate more than 100 000 instances How many lemmas are needed to solve a problem? Only 10% of this number, and sometimes much less Question Could we select the good one? 19 / 32

Our approach ML-Solver Instantiation Ground Solver Instance selection Processing Instances in a priority queue Encode instances Call predictor Instance selection Several strategies for selection Instances Delayed Selected instances Inst 1 Filter ... Inst 1 rank Inst n ... Predictor Inst n rank 20 / 32

State description Model Formula Instances ( l 1 , . . . , l n , ∀ x n . ψ [ x n ] , x 1 �→ t 1 , . . . , x n �→ t n )   Qformula 1 Inst 1 Inst 1 0 x 12 x 13 . . . x 1 n  ( model 1 1 , m )  . . . 1 1 , 1 Qformula 2 Inst 2 Inst 2 1 x 22 x 23 . . . x 2 n ( model 1 1 , m ) . . .   1 1 , 1   rounds { →  ֒  . . . .  ...   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   . . . .    Qformula i Inst i Inst i ( model k k , m ) . . . 0 x d 2 x d 3 . . . x dn k k , 1 21 / 32

Experiments veriT Small proof pre processing Data set balancing data over sampling under sampling Features Train importance XGBoost XGBoost classification predictions Model C code 22 / 32

Time evaluation Experiments run on UF SMTLIB benchmarks with 120s timeout veriT without learning solves 2923 veriT with learning solves 2939 with learning 24 / 32

Evaluation on test + training set Figure: comparison of veriT configurations on UF SMT-LIB benchmarks. 25 / 32

Evaluation on test set only Figure: comparison of veriT configurations on UF SMT-LIB benchmarks. 26 / 32

Machine learning for instance selection in SMT solving (W ork in - PowerPoint PPT Presentation

Machine learning for instance selection in SMT solving (W ork in Progress ) Jasmin Christian Blanchete 1, 2 Daniel El Ouraoui 2 Pascal Fontaine 2 Cezary Kaliszyk 3 Vrije Universiteit Amsterdam, Amsterdam, The Netherlands University of Lorraine,

SMT WORLDWIDE SMT America, Europe and Asia staff has over 20 years experience in the SMT field

POLYMETALLIC PRODUCER AGM PRESENTATION June 30, 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL: SMT

SMT Solvers: A Disruptive Technology John Rushby Computer Science Laboratory SRI International

Using SMT solvers for binary analysis and exploitation A primer on SMT, SMT solvers, Z3 & angr

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

INSTANCE BASED LEARNING 2 Instance-Based Learning Distance function defines whats learned

SMT in Asia Content Teknek and the SMT industry The market Why cleaning is needed

Visualizing SMT-Based Parallel Constraint Solving Jelena Budakovic, Amedeo Zucchetti, Matteo

Tutorial on SMT Solvers Combinatorial Problem Solving (CPS) Enric Rodr guez-Carbonell April

Re Relational Con Constraint So Solving ng in in SMT SMT Paul Meng , Andrew Reynolds, Cesare

POLYMETALLIC PRODUCER CORPORATE PRESENTATION July 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL:

POLYMETALLIC PRODUCER CORPORATE PRESENTATION February 2020 TSX: SMT | NYSE AMERICAN: SMTS |

DIVERSIFIED PRODUCER CORPORATE PRESENTATION August 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL:

DIVERSIFED PRODUCER CORPORATE PRESENTATION August 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL:

SMT-LIB for HOL Daniel Kroening Philipp Rmmer Georg Weissenbacher Oxford University Computing

Motivation SMT Theories of Interest History of SMT Eager approach Lazy approach Optimizations

The Singleton Pattern Design Patterns In Java Bob Tarr The Singleton Pattern The Singleton

Program Correctness Literatuur Verification of Sequential and Concurrent Programs. Krzysztof R.

C Programming for Engineers Object Oriented Programming ICEN 360 Spring 2017 Prof. Dola

Network Virtualization Architecture: Proposal and Initial Prototype G.Schaffrath 1 , C.Werle 2 , P

Load Balancing for Interdependent IoT Microservices Ruozhou Yu, Vishnu Teja Kilari, Guoliang Xue

Object Flow Analysis Taking an Object-centric View on Dynamic Analysis Adrian Lienhard 1 ,

Policy-Based Instantiation of Norms in MAS Andreea Urzic and Cristian Gratie Policy-Based

Kimmo Rossi European Commission DG CONNECT Unit CNECT.G.3 Data Value Chain New organisation

Machine learning for instance selection in SMT solving (W ork in - PowerPoint PPT Presentation

Machine learning for instance selection in SMT solving (W ork in Progress ) Jasmin Christian Blanchete 1, 2 Daniel El Ouraoui 2 Pascal Fontaine 2 Cezary Kaliszyk 3 Vrije Universiteit Amsterdam, Amsterdam, The Netherlands University of Lorraine,

SMT WORLDWIDE SMT America, Europe and Asia staff has over 20 years experience in the SMT field

POLYMETALLIC PRODUCER AGM PRESENTATION June 30, 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL: SMT

SMT Solvers: A Disruptive Technology John Rushby Computer Science Laboratory SRI International

Using SMT solvers for binary analysis and exploitation A primer on SMT, SMT solvers, Z3 &amp; angr

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

INSTANCE BASED LEARNING 2 Instance-Based Learning Distance function defines whats learned

SMT in Asia Content Teknek and the SMT industry The market Why cleaning is needed

Visualizing SMT-Based Parallel Constraint Solving Jelena Budakovic, Amedeo Zucchetti, Matteo

Tutorial on SMT Solvers Combinatorial Problem Solving (CPS) Enric Rodr guez-Carbonell April

Re Relational Con Constraint So Solving ng in in SMT SMT Paul Meng , Andrew Reynolds, Cesare

POLYMETALLIC PRODUCER CORPORATE PRESENTATION July 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL:

POLYMETALLIC PRODUCER CORPORATE PRESENTATION February 2020 TSX: SMT | NYSE AMERICAN: SMTS |

DIVERSIFIED PRODUCER CORPORATE PRESENTATION August 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL:

DIVERSIFED PRODUCER CORPORATE PRESENTATION August 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL:

SMT-LIB for HOL Daniel Kroening Philipp Rmmer Georg Weissenbacher Oxford University Computing

Motivation SMT Theories of Interest History of SMT Eager approach Lazy approach Optimizations

The Singleton Pattern Design Patterns In Java Bob Tarr The Singleton Pattern The Singleton

Program Correctness Literatuur Verification of Sequential and Concurrent Programs. Krzysztof R.

C Programming for Engineers Object Oriented Programming ICEN 360 Spring 2017 Prof. Dola

Network Virtualization Architecture: Proposal and Initial Prototype G.Schaffrath 1 , C.Werle 2 , P

Load Balancing for Interdependent IoT Microservices Ruozhou Yu, Vishnu Teja Kilari, Guoliang Xue

Object Flow Analysis Taking an Object-centric View on Dynamic Analysis Adrian Lienhard 1 ,

Policy-Based Instantiation of Norms in MAS Andreea Urzic and Cristian Gratie Policy-Based

Kimmo Rossi European Commission DG CONNECT Unit CNECT.G.3 Data Value Chain New organisation

Using SMT solvers for binary analysis and exploitation A primer on SMT, SMT solvers, Z3 & angr