Scaling up SAT/SMT Application to Industry R Venkatesh 8/12/2019

Acknowledgements Kumar Madhukar Afzal Mohammad Sumanth Prabhu Shrawan Kumar Muqsit Azeem Divyesh Unadkat Bharti Chimdyalwar Advaita Datar Priyanka Darke Avriti Chauhan Asia

Objective Interesting applications Proving properties of programs Constrained optimization Encoding strategies peculiar to domains

Central Theme Naive encoding may not work Exploit domain properties Invariant templates Verifjcation Small model property Equivalences Optimization Extending partial solutions

Verifjcation Problem Given a program with an assert statement check whether the assert holds int x = y = 0 Encode as a SAT problem while (*) { (CBMC) x = x + 1 Loops - a challenge as y = y + x } number of iterations not known assert(y >= 0)

Invariants and Abstractions Invariant is a property that holds for every run of the program. It defjnes an abstract set of states. Loop Invariant holds at the head, every iteration and end of the loop while(B) S {I ∧ B} S {I} {I} while(B) S {I ∧ ¬ B} Abstraction of a program, P , is any program P ′ that has more runs than P . A property that holds in P ′ also holds in P Invariants can help eliminate loops in a program by abstracting it.

Loop Elimination - Example int x = y = 0 int x = y = 0 assert(x >=0) while (*) { x = * x = x + 1 y = * y = y + x assume(x >=0) } if (*) x = x + 1 assert(y >= 0) y = y + x assert(x >=0) assert(y >= 0) How to discover these invariants?

Invariant Synthesis Search in a carefully constructed space (given by a grammar) Similar to program synthesis and Daikon Restrict language size by deriving grammar from code Possible assistance from data

An Example int x = y = 0 while (*) { Safe inductive invariants: x = x + 1 ( x ≥ 0 ∧ y ≥ 0 ) y = y + x } ( x ≥ 0 ∧ y − x ≥ 0 ) assert(y >= 0) Program Reference: Understanding IC3

Expression Syntax & Probability Syntax Probability x ≥ 0 0 . 4 int x = y = 0 − x ≥ 0 0 . 1 while (*) { y ≥ 0 0 . 2 x = x + 1 y = y + x − y ≥ 0 0 . 1 } x + y ≥ 0 0 . 2 assert(y >= 0) y − x ≥ 0 0 . 1

Grammar and Probabilities Sampling Grammar Weights from frequencies c ::= 0 Occurrences of formula of arity i k ::= 0 | 1 | − 1 Occurrences of an operator op ∈ { >, ≥} among inequalities v ::= x | y Occurrences of variable v coeffjcient k lincom ::= k · v + · · · + k · v Probabilities from weights ineq ::= lincom ≥ c | lincom > c cand ::= ineq ∨ ineq ∨ . . . ineq At any level, if available choices have weights a, b, and c They are sampled with probabilities a/(a+b+c), b/(a+b+c), and c/(a+b+c) More details: Fedyukovich, Kaufman, and Bodík, FMCAD 2017

Additional Heuristics Iterative learning: conjunct already proven invariants with the candidates Probabilities can be adjusted; for example: having derived ( x > 5 ) , do not sample ( x > 4 ) – weaker Learn from Executions (Dynamic/Symbolic) algebraic invariants from traces Prabhu et al., SAS 2018, Sharma et al., ESOP 2013 interpolants from bounded proofs Fedyukovich et al., TACAS 2017

Multiple Loops Invariants needed: int x = y = 0 int m = n = *; assume(m >= 0); for fjrst loop: ( x + y + n = m ) while (n != 0) { n–; if (*) then x++; for second loop: else y++; ( x + y + n = m ) ∧ ( n = 0 ) } while (x != 0) { for third loop: m–; x–; } ( x + y + n = m ) ∧ ( n = 0 ) ∧ ( x = 0 ) while (y != 0) { m–; y–; } assert(m == 0);

Multiple Loops x = 0 → x ≥ 0 , − x ≥ 0 int x = y = 0 int m = n = *; assume(m >= 0); y = 0 → y ≥ 0 , − y ≥ 0 while (n != 0) { m ≥ 0 → m ≥ 0 n–; if (*) then x++; else y++; m = n → m ≥ n , − m ≥ n } n ̸ = 0 → − n > 0 ∨ n > 0 while (x != 0) { m–; x–; } while (y != 0) { m–; y–; } assert(m == 0);

Multiple Loops { x ≥ 0 , − x ≥ 0 , y ≥ 0 , − y ≥ 0 , int x = y = 0 m ≥ 0 , m ≥ n , − m ≥ n , int m = n = *; assume(m >= 0); − n > 0 ∨ n > 0 } while (n != 0) { c ::= 0 n–; k ::= 1 | − 1 if (*) then x++; else y++; v ::= x | y | m | n } e ::= k · v | k · v + k · v cand ::= e ≥ c | e > c ∨ e > c while (x != 0) { m–; x–; } while (y != 0) { m–; y–; } assert(m == 0);

Multiple Loops { n ≥ 0 , − n ≥ 0 , − x > 0 ∨ x > 0 } int x = y = 0 int m = n = *; assume(m >= 0); while (n != 0) { n–; if (*) then x++; else y++; } while (x != 0) { m–; x–; } while (y != 0) { m–; y–; } assert(m == 0);

Multiple Loops { n ≥ 0 , − n ≥ 0 , − x > 0 ∨ x > 0 } int x = y = 0 int m = n = *; assume(m >= 0); c ::= 0 while (n != 0) { k ::= 1 | − 1 n–; v ::= x | n if (*) then x++; else y++; e ::= k · v } cand ::= e ≥ c | e > c ∨ e > c while (x != 0) { m–; x–; } while (y != 0) { m–; y–; } assert(m == 0);

Multiple Loops { x ≥ 0 , − x ≥ 0 , − y > 0 ∨ y > 0 , int x = y = 0 y ≥ 0 , − y ≥ 0 , m ≥ 0 , − m ≥ 0 } int m = n = *; assume(m >= 0); while (n != 0) { n–; if (*) then x++; else y++; } while (x != 0) { m–; x–; } while (y != 0) { m–; y–; } assert(m == 0);

Multiple Loops { x ≥ 0 , − x ≥ 0 , − y > 0 ∨ y > 0 , int x = y = 0 y ≥ 0 , − y ≥ 0 , m ≥ 0 , − m ≥ 0 } int m = n = *; assume(m >= 0); while (n != 0) { c ::= 0 n–; k ::= 1 | − 1 if (*) then x++; else y++; v ::= x | y | m } e ::= k · v cand ::= e ≥ c | e > c ∨ e > c while (x != 0) { m–; x–; } while (y != 0) { m–; y–; } assert(m == 0);

Insuffjciency of the grammars c ::= 0 c ::= 0 c ::= 0 k ::= 1 | − 1 k ::= 1 | − 1 k ::= 1 | − 1 v ::= x | y | m v ::= x | y | m | n v ::= x | n e ::= k · v e ::= k · v | k · v + k · v e ::= k · v cand ::= e ≥ c | e > cand ::= e ≥ c | e > cand ::= e ≥ c | e > c ∨ e > c c ∨ e > c c ∨ e > c ( x + y + n = m ) ∧ ( n = 0 ) ( x + y + n = m ) ∧ ( n = 0 ) ( x + y + n = m ) ∧ ( x = 0 )

Learning from Traces and Propagation ( x + y + n = m ) , for the fjrst loop, can be obtained by fjtting program behaviors into a polynomial This works for other loops as well (no change in variables between the loops) Propagate candidates to neighboring loops More details: Fedyukovich, Prabhu, Madhukar, and Gupta, FMCAD 2018

Experimental Results 101 (safe) benchmarks (81 – LIA, 20 – Non-linear) FreqHorn solved 81, Spacer solved 45, µ Z 42, and Eldarica 71 FreqHorn solved 41 on which Spacer diverged 44 on which µ Z diverged 22 on which Eldarica diverged 16 on which all others diverged (10 over NIA) When run without probabilities: FreqHorn solved 65 (with the same timeout - 5 mins) More details: Fedyukovich, Prabhu, Madhukar, and Gupta, FMCAD 2018

Extending to Programs with Arrays Generation of quantifjed candidates: ∀ Q . range ( Q , I ) = ⇒ cell ( Q , A , I ) Example: ∀ j . i < j ≤ N − 1 = ⇒ m ≤ A [ j ] Need for better solution than just extension of grammar by quantifjer Adding quantifjers directly to grammar may produce more spurious candidates Checking quantifjed invariant candidates is costly

Quantifjed Invariants int N, A[N], B[N]; int s = 0, m = 0; int i; for(i=N-1; i>=0; i=i-1){ if(m > A[i]) ∀ j . i < j ≤ N − 1 = ⇒ m ≤ A [ j ] m = A[i]; } for(i=0; i<N; i++){ ∀ j . 0 ≤ j ≤ N − 1 = ⇒ m ≤ A [ j ] ∧ B[N-i-1] = A[i] - m; } ∀ j . 0 ≤ j < i = ⇒ B [ N − j − 1 ] = A [ j ] − m for(i=0; i<N; i++){ ∀ j . 0 ≤ j < N = ⇒ m ≤ A [ j ] ∧ s = s + B[i] } ∀ j . 0 ≤ j < N = ⇒ B [ N − j − 1 ] = A [ j ] − m ∧ s ≥ 0 assert(s >= 0);

Quantifjed Invariants For each counter variable of a loop add a int N, A[N], B[N]; int s = 0, m = 0; new quantifjed variable to Q int i; Single quantifjed variable j for each loop for(i=N-1; i>=0; i=i-1){ if(m > A[i]) m = A[i]; Compute range based on bound on } counter variable for(i=0; i<N; i++){ i < j ≤ N − 1, 0 ≤ j ≤ N − 1 and 0 ≤ j < N B[N-i-1] = A[i] - m; } for(i=0; i<N; i++){ s = s + B[i] Sample cell formula using grammar } constructed from syntax assert(s >= 0); m ≤ A [ j ] , B [ N − j − 1 ] = A [ j ] − m , s ≥ 0, etc.

Quantifjed Invariants Propagate inductive invariants between int N, A[N], B[N]; int s = 0, m = 0; loops int i; for(i=N-1; i>=0; i=i-1){ if(m > A[i]) m = A[i]; ∀ j . i < j ≤ N − 1 = ⇒ m ≤ A [ j ] } to second and third loop as ∀ j . 0 ≤ j ≤ N − 1 = ⇒ m ≤ A [ j ] for(i=0; i<N; i++){ B[N-i-1] = A[i] - m; } ∀ j . 0 ≤ j < i = ⇒ B [ N − j − 1 ] = A [ j ] − m to third loop as for(i=0; i<N; i++){ ∀ j . 0 ≤ j < N = ⇒ B [ N − j − 1 ] = A [ j ] − m s = s + B[i] } assert(s >= 0);

Scaling SMT Checks We may still need to scale SMT checks to sample more candidates Two main techniques: Reduction to quantifjer free formulas Generalizing Sub-Ranges

Experimental Results 137 (safe) benchmarks (79 – single loops, 58 – multiple loops) FreqHorn solved 129, Spacer solved 81, VIAP 70, and Booster 48 FreqHorn solved 54 on which Spacer diverged 60 on which VIAP diverged 83 on which Booster diverged More details: Fedyukovich, Prabhu, Madhukar, and Gupta, CAV 2019

Scaling up SAT/SMT Application to Industry R Venkatesh 8/12/2019 - PowerPoint PPT Presentation

Scaling up SAT/SMT Application to Industry R Venkatesh 8/12/2019 Acknowledgements Kumar Madhukar Afzal Mohammad Sumanth Prabhu Shrawan Kumar Muqsit Azeem Divyesh Unadkat Bharti Chimdyalwar Advaita Datar Priyanka Darke Avriti Chauhan

SAT and SMT Murphy Berzish Overview Boolean Satisfiability (SAT) problem SAT solvers:

Introduction to SAT and SMT Solvers Interfacing Yosys and SMT Solvers for BMC and more using

Watched Literals in SAT and CP T opics in this Series Why SAT & Constraints? SAT

Applications of SMT to Test Generation Patrice Godefroid Microsoft Research SAT/SMT Summer

Z3: an efficient SAT/SMT solver SAT Problem SAT problem is translate in propositional formula

Smarter Balanced/SAT Testing Results 2017 1 Smarter Balanced 2 3 4 SAT Achievement Trend 5

SAT SAT SAT SAT To Become an Auto Parts Manufacturing Leader in ASEAN with Excellent Quality

Planning and Satisfiability Conclusion References Jussi Rintanen SAT-SMT School, Trento, June

Satisfiability Modulo Theories SMT solvers are finding their way in many different application

SMT in Asia Content Teknek and the SMT industry The market Why cleaning is needed

SMT WORLDWIDE SMT America, Europe and Asia staff has over 20 years experience in the SMT field

POLYMETALLIC PRODUCER AGM PRESENTATION June 30, 2020 TSX: SMT | NYSE AMERICAN: SMTS | BVL: SMT

SMT Solvers: A Disruptive Technology John Rushby Computer Science Laboratory SRI International

Using SMT solvers for binary analysis and exploitation A primer on SMT, SMT solvers, Z3 & angr

CDCL SAT Solvers & SAT-Based Problem Solving Joao Marques-Silva 1 , 2 & Mikolas Janota 2 1

Outline Scaling Scalinga Plenitude of Power Laws Scaling-at-large Scaling-at-large

GSI Darmstadt, 31.1 2.2.17 The STS-module-assembly: Status and Challenges JINR/ Dubna, Russia

3D prin(ng of hydrogel building blocks: going

Asynchronous system design flow based on Petri nets Microelectronics System Design Research Group

Recent DHCAL Developments Jos Repond and Lei Xia Argonne National Laboratory Linear Collider

Series UNCLASSIFIED Operated by Los Alamos National Security, LLC for NNSA LA-UR 09-05472

A categorical explanation of why Churchs Thesis holds in the Effective Topos Fabio Pasquali

Computing the rank of big sparse matrices modulo p using gaussian elimination Charles Bouillaguet

Petascale Visualization: Approaches and Initial Results James Ahrens Li-Ta Lo, Boonthanome