12th International Satisfiability Modulo Theories Competition - PowerPoint PPT Presentation

12th International Satisfiability Modulo Theories Competition SMT-COMP 2017 Matthias Heizmann (co-organizer) Giles Reger (co-organizer) Tjark Weber (chair)

Outline ◮ Main changes over last competition ◮ Benchmarks with ’unknown’ status ◮ Logics with algebraic data-types AUFBVDTLIA, AUFDTLIA, QF DT, UFDT, UFDTLIA ◮ Unsat-core Track ◮ Statistics and selected results of competition ◮ Short presentation of solvers Boolector, COLIBRI, CVC4, SMTInterpol, veriT, Yices

SMT-COMP – Procedure submit SMT-LIB benchmarks SMT-LIB benchmarks curated by users Clark Barrett, Pascal Fontaine, Cesare Tinelli, Christoph Wintersteiger upload benchmarks upload SMT solver solvers developers StarExec maintained by Aaron Stump Competition results

Solvers, Logics, and Benchmarks ◮ 15 teams participated ◮ Solvers: Main track 19 2 non-competitive Application track 4 2 non-competitive Unsat-core track 2 2 non-competitive ◮ Logics: Main track 40 5 experimental Application track 14 Unsat-core track 39 ◮ Benchmarks: Main track 256973 Application track 5971 Unsat-core track 114233

StarExec Cluster of machines at the University of Iowa. Hardware: ◮ Intel Xeon CPU E5-2609 @ 2.4 GHz, 10 MB cache ◮ 2 processors per node, 4 cores per processor ◮ Main memory capped at 60 GB per job pair Software: ◮ Red Hat Enterprise Linux Server release 7.2 ◮ Kernel 3.10.0-514, gcc 4.8.5, glibc 2.17

Main Track Main Track benchmark (set-logic ... ) (set-info ... )  any number of . .  .    (declare-sort ... )  set-info, declare-sort, define-sort, (define-sort ... ) (declare-fun ... ) declare-fun, define-fun, assert (define-fun ... )   (assert term0)   in any order (assert term1)  (assert term2) . . . ← one single check-sat command (check-sat) (exit)

Benchmarks with ’unknown’ status Some benchmarks in SMT-LIB repository do not have a sat/unsat status. Benchmarks with ’unknown’ status in SMT-COMP       not used in competition  2015     separate experimental track 2016 included in Main Track 2017

New logics Algebraic data-types ◮ defined in SMT-LIB 2.6 draft ◮ “experimental” this year (i.e., no winner determined) benchmarks solvers AUFBVDTLIA 1709 CVC4 AUFDTLIA 728 CVC4, vampire QF DT 8000 CVC4 UFDT 4535 CVC4, vampire UFDTLIA 303 vampire, CVC4

Benchmarks with ’unknown’ status Rules ◮ we trust the results of the solver(s) ◮ in case of disagreement we trust solvers that are sound on benchmarks with known status ◮ if there is disagreement between otherwise sound solvers, we exclude the benchmark

Benchmarks with ’unknown’ status Rules ◮ we trust the results of the solver(s) ◮ in case of disagreement we trust solvers that are sound on benchmarks with known status ◮ if there is disagreement between otherwise sound solvers, we exclude the benchmark Outcome ◮ There were 29 benchmarks with unknown status on which solvers disagreed on the result. ◮ On one benchmark (in BV) the corresponding solvers were sound on all benchmarks with known status. ◮ On 28 benchmarks (all in QF FP) the presumably wrong answers were given by unsound solvers.

Competition run of Main Track ◮ run all job pairs with 10 min timeout ◮ made preliminary results available ◮ rerun all job pairs that timed out with 20 min timeout ◮ made final results available on Friday (21st June)

Main Track – Selected results – QF ABV http://smtcomp.sourceforge.net/2017/results-QF ABV.shtml

Main Track: Competition-Wide Scoring Rank Solver Score (sequential) Score (parallel) Z3 171.99 171.99 1 CVC4 161.38 161.76 2 Yices2 110.63 110.63 3 SMTInterpol 65.96 66.00

Application Track

Unsat-core Track Motivation ◮ Important application of SMT-LIB ◮ One step towards verifiable proofs History 2012 introduced � discontinued reinstated as experimental track 2016 “regular” track 2017

Unsat-core Track Main Track benchmark Solver input (set-option :produce-unsat-cores true) (set-logic ... ) (set-logic ... ) (set-info ... ) (set-info ... ) . . . . . . (declare-sort ... ) (declare-sort ... ) (define-sort ... ) (define-sort ... ) (declare-fun ... ) (declare-fun ... ) (define-fun ... ) (define-fun ... ) (assert term0) (assert (! term0 :named y0)) (assert term1) (assert (! term1 :named y1)) (assert term2) (assert (! term2 :named y2)) . . . . . . (check-sat) (check-sat) (exit) (get-unsat-core) (exit)

Unsat-core Track Main Track benchmark Solver input (set-option :produce-unsat-cores true) (set-logic ... ) (set-logic ... ) (set-info ... ) (set-info ... ) . . . . . . (declare-sort ... ) (declare-sort ... ) (define-sort ... ) (define-sort ... ) (declare-fun ... ) (declare-fun ... ) (define-fun ... ) (define-fun ... ) (assert term0) (assert (! term0 :named y0)) (assert term1) (assert (! term1 :named y1)) (assert term2) (assert (! term2 :named y2)) . . . . . . (check-sat) (check-sat) (exit) (get-unsat-core) (exit) Solver output timeout: 40min unsat (y0 y2)

Unsat-core Track Main Track benchmark Solver input Validation script (set-option :produce-unsat-cores true) (set-logic ... ) (set-logic ... ) (set-logic ... ) (set-info ... ) (set-info ... ) (set-info ... ) . . . . . . . . . (declare-sort ... ) (declare-sort ... ) (declare-sort ... ) (define-sort ... ) (define-sort ... ) (define-sort ... ) (declare-fun ... ) (declare-fun ... ) (declare-fun ... ) (define-fun ... ) (define-fun ... ) (define-fun ... ) (assert term0) (assert (! term0 :named y0)) (assert term1) (assert term1) (assert (! term1 :named y1)) ———————– (assert term2) (assert term2) (assert (! term2 :named y2)) (assert term3) . . . . . . . . . (check-sat) (check-sat) (check-sat) (exit) (exit) (get-unsat-core) (exit) Solver output timeout: 40min unsat (y0 y2)

Unsat-core Track Main Track benchmark Solver input Validation script (set-option :produce-unsat-cores true) (set-logic ... ) (set-logic ... ) (set-logic ... ) (set-info ... ) (set-info ... ) (set-info ... ) . . . . . . . . . (declare-sort ... ) (declare-sort ... ) (declare-sort ... ) (define-sort ... ) (define-sort ... ) (define-sort ... ) (declare-fun ... ) (declare-fun ... ) (declare-fun ... ) (define-fun ... ) (define-fun ... ) (define-fun ... ) (assert term0) (assert (! term0 :named y0)) (assert term1) (assert term1) (assert (! term1 :named y1)) ———————– (assert term2) (assert term2) (assert (! term2 :named y2)) (assert term3) . . . . . . . . . (check-sat) (check-sat) (check-sat) (exit) (exit) (get-unsat-core) (exit) Solver output timeout: 40min unsat (y0 y2) timeout: 5min each Validation Validation Validation Validation solver 1 solver 2 solver 3 solver 4 sat/ sat/ sat/ sat/ unknown/ unknown/ unknown/ unknown/ unsat unsat unsat unsat

Unsat-core Track Main Track benchmark Solver input Validation script (set-option :produce-unsat-cores true) (set-logic ... ) (set-logic ... ) (set-logic ... ) (set-info ... ) (set-info ... ) (set-info ... ) . . . . . . . . . (declare-sort ... ) (declare-sort ... ) (declare-sort ... ) (define-sort ... ) (define-sort ... ) (define-sort ... ) (declare-fun ... ) (declare-fun ... ) (declare-fun ... ) (define-fun ... ) (define-fun ... ) (define-fun ... ) (assert term0) (assert (! term0 :named y0)) (assert term1) (assert term1) (assert (! term1 :named y1)) ———————– (assert term2) (assert term2) (assert (! term2 :named y2)) (assert term3) . . . . . . . . . (check-sat) (check-sat) (check-sat) (exit) (exit) Scoring scheme (get-unsat-core) (exit) n = # assert commands – # size of unsatisfiable core Solver output timeout: 40min unsat result erroneous if � 1 result erroneous (y0 y2) timeout: 5min each e = ⊲ wrong check-sat result or 0 otherwise ⊲ unsat-core rejected by validating solvers Validation Validation Validation Validation solver 1 solver 2 solver 3 solver 4 sat/ sat/ sat/ sat/ unknown/ unknown/ unknown/ unknown/ unsat unsat unsat unsat

Unsat-core Track – Statistics 245483 job pairs ∼ 0.01% 8% 92% 18982 226501 29 timeout/crash/ correct incorrect unknown check-sat responses check-sat responses ∼ 99.99% ∼ 0.01% 30 226471 timeout/crash get-unsat-core responses ∼ 0.01% ∼ 99.99% 8 226463 unsatisfiable core unsatisfiable core validated rejected by validating solvers

Unsat-core Track – Statistics 245483 job pairs ∼ 0.01% 8% 92% 18982 226501 29 timeout/crash/ correct incorrect unknown check-sat responses check-sat responses ∼ 99.99% ∼ 0.01% 30 226471 timeout/crash get-unsat-core responses ∼ 0.01% ∼ 99.99% 8 226463 unsatisfiable core unsatisfiable core validated rejected by validating solvers ◮ 19 times there was no consensus among the validating solvers ( � majority decision) ◮ 27525 ( ∼ 12%) times no independent validating solver approved the correctness of the unsatisfiable core

(Very) short presentations of Solvers that sent us slides. Boolector, COLIBRI, CVC4, SMTInterpol, veriT, Yices

12th International Satisfiability Modulo Theories Competition - PowerPoint PPT Presentation

12th International Satisfiability Modulo Theories Competition SMT-COMP 2017 Matthias Heizmann (co-organizer) Giles Reger (co-organizer) Tjark Weber (chair) Outline Main changes over last competition Benchmarks with unknown

Complete Instantiation of Quantified Formulas in Satisfiability Modulo Theories Yeting Ge 1

Introduction to Satisfiability Modulo Theories Combinatorial Problem Solving (CPS) Albert

SMT Unsat Core Minimization OFER GUTHMANN, OFER STRICHMAN, ANNA TRO STANETSKI FMCAD2016 1 SMT

Model-Constructing Satisfiability Calculus Dejan Jovanovi c Clark Barrett Leonardo de Moura

A Survey of Satisfiability Modulo Theory (for mathematicians) David Monniaux VERIMAG GNCS,

13th International Satisfiability Modulo Theories Competition SMT-COMP 2018 Matthias Heizmann

G22.2390-001 Logic in Computer Science Fall 2009 Lecture 10 1 Review Satisfiability Modulo

Satisfiability Modulo Theories Applications to Real-time Fault-Tolerant Systems SAT/SMT Summer

11th International Satisfiability Modulo Theories Competition SMT-COMP 2016 Sylvain Conchon

13th International Satisfiability Modulo Theories Competition SMT-COMP 2018 Matthias Heizmann

SMT-COMP 2019 14th International Satisfiability Modulo Theories Competition Liana Hadarean Antti

Enriched Lawvere Theories theories for Operational Semantics Lawvere theories enriched theories

Satisfiability Modulo Theories and Assignments Maria Paola Bonacina, Stphane Graham-Lengrand,

Proofs in Satisfiability Modulo Theories Clark Barrett (NYU) Leonardo de Moura (Microsoft

CDSAT: Conflict-Driven SATisfiability modulo theories and assignments 1 Maria Paola Bonacina

SMT: Satisfiability Modulo Theories Ranjit Jhala, UC San Diego April 9, 2013 Decision Procedures

Static Analysis Basics II Trent Jaeger Systems and Internet Infrastructure Security (SIIS) Lab

many others LLVM Developers Meeting, San Jose, October 2017 spcl.inf.ethz.ch @spcl_eth

CSE 484 / CSE M 584 Computer Security: Buffer Overflows

A technical introduction to Bitcoin and crypto-currencies Organised by Steven Gordon Room RS410

Sources 1/2 Model 634 Deuterium source (side on, MgF2 window, 30 W) for vacuum ultraviolet

A tale of two schedulers Noah Evans, Richard Barrett, Stephen Olivier, George Stelle

Your First MongoDB Environment: What You Should Know Before Choosing MongoDB as Your Database Me

CS145: INTRODUCTION TO DATA MINING Course Project Overview Instructor: Yizhou Sun