Understanding Random SAT Understanding Random SAT Beyond the - PowerPoint PPT Presentation

Understanding Random SAT Understanding Random SAT Beyond the Clauses-to-Variables Ratio Eugene Nudelman Eugene Nudelman Stanford University joint work with … Kevin Leyton-Brown Kevin Leyton-Brown Holger Hoos Holger Hoos University of British Columbia Alex Devkar Alex Devkar Yoav Shoham Yoav Shoham Stanford University

Introduction Introduction • SAT is one of the most studied most studied problems in CS • Lots known about its worst-case worst-case complexity – But often, particular instances of NP -hard problems like SAT are easy in practice easy in practice • “ Drosophila ” for average-case average-case and empirical empirical (typical-case) complexity studies • (Uniformly) random SAT provides a way to bridge analytical and empirical work CP 2004

Previously Previously … • Easy-hard-less Easy-hard-less hard hard transitions discovered in the behaviour of DPLL-type solvers [Selman, Mitchell, Levesque] – Strongly correlated with phase transition in solvability – Spawned a new enthusiasm for using empirical methods to study algorithm performance 2 1.5 1 0.5 0 • Follow up included study of: -0.5 -1 4 * Pr(SAT) - 2 -1.5 log(Kcnfs runtime) – Islands of tractability [Kolaitis et. al.] -2 3.3 3.5 3.7 3.9 4.1 4.3 4.5 4.7 4.9 5.1 5.3 c / / v – SLS search space topologies [Frank et.al., Hoos] – Backbones [Monasson et.al., Walsh and Slaney] – Backdoors [Williams et. al.] – Random restarts [Gomes et. al.] – Restart policies [Horvitz et.al, Ruan et.al.] – … CP 2004

Empirical Hardness Models Empirical Hardness Models • We proposed building regression models regression models as a disciplined way of predicting and studying algorithms ’ behaviour [Leyton-Brown, Nudelman, Shoham, CP-02] • Applications Applications of this machine learning approach: 1) Predict running time � Useful to know how long how long an algorithm will run 2) Gain theoretical understanding � Which variables are important important to the hardness model? 3) Build algorithm portfolios � Can select the right algorithm on a per-instance per-instance basis 4) Tune distributions for hardness � Can generate harder harder benchmarks by rejecting easy instances CP 2004

Outline Outline • Features Features • Experimental Results – Variable Size Data – Fixed Size Data CP 2004

Features: Local Search Probing Features: Local Search Probing 1200 lauses BEST # Unsat Clauses Short Plateau 1000 BEST # Unsat 800 600 400 Long Plateau 200 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Step Number Step N mber CP 2004

Features: Local Search Probing Features: Local Search Probing 1200 lauses BEST # Unsat Clauses 1000 BEST # Unsat 800 600 Best Solution (mean, CV) 400 200 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Step Number Step N mber CP 2004

Features: Local Search Probing Features: Local Search Probing 1200 lauses BEST # Unsat Clauses 1000 BEST # Unsat 800 600 Number of Steps to Optimal 400 (mean, median, CV, 10%.90%) 200 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Step Number Step N mber CP 2004

Features: Local Search Probing Features: Local Search Probing 1200 lauses BEST # Unsat Clauses Ave. Improvement To 1000 Best Per Step BEST # Unsat (mean, CV) 800 600 400 200 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Step Number Step N mber CP 2004

Features: Local Search Probing Features: Local Search Probing 1200 lauses BEST # Unsat Clauses 1000 BEST # Unsat 800 600 First LM Ratio 400 (mean, CV) 200 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Step Number Step N mber CP 2004

Features: Local Search Probing Features: Local Search Probing 1200 lauses BEST # Unsat Clauses 1000 BestCV BEST # Unsat (CV of Local Minima) 800 (mean, CV) 600 400 200 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Step Number Step N mber CP 2004

Features: DPLL, LP Features: DPLL, LP • DPLL DPLL search space size estimate – Random probing Random probing with unit propagation – Compute mean depth till contradiction – Estimate log(#nodes) • Cumulative number of unit propagations unit propagations at different depths (DPLL with Satz heuristic) • LP relaxation LP relaxation – Objective value – stats of integer slacks – #vars set to an integer CP 2004

Other Features Other Features • Problem Size Problem Size: Var Clause – v (#vars) } used for normalizing Var – c (#clauses) many other features Clause Powers of c / v , v / c , | c / v — 4.26 | – Var • Graphs: Graphs – Va Variable-Clause riable-Clause (VCG, bipartite) Var – Variable Variable (VG, edge whenever two Var variables occur in the same clause) Var – Clause Clause (CG, edge iff two clauses share a variable with opposite sign) Var Var • Balance Balance – #pos vs. #neg literals Clause Clause – unary, binary, ternary clauses • Proximity to Horn formula Horn formula Clause Clause CP 2004

Outline Outline • Features Features • Experimental Results Experimental Results – Variable Size Data – Fixed Size Data CP 2004

Experimental Setup Experimental Setup • Uniform random 3-SAT, 400 vars • Datasets Datasets (20000 instances each) – Variable-ratio Variable-ratio dataset (1 CPU-month) • c / v uniform in [3.26, 5.26] ( ∴ c ∈ [1304,2104]) – Fixed-ratio Fixed-ratio dataset (4 CPU-months) • c / v =4.26 ( ∴ v =400, c =1704) • Solvers Solvers – Kcnfs [Dubois and Dequen] – OKsolver [Kullmann] – Satz [Chu Min Li] • Quadratic Quadratic regression egression with logistic response function • Training : test : validation split – 70 : 15 : 15 CP 2004

Kcnfs Data 2 1.5 1 0.5 0 -0.5 -1 4 * Pr(SAT) - 2 -1.5 log(Kcnfs runtime) -2 3.3 3.5 3.7 3.9 4.1 4.3 4.5 4.7 4.9 5.1 5.3 c / / v CP 2004

Kcnfs Kcnfs Data Data 1000 100 Runtime(s) 10 1 0.1 0.01 3.26 3.76 4.26 4.76 5.26 Clauses-to-Variables Ratio CP 2004

Variable Ratio Prediction (Kcnfs) Variable Ratio Prediction (Kcnfs) 1000 Predicted Runtime [CPU sec] 100 10 1 0.1 0.01 0.01 0.1 1 10 100 1000 Actual Runtime [CPU sec] CP 2004

Variable Ratio - Variable Ratio - UNSAT UNSAT 1000 Predicted Runtime [CPU sec] 100 10 1 0.1 0.01 0.01 0.1 1 10 100 1000 Actual Runtime [CPU sec] CP 2004

Variable Ratio - Variable Ratio - SAT AT 1000 Predicted Runtime [CPU sec] 100 10 1 0.1 0.01 0.01 0.1 1 10 100 1000 Actual Runtime [CPU sec] CP 2004

Kcnfs Kcnfs vs. Satz vs. Satz (UNSAT) (UNSAT) 1000 100 Satz time [CPU sec] 10 1 0.1 0.01 0.01 0.1 1 10 100 1000 Kcnfs time [CPU sec] CP 2004

Kcnfs Kcnfs vs. Satz vs. Satz (SAT) (SAT) 1000 100 Satz time [CPU sec] 10 1 0.1 0.01 0.01 0.1 1 10 100 1000 Kcnfs time [CPU sec] CP 2004

Feature Importance Feature Importance – Variable Ratio Variable Ratio • Subset selection Subset selection can be used to identify features sufficient sufficient for approximating full model performance • Other (correlated) sets could potentially achieve similar performance Cost of Cost of Variable Variable Omission Omission | c / v -4.26 | 100 | c / v -4.26 | 2 69 ( v / c ) 2 × SapsBestCVMean 53 | c / v -4.26 | × SapsBestCVMean 33 CP 2004

Feature Importance Feature Importance – Variable Ratio Variable Ratio • Subset selection Subset selection can be used to identify features sufficient sufficient for approximating full model performance • Other (correlated) sets could potentially achieve similar performance Cost of Cost of Variable Variable Omission Omission | c / v -4.26 -4.26 | 100 | c / v -4.26 -4.26 | 2 69 ( v / c ) 2 × SapsBestCVMean 53 | c / v -4.26 -4.26 | × SapsBestCVMean 33 CP 2004

Feature Importance Feature Importance – Variable Ratio Variable Ratio • Subset selection Subset selection can be used to identify features sufficient sufficient for approximating full model performance • Other (correlated) sets could potentially achieve similar performance Cost of Cost of Variable Variable Omission Omission | c / v -4.26 | 100 | c / v -4.26 | 2 69 ( v / c ) 2 × SapsBestCVMean SapsBestCVMean 53 | c / v -4.26 | × SapsBestCVMean 33 SapsBestCVMean CP 2004

Fixed Ratio Data Fixed Ratio Data 1000 100 Runtime(s) 10 1 0.1 0.01 3.26 3.76 4.26 4.76 5.26 Clauses-to-Variables Ratio CP 2004

Fixed Ratio Prediction (Kcnfs) Fixed Ratio Prediction (Kcnfs) 1000 Predicted Runtime [CPU sec] 100 10 1 0.1 0.01 0.01 0.1 1 10 100 1000 Actual Runtime [CPU sec] CP 2004

Feature Importance Feature Importance – Fixed Ratio Fixed Ratio Cost of Cost of Variable Variable Omission Omission SapsBestSolMean 2 100 SapsBestSolMean × MeanDPLLDepth 74 GsatBestSolCV × MeanDPLLDepth 21 VCGClauseMean × GsatFirstLMRatioMean 9 CP 2004

Understanding Random SAT Understanding Random SAT Beyond the - PowerPoint PPT Presentation

Understanding Random SAT Understanding Random SAT Beyond the Clauses-to-Variables Ratio Eugene Nudelman Eugene Nudelman Stanford University joint work with Kevin Leyton-Brown Kevin Leyton-Brown Holger Hoos Holger Hoos University of

Watched Literals in SAT and CP T opics in this Series Why SAT & Constraints? SAT

Smarter Balanced/SAT Testing Results 2017 1 Smarter Balanced 2 3 4 SAT Achievement Trend 5

SAT SAT SAT SAT To Become an Auto Parts Manufacturing Leader in ASEAN with Excellent Quality

Algorithms for random k -SAT and k -colourings of a random graph Michael Molloy Dept of Computer

SAT and SMT Murphy Berzish Overview Boolean Satisfiability (SAT) problem SAT solvers:

Practical Proof Systems for SAT and QBF Marijn J.H. Heule Dagstuhl Seminar on SAT and

SAT ACT vs Which is best for your student? Aaron Golumbfskie Education Director

Z3: an efficient SAT/SMT solver SAT Problem SAT problem is translate in propositional formula

SAT Course Proposal West Orange High School SAT Data Team Approved December 18, 2017 SAT Data

Incremental SAT Library Integration using Abstract Stobjs Sol Swords Centaur Technology, Inc.

Redesigned SAT Redesigned SAT Category Redesigned SAT Total Testing 3 hours (plus 50 minutes

CDCL SAT Solvers Joao Marques-Silva INESC-ID, IST, ULisbon, Portugal Theory and Practice of SAT

Scheduling and SAT Emmanuel Hebrard Toulouse Outline Introduction 1 Scheduling and SAT

Solving AI Planning Problems with SAT SAT solving Invariants Conclusion References Jussi

CDCL SAT Solvers & SAT-Based Problem Solving Joao Marques-Silva 1 , 2 & Mikolas Janota 2 1

Learning: from CP to SAT and back again Ian Gent University of St Andrews Topics in this Series

State of Alaska, Department of Natural Resources Resource Development Council Presented by: Sara

Business CorrespondenceThe introduction/promotion letter!

New physics effects in neutrino fluxes from cosmic accelerators Poonam Mehta Department of

An overview of the course on Mathematics and Games: Critical Thinking and Problem-Solving Chee

Cyber-Physical Systems Deadline based Scheduling ICEN 553/453 Fall 2018 Prof. Dola Saha 1

Using an Inverted Index Synopsis for Query Latency and Performance Prediction Nicola Tonellotto

1 Numeric types Characters Integer types: char s name comes from representing

Privacy-preserving monitoring of an anonymity network Iain R. Learmonth 3rd February 2019 Tor

Understanding Random SAT Understanding Random SAT Beyond the - PowerPoint PPT Presentation

Understanding Random SAT Understanding Random SAT Beyond the Clauses-to-Variables Ratio Eugene Nudelman Eugene Nudelman Stanford University joint work with Kevin Leyton-Brown Kevin Leyton-Brown Holger Hoos Holger Hoos University of

Watched Literals in SAT and CP T opics in this Series Why SAT &amp; Constraints? SAT

Smarter Balanced/SAT Testing Results 2017 1 Smarter Balanced 2 3 4 SAT Achievement Trend 5

SAT SAT SAT SAT To Become an Auto Parts Manufacturing Leader in ASEAN with Excellent Quality

Algorithms for random k -SAT and k -colourings of a random graph Michael Molloy Dept of Computer

SAT and SMT Murphy Berzish Overview Boolean Satisfiability (SAT) problem SAT solvers:

Practical Proof Systems for SAT and QBF Marijn J.H. Heule Dagstuhl Seminar on SAT and

SAT ACT vs Which is best for your student? Aaron Golumbfskie Education Director

Z3: an efficient SAT/SMT solver SAT Problem SAT problem is translate in propositional formula

SAT Course Proposal West Orange High School SAT Data Team Approved December 18, 2017 SAT Data

Incremental SAT Library Integration using Abstract Stobjs Sol Swords Centaur Technology, Inc.

Redesigned SAT Redesigned SAT Category Redesigned SAT Total Testing 3 hours (plus 50 minutes

CDCL SAT Solvers Joao Marques-Silva INESC-ID, IST, ULisbon, Portugal Theory and Practice of SAT

Scheduling and SAT Emmanuel Hebrard Toulouse Outline Introduction 1 Scheduling and SAT

Solving AI Planning Problems with SAT SAT solving Invariants Conclusion References Jussi

CDCL SAT Solvers &amp; SAT-Based Problem Solving Joao Marques-Silva 1 , 2 &amp; Mikolas Janota 2 1

Learning: from CP to SAT and back again Ian Gent University of St Andrews Topics in this Series

State of Alaska, Department of Natural Resources Resource Development Council Presented by: Sara

Business CorrespondenceThe introduction/promotion letter!

New physics effects in neutrino fluxes from cosmic accelerators Poonam Mehta Department of

An overview of the course on Mathematics and Games: Critical Thinking and Problem-Solving Chee

Cyber-Physical Systems Deadline based Scheduling ICEN 553/453 Fall 2018 Prof. Dola Saha 1

Using an Inverted Index Synopsis for Query Latency and Performance Prediction Nicola Tonellotto

1 Numeric types Characters Integer types: char s name comes from representing

Privacy-preserving monitoring of an anonymity network Iain R. Learmonth 3rd February 2019 Tor

Watched Literals in SAT and CP T opics in this Series Why SAT & Constraints? SAT

CDCL SAT Solvers & SAT-Based Problem Solving Joao Marques-Silva 1 , 2 & Mikolas Janota 2 1