The SAT 2005 Competition Industrial category Certified UNSAT - - PowerPoint PPT Presentation

the sat 2005 competition
SMART_READER_LITE
LIVE PREVIEW

The SAT 2005 Competition Industrial category Certified UNSAT - - PowerPoint PPT Presentation

The SAT 2005 Competition Whats new this year The benchmarks First stage results All categories Random category Crafted category Industrial category Second stage results Random category Crafted category The SAT 2005 Competition


slide-1
SLIDE 1

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

The SAT 2005 Competition

Fourth Edition Daniel Le Berre and Laurent Simon Eighth International Conference on Theory and Applications of Satisfiability Testing, SAT’05

1 / 55

slide-2
SLIDE 2

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Agenda

What’s new this year The benchmarks First stage results All categories Random category Crafted category Industrial category Second stage results Random category Crafted category Industrial category Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

2 / 55

slide-3
SLIDE 3

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

They support us

Thank you!

3 / 55

slide-4
SLIDE 4

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

The new judges

Armin Biere Specialist about industrial benchmarks and solvers. Olivier Kullmann Specialist about k-SAT. Generated all the benchmarks for the random category. Allen van Gelder Well aware of the CASC competition. Proposed the new scoring scheme. Managed the certified unsat special track. All the decisions were taken in agreement with the judges

4 / 55

slide-5
SLIDE 5

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

The special tracks

Certified UNSAT a specific category in which the solvers must output a certificate of unsatisfiability. The proof format and a proof checker were provided by Allen van Gelder. Only two participants: zchaff and ttsp-3.0 Pseudo Boolean evaluation dedicated to solvers managing pseudo-boolean constraints and optimization functions. Managed by Vasco Manquinho and Olivier Roussel http://www.cril.univ-artois.fr/PB05/ 8 solvers (17 variants) from 8 submitters. Non clausal evaluation dedicated to solvers able to take gates as input. The input format was provided by Fahiem Bacchus and Toby Walsh. No solver

  • submission. One benchmark submission.

5 / 55

slide-6
SLIDE 6

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

What’s new in the rules

◮ Competition and Demonstration divisions.

Competition the source code of the solver must be available after the competition. Demonstration a binary version of the solver must be available for research purpose.

◮ Participation to the competition must benefit to the

community

◮ By providing source code, binary or benchmarks ◮ By supporting the conference and the competition 6 / 55

slide-7
SLIDE 7

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

The new scoring scheme

Benchmark purse to be divided equally among the solvers able to solve it. Speed purse to be divided unequally among the solvers able to solve a given benchmark. Series an extra credit is given for each series solved. Solver his score is the sum of the credits obtained per benchmarks solved.

7 / 55

slide-8
SLIDE 8

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

The new award scheme

◮ Three categories: industrial, crafted and random ◮ Three specialties: SAT, UNSAT and SAT+UNSAT ◮ Three medals: gold, silver and bronze

So we have a total of 27 awards this year!

8 / 55

slide-9
SLIDE 9

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Invariants

◮ Only 3 solvers per submitter can enter the first stage,

competition division.

◮ Only 1 solver per submitter can enter the second stage,

competition division.

9 / 55

slide-10
SLIDE 10

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Random category

◮ 3-SAT, 5-SAT, 7-SAT ◮ From 400 to 10000 variables. ◮ 285 SAT and 105 UNSAT benchmarks ◮ Answers known in advance

10 / 55

slide-11
SLIDE 11

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Industrial category

Zarpas New formal verification benchmarks from IBM (FV 2004) Velev Known VLIW-SAT (2.0 and 4.0), VLIW-UNSAT 2.0 and Liveness UNSAT 2.0 Grieu VMPC invertion, open cryptographic problem Narain VPN models generated from Alloy Maris Planning benchmarks Wider range of problems than in previous edition.

11 / 55

slide-12
SLIDE 12

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Crafted category

Sat’04 Previous year hard, unsolved benchmarks Biere LinvRinv benchmarks (proposed by Cook last year) Sabharwal Counting/Ordering/Pebbling problems Jarvisalo Based on 3-Regular graphs Lynce Social Golfer problem (A golf problem in St Andrews?) Sorge Algebraic benchmarks Markstrom Problems generating long learned clauses. Roussel PHNF form of previous year medium benchmarks Wider range of problems than in previous edition.

12 / 55

slide-13
SLIDE 13

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Environment

The hardware: LRI 16 Athlon 1800+ with 1GB RAM UC 8 Athlon 1800+ with 2 GB RAM 32 Pentium III 450 with 1GB of RAM

◮ Running GNU Linux (RH flavor). ◮ Solvers compiled with GCC 3.3.5. ◮ Java solver using Java 1.5.0 02 JVM.

Provided by:

◮ LINC Lab, Department of ECECS, University of

Cincinnati

◮ LRI, Universit´

e de Paris-Sud

13 / 55

slide-14
SLIDE 14

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

The first stage

◮ Aim: to detect the most promising solvers for a given

(category,specialty)

◮ 20 minutes timeout (Greater than in previous years) ◮ Solvers answering incorrectly move to demonstration

division

14 / 55

slide-15
SLIDE 15

Overview on all the benchmarks (see posters)

100 200 300 400 500 600 700 800 900 1000 200 400 600 800 1000 1200 1400 ! tts!3!0 ! lsatv1.1 ! (SatELiterelease) ! kcnfs!2004 ! adaptnovelty ! saps ! (rpaws5) ! (kcnfs) ! wllsatv1 ! (rpaws40) ! hsatrr ! rpaws10 ! rrsaps ! vw ! DewSatz1c ! DewSatz1b ! g2wsat ! ranov ! (DewSatz1e) ! (DewSatz1d) ! DewSatz1a ! compsat ! hsat.5 ! hsat.1 ! sat4j.jar ! (CirCUsB) ! (CirCUsA) ! marchdl ! zchaff ! HaifaSat2 ! Jerusat1.31A ! Jerusat1.31B ! (CirCUsD) ! (midisatstatic) ! HaifaSat ! zchaffrand ! vallst.sh ! csat ! (eurekaB) ! (eurekaC) ! (eurekaA) ! minisatstatic ! SatELiteGTI

#Solved CPU!Time needed (s)

All solvers on All benchmarks

tts!3!0 (153) lsatv1.1 (217) (SatELiterelease) (294) kcnfs!2004 (305) adaptnovelty (315) saps (333) (rpaws5) (337) (kcnfs) (377) wllsatv1 (386) (rpaws40) (394) hsatrr (402) rpaws10 (402) rrsaps (432) vw (435) DewSatz1c (441) DewSatz1b (445) g2wsat (445) ranov (457) (DewSatz1e) (467) (DewSatz1d) (476) DewSatz1a (488) compsat (587) hsat.5 (590) hsat.1 (591) sat4j.jar (596) (CirCUsB) (606) (CirCUsA) (621) marchdl (621) zchaff (630) HaifaSat2 (637) Jerusat1.31A (641) Jerusat1.31B (641) (CirCUsD) (643) (midisatstatic) (644) HaifaSat (653) zchaffrand (655) vallst.sh (667) csat (694) (eurekaB) (696) (eurekaC) (697) (eurekaA) (710) minisatstatic (780) SatELiteGTI (818)

slide-16
SLIDE 16

Clustering on all the benchmarks

83 166 249 332 415 Dew_Satz_1b Dew_Satz_1c Dew_Satz_1a (Dew_Satz_1d) (Dew_Satz_1e) wllsatv1 (kcnfs) kcnfs!2004 adaptnovelty saps g2wsat rpaws10 vw (rpaws40) rrsaps ranov (rpaws5) lsatv1.1 tts!3!0 (siege4) (SatELite_release) (CirCUsA) (CirCUsB) (CirCUsD) vallst.sh compsat zchaff csat (eureka_A) (eureka_B) HaifaSat HaifaSat2 zchaff_rand (eureka_C) Jerusat1.31_A Jerusat1.31_B (midisat_static) sat4j.jar minisat_static SatELiteGTI hsat.1 hsat.5 hsatrr march_dl

Solvers Distance (#Benchs over 1657)

SAT 2005 Clustering of all solvers on all benchmarks

405 401 448 436 431 356 357 305 294 308 416 375 409 365 402 428 316 211 153 276 583 568 602 623 544 584 641 659 644 610 594 611 646 596 596 598 549 731 768 547 546 373 577

406, 400 558, 535 454, 430 325, 277 678, 625 629, 575 604, 547 412, 355 458, 383 627, 565 431, 360 370, 292 648, 538 694, 561 458, 331 478, 301 716, 549 801, 698 489, 341 478, 260 739, 533 508, 260 758, 494 774, 479 513, 240 634, 494 788, 446 647, 500 153, 0 522, 294 719, 490 558, 373 864, 414 273, 0 883, 388 669, 351 567, 194 945, 384 739, 120 972, 242 493, 0 946, 0 1167, 0

slide-17
SLIDE 17

Overview of the solvers on random benchmarks

20 40 60 80 100 120 140 160 180 200 400 600 800 1000 1200 1400 ! (CirCUsA) ! compsat ! HaifaSat ! lsatv1.1 ! (CirCUsD) ! (CirCUsB) ! (eurekaB) ! HaifaSat2 ! hsatrr ! Jerusat1.31A ! (eurekaA) ! (eurekaC) ! Jerusat1.31B ! (SatELiterelease) ! zchaffrand ! zchaff ! hsat.1 ! vallst.sh ! hsat.5 ! csat ! (midisatstatic) ! sat4j.jar ! SatELiteGTI ! minisatstatic ! wllsatv1 ! (DewSatz1d) ! marchdl ! DewSatz1b ! DewSatz1c ! DewSatz1a ! saps ! adaptnovelty ! (DewSatz1e) ! (rpaws40) ! rrsaps ! (rpaws5) ! (kcnfs) ! kcnfs!2004 ! vw ! rpaws10 ! g2wsat ! ranov

#Solved CPU!Time needed (s)

All solvers on Random benchmarks

(CirCUsA) (3) compsat (3) HaifaSat (3) lsatv1.1 (3) (CirCUsD) (4) (CirCUsB) (5) (eurekaB) (5) HaifaSat2 (5) hsatrr (5) Jerusat1.31A (7) (eurekaA) (9) (eurekaC) (9) Jerusat1.31B (11) (SatELiterelease) (12) zchaffrand (12) zchaff (16) hsat.1 (17) vallst.sh (17) hsat.5 (19) csat (27) (midisatstatic) (46) sat4j.jar (50) SatELiteGTI (54) minisatstatic (56) wllsatv1 (67) (DewSatz1d) (73) marchdl (74) DewSatz1b (84) DewSatz1c (85) DewSatz1a (87) saps (101) adaptnovelty (107) (DewSatz1e) (107) (rpaws40) (112) rrsaps (116) (rpaws5) (139) (kcnfs) (140) kcnfs!2004 (140) vw (148) rpaws10 (151) g2wsat (158) ranov (178)

slide-18
SLIDE 18

Overview of the size of crafted benchmarks

10

3

10

4

10

5

10

6

biere05/linvrinv jarvisalo05/mod2!3cage!unsat jarvisalo05/mod2!3g14!sat jarvisalo05/mod2!rand3bip!sat arvisalo05/mod2!rand3bip!unsat jarvisalo05/mod2c!3cage!unsat jarvisalo05/mod2c!rand3bip!sat rvisalo05/mod2c!rand3bip!unsat lynce05/social!golfer!problem markstrom05/eulcbip markstrom05/pmg roussel05/cnfcolor!PHNF roussel05/equilarge!PHNF roussel05/visbmc!PHNF sabharwal05/counting/clqcolor/sat al05/counting/clqcolor/unsat/set!a al05/counting/clqcolor/unsat/set!b abharwal05/counting/fclqcolor/sat 05/counting/fclqcolor/unsat/set!a 05/counting/fclqcolor/unsat/set!b sabharwal05/counting/fphp/sat rwal05/counting/fphp/unsat/easier wal05/counting/fphp/unsat/harder sabharwal05/counting/php/sat rwal05/counting/php/unsat/easier rwal05/counting/php/unsat/harder harwal05/ordering/gt!ordering/sat wal05/ordering/gt!ordering/unsat rwal05/pebbling/grid!pebbling/sat al05/pebbling/grid!pebbling/unsat 05/pebbling/random!pebbling/sat /pebbling/random!pebbling/unsat abharwal05/planning/logistics/sat harwal05/planning/logistics/unsat sat04/gomes03 sorge05/QG6 sorge05/QG7/ sorge05/QG7a sorge05/QG8

Distribution of the sizes of crafted benchmarks

# of literals

slide-19
SLIDE 19

Overview of the solvers on crafted benchmarks

50 100 150 200 250 300 350 400 200 400 600 800 1000 1200 1400 ! kcnfs!2004 ! (rpaws5) ! tts!3!0 ! lsatv1.1 ! adaptnovelty ! (kcnfs) ! saps ! rpaws10 ! (SatELiterelease) ! (rpaws40) ! g2wsat ! vw ! ranov ! wllsatv1 ! rrsaps ! (DewSatz1e) ! hsatrr ! DewSatz1c ! DewSatz1b ! DewSatz1a ! (DewSatz1d) ! sat4j.jar ! compsat ! HaifaSat2 ! HaifaSat ! hsat.5 ! hsat.1 ! (CirCUsB) ! Jerusat1.31B ! zchaff ! (eurekaC) ! (CirCUsA) ! (CirCUsD) ! zchaffrand ! Jerusat1.31A ! csat ! (midisatstatic) ! (eurekaB) ! marchdl ! (eurekaA) ! minisatstatic ! SatELiteGTI ! vallst.sh

#Solved CPU!Time needed (s)

All solvers on Crafted benchmarks

kcnfs!2004 (84) (rpaws5) (97) tts!3!0 (98) lsatv1.1 (103) adaptnovelty (108) (kcnfs) (120) saps (121) rpaws10 (126) (SatELiterelease) (140) (rpaws40) (148) g2wsat (156) vw (156) ranov (163) wllsatv1 (166) rrsaps (176) (DewSatz1e) (192) hsatrr (206) DewSatz1c (227) DewSatz1b (232) DewSatz1a (233) (DewSatz1d) (234) sat4j.jar (256) compsat (269) HaifaSat2 (292) HaifaSat (294) hsat.5 (306) hsat.1 (307) (CirCUsB) (309) Jerusat1.31B (312) zchaff (312) (eurekaC) (316) (CirCUsA) (318) (CirCUsD) (320) zchaffrand (320) Jerusat1.31A (321) csat (324) (midisatstatic) (325) (eurekaB) (337) marchdl (337) (eurekaA) (342) minisatstatic (348) SatELiteGTI (362) vallst.sh (368)

slide-20
SLIDE 20

Overview of the size of industrial benchmarks

10

3

10

4

10

5

10

6

10

7

grieu05/vmpc maris05/Depots maris05/DriverLog maris05/Ferry maris05/Rovers maris05/Satellite narain05/vpn v05/liveness!unsat!2!0 velev05/vliw!sat!2!0 velev05/vliw!sat!4!0 velev05/vliw!unsat!2!0 zarpas05/01 zarpas05/07 zarpas05/18 zarpas05/1_11 zarpas05/20 zarpas05/23 zarpas05/26 zarpas05/29 zarpas05/2_14

Distribution of the sizes of industrial benchmarks

# of literals

slide-21
SLIDE 21

Overview of the solvers on industrial benchmarks

50 100 150 200 250 300 350 400 450 500 200 400 600 800 1000 1200 1400 ! tts!3!0 ! kcnfs!2004 ! adaptnovelty ! (rpaws5) ! lsatv1.1 ! saps ! ranov ! (kcnfs) ! rpaws10 ! DewSatz1b ! DewSatz1c ! g2wsat ! vw ! (rpaws40) ! rrsaps ! (SatELiterelease) ! wllsatv1 ! DewSatz1a ! (DewSatz1e) ! (DewSatz1d) ! hsatrr ! marchdl ! hsat.5 ! hsat.1 ! (midisatstatic) ! vallst.sh ! sat4j.jar ! (CirCUsB) ! (CirCUsA) ! zchaff ! Jerusat1.31A ! compsat ! Jerusat1.31B ! (CirCUsD) ! zchaffrand ! HaifaSat2 ! csat ! (eurekaB) ! HaifaSat ! (eurekaA) ! (eurekaC) ! minisatstatic ! SatELiteGTI

#Solved CPU!Time needed (s)

All solvers on Industrial benchmarks

tts!3!0 (55) kcnfs!2004 (81) adaptnovelty (100) (rpaws5) (101) lsatv1.1 (111) saps (111) ranov (116) (kcnfs) (117) rpaws10 (125) DewSatz1b (129) DewSatz1c (129) g2wsat (131) vw (131) (rpaws40) (134) rrsaps (140) (SatELiterelease) (142) wllsatv1 (153) DewSatz1a (168) (DewSatz1e) (168) (DewSatz1d) (169) hsatrr (191) marchdl (210) hsat.5 (265) hsat.1 (267) (midisatstatic) (273) vallst.sh (282) sat4j.jar (290) (CirCUsB) (292) (CirCUsA) (300) zchaff (302) Jerusat1.31A (313) compsat (315) Jerusat1.31B (318) (CirCUsD) (319) zchaffrand (323) HaifaSat2 (340) csat (343) (eurekaB) (354) HaifaSat (356) (eurekaA) (359) (eurekaC) (372) minisatstatic (376) SatELiteGTI (402)

slide-22
SLIDE 22

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

The second stage

22 / 55

slide-23
SLIDE 23

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

And Now....

The final results

23 / 55

slide-24
SLIDE 24

Overview of the solvers on random benchmarks

50 100 150 200 250 1000 2000 3000 4000 5000 6000 7000 ! minisatstatic ! SatELiteGTI ! marchdl ! saps ! wllsatv1 ! DewSatz1a ! adaptnovelty ! kcnfs!2004 ! vw ! g2wsat ! ranov

#Solved CPU!Time needed (s)

Second Stage: All solvers on Random benchmarks

minisatstatic (78) SatELiteGTI (79) marchdl (99) saps (104) wllsatv1 (104) DewSatz1a (118) adaptnovelty (119) kcnfs!2004 (167) vw (170) g2wsat (178) ranov (209)

slide-25
SLIDE 25

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Random SAT specialty, the winners

  • 1. ranov
  • 2. g2wsat
  • 3. vw

25 / 55

slide-26
SLIDE 26

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Random SAT specialty, the winners

Solver Score SAT answers UNSAT answers ranov 163903 209 g2wsat 101286 178 vw 76002 170 adaptnovelty 21748 119 saps 15603 104 kcnfs-2004 14604 92 dSatz-1a 8943 68 march-dl 7444 56 wllsatv1 7202 59 satELiteGTI 5198 46 minisat 5147 45

26 / 55

slide-27
SLIDE 27

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Random UNSAT specialty, the winners

  • 1. kcnfs-2004
  • 2. march-dl
  • 3. sSatz-1a

27 / 55

slide-28
SLIDE 28

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

UNSAT specialty, the complete ranking

Solver Score SAT answers UNSAT answers kcnfs-2004 97930 75 march-dl 25228 43 dewSatz-1a 19456 50 wllsatv1 12902 45 minisat 7369 33 satELiteGTI 7335 33

28 / 55

slide-29
SLIDE 29

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Random SAT+UNSAT specialty, the winners

  • 1. kcnfs-2004
  • 2. march-dl
  • 3. sSatz-1a

29 / 55

slide-30
SLIDE 30

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Random SAT+UNSAT specialty, the complete ranking

Solver Score SAT answers UNSAT answers kcnfs-2004 95075 92 75 march-dl 27141 56 43 dSatz-1a 22940 68 50 wllsatv1 16145 59 45 satELiteGTI 10074 46 33 minisat 10058 45 33

30 / 55

slide-31
SLIDE 31

Overview of the solvers on crafted benchmarks

based on a selection of the first stage benchmarks

50 100 150 200 250 300 350 400 1000 2000 3000 4000 5000 6000 7000 ! tts!3!0 ! hsat.1 ! Jerusat1.31A ! zchaff ! marchdl ! zchaffrand ! csat ! minisatstatic ! vallst.sh ! SatELiteGTI

#Solved CPU!Time needed (s)

Second Stage: All solvers on Crafted benchmarks

tts!3!0 (102) hsat.1 (324) Jerusat1.31A (347) zchaff (356) marchdl (357) zchaffrand (358) csat (371) minisatstatic (379) vallst.sh (387) SatELiteGTI (400)

slide-32
SLIDE 32

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Crafted SAT specialty, the winners

  • 1. vallst
  • 2. march-dl
  • 3. hsat-1

32 / 55

slide-33
SLIDE 33

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Crafted SAT specialty, the complete ranking

Solver Score SAT answers UNSAT answers vallst 31258 138 march-dl 27656 138 hsat-1 20156 130 satELiteGTI 17418 122 minisat 17210 122 csat 13791 113 zchaff 13692 112 zchaff-rand 11431 107 jerusat-A 10702 104 tts 475 5

33 / 55

slide-34
SLIDE 34

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Crafted UNSAT specialty, the winners

  • 1. satEliteGTI
  • 2. minisat
  • 3. vallst and march-dl

34 / 55

slide-35
SLIDE 35

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Crafted UNSAT specialty, the complete ranking

Solver Score SAT answers UNSAT answers satELiteGTI 35639 126 minisat 26159 121 vallst 25532 100 march-dl 25371 99 csat 23878 112 tts-3-0 20765 54 hsat-1 19936 90 zchaff 14359 89 zchaff-rand 12419 78 jerusat-A 9275 77

35 / 55

slide-36
SLIDE 36

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Crafted SAT+UNSAT specialty, the winners

  • 1. vallst
  • 2. satEliteGTI
  • 3. march-dl

36 / 55

slide-37
SLIDE 37

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Crafted SAT+UNSAT specialty, the complete ranking

Solver Score SAT answers UNSAT answers vallst 56445 138 100 satELiteGTI 53128 122 126 march-dl 52432 138 99 minisat 43691 122 121 hsat-1 39497 130 90 csat 38324 113 112 zchaff 27455 112 89 zchaff-rand 24171 107 78 tts-3-0 21298 5 54 jerusat-A 19632 104 77

37 / 55

slide-38
SLIDE 38

Overview of the solvers on original industrial benchmarks

50 100 150 200 250 300 2000 4000 6000 8000 10000 12000 14000 ! wllsatv1 ! hsat.5 ! vallst.sh ! sat4j.jar ! compsat ! zchaff ! zchaffrand ! csat ! HaifaSat ! Jerusat1.31B ! minisatstatic ! SatELiteGTI

#Solved CPU!Time needed (s)

Second Stage: All solvers on renamed Industrial benchmarks

wllsatv1 (92) hsat.5 (153) vallst.sh (154) sat4j.jar (180) compsat (189) zchaff (197) zchaffrand (226) csat (231) HaifaSat (242) Jerusat1.31B (243) minisatstatic (250) SatELiteGTI (267)

slide-39
SLIDE 39

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Industrial SAT specialty, the winners

  • 1. satEliteGTI
  • 2. minisat
  • 3. jerusat-B and haifaSat

39 / 55

slide-40
SLIDE 40

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Industrial SAT Specialty, the complete ranking

Solver Score SAT answers UNSAT answers satELiteGTI 73506 180 minisat 50985 166 jerusat-B 38625 163 haifaSat 28428 151 zchaff-rand 24885 132 csat 21997 140 zchaff 19236 121 compsat 16715 114 sat4j 12898 110 wllsatv1 11390 86 hsat-5 11046 99 vallst 7757 85

40 / 55

slide-41
SLIDE 41

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Industrial UNSAT specialty, the winners

  • 1. satEliteGTI
  • 2. zchaff-rand
  • 3. haifaSat

41 / 55

slide-42
SLIDE 42

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Industrial UNSAT specialty, the complete ranking

Solver Score SAT answers UNSAT answers satELiteGTI 27518 87 zchaff-rand 26792 94 haifaSat 23666 91 minisat 19863 84 csat 15892 91 zchaff 13829 76 jerusat-B 10225 80 hsat-5 10029 54 vallst 9192 69 compsat 9097 75 sat4j 8654 70 wllsatv1 1053 6

42 / 55

slide-43
SLIDE 43

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Industrial SAT+UNSAT specialty, the winners

  • 1. satEliteGTI
  • 2. minisat
  • 3. zchaff-rand and haifaSat

43 / 55

slide-44
SLIDE 44

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

SAT+UNSAT specialty, the complete ranking

Solver Score SAT answers UNSAT answers ssatELiteGTI 99662 180 87 minisat 69485 166 84 haifaSat 50931 151 91 zchaff-rand 50515 132 94 jerusat-B 47487 163 80 csat 36526 140 91 compsat 25399 114 75 zchaff 31702 121 76 sat4j 21097 110 70 hsat-5 20995 99 54 vallst 16874 85 69 wllsatv1 12467 86 6

44 / 55

slide-45
SLIDE 45

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation 45 / 55

slide-46
SLIDE 46

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

The proof format

All that job was done by Allen van Gelder

◮ For each benchmarks bench found unsat. a proof file

bench.proof must be given.

◮ There are two possibilities for the proof format:

Resolution format Each resolution steps are provided and resolvants are explicitly provided. Trace format Only the resolution steps are provided. Not the resolvants. More compact format.

◮ A checker can check a proof using the resolution format. ◮ The trace format can be converted into the resolution

format.

46 / 55

slide-47
SLIDE 47

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Examples of the proof formats

Example (Original input file)

p cnf 2 3 1 -2 0 1 2 0

  • 1 0

Example (possible proof, resolution format)

4 2 1 2 2 1 1 2 5 1 3 4 0 0

Example (possible proof, trace format)

4 2 1 2 5 1 3 4

47 / 55

slide-48
SLIDE 48

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

The contestants

zchaff Use the resolution format ttsp Use the trace format

◮ The solvers behave quite differently: only 21

benchmarks are found unsat by both solvers.

◮ They do not use the same output format.

48 / 55

slide-49
SLIDE 49

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Zchaff vs TTS: running time

TTS ZCHAFF CPU Time CERT ORIG Cost Size CERT ORIG Cost Size sat05-561 0,01 0,02

  • 50,00

5

  • 25,01

7 sat05-562 6,83 4,38 55,94 415 0,09 0,01 633,33 982 sat05-1273 (471,36) 286,59 64,47 12,55 8,42 49,13 54066 sat05-1171 1,4 1,36 2,94 107 6,42 5,06 26,94 19907 sat05-1172 7,89 7,06 11,76 594 41,11 34, 19,84 91341 sat05-1185 2,15 1,87 14,97 245 6,28 5 25,66 15153 sat05-1186 12,26 9,71 26,26 1377 65,28 56,11 16,35 132093 sat05-1213 90,97 3,83 2275,20 9254 12,18 9,57 27,20 28493 sat05-1214 (403,76) 21,41 1785,85 45,39 40,07 13,29 84726 sat05-1227 (523,35) 6,69 7722,87 8,08 6,33 27,53 24433 sat05-1228 (600,91) 30,27 1885,17 60,79 50,88 19,50 102763 sat05-2308 143,54 7,76 1749,74 41533 30,23 26,4 14,51 43531 sat05-2309 (153,31) 7,56 1927,91 16,03 13,37 19,91 22741 sat05-2323 (162,45) 12,75 1174,12 85,7 69,31 23,65 53554 sat05-2325 (124,59) 12,78 874,88 232,9 195,33 19,23 91270 sat05-2594 7,93 0,96 726,04 5153 34,92 27,41 27,37 50009 sat05-2595 45,5 1,12 3962,50 16756 5,75 3,97 44,74 18110 sat05-2596 161,96 1,23 13067,48 49384 25,8 19,66 31,23 39092 sat05-2610 221,81 2,08 10563,94 61406 71,13 58,1 22,42 65084 sat05-2625 50,79 2,6 1853,46 27962 40,02 32,01 25,03 41484 sat05-2654 (196,8) 7,39 2563,06 205,39 174,87 17,45 89099 Size in KB. 49 / 55

slide-50
SLIDE 50

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Size of the Problems

sat05-561 24 61 sat05-562 90 262 sat05-1273 116 1362 sat05-1171 80 370 sat05-1172 120 672 sat05-1185 90 415 sat05-1186 132 738 sat05-1213 80 650 sat05-1214 120 1212 sat05-1227 90 775 sat05-1228 132 1398 sat05-2308 120 480 sat05-2309 120 480 sat05-2323 140 560 sat05-2325 140 560 sat05-2594 90 240 sat05-2595 90 240 sat05-2596 90 240 sat05-2610 105 280 sat05-2625 120 320 sat05-2654 150 400

50 / 55

slide-51
SLIDE 51

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Zchaff certificates on easy UNSAT IBM benchmarks

Bench Cert Orig Cost (%) Size (KB) 01 SAT dat.k10 0,9 0,23 284,19 12656 07 SAT dat.k30 4,21 3,41 23,41 11353 07 SAT dat.k35 5,11 4,29 19,02 11836 18 SAT dat.k10 35,93 0,32 10957,58 (711381) 18 SAT dat.k15 (103,42) 3,64 2743,20

  • 1 11 SAT dat.k10

3,59 0,64 462,38 54786 1 11 SAT dat.k15 30,08 3,65 724,87 (525236) 20 SAT dat.k10 3,02 0,29 931,40 54839 23 SAT dat.k10 0,78 0,23 236,80 10251 23 SAT dat.k15 (98,71) 1,23 7913,41

  • 26 SAT dat.k10

12,34 7,95 55,27 148 2 14 SAT dat.k10 15,09 0,48 3017,36 (286783) 2 14 SAT dat.k15 (102,2) 1,49 6778,59

  • 51 / 55
slide-52
SLIDE 52

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Size of the ”easy” IBM benchmarks

Benchmark # var # clauses Size 01 SAT dat.k10.cnf 9275 38802 12656 07 SAT dat.k30.cnf 11081 31034 11353 07 SAT dat.k35.cnf 12116 33469 11836 1 11 SAT dat.k10.cnf 28280 111519 54786 (1 11 SAT dat.k15.cnf) 44993 178110 (525236) (18 SAT dat.k10.cnf) 17141 69989 (711381) —18 SAT dat.k15.cnf— 25915 106325 20 SAT dat.k10.cnf 17567 72087 54839 (2 14 SAT dat.k10.cnf) 12859 49351 (286783) —2 14 SAT dat.k15.cnf— 20302 78395 23 SAT dat.k10.cnf 18612 76086 10251 —23 SAT dat.k15.cnf— 29106 119635 26 SAT dat.k10.cnf 55591 277611 148

52 / 55

slide-53
SLIDE 53

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

The reasons for failure

Input format discussable several standard format do exist for gate descriptions (netlists, trace format,...). No benchmarks to play with the only benchmarks currently available in the edimacs format are the ones submitted for the special track. No solver to play with There is currently no solver able to read the edimacs input format.

53 / 55

slide-54
SLIDE 54

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

Next SAT contest?

Will be in 2007... book your T-shirts now!

54 / 55

slide-55
SLIDE 55

The SAT 2005 Competition What’s new this year The benchmarks First stage results

All categories Random category Crafted category Industrial category

Second stage results

Random category Crafted category Industrial category

Certified UNSAT Special track Non clausal special track Next contest? Pseudo Boolean evaluation

The first pseudo boolean solver evaluation

To be presented by Olivier Roussel and Vasco Manquinho

55 / 55