methodology for comparison and ranking of sat solvers
play

Methodology for Comparison and Ranking of SAT Solvers Mladen Nikoli - PowerPoint PPT Presentation

Introduction Preliminaries Methodology Evaluation Related work Conclusions Methodology for Comparison and Ranking of SAT Solvers Mladen Nikoli c Third Workshop on Formal and Automated Theorem Prooving and Applications January 29, 2010.


  1. Introduction Preliminaries Methodology Evaluation Related work Conclusions Methodology for Comparison and Ranking of SAT Solvers Mladen Nikoli´ c Third Workshop on Formal and Automated Theorem Prooving and Applications January 29, 2010. Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  2. Introduction Preliminaries Methodology Evaluation Related work Conclusions Overview 1 Introduction 2 Preliminaries 3 Methodology 4 Evaluation 5 Related work 6 Conclusions Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  3. Introduction Preliminaries Methodology Evaluation Related work Conclusions Overview 1 Introduction 2 Preliminaries 3 Methodology 4 Evaluation 5 Related work 6 Conclusions Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  4. Introduction Preliminaries Methodology Evaluation Related work Conclusions Comparison of SAT solvers SAT solvers Importance of SAT solver comparison Large number of proposed modifications each year Their usefulness is not self-evident We need to discriminate better between good and bad ideas Current approach Unreliable Sometimes inconclusive No discussion if the observed difference could arise by chance Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  5. Introduction Preliminaries Methodology Evaluation Related work Conclusions Motivation Graph coloring Industrial Solver Best Worst Best Worst MiniSAT 09z 180 157 159 112 minisat cumr r 190 180 150 108 minisat2 200 183 140 93 MiniSat2hack 200 183 141 94 Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  6. Introduction Preliminaries Methodology Evaluation Related work Conclusions Main goals Eliminate chance effects from the comparison Decide if there is an overall positive or negative effect Give an information on statistical significance of the difference Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  7. Introduction Preliminaries Methodology Evaluation Related work Conclusions Main difficulties Censored observations Comparison of distributions of solving times for one instance Combining conclusions obtained on individual instances Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  8. Introduction Preliminaries Methodology Evaluation Related work Conclusions Overview 1 Introduction 2 Preliminaries 3 Methodology 4 Evaluation 5 Related work 6 Conclusions Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  9. Introduction Preliminaries Methodology Evaluation Related work Conclusions Statistical hypothesis testing Null hypothesis H 0 Test statistic T p = P ( | T | ≥ t | H 0 ) If p < α then reject H 0 Effect size Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  10. Introduction Preliminaries Methodology Evaluation Related work Conclusions Comparing two distributions Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  11. Introduction Preliminaries Methodology Evaluation Related work Conclusions Point biserial correlation Point biserial correlation ρ pb can be estimated by � N i =1 ( X i − X )( Y i − Y ) r pb = �� N �� N i =1 ( X i − X ) 2 i =1 ( Y i − Y ) 2 ρ pb , r pb ∈ [ − 1 , +1] Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  12. Introduction Preliminaries Methodology Evaluation Related work Conclusions Point biserial correlation Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  13. Introduction Preliminaries Methodology Evaluation Related work Conclusions Handling censored data Gehan statistic W G E ( W G ) = P ( X > Y ) − P ( X < Y ) 1 − E ( W G ) = P ( X < Y ) 2 Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  14. Introduction Preliminaries Methodology Evaluation Related work Conclusions Overview 1 Introduction 2 Preliminaries 3 Methodology 4 Evaluation 5 Related work 6 Conclusions Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  15. Introduction Preliminaries Methodology Evaluation Related work Conclusions Sketch of the methodology H 0 : no difference in solver performance Choose the level of statistical significance α Calculate differences d i between samples of solving times of F i Under the null hypothesis the average of d i shouldn’t be too large Estimate the p value and check the significance of the average difference Check and interpret the effect size Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  16. Introduction Preliminaries Methodology Evaluation Related work Conclusions Choice of function d What could be a good choice for function d ? ρ pb ? π = P ( X < Y )? Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  17. Introduction Preliminaries Methodology Evaluation Related work Conclusions Choice of function d Theorem Under some reasonable conditions the following relations hold W G = S R S Y r pb (1) n 1 n 2 var ( W G ) → 1 ( n 1 + n 2 → ∞ ) (2) S 2 R S 2 Y 2 var ( r pb ) n 2 1 n 2 where � n 1 + n 2 � � � ( X i − X ) 2 S X = � i =1 Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  18. Introduction Preliminaries Methodology Evaluation Related work Conclusions Determining statistical significance How is the average of d i distributed (choosing r pb for d i )? M z = 1 � z ( r i ) M i =1 � M M � 1 z ( ρ i ) , 1 var ( r i ) � � z ∼ N M 2 (1 − r 2 i ) 2 M i =0 i =1 Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  19. Introduction Preliminaries Methodology Evaluation Related work Conclusions Determining effect size Averages of estimates of ρ pb or π on individual formulae Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  20. Introduction Preliminaries Methodology Evaluation Related work Conclusions Ranking Potential problems with transitivity P ( A > B ) > 1 2 , P ( B > C ) > 1 2 ⇒ P ( A > C ) > 1 2 Kendall-Wei method Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  21. Introduction Preliminaries Methodology Evaluation Related work Conclusions Overview 1 Introduction 2 Preliminaries 3 Methodology 4 Evaluation 5 Related work 6 Conclusions Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  22. Introduction Preliminaries Methodology Evaluation Related work Conclusions Results of comparison α = 0 . 05 Only the difference between S 3 and S 4 is insignificant ρ pb π S 1 S 2 S 3 S 4 S 1 S 2 S 3 S 4 S 1 - 0.326 0.636 0.636 - 0.320 0.140 0.141 S 2 -0.326 - 0.465 0.464 0.680 - 0.239 0.239 S 3 -0.636 -0.465 - 0.010 0.860 0.761 - 0.506 -0.636 -0.464 -0.010 - 0.859 0.761 0.494 - S 4 Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  23. Introduction Preliminaries Methodology Evaluation Related work Conclusions How many shuffles do we need? Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

  24. Introduction Preliminaries Methodology Evaluation Related work Conclusions Overview 1 Introduction 2 Preliminaries 3 Methodology 4 Evaluation 5 Related work 6 Conclusions Mladen Nikoli´ cThird Workshop on Formal and Automated Theorem Prooving and Applications Methodology for Comparison and Ranking of SAT Solvers

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend