The Rˆ
- le of Benchmarking in Symbolic
Computation (Position Paper)
James Davenport Thanks to Alessandro Cimatti for the SMT graphs
University of Bath
21 September 2018
James Davenport (Bath) Rˆ
- le of Benchmarking in Symbolic Computation
1 / 26
The R ole of Benchmarking in Symbolic Computation (Position Paper) - - PowerPoint PPT Presentation
The R ole of Benchmarking in Symbolic Computation (Position Paper) James Davenport Thanks to Alessandro Cimatti for the SMT graphs University of Bath 21 September 2018 James Davenport (Bath) R ole of Benchmarking in Symbolic
James Davenport (Bath) Rˆ
1 / 26
1 Summary 2 The sparse weakness of complexity theory 3 Benchmarking against problem sets 4 Conclusions James Davenport (Bath) Rˆ
2 / 26
James Davenport (Bath) Rˆ
3 / 26
James Davenport (Bath) Rˆ
4 / 26
James Davenport (Bath) Rˆ
5 / 26
James Davenport (Bath) Rˆ
6 / 26
James Davenport (Bath) Rˆ
7 / 26
James Davenport (Bath) Rˆ
8 / 26
James Davenport (Bath) Rˆ
9 / 26
James Davenport (Bath) Rˆ
10 / 26
1 deciding if a polynomial is square-free 2 deciding if two polynomials have a non-trivial g.c.d.
James Davenport (Bath) Rˆ
11 / 26
James Davenport (Bath) Rˆ
12 / 26
James Davenport (Bath) Rˆ
13 / 26
James Davenport (Bath) Rˆ
14 / 26
James Davenport (Bath) Rˆ
15 / 26
James Davenport (Bath) Rˆ
16 / 26
James Davenport (Bath) Rˆ
17 / 26
500 1000 1500 2000 2500 3000 3500 4000 0.01 0.1 1 10 100 1000 # of instances time SMT(NRA) -- SAT+UNSAT Benchmarks (no-meti-tarski) MathSAT CVC4 Z3 Yices SMT-RAT dReal iSAT3 virtual-best 20 40 60 80 100 120 140 0.01 0.1 1 10 100 1000 # of instances time VMT(NTA) -- Safe+Unsafe Benchmarks IncreLin-nuXmv StaticLin-nuXmv K-induction-NTA-MathSAT K-induction-NTA-dReal BMC-NTA-MathSAT BMC-NTA-dReal virtual-best
James Davenport (Bath) Rˆ
18 / 26
0.1 1 10 100 1000 0.1 1 10 100 1000 Yices MathSAT 0.1 1 10 100 1000 0.1 1 10 100 1000 Yices MathSAT
James Davenport (Bath) Rˆ
19 / 26
1 Complexity theory has served us well 2 But it has its limits, especially in the presence of NP-hardness 3 Other fields (SAT, SMT) handle this differently
4 But for this we need large corpora of problems, both “hard”
5 And a better (and more honest) benchmarking methodology 6 Contests certainly haven’t hurt SAT 7 But these evolve, unlike “Top 500”, which has hurt HPC
James Davenport (Bath) Rˆ
20 / 26
James Davenport (Bath) Rˆ
21 / 26
James Davenport (Bath) Rˆ
22 / 26
James Davenport (Bath) Rˆ
23 / 26
James Davenport (Bath) Rˆ
24 / 26
James Davenport (Bath) Rˆ
25 / 26
James Davenport (Bath) Rˆ
26 / 26