DM841 DISCRETE OPTIMIZATION Part 2 – Heuristics
Experimental Analysis
Marco Chiarandini
Department of Mathematics & Computer Science University of Southern Denmark
Experimental Analysis Marco Chiarandini Department of Mathematics - - PowerPoint PPT Presentation
DM841 D ISCRETE O PTIMIZATION Part 2 Heuristics Experimental Analysis Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Outline Inferential Statistics Sequential Testing Outline Algorithm
Department of Mathematics & Computer Science University of Southern Denmark
Outline Inferential Statistics Sequential Testing Algorithm Selection
2
Outline Inferential Statistics Sequential Testing Algorithm Selection
3
Outline Inferential Statistics Sequential Testing Algorithm Selection
4
Outline Inferential Statistics Sequential Testing Algorithm Selection
◮ We work with samples (instances, solution quality) ◮ But we want sound conclusions: generalization over a given population
◮ Thus we need statistical inference
10
Outline Inferential Statistics Sequential Testing Algorithm Selection
◮ There is a competition and two stochastic algorithms A1 and A2 are
◮ We run both algorithms once on n instances.
11
Outline Inferential Statistics Sequential Testing Algorithm Selection
◮ p: probability that A1 wins on each instance (+) ◮ n: number of runs without ties ◮ Y : number of wins of algorithm A1
10 15 20 0.00 0.04 0.08 0.12
Binomial Distribution: Trials = 30, Probability of success = 0.5
Number of Successes Probability Mass
Outline Inferential Statistics Sequential Testing Algorithm Selection
2 4 6 8 10 0.00 0.05 0.10 0.15 0.20 0.25
Binomial distribution: Trials = 30 Probability of success 0.5
number of successes y Pr[Y=y]
13
Outline Inferential Statistics Sequential Testing Algorithm Selection
14
Outline Inferential Statistics Sequential Testing Algorithm Selection
◮ Assume that data are consistent with a null hypothesis H0 (e.g., sample
◮ Use a statistical test to compute how likely this is to be true, given the
◮ Do not reject H0 if the p-value is larger than an user defined threshold
◮ Alternatively, (p-value < α), H0 is rejected in favor of an alternative
15
Outline Inferential Statistics Sequential Testing Algorithm Selection
16
Outline Inferential Statistics Sequential Testing Algorithm Selection
◮ Consequences:
◮ allows inference from a sample ◮ allows to model errors in measurements: X = µ + ǫ
◮ Issues:
◮ n should be enough large ◮ µ and σ must be known 17
Outline Inferential Statistics Sequential Testing Algorithm Selection
10 20 30 40 0.0 0.2 0.4 0.6
Weibull distribution
x dweibull(x, shape = 1.4)
¯ X−µ σ/√n
n=1 Density 0.0 0.1 0.2 0.3 0.4 0.5 0.6 n=5 Density 0.0 0.1 0.2 0.3 0.4 n=15 Density 0.0 0.1 0.2 0.3 0.4 n=50 Density 0.0 0.1 0.2 0.3 0.4
18
Outline Inferential Statistics Sequential Testing Algorithm Selection
¯ X1 ¯ X2 ¯ X3
¯ X1 ¯ X2 ¯ X3
19
Outline Inferential Statistics Sequential Testing Algorithm Selection
20
Outline Inferential Statistics Sequential Testing Algorithm Selection
21
Outline Inferential Statistics Sequential Testing Algorithm Selection
T =
( ¯ X1− ¯ X2)−
r
T˜Student’s t Distribution
1 − ¯
2
21
Outline Inferential Statistics Sequential Testing Algorithm Selection
25 30 35 40 45 0.0 0.2 0.4 0.6 0.8 1.0 F(x) x
F(x) F(x)
1 2
22
Outline Inferential Statistics Sequential Testing Algorithm Selection
◮ independence ◮ homoschedasticity ◮ normality
◮ independence ◮ homoschedasticity
◮ Rank based tests ◮ Permutation tests
◮ Exact ◮ Conditional Monte Carlo 23
Outline Inferential Statistics Sequential Testing Algorithm Selection
24
Outline Inferential Statistics Sequential Testing Algorithm Selection
◮ Blocking on instances ◮ Same pseudo random seed
◮ If the sample size is large enough (infinity) any difference in the means
◮ Real vs Statistical significance
◮ Desired statistical power + practical precision ⇒ sample size
25
Outline Inferential Statistics Sequential Testing Algorithm Selection
◮ Statement of the objectives of the experiment
◮ Comparison of different algorithms ◮ Impact of algorithm components ◮ How instance features affect the algorithms
◮ Identification of the sources of variance
◮ Treatment factors (qualitative and quantitative) ◮ Controllable nuisance factors ⇐ blocking ◮ Uncontrollable nuisance factors ⇐ measuring
◮ Definition of factor combinations to test
◮ Running a pilot experiment and refine the design
◮ Bugs and no external biases ◮ Ceiling or floor effects ◮ Rescaling levels of quantitative factors ◮ Detect the number of experiments needed to obtained the desired power. 26
Outline Inferential Statistics Sequential Testing Algorithm Selection
Algorithm 1 Algorithm 2 . . . Algorithm k Instance 1 X11 X12 X1k . . . . . . . . . . . . Instance b Xb1 Xb2 Xbk
Algorithm 1 Algorithm 2 . . . Algorithm k Instance 1 X111, . . . , X11r X121, . . . , X12r X1k1, . . . , X1kr Instance 2 X211, . . . , X21r X221, . . . , X22r X2k1, . . . , X2kr . . . . . . . . . . . . Instance b Xb11, . . . , Xb1r Xb21, . . . , Xb2r Xbk1, . . . , Xbkr
27
Outline Inferential Statistics Sequential Testing Algorithm Selection
◮ Protected versions: global test + no adjustments ◮ Bonferroni α = αEX/c (conservative) ◮ Tukey Honest Significance Method (for parametric analysis) ◮ Holm (step-wise) ◮ Other step-wise procedures
28
Outline Inferential Statistics Sequential Testing Algorithm Selection
29
Outline Inferential Statistics Sequential Testing Algorithm Selection
30
Outline Inferential Statistics Sequential Testing Algorithm Selection
◮ Matched pairs versions: when, when not ◮ t-test with different variances
31
Outline Inferential Statistics Sequential Testing Algorithm Selection
32
Outline Inferential Statistics Sequential Testing Algorithm Selection
◮ Matched pairs versions: when, when not ◮ t-test Welch variant: no assumption of equal variances
33
Outline Inferential Statistics Sequential Testing Algorithm Selection
Instance HEA TSN1 ILS MinConf XRLF Instance Succ. k Succ. k Succ. k Succ. k Succ. k flat300_20_0 10 20 10 20 10 20 10 20 6 20 flat300_26_0 10 26 10 26 10 26 10 26 1 33 flat300_28_0 6 31 4 31 2 31 1 31 1 34 flat1000_50_0 4 50 2 85 6 88 4 87 1 84 flat1000_60_0 4 87 3 88 1 89 4 89 6 87 flat1000_76_0 1 88 1 88 1 89 8 90 6 87 GLS SAN2 Novelty TSN3 Instance Succ. k Succ. k Succ. k Succ. k flat300_20_0 10 20 10 20 1 22 1 33 flat300_26_0 10 33 1 32 4 29 6 35 flat300_28_0 8 33 8 33 10 35 4 35 flat1000_50_0 10 50 1 86 6 54 1 95 flat1000_60_0 4 90 1 88 4 64 1 96 flat1000_76_0 8 92 4 89 8 98 1 96
34
Outline Inferential Statistics Sequential Testing Algorithm Selection
col
Novelty HEA TSinN1 ILS MinConf GLS2 XRLF SAKempeFI TSinN3 50 60 70 80 90
70 80 90
88 90 92 94 96 98
Novelty HEA TSinN1 ILS MinConf GLS2 XRLF SAKempeFI TSinN3 20 25 30 35
flat300_20_0
26 28 30 32 34 36
31 32 33 34 35 36 37
35
Outline Inferential Statistics Sequential Testing Algorithm Selection
> load ("gcp -all -classes.dataR ") > G <- F[F$class ==" Flat",] > bwplot(alg ~ col | inst ,data=G,scales=list(x=list(relation =" free ")), pch ="|") > boxplot(err3~alg ,data=G,horizontal=TRUE ,main= expression (paste (" Invariant error: ",frac(x-x^( opt),x^( worst)-x^( opt)))),notch=TRUE , col =" pink ") > boxplot(rank~alg ,data=G,horizontal=TRUE ,main =" Ranks",notch=TRUE ,col =" pink ")
36
Outline Inferential Statistics Sequential Testing Algorithm Selection
HEA TSinN1 ILS MinConf GLS2 XRLF SAKempeFI TSinN3 0.3 0.4 0.5 0.6 0.7
Invariant error: x − x(opt) x(worst) − x(opt)
HEA TSinN1 ILS MinConf GLS2 XRLF SAKempeFI TSinN3 20 40 60 80
Ranks
37
Outline Inferential Statistics Sequential Testing Algorithm Selection
> pairwise.wilcox.test(G$err3 ,G$alg ,paired=TRUE) Pairwise comparisons using Wilcoxon rank sum test data: G$err3 and G$alg
38
Outline Inferential Statistics Sequential Testing Algorithm Selection
> par(las=1,mar=c(3,8,3,1)) > plot(TukeyHSD(aov(err3~alg*inst ,data=G),which =" alg "),las=1,mar=c (3,7,3,1))
0.00 0.05 0.10 0.15 0.20 TSinN3−SAKempeFI TSinN3−XRLF SAKempeFI−XRLF TSinN3−GLS2 SAKempeFI−GLS2 XRLF−GLS2 TSinN3−MinConf SAKempeFI−MinConf XRLF−MinConf GLS2−MinConf TSinN3−ILS SAKempeFI−ILS XRLF−ILS GLS2−ILS MinConf−ILS TSinN3−TSinN1 SAKempeFI−TSinN1 XRLF−TSinN1 GLS2−TSinN1 MinConf−TSinN1 ILS−TSinN1 TSinN3−HEA SAKempeFI−HEA XRLF−HEA GLS2−HEA MinConf−HEA ILS−HEA TSinN1−HEA TSinN3−Novelty SAKempeFI−Novelty XRLF−Novelty GLS2−Novelty MinConf−Novelty ILS−Novelty TSinN1−Novelty HEA−Novelty
95% family−wise confidence level
39
Outline Inferential Statistics Sequential Testing Algorithm Selection
40
Outline Inferential Statistics Sequential Testing Algorithm Selection
Novelty HEA TSinN1 ILS MinConf GLS2 XRLF SAKempeFI TSinN3 0.50 0.55 0.60 0.65 0.70 Average Inveriant Error (Tukey's Honset Significance Difference) Novelty HEA TSinN1 ILS MinConf GLS2 XRLF SAKempeFI TSinN3 0.50 0.55 0.60 0.65 0.70 Average Inveriant Error (Permutation Test) Novelty HEA TSinN1 ILS MinConf GLS2 XRLF SAKempeFI TSinN3 20 40 60 80 Average Rank (Friedman Test) 41
Outline Inferential Statistics Sequential Testing Algorithm Selection
42
Outline Inferential Statistics Sequential Testing Algorithm Selection
◮ F-Race use Friedman test ◮ Holm adjustment method is typically the most powerful
race(wrapper.file , maxExp =0, stat.test=c("friedman","t. bonferroni","t.holm","t.none"), conf.level =0.95 , first.test=5, interactive =TRUE , log.file="", no.slaves =0 ,...)
43
Outline Inferential Statistics Sequential Testing Algorithm Selection
S_D_s_Y S_D_g_Y O_CCRB O_CCRA O_DCRB S_D_g_N O_CRRA O_DCRA O_CRRB S_D_s_N O_DRRA O_DRRB S_RLF_N O_CCFA S_RLF_Y O_CCFB O_DCFB O_DCFA S_Seq_SL_Y ... 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
class−GEOMb (11 Instances)
Stage
46
Outline Inferential Statistics Sequential Testing Algorithm Selection 47
Outline Inferential Statistics Sequential Testing Algorithm Selection
48
Outline Inferential Statistics Sequential Testing Algorithm Selection 49
Outline Inferential Statistics Sequential Testing Algorithm Selection
◮ Each configurable parameter has associated a sampling distribution that
◮ numerical parameters: truncated normal distribution ◮ categorical parameters: discrete distribution.
◮ The update of the distributions consists in modifying the mean and the
◮ The update biases the distributions to increase the probability of
50
Outline Inferential Statistics Sequential Testing Algorithm Selection
51
Outline Inferential Statistics Sequential Testing Algorithm Selection
52
Outline Inferential Statistics Sequential Testing Algorithm Selection
◮ the portfolio of algorithms is a set of (complementary) SAT solvers ◮ the instances are Boolean formulas ◮ the cost metric is for example average runtime or number of unsolved
◮ Portfolio algorithms are commonly the winners at
53
Outline Inferential Statistics Sequential Testing Algorithm Selection
◮ The algorithm selection problem is mainly solved with machine learning
◮ represent the problem instances by numerical features f , ◮ then algorithm selection can be seen as a multi-class classification
◮ Instance features are numerical representations of instances. For
54
Outline Inferential Statistics Sequential Testing Algorithm Selection
◮ Regression Approach
A∈P ˆ
◮ Clustering Approach
◮ Pairwise Cost-Sensitive Classification Approach
55
Outline Inferential Statistics Sequential Testing Algorithm Selection
◮ Online Selection
◮ Computation of Schedules
◮ Selection of Parallel Portfolios
56