SLIDE 1 Tuning numerical parameters of algorithms: sampling and stochasticity handling
utzle, M. Birattari, M. Montes de Oca
IRIDIA, CoDE, Universit´ e Libre de Bruxelles Brussels, Belgium zyuan@ulb.ac.be iridia.ulb.ac.be/~zyuan
SLIDE 2 Outline
- 1. The tuning problem
- 2. Tuning algorithm
Sampling in parameter space Budget allocation for ranking and selection: F-Race Combine F-Race with Sampling method
Iterated F-Race (Birattari et al. 2010) Post-selection mechanism
SLIDE 3 Outline
- 1. The tuning problem
- 2. Tuning algorithm
Sampling in parameter space Budget allocation for ranking and selection: F-Race Combine F-Race with Sampling method
Iterated F-Race (Birattari et al. 2010) Post-selection mechanism
SLIDE 4 Configuration of parameterized algorithms
Algorithm components
◮ categorical parameters
◮ choice of neighborhood in local search ◮ choice of crossover and mutation in EAs ◮ type of perturbation in iterated local search
◮ numerical parameters (real-valued or integer)
◮ crossover and mutation rates ◮ tabu list length ◮ perturbation strength
SLIDE 5
Importance of the tuning problem
◮ improvement over default settings, manual tuning ◮ reduction of development time and human intervention ◮ empirical studies, comparisons of algorithms ◮ support for end users of algorithms
SLIDE 6
Tuning problem: formal definition (Birattari et al. 2002)
The tuning problem can be defined as a tuple Θ, I, PI, PC, C
SLIDE 7
Tuning problem: formal definition (Birattari et al. 2002)
The tuning problem can be defined as a tuple Θ, I, PI, PC, C
◮ Θ: set of candidate configurations.
SLIDE 8
Tuning problem: formal definition (Birattari et al. 2002)
The tuning problem can be defined as a tuple Θ, I, PI, PC, C
◮ Θ: set of candidate configurations. ◮ I: set of instances. PI: probability measure over I.
SLIDE 9
Tuning problem: formal definition (Birattari et al. 2002)
The tuning problem can be defined as a tuple Θ, I, PI, PC, C
◮ Θ: set of candidate configurations. ◮ I: set of instances. PI: probability measure over I. ◮ c(θ, i): random variable representing the cost measure of a
configuration θ ∈ Θ on instance i ∈ I
SLIDE 10
Tuning problem: formal definition (Birattari et al. 2002)
The tuning problem can be defined as a tuple Θ, I, PI, PC, C
◮ Θ: set of candidate configurations. ◮ I: set of instances. PI: probability measure over I. ◮ c(θ, i): random variable representing the cost measure of a
configuration θ ∈ Θ on instance i ∈ I
◮ C ⊂ ℜ: range of c. PC: probability measure over the set C.
SLIDE 11 Tuning problem: formal definition (Birattari et al. 2002)
The tuning problem can be defined as a tuple Θ, I, PI, PC, C
◮ Θ: set of candidate configurations. ◮ I: set of instances. PI: probability measure over I. ◮ c(θ, i): random variable representing the cost measure of a
configuration θ ∈ Θ on instance i ∈ I
◮ C ⊂ ℜ: range of c. PC: probability measure over the set C. ◮ C(θ) = C(θ|Θ, I, PI, PC): performance expectation:
C(θ) = EI,C[c] =
(1)
SLIDE 12 Tuning problem: formal definition (Birattari et al. 2002)
The tuning problem can be defined as a tuple Θ, I, PI, PC, C
◮ Θ: set of candidate configurations. ◮ I: set of instances. PI: probability measure over I. ◮ c(θ, i): random variable representing the cost measure of a
configuration θ ∈ Θ on instance i ∈ I
◮ C ⊂ ℜ: range of c. PC: probability measure over the set C. ◮ C(θ) = C(θ|Θ, I, PI, PC): performance expectation:
C(θ) = EI,C[c] =
(1)
◮ The objective is to find a performance optimizing
configuration ¯ θ: ¯ θ = arg min
θ∈Θ C(θ)
(2)
SLIDE 13 Tuning problem: formal definition (Birattari et al. 2002)
The tuning problem can be defined as a tuple Θ, I, PI, PC, C
◮ Θ: set of candidate configurations. ◮ I: set of instances. PI: probability measure over I. ◮ c(θ, i): random variable representing the cost measure of a
configuration θ ∈ Θ on instance i ∈ I
◮ C ⊂ ℜ: range of c. PC: probability measure over the set C. ◮ C(θ) = C(θ|Θ, I, PI, PC): performance expectation:
C(θ) = EI,C[c] =
(1)
◮ The objective is to find a performance optimizing
configuration ¯ θ: ¯ θ = arg min
θ∈Θ C(θ)
(2)
◮ Analytical solution not possible, hence estimate expected cost
in a Monte Carlo fashion
SLIDE 14 Tuning problem is an optimization problem
Variables:
mixed discrete-continuous, conditional variables
Objective:
◮ black-box ◮ stochastic
◮ due to stochasticity of the algorithm ◮ due to sampling of instances
SLIDE 15 Outline
- 1. The tuning problem
- 2. Tuning algorithm
Sampling in parameter space Budget allocation for ranking and selection: F-Race Combine F-Race with Sampling method
Iterated F-Race (Birattari et al. 2010) Post-selection mechanism
SLIDE 16
Solving tuning problem: Our approach
◮ sampling in parameter space ◮ budget allocation for ranking and selection under
stochasticity: F-Race
◮ combine budget allocator with sampling methods
Open question: trade-off in allocating budget to sampling new points or evaluation of sampled points.
SLIDE 17 Sampling in parameter space
◮ focus on numerical parameters ◮ usually low dimension, low budget ◮ sampling in established tuners: ad-hoc, factorial design,
Kriging approximation
◮ our work studies state-of-the-art derivative-free optimizers:
BOBYQA, CMA-ES, and MADS (Yuan et al. 2010, 2012a)
2 3 4 5 6 1 2 3 4 5
Average rank of algorithms across numbers of parameters in MMAS
Number of parameters Average rank CMAES MADS IRS URS BOBYQA 2.0 2.5 3.0 3.5 4.0 4.5 5.0 1 2 3 4 5
Average rank of algorithms across numbers of parameters in cPSO
Number of parameters Average rank CMAES MADS IRS URS BOBYQA
SLIDE 18
F-Race (Birattari et al. 2002)
Θ i
SLIDE 19
F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates
SLIDE 20
F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances
SLIDE 21
F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates
SLIDE 22
F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates
SLIDE 23
F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates
SLIDE 24
F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates
as detected by Friedman test
SLIDE 25
F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates
as detected by Friedman test
SLIDE 26
F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates
as detected by Friedman test
SLIDE 27
F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates
as detected by Friedman test
SLIDE 28
F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates
as detected by Friedman test
SLIDE 29
F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates
as detected by Friedman test
SLIDE 30
F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates
as detected by Friedman test
SLIDE 31
F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates
as detected by Friedman test
SLIDE 32
F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates
as detected by Friedman test
SLIDE 33
F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates
as detected by Friedman test
SLIDE 34
F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates
as detected by Friedman test
SLIDE 35 F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates
as detected by Friedman test
◮ . . . repeat until a winner is selected
- r until computation budget is consumed
SLIDE 36 F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates
as detected by Friedman test
◮ . . . repeat until a winner is selected
- r until computation budget is consumed
SLIDE 37 F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates
as detected by Friedman test
◮ . . . repeat until a winner is selected
- r until computation budget is consumed
SLIDE 38 F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates
as detected by Friedman test
◮ . . . repeat until a winner is selected
- r until computation budget is consumed
SLIDE 39 F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates
as detected by Friedman test
◮ . . . repeat until a winner is selected
- r until computation budget is consumed
SLIDE 40 F-Race (Birattari et al. 2002)
Θ i
◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates
as detected by Friedman test
◮ . . . repeat until a winner is selected
- r until computation budget is consumed
Open question: What is the power and actual type I error of sequential hypothesis testing?
SLIDE 41 Outline
- 1. The tuning problem
- 2. Tuning algorithm
Sampling in parameter space Budget allocation for ranking and selection: F-Race Combine F-Race with Sampling method
Iterated F-Race (Birattari et al. 2010) Post-selection mechanism
SLIDE 42
Iterated F-Race
◮ sample configurations iteratively ◮ in each iteration, use F-Race to rank and select the best
configurations to bias the sampling
◮ I/F-Race is devised in Birattari et al. 2010, which tunes
numerical parameters, categorical parameters, and conditional parameters.
◮ F-Race is also hybridized with existing sampling methods, such
as MADS/F-Race and CMAES/F-Race (Yuan et al. 2012a)
◮ increase the probability of type I error (incumbent protection,
(Yuan et al. 2012a)
◮ what if the sampling method does not need ranking and
selection, e.g. response surface methodologies
SLIDE 43
Post-selection (Yuan et al. 2012a, 2012b)
◮ use few instances during the sampling phase to identify a
number of elite configurations
◮ in the final post-selection phase, use F-Race to carefully select
the best from the set of elite configurations
◮ can be applied together with iterated F-Race
SLIDE 44 Outline
- 1. The tuning problem
- 2. Tuning algorithm
Sampling in parameter space Budget allocation for ranking and selection: F-Race Combine F-Race with Sampling method
Iterated F-Race (Birattari et al. 2010) Post-selection mechanism
SLIDE 45
Experimental setup
◮ case studies of tuning numerical parameters of MMAS: α, β,
ρ, m, γ, nn, q0
◮ 3 case studies of each of 2 to 6 parameters being tuned,
resulting in 3 · 5 = 15 case studies
◮ in each case study, 7 budget levels are studied, ranging from
tens to thousands
SLIDE 46 Repeated evaluation: case studies in URS
◮ fixed number of repetition nr ∈ {1, 3, 5, 10, 20, 40} ◮ takes uniform random sampling for study, include U/F-Race ◮ best nr differs a lot depending on budget levels ◮ U/F-Race outperforms fixed number of repeated evaluations
1 2 3 4 5 6 7 1 2 3 4 5 6 7 Average rank of algorithms across budget levels Budget level Average rank of algorithm U1 U3 U5 U10 U20 U40 UF
SLIDE 47 Post-selection: case studies in MADS
◮ Post-selection with low nr outperforms repeated evaluation ◮ Post-selection with nr = 1 results in best performance ◮ Post-selection with nr = 1 outperforms F-Race hybrid M1 M3 M5 M10 M20 M40 MP1 MP3 MP5 MP10 MP20 MP40 2 4 6 8 10 12
1 2 3 4 5 6 7 1.0 1.5 2.0 2.5 3.0 3.5 4.0
MP1 MP3 MP5 MF
SLIDE 48 Comparisons of all tuners
◮ BOBYQA, CMA-ES, and MADS uses post-selection and
nr = 1
◮ includes also I/F-Race and U/F-Race for comparison ◮ BOBYQA with post-selection and nr = 1 appears to be the
best setting
BP1 CP1 MP1 IF UF 1 2 3 4 5
2 3 4 5 6 1 2 3 4 5 Average rank of algorithms across numbers of parameters Number of parameters Average rank of algorithm BP1 CP1 MP1 IF RF
SLIDE 49
Conclusions and future work
Conclusions
◮ state-of-the-art derivative-free optimizers e.g. BOBYQA,
CMA-ES show good performance in sampling in parameter space
◮ post-selection using F-Race is a simple and effective
mechanism for stochasticity handling
Future work
◮ further investigation into post-selection mechanism ◮ a detailed survey of derivative-free continuous optimizers and
statistical ranking and selection techniques
◮ sampling techniques also for categorical parameters ◮ better understand and address the trade-off in allocating
budget to sampling new configurations or evaluation of sampled evaluations
SLIDE 50 Some References
◮ Birattari, M., St¨
utzle, T., Paquete, L., Varrentrapp, K. (2002): A racing algorithm for configuring metaheuristics. In Langdon, W.B., et al., eds.: GECCO 2002, 11–18.
◮ Birattari, M., Yuan, Z., Balaprakash, P., St¨
utzle, T. (2010):F-Race and iterated F-Race: An overview. In Bartz-Beielstein, T., et al., eds.: Experimental Methods for the Analysis of Optimization Algorithms. Springer, Berlin, Germany 311–336.
◮ Yuan, Z., Montes de Oca, M., Birattari, M., St¨
utzle, T. (2012): Continuous optimization algorithms for tuning real and integer parameters
- f swarm intelligence algorithms. Swarm Intelligence 6(1), 49–75.
◮ Yuan, Z., St¨
utzle, T., Montes de Oca, M., Birattari, M. (2012): An analysis of a post-selection mechanism for handling stochasticity in tuning numerical parameters of MAX–MIN Ant System. Technical Report TR/IRIDIA/2012-007, IRIDIA, ULB, Brussels, Belgium, 2012.