Automatic Algorithm Configuration Design choices and parameters - - PDF document

automatic algorithm configuration
SMART_READER_LITE
LIVE PREVIEW

Automatic Algorithm Configuration Design choices and parameters - - PDF document

HEURISTIC OPTIMIZATION Automatic Algorithm Configuration Design choices and parameters everywhere Todays high-performance optimizers involve a large number of design choices and parameter settings I exact solvers I design choices include


slide-1
SLIDE 1

HEURISTIC OPTIMIZATION

Automatic Algorithm Configuration

Design choices and parameters everywhere

Todays high-performance optimizers involve a large number of design choices and parameter settings

I exact solvers

I design choices include alternative models, pre-processing,

variable selection, value selection, branching rules . . .

I many design choices have associated numerical parameters I example: CPLEX 10.1.1 has 159 user-specifiable parameters,

about 80 influence the solver’s search mechanism

I approximate algorithms

I design choices include solution representation, operators,

neighborhood, pre-processing, strategies, . . .

I many design choices have associated numerical parameters I example: multi-objective ACO algorithms with 22 parameters

(plus several still hidden ones): see later

Heuristic Optimization 2011 2

slide-2
SLIDE 2

Example: Ant Colony Optimization

Heuristic Optimization 2011 3

  • Heuristic Optimization 2011

4

slide-3
SLIDE 3

Probabilistic solution construction

i j k g ! ij " ij

?

,

Heuristic Optimization 2011 5

ACO design choices and numerical parameters

I solution construction

I choice of pheromone model I choice of heuristic information I choice of constructive procedure I numerical parameters I α, β influence the weight of pheromone and heuristic

information, respectively

I q0 determines greediness of construction procedure I m, the number of ants

I pheromone update

I which ants deposit pheromone and how much? I numerical parameters I ρ: evaporation rate I τ0: initial pheromone level

I local search

I . . . many more . . . Heuristic Optimization 2011 6

slide-4
SLIDE 4

Designing optimization algorithms

Challenges

I many alternative design choices I nonlinear interactions among algorithm components

and/or parameters

I performance assessment is difficult

Traditional design approach

I trial–and–error design guided by expertise/intuition

prone to over-generalizations, implicit independence assumptions, limited exploration of design alternatives Can we make this approach more principled and automatic?

Heuristic Optimization 2011 7

Towards automatic algorithm configuration

Automated algorithm configuration

I apply powerful search techniques to design algorithms I use computation power to explore algorithm design spaces I free human creativity for higher level tasks

Heuristic Optimization 2011 8

slide-5
SLIDE 5

Why automatic algorithm configuration?

I improvement over manual, ad-hoc methods for tuning I reduction of development time and human intervention I increase number of considerable degrees of freedom I empirical studies, comparisons of algorithms I support for end users of algorithms

Heuristic Optimization 2011 9

Configuration is a stochastic optimization/learning problem

Random influences

I stochasticity of the parameterized algorithm I stochasticity due to the “sampling” of the instance to be

tackled

Learning aspects

I algorithm configuration should solve unseen instances

Configuration problem is a stochastic mixed discrete–continuous

  • ptimization problem with machine learning aspects

Heuristic Optimization 2011 10

slide-6
SLIDE 6

Main configuration approaches

Offline configuration

I configure algorithm before deploying it I configuration done on training instances

Online configuration

I adapt parameter setting while solving an instance I typically limited to a set of known crucial algorithm

parameters

Heuristic Optimization 2011 11

Offline configuration

Remark: Configuration scenario requires the definition of performance measure to be optimized

I maximize solution quality (within given computation time) I minimize computation time (to reach optimal solution)

Heuristic Optimization 2011 12

slide-7
SLIDE 7

Towards a shift of paradigm in algorithm design

  • Heuristic Optimization 2011

13

Towards a shift of paradigm in algorithm design

  • Heuristic Optimization 2011

14

slide-8
SLIDE 8

Towards a shift of paradigm in algorithm design

  • Heuristic Optimization 2011

15

Approaches to configuration and tuning

I numerical optimization techniques

I e.g. CMA-ES [Hansen & Ostermeier, 2001], MADS [Audet &

Orban, 2006]

I heuristic search methods

I e.g. ParamILS [Hutter, Hoos, Leyton-Brown, St¨

utzle, 2009], genetic programming [Fukunaga, 2002], gender-based GA [Sellman et al, 2010], . . .

I experimental design, ANOVA

I e.g. CALIBRA [Adenso-Diaz & Laguna, 2006], [Ridge, Kudenko,

2007, Ruiz Maroto, 2006, Coy et al., 2000]

I response surface methods (model-based optimization)

I e.g. SPO [Bartz-Beielstein, 2006], SMAC [Hutter, Hoos,

Leyton-Brown, 2011]

I sequential statistical testing, F-race, iterated F-race

I e.g. [Birratari, St¨

utzle, Paquete, Varrentrap, 2002;Balaprakash, Birattari, St¨ utzle, 2007]

Heuristic Optimization 2011 16

slide-9
SLIDE 9

Example of application scenario

I Mario collects phone orders for 30 minutes I scheduling deliveries is an optimization problem I a different instance arises every 30 minutes I limited amount of time for scheduling, say one minute I good news: Mario has an SLS algorithm! I . . . but the SLS algorithm must be tuned I You have a limited amount of time for configuring it, say one

week

Criterion:

Good configurations find good solutions for future instances!

Heuristic Optimization 2011 17

Brute-force approach to configuration

  • 1. sample a set of configurations Θ0 ⊆ Θ
  • 2. estimate C(θ) for each θ ∈ Θ0
  • 3. return the configuration with the lowest estimate

Disadvantages of brute-force configuration

  • 1. one needs to determine a priori how many runs on each

candidate configuration

  • 2. poor performing candidate configurations are evaluated with

same computational effort as good ones

Heuristic Optimization 2011 18

slide-10
SLIDE 10

Remark: Estimation of expected cost

Given

I n runs for estimating expected cost of configuration θ I a large number of instances

Question

I how many runs on how many instances to minimize variance

  • f estimate

Answer

I one run on each of n instances (Birattari, 2004)

Heuristic Optimization 2011 19

The racing approach

Θ i

I start with a set of initial candidates I consider a stream of instances I sequentially evaluate candidates I discard inferior candidates

as sufficient evidence is gathered against them

I . . . repeat until a winner is selected

  • r until computation time expires

Heuristic Optimization 2011 20

slide-11
SLIDE 11

The F-Race algorithm

Statistical testing

  • 1. family-wise tests for differences among configurations

I Friedman two-way analysis of variance by ranks

  • 2. if Friedman rejects H0, perform pairwise comparisons to best

configuration

I apply Friedman post-hoc test

Predecessors

I racing algorithms in model-selection Maron & Moore (1994)

Heuristic Optimization 2011 21

Sampling configurations

F-race is a method for the selection of the best configuration and independent of the way the set of configurations is sampled

Sampling configurations and F-race

I full factorial design I random sampling design I iterative refinement of a sampling model (iterative F-race)

Heuristic Optimization 2011 22

slide-12
SLIDE 12

Full factorial design

I full factorial design (FFD) was used in the first applications of

F-race to make comparisons to other ways of doing races

I levels for FFD can be determined manually, randomly, etc. I FFD has significant disadvantages

I expertise for selecting the levels of each parameter I exponential growth with number d of parameters: ld Heuristic Optimization 2011 23

Random sampling design

I Define a probability measure PX on

the space X of parameter values

I Sample the configurations I Apply F-Race to select the best I Performance attributed to the number

  • f samples

I advantages

I arbitrary number of candidate

configurations is sampled

I no a priori definition of levels

necessary

I covers uniformly the parameter space Heuristic Optimization 2011 24

slide-13
SLIDE 13

Iterative re-finement

I modify the probability measure:

I using previously seen promising configurations to favor the

search towards promising regions

I sample from the newly defined distribution I apply F-race I iterate through this process

Heuristic Optimization 2011 25

I sample configurations

from initial distribution

While not terminate()

  • 1. apply F-Race
  • 2. modify the distribution
  • 3. sample configurations

with selection probability

Heuristic Optimization 2011 26

slide-14
SLIDE 14

Tuning MMAS for TSP

I 3 continuous parameters; only continuous part

I FFD: {apriori knowledge, random}

!! " ! #" #!

$%&'()*)+%,*-./(012) '2342,)*12.025+*)+%,.63%&.)72.326232,42.4%8)

9## #:;; <=>! ?;9" @@A!B @@A!C BDA EF@ BDA EF@ BDA EF@ BDA EF@

Heuristic Optimization 2011 27

Successful applications

From IRIDIA

I winning algorithm in a time-tabling competition I improving vehicle routing and scheduling software of SAP I new state-of-the-art metaheuristics for the probabilistic TSP

and stochastic VRPs

I various state-of-the-art algorithms in continuous optimization

I PSO for large-scale continuous functions I continuous ACO algorithms

Other groups

I Satenstein: configurable software framework for SAT solving I Configuration of MIP solvers (Cplex etc.) with speed-ups of

up to two orders of magnitude

I . . .

Heuristic Optimization 2011 28

slide-15
SLIDE 15

Conclusions

Status

I using automatic configuration tools is rewarding in terms of

development time and algorithm performance

I interactive usage of configurators allows humans to focus on

creative part of algorithm design

I many application opportunities also in other areas than

  • ptimization

Future work

I more powerful configurators I more and more complex applications I best practice

Heuristic Optimization 2011 29