Structure at the meta-level: Observations on the structure of design - - PowerPoint PPT Presentation

structure at the meta level
SMART_READER_LITE
LIVE PREVIEW

Structure at the meta-level: Observations on the structure of design - - PowerPoint PPT Presentation

Structure at the meta-level: Observations on the structure of design spaces of high-performance solvers for hard combinatorial problems Holger H. Hoos BETA Lab Department of Computer Science University of British Columbia Canada based on


slide-1
SLIDE 1

Structure at the meta-level:

Observations on the structure of design spaces

  • f high-performance solvers

for hard combinatorial problems

Holger H. Hoos

BETA Lab Department of Computer Science University of British Columbia Canada

based on joint work with

Chris Fawcett, Frank Hutter, Kevin Leyton-Brown, Thomas St¨ utzle

slide-2
SLIDE 2

Acknowledgements

Collaborators: I Domagoj Babi´ c I Sam Bayless I Chris Fawcett I Quinn Hsu I Frank Hutter I Erez Karpas I Chris Nell I Eugene Nudelman I Steve Ramage I Gabriele R¨

  • ger

I Marius Schneider I James Styles I Dave Tompkins I Mauro Vallati I Lin Xu I Thomas Bartz-Beielstein

(FH K¨

  • ln, Germany)

I Marco Chiarandini

(Syddansk Universitet, Denmark)

I Alfonso Gerevini

(Universit` a degli Studi di Brescia, Italy)

I Malte Helmert

(Universit¨ at Basel, Switzerland)

I Alan Hu I Kevin Leyton-Brown I Kevin Murphy I Alessandro Saetti

(Universit` a degli Studi di Brescia, Italy)

I Thomas St¨ utzle

(Universit´ e Libre de Bruxelles, Belgium)

Research funding: I NSERC, Mprime, GRAND, CFI I IBM, Actenum Corp. Computing resources: I Arrow, BETA, ICICS clusters I Compute Canada / WestGrid

slide-3
SLIDE 3

Take-home message:

I exploiting structure in problem instances permits

practical solution of hard problems instance-level structure

I structure in space of algorithms (+ human creativity)

facilitates effective construction of good solvers for hard problems meta-level structure

I meta-level structure may differ substantially from

instance-level structure

I PbO (rich algorithm design space + automated configuration)

permits (partial) automation of building effective solvers; efficacy depends on exploitation of meta-level structure

Holger Hoos: Structure at the Meta-Level 38

slide-4
SLIDE 4

application context

Holger Hoos: Structure at the Meta-Level 1

slide-5
SLIDE 5

application context solver

Holger Hoos: Structure at the Meta-Level 1

slide-6
SLIDE 6

application context 1 application context 2 application context 3

Holger Hoos: Structure at the Meta-Level 2

slide-7
SLIDE 7

application context 1 solver application context 2 application context 3 solver solver

Holger Hoos: Structure at the Meta-Level 2

slide-8
SLIDE 8

application context 1 application context 2 application context 3 solver[·]

Holger Hoos: Structure at the Meta-Level 2

slide-9
SLIDE 9

application context 1 solver[p1] application context 2 application context 3 solver[p3] solver solver[·] solver solver solver solver[p2]

Holger Hoos: Structure at the Meta-Level 2

slide-10
SLIDE 10

application context solver

  • ptimised

solver design space

  • f solvers

Holger Hoos: Structure at the Meta-Level 3

slide-11
SLIDE 11

application context planner planner solver

  • ptimised

solver parallel portfolio instance- based selector design space

  • f solvers

Holger Hoos: Structure at the Meta-Level 3

slide-12
SLIDE 12

application context planner planner

  • ptimised

solver parallel portfolio instance- based selector design space

  • f solvers

Holger Hoos: Structure at the Meta-Level 3

slide-13
SLIDE 13

Programming by Optimisation (PbO)

I program (large) space of programs I encourage software developers to

I avoid premature commitment to design choices I seek & maintain design alternatives

I automatically find performance-optimising designs

for given use context(s)

Holger Hoos: Structure at the Meta-Level 4

slide-14
SLIDE 14

Communications of the ACM, 55(2), pp. 70–80, February 2012

www.prog-by-opt.net

slide-15
SLIDE 15

Levels of PbO:

Level 4: Make no design choice prematurely that cannot be justified compellingly. Level 3: Strive to provide design choices and alternatives. Level 2: Keep and expose design choices considered during software development. Level 1: Expose design choices hardwired into existing code (magic constants, hidden parameters, abandoned design alternatives). Level 0: Optimise settings of parameters exposed by existing software.

Holger Hoos: Structure at the Meta-Level 6

slide-16
SLIDE 16

Success in optimising speed:

Application, Design choices Speedup PbO level SAT-based software verification (Spear), 41

Hutter, Babi´ c, HH, Hu (2007)

4.5–500 × 2–3 AI Planning (LPG), 62

Vallati, Fawcett, Gerevini, HH, Saetti (2011)

3–118 × 1 Mixed integer programming (CPLEX), 76

Hutter, HH, Leyton-Brown (2010)

2–52 ×

... and solution quality:

University timetabling, 18 design choices, PbO level 2–3 new state of the art; UBC exam scheduling

Fawcett, Chiarandini, HH (2009)

Machine learning / Classification, 803 design choices, PbO level 0–1

  • utperforms specialised model selection & hyper-parameter optimisation

methods from machine learning

Thornton, Hutter, HH, Leyton-Brown (2012) Holger Hoos: Structure at the Meta-Level 7

slide-17
SLIDE 17

Outline

  • 1. Introduction
  • 2. Design spaces & design optimisation
  • 3. Which choices matter? Global perspectives
  • 4. Which choices matter? A local perspective
  • 5. Speculation and open questions

Holger Hoos: Structure at the Meta-Level 8

slide-18
SLIDE 18

Design optimisation

Simplest case: Configuration / tuning

I Standard optimisation techniques (e.g., CMA-ES – Hansen & Ostermeier 01; MADS – Audet & Orban 06) I Advanced sampling methods (e.g., REVAC, REVAC++ – Nannen & Eiben 06–09) I Racing (e.g., F-Race – Birattari, St¨ utzle, Paquete, Varrentrapp 02; Iterative F-Race – Balaprakash, Birattari, St¨ utzle 07) I Model-free search (e.g., ParamILS – Hutter, HH, St¨ utzle 07; Hutter, HH, Leyton-Brown, St¨ utzle 09) I Sequential model-based optimisation (e.g., SPO – Bartz-Beielstein 06; SMAC – Hutter, HH, Leyton-Brown 11–12)

Holger Hoos: Structure at the Meta-Level 9

slide-19
SLIDE 19

Iterated Local Search (Local Search)

Holger Hoos: Structure at the Meta-Level 10

slide-20
SLIDE 20

Iterated Local Search (Perturbation)

Holger Hoos: Structure at the Meta-Level 10

slide-21
SLIDE 21

Iterated Local Search (Local Search)

Holger Hoos: Structure at the Meta-Level 10

slide-22
SLIDE 22

Iterated Local Search (Local Search)

Holger Hoos: Structure at the Meta-Level 10

slide-23
SLIDE 23

Iterated Local Search

?

Selection (using Acceptance Criterion)

Holger Hoos: Structure at the Meta-Level 10

slide-24
SLIDE 24

Iterated Local Search (Perturbation)

Holger Hoos: Structure at the Meta-Level 10

slide-25
SLIDE 25

ParamILS

I iterated local search in configuration space I initialisation: pick best of default + R random configurations I subsidiary local search: iterative first improvement,

change one parameter in each step

I perturbation: change s randomly chosen parameters I acceptance criterion: always select better configuration I number of runs per configuration increases over time;

ensure that incumbent always has same number of runs as challengers

Holger Hoos: Structure at the Meta-Level 11

slide-26
SLIDE 26

Sequential Model-based Optimisation

e.g., Jones (1998), Bartz-Beielstein (2006) I Key idea:

use predictive performance model (response surface model) to find good configurations

I perform runs for selected configurations (initial design)

and fit model (e.g., noise-free Gaussian process model)

I iteratively select promising configuration,

perform run and update model

Holger Hoos: Structure at the Meta-Level 12

slide-27
SLIDE 27

Sequential Model-based Optimisation

parameter response measured

(Initialisation)

Holger Hoos: Structure at the Meta-Level 13

slide-28
SLIDE 28

Sequential Model-based Optimisation

parameter response model measured

(Initialisation)

Holger Hoos: Structure at the Meta-Level 13

slide-29
SLIDE 29

Sequential Model-based Optimisation

parameter response model predicted best measured

(Initialisation)

Holger Hoos: Structure at the Meta-Level 13

slide-30
SLIDE 30

Sequential Model-based Optimisation

parameter response model measured

(Initialisation)

Holger Hoos: Structure at the Meta-Level 13

slide-31
SLIDE 31

Sequential Model-based Optimisation

parameter response model predicted best measured

(Initialisation)

Holger Hoos: Structure at the Meta-Level 13

slide-32
SLIDE 32

Sequential Model-based Optimisation

parameter response model measured

(Initialisation)

Holger Hoos: Structure at the Meta-Level 13

slide-33
SLIDE 33

Sequential Model-based Optimisation

parameter response model predicted best measured

(Initialisation)

Holger Hoos: Structure at the Meta-Level 13

slide-34
SLIDE 34

Sequential Model-based Optimisation

parameter response model measured

(Initialisation)

Holger Hoos: Structure at the Meta-Level 13

slide-35
SLIDE 35

Sequential Model-based Optimisation

parameter response model predicted best measured

new incumbent found!

(Initialisation)

Holger Hoos: Structure at the Meta-Level 13

slide-36
SLIDE 36

Sequential Model-based Algorithm Configuration (SMAC)

Hutter, HH, Leyton-Brown (2011) I uses random forest model to predict performance

  • f parameter configurations

I predictions based on algorithm parameters and instance

features, aggregated across instances

I finds promising configurations based on expected improvement

criterion, using multi-start local search and random sampling

I initialisation with single configuration

(algorithm default or randomly chosen)

Holger Hoos: Structure at the Meta-Level 14

slide-37
SLIDE 37

Which choices matter? Global perspectives

Observation: Some design choices matter more than others depending on . . .

I algorithm under consideration I given use context

Knowledge which choices / parameters matter may . . .

I guide algorithm development I facilitate configuration

Holger Hoos: Structure at the Meta-Level 15

slide-38
SLIDE 38

Forward selection based on empirical performance models

Hutter, HH, Leyton-Brown (2013)

Key idea:

I build regression models of algorithm performance as a

function of input parameters (= design choices) empirical performance models (EPMs)

I consider only subset of parameters S, ignore all others I starting with S = ∅, iteratively add parameters one at a time I in each iteration, greedily add parameter resuling in

  • max. improvement in accuracy of regression model

Holger Hoos: Structure at the Meta-Level 16

slide-39
SLIDE 39

EPMs work:

Hutter, HH, Leyton-Brown (to appear in AIJ)

SPEAR on SAT-encoded IBM software verification problems true running times [log10 CPU sec]

Holger Hoos: Structure at the Meta-Level 17

slide-40
SLIDE 40

EPMs work:

Hutter, HH, Leyton-Brown (to appear in AIJ)

SPEAR on SAT-encoded IBM software verification problems predicted running times [log10 CPU sec]

Holger Hoos: Structure at the Meta-Level 17

slide-41
SLIDE 41

EPMs work:

Hutter, HH, Leyton-Brown (to appear in AIJ)

CPLEX 12.1 on MIP problems from computational sustainability true running times [log10 CPU sec]

Holger Hoos: Structure at the Meta-Level 17

slide-42
SLIDE 42

EPMs work:

Hutter, HH, Leyton-Brown (to appear in AIJ)

CPLEX 12.1 on MIP problems from computational sustainability predicted running times [log10 CPU sec]

Holger Hoos: Structure at the Meta-Level 17

slide-43
SLIDE 43

Empirical study:

I high-performance solvers for SAT, MIP, TSP

(23–76 parameters), well-known sets of benchmark data (real-world structure)

I random forest models for performance prediction,

trained on 1000 randomly sampled configurations per solver

Holger Hoos: Structure at the Meta-Level 18

slide-44
SLIDE 44

Good prediction accuracies for few parameters:

2 4 6 8 10 0.05 0.1 0.15 0.2

Parameter subset size RMSE

LK-H on TSPLIB

(RMSE for log10 running times in CPU sec)

Holger Hoos: Structure at the Meta-Level 19

slide-45
SLIDE 45

Good prediction accuracies for few parameters:

2 4 6 8 10 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

Parameter subset size RMSE

SPEAR on SAT-encoded IBM software verification problems

(RMSE for log10 running times in CPU sec)

Holger Hoos: Structure at the Meta-Level 19

slide-46
SLIDE 46

How important is each parameter?

Cost of omission = impact on model accuracy from excluding single parameters

50 100 RUNS MAX_BREADTH PATCHING_C_OPTIONS STABLE_RUNS MAX_CANDIDATES_SYMMETRIC MOVE_TYPE BACKTRACKING SUBSEQUENT_MOVE_TYPE MAX_CANDIDATES EXCESS Importance

LK-H on TSPLIB (normalised to 100 for most important parameter)

Holger Hoos: Structure at the Meta-Level 20

slide-47
SLIDE 47

How important is each parameter?

Cost of omission = impact on model accuracy from excluding single parameters

20 40 60 80 100 sp−restart−inc sp−rand−phase−dec−freq sp−update−dec−queue sp−rand−var−dec−scaling sp−resolution sp−var−activity−inc sp−rand−var−dec−freq sp−variable−decay sp−phase−dec−heur sp−var−dec−heur Importance

SPEAR on SAT-encoded IBM software verification problems (normalised to 100 for most important parameter)

Holger Hoos: Structure at the Meta-Level 20

slide-48
SLIDE 48

Functional ANOVA based on empirical performance models

Hutter, HH, Leyton-Brown (in preparation)

Key idea:

I build regression model of algorithm performance as a function

  • f all input parameters (= design choices)

empirical performance models (EPMs)

I analyse variance in model output (= predicted performance)

due to each parameter, parameter interactions

I importance of parameter: fraction of performance variation

  • ver configuration space explained by it (main effect)

I analogous for sets of parameters (interaction effects)

Holger Hoos: Structure at the Meta-Level 21

slide-49
SLIDE 49

Decomposition of variance in a nutshell

For parameters p1, . . . , pn and a function (performance model) y: y(p1, . . . , pn) = µ + f1(p1) + f2(p2) + · · · + fn(pn) + f1,2(p1, p2) + f1,3(p1, p3) + · · · + fn−1,n(pn−1, pn) + f1,2,3(p1, p2, p3) + · · · + · · ·

Holger Hoos: Structure at the Meta-Level 22

slide-50
SLIDE 50

Note:

I Straightforward computation of main and interaction effects

is intractable. (integration over combinatorial spaces of configurations)

I For random forest models, marginal performance predictions

and variance decomposition (up to constant-sized interactions) can be computed exactly and efficiently.

Holger Hoos: Structure at the Meta-Level 23

slide-51
SLIDE 51

Empirical study:

I 8 high-performance solvers for SAT, ASP, MIP, TSP

(4–85 parameters)

I 12 well-known sets of benchmark data

(random + real-world structure)

I random forest models for performance prediction,

trained on 10 000 randomly sampled configurations per solver + data from 25+ runs of SMAC configuration procedure

Holger Hoos: Structure at the Meta-Level 24

slide-52
SLIDE 52

Fraction of variance explained by main effects:

CPLEX on RCW (comp sust) 70.3% CPLEX on CORLAT (comp sust) 35.0% Clasp on software verificatition 78.9% Clasp on DB query optimisation 62.5% CryptoMiniSAT on bounded model checking 35.5% CryptoMiniSAT on software verification 31.9%

Holger Hoos: Structure at the Meta-Level 25

slide-53
SLIDE 53

Fraction of variance explained by main + 2-interaction effects:

CPLEX on RCW (comp sust) 70.3% + 12.7% CPLEX on CORLAT (comp sust) 35.0% + 8.3% Clasp on software verificatition 78.9% + 14.3% Clasp on DB query optimisation 62.5% + 11.7% CryptoMiniSAT on bounded model checking 35.5% + 20.8% CryptoMiniSAT on software verification 31.9% + 28.5%

Holger Hoos: Structure at the Meta-Level 26

slide-54
SLIDE 54

Note:

may pick up variation caused by poorly performing configurations

Simple solution:

cap at default performance (or quantile from distribution of randomly sampled configurations); build model from capped data.

Holger Hoos: Structure at the Meta-Level 27

slide-55
SLIDE 55

Which choices matter? A local perspective

Note: We are mostly interested in good configurations Note: (obtained from design optimisation) Questions:

I Which differences between two configurations matter

(how much)?

I How close to the default can good performance be obtained? I How sensitive is an optimised configuration to parameter

changes? Answers may . . .

I guide algorithm development I facilitate configuration I improve performance of default configurations I improve robustness of optimised configurations

Holger Hoos: Structure at the Meta-Level 28

slide-56
SLIDE 56

Ablation analysis

Fawcett, HH (under review)

Key idea:

I given two configurations, A and B, change one parameter at a

time to get from A to B ablation path

I in each step, change parameter to achieve maximal gain (or

minimal loss) in performance

I for computational efficiency, use racing (F-race)

for evaluating parameters considered in each step

Holger Hoos: Structure at the Meta-Level 29

slide-57
SLIDE 57

Prototypical ablation results:

  • all parameters are equally important

(note: log scale!)

Holger Hoos: Structure at the Meta-Level 30

slide-58
SLIDE 58

Prototypical ablation results:

  • few parameters are most important

(note: log scale!)

Holger Hoos: Structure at the Meta-Level 30

slide-59
SLIDE 59

Empirical study:

I high-performance solvers for SAT, MIP, AI Planning

(26–76 parameters), well-known sets of benchmark data (real-world structure)

I optimised configurations obtained from ParamILS

(minimisation of penalised average running time; (10 runs per scenario, 48 CPU hours each)

Holger Hoos: Structure at the Meta-Level 31

slide-60
SLIDE 60

Ablation between default and optimised configurations:

  • LPG on Depots planning domain

Holger Hoos: Structure at the Meta-Level 32

slide-61
SLIDE 61

Ablation between default and optimised configurations:

  • SPEAR on SAT-encoded IBM software verification problems

Holger Hoos: Structure at the Meta-Level 32

slide-62
SLIDE 62

Which parameters are important?

SPEAR on SAT-encoded IBM software verification instances:

I sp-var-dec-heur (99.92% of overall performance gain!) I sp-rand-var-dec-scaling I sp-res-cutoff-cls I sp-first-restart

LPG (AI Planning): Importance of parameters varies between planning domains

Holger Hoos: Structure at the Meta-Level 33

slide-63
SLIDE 63

Ablation between optimised configurations:

  • CPLEX 12.1 on MIP problems from computational sustainability

large plateau of good configurations

Holger Hoos: Structure at the Meta-Level 34

slide-64
SLIDE 64

Ablation between optimised configurations:

  • SPEAR on SAT-encoded IBM software verification problems

possibility of barriers between good configurations

Holger Hoos: Structure at the Meta-Level 34

slide-65
SLIDE 65

Speculation and open questions

Optimisation at the meta-level

I candidate solutions are engineering designs I evaluation is (very) noisy (problem instances) I evaluation is expensive I cost of evaluation often depends on quality of candidate

solution (e.g., for minimisation of running time) different methods, different types of structure, different ways to exploit structure

Holger Hoos: Structure at the Meta-Level 35

slide-66
SLIDE 66

Some hypotheses

HH (2012), Fawcett & HH (under review), Hutter, HH, Leyton-Brown (2013) I parameters interact, but not too much I individual parameter responses tend to be well-behaved

(uni-modal)

I (few) key parameters need to have certain settings,

depending on use context (vz backbones) large, shallow basins around optimised configurations

I for highly parametric algorithms, there are barrier-free paths

between optimised configurations (vz neutral paths in landscapes of RNA secondary structures)

Holger Hoos: Structure at the Meta-Level 36

slide-67
SLIDE 67

Open questions

I Applicability of / insights from standard landscape

analysis techniques? (auto-correlation, fitness distance analysis, ...)

I Insights that can be exloited for better design optimisers

(configurators)?

I Principles that can guide algorithm developers using PbO

to more effectively optimisable designs?

Holger Hoos: Structure at the Meta-Level 37

slide-68
SLIDE 68

Take-home message:

I exploiting structure in problem instances permits

practical solution of hard problems instance-level structure

I structure in space of algorithms (+ human creativity)

facilitates effective construction of good solvers for hard problems meta-level structure

I meta-level structure may differ substantially from

instance-level structure

I PbO (rich algorithm design space + automated configuration)

permits (partial) automation of building effective solvers; efficacy depends on exploitation of meta-level structure

Holger Hoos: Structure at the Meta-Level 38