AI-Augmented Algorithms How I Learned to Stop Worrying and Love - - PowerPoint PPT Presentation

ai augmented algorithms
SMART_READER_LITE
LIVE PREVIEW

AI-Augmented Algorithms How I Learned to Stop Worrying and Love - - PowerPoint PPT Presentation

AI-Augmented Algorithms How I Learned to Stop Worrying and Love Choice Lars Kotthofg University of Wyoming larsko@uwyo.edu Warsaw, 17 April 2019 Outline 2 Big Picture Motivation Choosing Algorithms Tuning Algorithms


slide-1
SLIDE 1

AI-Augmented Algorithms

How I Learned to Stop Worrying and Love Choice Lars Kotthofg

University of Wyoming larsko@uwyo.edu Warsaw, 17 April 2019

slide-2
SLIDE 2

Outline

▷ Big Picture ▷ Motivation ▷ Choosing Algorithms ▷ Tuning Algorithms ▷ Applications ▷ Outlook and Resources

2

slide-3
SLIDE 3

Big Picture

▷ advance the state of the art through meta-algorithmic techniques ▷ rather than inventing new things, use existing things more intelligently – automatically ▷ invent new things through combinations of existing things

3

slide-4
SLIDE 4

Motivation – What Difgerence Does It Make?

4

slide-5
SLIDE 5

Prominent Application

Fréchette, Alexandre, Neil Newman, Kevin Leyton-Brown. “Solving the Station Packing Problem.” In Association for the Advancement of Artifjcial Intelligence (AAAI), 2016.

5

slide-6
SLIDE 6

Performance Difgerences

0.1 1 10 100 1000 0.1 1 10 100 1000 Virtual Best SAT Virtual Best CSP

Hurley, Barry, Lars Kotthofg, Yuri Malitsky, and Barry O’Sullivan. “Proteus: A Hierarchical Portfolio of Solvers and Transformations.” In CPAIOR, 2014.

6

slide-7
SLIDE 7

Performance Improvements

10

−2 10 −1 10 0 10 1 10 2 10 3 10 4

10

−2

10

−1

10 10

1

10

2

10

3

10

4

SPEAR, original default (s) SPEAR, optimized for SWV (s) Hutter, Frank, Domagoj Babic, Holger H. Hoos, and Alan J. Hu. “Boosting Verifjcation by Automatic Tuning of Decision Procedures.” In FMCAD ’07: Proceedings of the Formal Methods in Computer Aided Design, 27–34. Washington, DC, USA: IEEE Computer Society, 2007.

7

slide-8
SLIDE 8

Common Theme

Performance models of black-box processes ▷ also called surrogate models ▷ substitute expensive underlying process with cheap approximate model ▷ build approximate model using machine learning techniques based on results of evaluations of the underlying process ▷ no knowledge of what the underlying process is required (but can be helpful) ▷ may facilitate better understanding of the underlying process through interrogation of the model

8

slide-9
SLIDE 9

Choosing Algorithms

9

slide-10
SLIDE 10

Algorithm Selection

Given a problem, choose the best algorithm to solve it.

Rice, John R. “The Algorithm Selection Problem.” Advances in Computers 15 (1976): 65–118.

10

slide-11
SLIDE 11

Algorithm Selection

Portfolio Algorithm 2 Algorithm 1 Algorithm 3 Training Instances Instance 2 Instance 1 Instance 3 Algorithm Selection Performance Model Instance 4 Instance 5 Instance 6 . . . Instance 4: Algorithm 2 Instance 5: Algorithm 3 Instance 6: Algorithm 3 . . . Feature Extraction Feature Extraction

11

slide-12
SLIDE 12

Algorithm Portfolios

▷ instead of a single algorithm, use several complementary algorithms ▷ idea from Economics – minimise risk by spreading it out across several securities ▷ same for computational problems – minimise risk of algorithm performing poorly ▷ in practice often constructed from competition winners or

  • ther algorithms known to have good performance

Huberman, Bernardo A., Rajan M. Lukose, and Tad Hogg. “An Economics Approach to Hard Computational Problems.” Science 275, no. 5296 (1997): 51–54. doi:10.1126/science.275.5296.51.

12

slide-13
SLIDE 13

Algorithms

“algorithm” used in a very loose sense ▷ algorithms ▷ heuristics ▷ machine learning models ▷ software systems ▷ machines ▷ …

13

slide-14
SLIDE 14

Parallel Portfolios

Why not simply run all algorithms in parallel? ▷ not enough resources may be available/waste of resources ▷ algorithms may be parallelized themselves ▷ memory/cache contention

14

slide-15
SLIDE 15

Building an Algorithm Selection System

▷ requires algorithms with complementary performance ▷ most approaches rely on machine learning ▷ train with representative data, i.e. performance of all algorithms in portfolio on a number of instances ▷ evaluate performance on separate set of instances ▷ potentially large amount of prep work

15

slide-16
SLIDE 16

Key Components of an Algorithm Selection System

▷ feature extraction ▷ performance model ▷ prediction-based selector/scheduler

  • ptional:

▷ presolver ▷ secondary/hierarchical models and predictors (e.g. for feature extraction time)

16

slide-17
SLIDE 17

Types of Performance Models

Regression Models A1 A2 A3 A1: 1.2 A2: 4.5 A3: 3.9 Classifjcation Model A1 A3 A1 A2 A1 Pairwise Classifjcation Models A1 vs. A2 A1 A2 A1 A1 A1 vs. A3 A1 A1 A3 A3 … A1: 1 vote A2: 0 votes A3: 2 votes Pairwise Regression Models A1 - A2 A1 - A3 … A1: -1.3 A2: 0.4 A3: 1.7 Instance 1 Instance 2 Instance 3 . . . Instance 1: Algorithm 2 Instance 2: Algorithm 1 Instance 3: Algorithm 3 . . . 17

slide-18
SLIDE 18

Tuning Algorithms

18

slide-19
SLIDE 19

Algorithm Confjguration

Given a (set of) problem(s), fjnd the best parameter confjguration.

19

slide-20
SLIDE 20

Parameters?

▷ anything you can change that makes sense to change ▷ e.g. search heuristic, optimization level, computational resolution ▷ not random seed, whether to enable debugging, etc. ▷ some will afgect performance, others will have no efgect at all

20

slide-21
SLIDE 21

Automated Algorithm Confjguration

▷ no background knowledge on parameters or algorithm – black-box process ▷ as little manual intervention as possible

▷ failures are handled appropriately ▷ resources are not wasted ▷ can run unattended on large-scale compute infrastructure

21

slide-22
SLIDE 22

Algorithm Confjguration

Frank Hutter and Marius Lindauer, “Algorithm Confjguration: A Hands on Tutorial”, AAAI 2016

22

slide-23
SLIDE 23

General Approach

▷ evaluate algorithm as black-box function ▷ observe efgect of parameters without knowing the inner workings, build surrogate model based on this data ▷ decide where to evaluate next, based on surrogate model ▷ repeat

23

slide-24
SLIDE 24

When are we done?

▷ most approaches incomplete, i.e. do not exhaustively explore parameter space ▷ cannot prove optimality, not guaranteed to fjnd optimal solution (with fjnite time) ▷ performance highly dependent on confjguration space

How do we know when to stop?

24

slide-25
SLIDE 25

Time Budget

How much time/how many function evaluations? ▷ too much wasted resources ▷ too little suboptimal result ▷ use statistical tests ▷ evaluate on parts of the instance set ▷ for runtime: adaptive capping ▷ in general: whatever resources you can reasonably invest

25

slide-26
SLIDE 26

Grid and Random Search

▷ evaluate certain points in parameter space

Bergstra, James, and Yoshua Bengio. “Random Search for Hyper-Parameter Optimization.” J. Mach. Learn. Res. 13, no. 1 (February 2012): 281–305.

26

slide-27
SLIDE 27

Model-Based Search

▷ evaluate small number of confjgurations ▷ build model of parameter-performance surface based on the results ▷ use model to predict where to evaluate next ▷ repeat ▷ allows targeted exploration of new confjgurations ▷ can take instance features into account like algorithm selection

Hutter, Frank, Holger H. Hoos, and Kevin Leyton-Brown. “Sequential Model-Based Optimization for General Algorithm Confjguration.” In LION 5, 507–23, 2011.

27

slide-28
SLIDE 28

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0.000 0.005 0.010 0.015 0.020 0.025

x type

  • init

prop

type

y yhat ei

Iter = 1, Gap = 1.9909e−01

28

slide-29
SLIDE 29

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0.00 0.01 0.02 0.03

x type

  • init

prop seq

type

y yhat ei

Iter = 2, Gap = 1.9909e−01

29

slide-30
SLIDE 30

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0.000 0.002 0.004 0.006

x type

  • init

prop seq

type

y yhat ei

Iter = 3, Gap = 1.9909e−01

30

slide-31
SLIDE 31

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0e+00 2e−04 4e−04 6e−04 8e−04

x type

  • init

prop seq

type

y yhat ei

Iter = 4, Gap = 1.9992e−01

31

slide-32
SLIDE 32

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0e+00 1e−04 2e−04

x type

  • init

prop seq

type

y yhat ei

Iter = 5, Gap = 1.9992e−01

32

slide-33
SLIDE 33

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0.00000 0.00003 0.00006 0.00009 0.00012

x type

  • init

prop seq

type

y yhat ei

Iter = 6, Gap = 1.9996e−01

33

slide-34
SLIDE 34

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0e+00 1e−05 2e−05 3e−05 4e−05 5e−05

x type

  • init

prop seq

type

y yhat ei

Iter = 7, Gap = 2.0000e−01

34

slide-35
SLIDE 35

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0.0e+00 5.0e−06 1.0e−05 1.5e−05 2.0e−05

x type

  • init

prop seq

type

y yhat ei

Iter = 8, Gap = 2.0000e−01

35

slide-36
SLIDE 36

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0.0e+00 2.5e−06 5.0e−06 7.5e−06 1.0e−05

x type

  • init

prop seq

type

y yhat ei

Iter = 9, Gap = 2.0000e−01

36

slide-37
SLIDE 37

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0e+00 1e−07 2e−07 3e−07 4e−07

x type

  • init

prop seq

type

y yhat ei

Iter = 10, Gap = 2.0000e−01

37

slide-38
SLIDE 38

Selected Applications

38

slide-39
SLIDE 39

Compiler Parameter Tuning

▷ pre-defjned optimization levels ofger not much fmexibility ▷ improvements possible by tuning full compiler parameter space ▷ tuned compute-intensive AI algorithms ▷ up to 40% runtime improvement over gcc -O2/-O3

Pérez Cáceres, Leslie, Federico Pagnozzi, Alberto Franzin, and Thomas Stützle. “Automatic Confjguration of GCC Using Irace.” In Artifjcial Evolution, edited by Evelyne Lutton, Pierrick Legrand, Pierre Parrend, Nicolas Monmarché, and Marc Schoenauer, 202–16. Cham: Springer International Publishing, 2018.

39

slide-40
SLIDE 40

Application – Optimizing Graphene Oxide Reduction

▷ reduce graphene oxide to graphene through laser irradiation ▷ allows to create electrically conductive lines in insulating material ▷ laser parameters need to be tuned carefully to achieve good results

40

slide-41
SLIDE 41

From Graphite/Coal to Carbon Electronics

Overview of the Process

41

slide-42
SLIDE 42

Evaluation of Irradiated Material

42

slide-43
SLIDE 43

Morphology of Irradiated Material

43

slide-44
SLIDE 44

Surrogate-Model-Based Optimization

  • 2

4 6 10 20 30 40 50

Iteration Ratio

44

slide-45
SLIDE 45

Surrogate-Model-Based Optimization

  • 2

4 6 2 4 6 8

Iteration Ratio

  • Predictions work even with small training dataset (19 points)
  • AI Model achieved IG/ID ratio (>6) after 1st prediction

During Training After 1st prediction + Prediction

  • Actual

50 um 50 um 45

slide-46
SLIDE 46

Explored Parameter Space

  • 1

4 3 45 5 7 6 8 2 15 13 14 12 17 26 21 22 27 19 20 18 25 9 24 31 40 32 35 29 38 41 34 33 11 30 28 10 16 23 46 36 42 37 39 47 48 44 43 2 4 6

Parameter Space Ratio

46

slide-47
SLIDE 47

Outlook

47

slide-48
SLIDE 48

Quo Vadis, Software Engineering?

Run

Hoos, Holger H. “Programming by Optimization.” Communications of the Association for Computing Machinery (CACM) 55, no. 2 (February 2012): 70–80. https://doi.org/10.1145/2076450.2076469.

48

slide-49
SLIDE 49

Quo Vadis, Software Engineering?

Run + AI

Hoos, Holger H. “Programming by Optimization.” Communications of the Association for Computing Machinery (CACM) 55, no. 2 (February 2012): 70–80. https://doi.org/10.1145/2076450.2076469.

48

slide-50
SLIDE 50

(Much) More Information

https://larskotthoff.github.io/assurvey/

Kotthofg, Lars. “Algorithm Selection for Combinatorial Search Problems: A Survey.” AI Magazine 35, no. 3 (2014): 48–60.

49

slide-51
SLIDE 51

Tools and Resources

LLAMA https://bitbucket.org/lkotthoff/llama SATzilla http://www.cs.ubc.ca/labs/beta/Projects/SATzilla/

iRace http://iridia.ulb.ac.be/irace/ mlrMBO https://github.com/mlr-org/mlrMBO SMAC http://www.cs.ubc.ca/labs/beta/Projects/SMAC/

Spearmint https://github.com/HIPS/Spearmint TPE https://jaberg.github.io/hyperopt/ autofolio https://bitbucket.org/mlindauer/autofolio/ Auto-WEKA http://www.cs.ubc.ca/labs/beta/Projects/autoweka/ Auto-sklearn https://github.com/automl/auto-sklearn

50

slide-52
SLIDE 52

Summary

Algorithm Selection choose the best algorithm for solving a problem Algorithm Confjguration choose the best parameter confjguration for solving a problem with an algorithm ▷ mature research areas ▷ can combine confjguration and selection ▷ efgective tools are available ▷ COnfjguration and SElection of ALgorithms group COSEAL http://www.coseal.net

Don’t set parameters prematurely, embrace choice!

51

slide-53
SLIDE 53

I’m hiring!

Several funded graduate positions available.

52