A Hands-On Introduction to Automatic Machine Learning Lars Kotthofg - - PowerPoint PPT Presentation

a hands on introduction to automatic machine learning
SMART_READER_LITE
LIVE PREVIEW

A Hands-On Introduction to Automatic Machine Learning Lars Kotthofg - - PowerPoint PPT Presentation

A Hands-On Introduction to Automatic Machine Learning Lars Kotthofg University of Wyoming larsko@uwyo.edu AutoML Workshop, 28 August 2018, Nanjing 1 Machine Learning Data Machine Learning Predictions 2 Automatic Machine Learning


slide-1
SLIDE 1

A Hands-On Introduction to Automatic Machine Learning

Lars Kotthofg

University of Wyoming larsko@uwyo.edu AutoML Workshop, 28 August 2018, Nanjing

1

slide-2
SLIDE 2

Machine Learning

Data Machine Learning Predictions

2

slide-3
SLIDE 3

Automatic Machine Learning

Hyperparameter Tuning Data Machine Learning Predictions

3

slide-4
SLIDE 4

Grid and Random Search

▷ evaluate certain points in parameter space

Bergstra, James, and Yoshua Bengio. “Random Search for Hyper-Parameter Optimization.” J. Mach. Learn. Res. 13, no. 1 (February 2012): 281–305.

4

slide-5
SLIDE 5

Local Search

▷ start with random confjguration ▷ change a single parameter (local search step) ▷ if better, keep the change, else revert ▷ repeat, stop when resources exhausted or desired solution quality achieved ▷ restart occasionally with new random confjgurations

5

slide-6
SLIDE 6

Local Search Example

(Initialisation)

graphics by Holger Hoos

6

slide-7
SLIDE 7

Local Search Example

(Initialisation)

graphics by Holger Hoos

7

slide-8
SLIDE 8

Local Search Example

(Local Search)

graphics by Holger Hoos

8

slide-9
SLIDE 9

Local Search Example

(Local Search)

graphics by Holger Hoos

9

slide-10
SLIDE 10

Local Search Example

(Perturbation)

graphics by Holger Hoos

10

slide-11
SLIDE 11

Local Search Example

(Local Search)

graphics by Holger Hoos

11

slide-12
SLIDE 12

Local Search Example

(Local Search)

graphics by Holger Hoos

12

slide-13
SLIDE 13

Local Search Example

(Local Search)

graphics by Holger Hoos

13

slide-14
SLIDE 14

Local Search Example

?

Selection (using Acceptance Criterion)

graphics by Holger Hoos

14

slide-15
SLIDE 15

Model-Based Search

▷ evaluate small number of confjgurations ▷ build model of parameter-performance surface based on the results ▷ use model to predict where to evaluate next ▷ repeat, stop when resources exhausted or desired solution quality achieved ▷ allows targeted exploration of promising confjgurations

15

slide-16
SLIDE 16

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0.000 0.005 0.010 0.015 0.020 0.025

x type

  • init

prop

type

y yhat ei

Iter = 1, Gap = 1.9909e−01

Bischl, Bernd, Jakob Richter, Jakob Bossek, Daniel Horn, Janek Thomas, and Michel Lang. “MlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions,” March 9, 2017. http://arxiv.org/abs/1703.03373.

16

slide-17
SLIDE 17

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0.00 0.01 0.02 0.03

x type

  • init

prop seq

type

y yhat ei

Iter = 2, Gap = 1.9909e−01

Bischl, Bernd, Jakob Richter, Jakob Bossek, Daniel Horn, Janek Thomas, and Michel Lang. “MlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions,” March 9, 2017. http://arxiv.org/abs/1703.03373.

17

slide-18
SLIDE 18

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0.000 0.002 0.004 0.006

x type

  • init

prop seq

type

y yhat ei

Iter = 3, Gap = 1.9909e−01

Bischl, Bernd, Jakob Richter, Jakob Bossek, Daniel Horn, Janek Thomas, and Michel Lang. “MlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions,” March 9, 2017. http://arxiv.org/abs/1703.03373.

18

slide-19
SLIDE 19

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0e+00 2e−04 4e−04 6e−04 8e−04

x type

  • init

prop seq

type

y yhat ei

Iter = 4, Gap = 1.9992e−01

Bischl, Bernd, Jakob Richter, Jakob Bossek, Daniel Horn, Janek Thomas, and Michel Lang. “MlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions,” March 9, 2017. http://arxiv.org/abs/1703.03373.

19

slide-20
SLIDE 20

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0e+00 1e−04 2e−04

x type

  • init

prop seq

type

y yhat ei

Iter = 5, Gap = 1.9992e−01

Bischl, Bernd, Jakob Richter, Jakob Bossek, Daniel Horn, Janek Thomas, and Michel Lang. “MlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions,” March 9, 2017. http://arxiv.org/abs/1703.03373.

20

slide-21
SLIDE 21

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0.00000 0.00003 0.00006 0.00009 0.00012

x type

  • init

prop seq

type

y yhat ei

Iter = 6, Gap = 1.9996e−01

Bischl, Bernd, Jakob Richter, Jakob Bossek, Daniel Horn, Janek Thomas, and Michel Lang. “MlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions,” March 9, 2017. http://arxiv.org/abs/1703.03373.

21

slide-22
SLIDE 22

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0e+00 1e−05 2e−05 3e−05 4e−05 5e−05

x type

  • init

prop seq

type

y yhat ei

Iter = 7, Gap = 2.0000e−01

Bischl, Bernd, Jakob Richter, Jakob Bossek, Daniel Horn, Janek Thomas, and Michel Lang. “MlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions,” March 9, 2017. http://arxiv.org/abs/1703.03373.

22

slide-23
SLIDE 23

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0.0e+00 5.0e−06 1.0e−05 1.5e−05 2.0e−05

x type

  • init

prop seq

type

y yhat ei

Iter = 8, Gap = 2.0000e−01

Bischl, Bernd, Jakob Richter, Jakob Bossek, Daniel Horn, Janek Thomas, and Michel Lang. “MlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions,” March 9, 2017. http://arxiv.org/abs/1703.03373.

23

slide-24
SLIDE 24

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0.0e+00 2.5e−06 5.0e−06 7.5e−06 1.0e−05

x type

  • init

prop seq

type

y yhat ei

Iter = 9, Gap = 2.0000e−01

Bischl, Bernd, Jakob Richter, Jakob Bossek, Daniel Horn, Janek Thomas, and Michel Lang. “MlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions,” March 9, 2017. http://arxiv.org/abs/1703.03373.

24

slide-25
SLIDE 25

Model-Based Search Example

  • y

ei −1.0 −0.5 0.0 0.5 1.0 0.0 0.4 0.8 0e+00 1e−07 2e−07 3e−07 4e−07

x type

  • init

prop seq

type

y yhat ei

Iter = 10, Gap = 2.0000e−01

Bischl, Bernd, Jakob Richter, Jakob Bossek, Daniel Horn, Janek Thomas, and Michel Lang. “MlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions,” March 9, 2017. http://arxiv.org/abs/1703.03373.

25

slide-26
SLIDE 26

Problems

▷ How good are we really? ▷ How much of it is just random chance? ▷ Can we do better?

26

slide-27
SLIDE 27

Underlying Issues

▷ true performance landscape unknown ▷ resources allow to explore only tiny part of hyperparameter space ▷ results inherently stochastic

27

slide-28
SLIDE 28

Potential Solutions

▷ better-understood benchmarks ▷ more comparisons ▷ more runs with difgerent random seed

28

slide-29
SLIDE 29

Two-Slide MBO ML

# http://www.cs.uwyo.edu/~larsko/mbo.py params = { 'C': np.logspace(-2, 10, 13), 'gamma': np.logspace(-9, 3, 13) } param_grid = [ { 'C': x, 'gamma': y } for x in params['C'] for y in params['gamma'] ] # [{'C': 0.01, 'gamma': 1e-09}, {'C': 0.01, 'gamma': 1e-08}...] initial_samples = 3 evals = 10 random.seed(1) def est_acc(pars): clf = svm.SVC(**pars) return np.median(cross_val_score(clf, iris.data, iris.target, cv = 10)) data = [] for pars in random.sample(param_grid, initial_samples): acc = est_acc(pars) data += [ list(pars.values()) + [ acc ] ] # [[1.0, 0.1, 1.0], # [1000000000.0, 1e-07, 1.0], # [0. 1, 1e-06,0.9333333333333333]]

29

slide-30
SLIDE 30

Two-Slide MBO ML

regr = RandomForestRegressor(random_state = 0) for evals in range(0, evals): df = np.array(data) regr.fit(df[:,0:2], df[:,2]) preds = regr.predict([ list(pars.values()) for pars in param_grid ]) i = preds.argmax() acc = est_acc(param_grid[i]) data += [ list(param_grid[i].values()) + [ acc ] ] print("{}: best predicted {} for {}, actual {}" .format(evals, round(preds[i], 2), param_grid[i], round(acc, 2))) i = np.array(data)[:,2].argmax() print("Best accuracy ({}) for parameters {}".format(data[i][2], data[i][0:2]))

30

slide-31
SLIDE 31

Two-Slide MBO ML

0: best predicted 0.99 for {'C': 1.0, 'gamma': 1e-09}, actual 0.93 1: best predicted 0.99 for {'C': 1000000000.0, 'gamma': 1e-09}, actual 0.93 2: best predicted 0.99 for {'C': 1000000000.0, 'gamma': 0.1}, actual 0.93 3: best predicted 0.97 for {'C': 1.0, 'gamma': 0.1}, actual 1.0 4: best predicted 0.99 for {'C': 1.0, 'gamma': 0.1}, actual 1.0 5: best predicted 1.0 for {'C': 1.0, 'gamma': 0.1}, actual 1.0 6: best predicted 1.0 for {'C': 1.0, 'gamma': 0.1}, actual 1.0 7: best predicted 1.0 for {'C': 1.0, 'gamma': 0.1}, actual 1.0 8: best predicted 1.0 for {'C': 0.01, 'gamma': 0.1}, actual 0.93 9: best predicted 1.0 for {'C': 1.0, 'gamma': 0.1}, actual 1.0 Best accuracy (1.0) for parameters [1.0, 0.1]

31

slide-32
SLIDE 32

Tools and Resources

iRace http://iridia.ulb.ac.be/irace/ TPOT https://github.com/EpistasisLab/tpot mlrMBO https://github.com/mlr-org/mlrMBO SMAC http://www.cs.ubc.ca/labs/beta/Projects/SMAC/

Spearmint https://github.com/HIPS/Spearmint TPE https://jaberg.github.io/hyperopt/ Auto-WEKA http://www.cs.ubc.ca/labs/beta/Projects/autoweka/ Auto-sklearn https://github.com/automl/auto-sklearn

Available soon: edited book on automatic machine learning https://www.automl.org/book/ (Frank Hutter, Lars Kotthofg, Joaquin Vanschoren)

32

slide-33
SLIDE 33

I’m hiring!

Several funded graduate/postdoc positions available.

33