Bringing the human back in the loop Joaquin Vanschoren, TU/e (new) - - PowerPoint PPT Presentation

bringing the human back in the loop
SMART_READER_LITE
LIVE PREVIEW

Bringing the human back in the loop Joaquin Vanschoren, TU/e (new) - - PowerPoint PPT Presentation

Bringing the human back in the loop Joaquin Vanschoren, TU/e (new) data Meta-learning Automate optimization data analysis improved initial workflows workflows Online collaboration Automating machine learning: a human-robot symbiosis


slide-1
SLIDE 1

Online collaboration Meta-learning

  • ptimization

Automate data analysis

(new) data initial workflows improved workflows

Bringing the human back in the loop

Joaquin Vanschoren, TU/e

slide-2
SLIDE 2
slide-3
SLIDE 3

Online collaboration Meta-learning

  • ptimization

Robot assistants

Automating machine learning: a human-robot symbiosis

OpenML

(new) data initial workflows improved workflows

datasets, workflows, meta-data meta-models

Joaquin Vanschoren, TU/e

slide-4
SLIDE 4

(new) data

Robot assistants

initial workflows

Data type bot: detects/assigns correct data types removes unique/constant features Missing value bot: detects miscoded missing values, correlation with target Anomaly detection (WTF) bot: spurious strings, data artifacts

Thanks to Rich Caruana

DataDiff bot: detects data changes (data type, statistical deviations, …)

slide-5
SLIDE 5

(new) data

Robot assistants

initial workflows

Data leak bot: detects if test data leaks into the training set (e.g. by inspecting workflow, withholding data) Encoding bot: converts to numeric data depending on ML algo (SVM, kNN, NN) Label imbalance bot: detects/ reduces class imbalance (e.g. SMOTE) Data similarity bot: finds datasets similar to yours (meta-features)

slide-6
SLIDE 6

(new) data

Robot assistants

initial workflows

Feature selection bot: recommends/runs feature selection techniques Runtime prediction bot: predicts how long an ML algorithm will run on your data Imputation bot: recommends/runs missing value imputation techniques Outlier detection bot: recommends/ runs outlier detection techniques

slide-7
SLIDE 7

(new) data

Robot assistants

initial workflows

Random Bot: runs random search given a hyperparameter space Greedy Bot: learns key algorithms, hyper- parameters, ranges. Tries those first. Optimization bots: runs advanced hyperparameter optimization Workflow bot: build ML workflows, in collaboration with other bots

slide-8
SLIDE 8

Random Bot on OpenML

slide-9
SLIDE 9
slide-10
SLIDE 10

Online collaboration Robot assistants

Publish workflows on OpenML for collaboration

OpenML

(new) data initial workflows improved workflows

datasets, workflows, meta-data meta-models

Joaquin Vanschoren, TU/e

slide-11
SLIDE 11
slide-12
SLIDE 12
slide-13
SLIDE 13
slide-14
SLIDE 14

Online collaboration Meta-learning

  • ptimization

Robot assistants

Learn from results of humans and robots, use that to build better bots

OpenML

(new) data initial workflows improved workflows

datasets, workflows, meta-data meta-models

slide-15
SLIDE 15

Learn from results of humans and robots,

  • > build better bots, more insight

Data similarity bot: Meta-features across workflow Feature selection bot: how do different techniques affect learning performance? Runtime prediction bot: many runtime results Optimization bots: use meta-data to search hyperparameter space more effectively

slide-16
SLIDE 16
  • Warm start: initialize search

with promising configurations

  • Transfer learning, e.g. surrogate

models with prior

  • Acquisition functions based on

meta-models

Combine meta-learning and optimization

Bayesian optimization

slide-17
SLIDE 17
  • Acquisition functions based on meta-models

Combine meta-learning and optimization

slide-18
SLIDE 18

Combine meta-learning and optimization

(Adaptive) multi-armed bandits

  • Use meta-learning to predict

which configurations to race first

  • Learn from previous iterations
  • Active testing:

choose configurations that

  • utperformed the surviving

configurations on similar datasets.

slide-19
SLIDE 19

Online collaboration Meta-learning

  • ptimization

Automate data analysis

(new) data initial workflows improved workflows

Thank you