Better Machine Learning Through Data Sa Saleema ema Amershi shi - - PowerPoint PPT Presentation

better machine learning through data
SMART_READER_LITE
LIVE PREVIEW

Better Machine Learning Through Data Sa Saleema ema Amershi shi - - PowerPoint PPT Presentation

Better Machine Learning Through Data Sa Saleema ema Amershi shi Machine T eaching Group Microsoft Research August 14, 2016 Making better sense of data. Better data makes better machine learning. Data + Algorithm = Model Data Algorithm


slide-1
SLIDE 1

Better Machine Learning Through Data

Sa Saleema ema Amershi shi Machine T eaching Group Microsoft Research

August 14, 2016

slide-2
SLIDE 2

Making better sense of data. Better data makes better machine learning.

slide-3
SLIDE 3

Data + Algorithm = Model

Data Model Algorithm

slide-4
SLIDE 4

Data + Algorithm = Model

Data Model Algorithm Machine learning research often takes the data as given.

slide-5
SLIDE 5

When Algorithms Discriminate – The New York Times, 2015 Big Data’s all-too-human failings – Reuters, 2016 Artificial Intelligence’s White Guy Problem

– The New York Times, 2016

Mapping Crime – Or Stirring Hate?– Financial Times, 2014

slide-6
SLIDE 6

Making better sense of data. Better data makes better machine learning. Most influence practitioners have on machine learning is through data.

slide-7
SLIDE 7

Data + Algorithm = Model

Data Model Algorithm In research, data is often taken as given.

slide-8
SLIDE 8

Algorithm

Data + Algorithm = Model

Model Data In practice, the algorithm is often taken as given. In research, data is often taken as given.

slide-9
SLIDE 9

Algorithm

Data + Algorithm = Model

Model Data

“Data scientists, according to interviews and expert estimates, spend 50 perc ercen ent to 80 perc ercen ent

  • f thei

eir r time mired in this more mundane labor

  • f collecting and preparing unruly digital data.”
  • New York Times, 2014

In practice, the algorithm is often taken as given.

slide-10
SLIDE 10

Algorithm

Data + Algorithm = Model

Model Data

slide-11
SLIDE 11

[Patel et al., CHI 2008]

slide-12
SLIDE 12

Algorithm

Data + Algorithm = Model

Model Data

slide-13
SLIDE 13

Algorithm

Data + Algorithm = Model

Data Model Iterations are driven by evaluating models on data.

slide-14
SLIDE 14

Algorithm

Data + Algorithm = Model

Model Data Iterations are driven by evaluating models on data. In practice, most effort is spent crafting input data.

slide-15
SLIDE 15

Algorithm Model Data

Machine learning in theory

slide-16
SLIDE 16

Algorithm Collect & Label Samples Create Features Evaluate Results

Machine learning in practice

slide-17
SLIDE 17

Algorithm Evaluate Results Collect & Label Samples Create Features

slide-18
SLIDE 18

Algorithm Evaluate Results Collect & Label Samples Create Features Structured Labeling [CHI 2014] Feature Insight [VAST 2015] ModelTracker [CHI 2015, VAST 2016]

slide-19
SLIDE 19

Evaluate Results Create Features Algorithm Feature Insight [VAST 2015] ModelTracker [CHI 2015, VAST 2016] Collect & Label Samples Structured Labeling [CHI 2014]

slide-20
SLIDE 20

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

slide-21
SLIDE 21

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

slide-22
SLIDE 22

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

1

slide-23
SLIDE 23

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

1

slide-24
SLIDE 24

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

2

slide-25
SLIDE 25

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

2

slide-26
SLIDE 26

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

2 1

slide-27
SLIDE 27

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

2 1

slide-28
SLIDE 28

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

2 2

slide-29
SLIDE 29

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

2 2

slide-30
SLIDE 30

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

3 2

slide-31
SLIDE 31

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

3 2

slide-32
SLIDE 32

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

4 2

slide-33
SLIDE 33

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

4 2

slide-34
SLIDE 34

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

4 3

slide-35
SLIDE 35

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

4 3

slide-36
SLIDE 36

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

4 3

Does not support concep ncept t evolution

  • lution

(refining the target concept as data is

  • bserved).
slide-37
SLIDE 37

How common is concept evolution?

Nine machine learning experts labeled the same 200 pages in two sessions (4 weeks apart). Average consistency 81.7% (SD=6.8%) 6 out of 9 people’s labels changed significantly (via Chi Square test of symmetry)

25 50 75 100

Consistency Participants

slide-38
SLIDE 38

Proposed Solution – Structured Labeling

Enable people to exp xplic licitly itly organize anize th their eir concept ncept via grou

  • uping

ping and ta tagging gging within a traditional labeling scheme.

slide-39
SLIDE 39

Traditional Labeling

Pre-defined high-level categories.

Cat Cat Not Cat Cat Is this a Cat?

2 2

slide-40
SLIDE 40

Structured Labeling

Cat Cat Not Cat Cat Is this a Cat?

2

Definitely Cat

2

Definitely Not Cat

slide-41
SLIDE 41

Structured Labeling

Cat Cat Not Cat Cat Is this a Cat?

2

Definitely Cat

2

Definitely Not Cat

slide-42
SLIDE 42

Structured Labeling

Cat Cat Not Cat Cat Is this a Cat?

2

Definitely Cat

2

Definitely Not Cat

2 2

Definitely Not Cat Cat Poster

1

User provided tags on groups aid recall. Grouping within high-level categories.

slide-43
SLIDE 43

Structured Labeling

Cat Cat Not Cat Cat Is this a Cat?

2

Definitely Cat

2

Definitely Not Cat

2 2

Definitely Not Cat Cat Poster

1

User provided tags on groups aid recall. Grouping within high-level categories.

Blogs

2

Lions

2

slide-44
SLIDE 44

Structured Labeling

Cat Cat Not Cat Cat Is this a Cat?

2

Definitely Cat

2

Definitely Not Cat

2 2

Definitely Not Cat Cat Poster

1

User provided tags on groups aid recall. Grouping within high-level categories.

Blogs

2

Lions

2

slide-45
SLIDE 45

Structured Labeling

Cat Cat Not Cat Cat Is this a Cat?

2

Definitely Cat

2

Definitely Not Cat

2 2

Definitely Not Cat

User provided tags on groups aid recall. Grouping within high-level categories.

Blogs

2

Lions

2

Cat Poster

2

slide-46
SLIDE 46

Structured Labeling

Cat Cat Not Cat Cat Is this a Cat?

2

Definitely Cat

2

Definitely Not Cat

2 2

Definitely Not Cat

User provided tags on groups aid recall. Grouping within high-level categories.

Lions

2

Blogs

2

Cat Poster

2

Can move, merge and split groups as desired.

slide-47
SLIDE 47

Assisted Structured Labeling

Cat Cat Not Cat Cat Is this a Cat?

2

Definitely Cat

2

Definitely Not Cat

2 2

Definitely Not Cat Lions

2

Blogs

2

Cat Poster

2

Grouping recommendations to improve label consistency.

slide-48
SLIDE 48

Assisted Structured Labeling

Cat Cat Not Cat Cat Is this a Cat?

2

Definitely Cat

2

Definitely Not Cat

2 2

Definitely Not Cat Lions

2

Blogs

2

Cat Poster

2

Grouping recommendations to improve label consistency. Similar items to help users make decisions.

slide-49
SLIDE 49

Findings

People revised labels significantly more with structured labeling People labeled more consistently People preferred it over traditional labeling

Label Consistency

(X2=6.53, df=2, p < .038) (X2=20.19, df=2, p < .001)

Mean # Groups

(X2=12, df=2, p < .002)

# Revisions

slide-50
SLIDE 50

Structured Labeling Summary

Current tools do not support concept cept evoluti

  • lution
  • n.

Str tructur uctured ed labeli eling ng helps people refine their concepts by surfacing labeling decisions and aiding recall. People used structured labeling when it was available and labeled eled mo more cons nsistently stently. Str tructur ucture e conta ntains ins additi itional

  • nal infor
  • rmat

mation ion (e.g., group related features, group related accuracy, decisions made…)

slide-51
SLIDE 51

Evaluate Results Algorithm ModelTracker [CHI 2015, VAST 2016] Collect & Label Samples Create Features Feature Insight [VAST 2015] Structured labeling improves consistency [CHI 2014]

slide-52
SLIDE 52

“At the end of the day, some machine learning projects succeed and some fail. What makes the difference? Eas asily y the he mo most st imp mportan tant t fac actor r is the s the fea eatures es use sed. d.”

[Domingos, CACM 2012]

…yet, little guidance or best practices exist.

slide-53
SLIDE 53

How do people come up with features?

Look for features used in related domains. Use intuition or domain knowledge. Apply automated techniques Featu ture e ideation ation – Think of and experiment with custom features (a “black art”).

slide-54
SLIDE 54

Proposed Solution – Feature Insight

Support comp mpar are e and contra ntrast st of data.

slide-55
SLIDE 55

What makes a cat a cat?

slide-56
SLIDE 56

What makes a cat a cat?

slide-57
SLIDE 57

Proposed Solution – Feature Insight

Support comp mpar are e and contra ntrast st of data. Comparing pairs vs sets?

slide-58
SLIDE 58

Comparing Pairs vs Sets

Sets may help people think of generalizable features.

Negative Positive Positives Negatives

vs

slide-59
SLIDE 59

Proposed Solution – Feature Insight

Support comp mpar are e and contra ntrast st of data. Comparing pairs vs sets? Raw data vs visual summaries?

slide-60
SLIDE 60

Looking at Raw Data vs. Visual Summaries

Visual summaries may reveal relevant characteristics and hide irrelevant noise.

vs

Raw Data Visual Summary

slide-61
SLIDE 61

Individual Comparison Set Comparison Raw Data Visual Summaries

slide-62
SLIDE 62

0.0 0.5 1.0

Classifier Performance (p<.01)

2 4 6

Feature Count

1 2 3

Preference Rank (Small is better) (p=.03) Raw + Individual Raw + Set Visual + Individual Visual + Set

Findings

Visual summaries led to better features Visual summaries preferred over looking at raw data Sets useful only in combination with visuals

slide-63
SLIDE 63

Feature Insight Summary

Featu turing ing is arguably the most important step in machine learning, but there is little guidance on featur ture ideation ation. Feature Insight supports error comparison, examination of sets, and visual summaries. Visual al summ mmaries ries help people create better tter quality lity featur atures es.

slide-64
SLIDE 64

Algorithm Collect & Label Samples Structured Labeling [CHI 2014] Create Features Feature Insight [VAST 2015] Evaluate Results ModelTracker [CHI 2015, VAST 2016]

slide-65
SLIDE 65

Algorithm Evaluate Results ModelTracker [CHI 2015, VAST 2016] Collect & Label Samples Structured Labeling [CHI 2014] Create Features Feature Insight [VAST 2015]

How do people evaluate performance?

slide-66
SLIDE 66

Algorithm Evaluate Results Collect & Label Samples Create Features

0.71 0.67 0.70 ???

Predicted Actual Positive Negative Positive 143 72 Negative 35 190

How do people evaluate performance?

Summary statistics hide de imp mportant ant info format mation ion about model behavior.

slide-67
SLIDE 67

Algorithm Evaluate Results Collect & Label Samples Create Features

0.71 0.67 0.70 ???

Predicted Actual Positive Negative Positive 143 72 Negative 35 190

How do people evaluate performance?

Summary statistics hide de imp mportant ant info format mation ion about model behavior.

slide-68
SLIDE 68

Algorithm Evaluate Results Collect & Label Samples Create Features

0.71 0.67 0.70 ???

Predicted Actual Positive Negative Positive 143 72 Negative 35 190

Summary statistics hide de imp mportant ant info format mation ion about model behavior. Switch ching ing tools to examine data is di disr srupt ptive ive and leads to a trial-and- error approach [Patel et al., AAAI 2008].

How do people evaluate performance?

slide-69
SLIDE 69

Example: Predicting Income Levels

slide-70
SLIDE 70
slide-71
SLIDE 71

Decision Tree 86% Accuracy Support Vector Machine 85% Accuracy

slide-72
SLIDE 72

Decision Tree 86% Accuracy Support Vector Machine 85% Accuracy

slide-73
SLIDE 73

ModelTracker Demo

slide-74
SLIDE 74

Significantly faster and more accurate performance analysis

ModelTracker Common Confusion Matrix

slide-75
SLIDE 75

ModelTracker Summary

Current tools for performance analysis and debugging hide a lot of important information about model behavior. ModelTracker suppor ports ts esti timating mating performance formance at mu t multi tiple ple level vels s of granul nularity rity while enabling direct ect acces cess s to to data ta. People are significantly faster ter and mo more e accura urate te at performance analysis with ModelTracker.

slide-76
SLIDE 76

Algorithm Evaluate Results ModelTracker [CHI 2015, VAST 2016] Collect & Label Samples Structured Labeling [CHI 2014] Create Features Feature Insight [VAST 2015]

slide-77
SLIDE 77

Tune Collect Clean Deploy Train

Many more opportunities to better support machine learning in practice.

Label Feature Evaluate

slide-78
SLIDE 78

Tune Evaluate Label Feature Collect Clean Train Deploy

Many more opportunities to better support machine learning in practice.

slide-79
SLIDE 79

Tune Evaluate Feature Collect Clean Deploy Label Train

Many more opportunities to better support machine learning in practice and theory.

slide-80
SLIDE 80

Making better sense of data. Better data means better machine learning. Most influence practitioners have on machine learning is through data. Many more opportunities!

slide-81
SLIDE 81

Better Machine Learning Through Data

Sa Saleema ema Amershi, shi, samer ershi@ shi@mic micros

  • soft.co

.com Machine T eaching Group Microsoft Research

August 14, 2016

Thanks! Questions?