Now What? Foster Provost Thanks to Josh Attenburgh, Henry Chen, - - PDF document

now what
SMART_READER_LITE
LIVE PREVIEW

Now What? Foster Provost Thanks to Josh Attenburgh, Henry Chen, - - PDF document

Foster Provost 11/17/17 So Youve Built a Machine Learning Model Now What? Foster Provost Thanks to Josh Attenburgh, Henry Chen, Brian Dalessandro, Sam Fraiberger, Thore Graepel, Panos Ipeirotis, Michal Kosinski, David Martens,


slide-1
SLIDE 1

Foster Provost – 11/17/17 1

So You’ve Built a Machine Learning Model…

Now What?

Foster Provost

Thanks to Josh Attenburgh, Henry Chen, Brian Dalessandro, Sam Fraiberger, Thore Graepel, Panos Ipeirotis, Michal Kosinski, David Martens, Claudia Perlich, David Stillwell

The Data Science Process is a useful framework for thinking through lots of modeling & managerial decisions about solving problems with AI/Machine Learning/Data Science

For more, see Data Science for Business Provost & FawceF. O’Reilly Media 2013

slide-2
SLIDE 2

Foster Provost – 11/17/17 2

Just a few issues:

  • Misalignment of problem

formulaQon

  • Leakage in features
  • Sampling bias
  • Learning bias (ML favors

larger subpopulaQons)

  • Labeling bias
  • EvaluaQon bias

The Data Science Process is a useful framework for thinking through lots of modeling & managerial decisions about solving problems with AI/Machine Learning/Data Science

In Reality…

slide-3
SLIDE 3

Foster Provost – 11/17/17 3

In this talk I’ll focus on two common problems faced when deploying machine learned models

  • Lack of transparency into why model-driven

systems make the decisions that they do

– important for a whole bunch of reasons

  • user acceptance, managerial acceptance, debugging/improving

– of current interest: are your decisions fair?

  • “Unknown Unknowns”

– do you know what your model is missing? Especially what it’s missing and “thinks” it’s ge[ng right?

6

Gabrielle Giffords ShooQng, Tucson, AZ, Jan 2011

slide-4
SLIDE 4

Foster Provost – 11/17/17 4 7

Why was Mariko shown this PoFery Barn ad?

slide-5
SLIDE 5

Foster Provost – 11/17/17 5

Why was this decision made?

evidence

?

decision data-driven model Customer Manager Data Science Team

Explana5ons for whom?

slide-6
SLIDE 6

Foster Provost – 11/17/17 6

The Complex World of Models

(Martens & FP, “Explaining Data-driven Document ClassificaQon.” MISQ 2014)

A noQon of explanaQon

The Evidence Counterfactual

  • Models can be viewed as evidence-combining systems
  • We are considering cases where individual pieces of evidence are interpretable
  • Thus, for any specific decision* from any model we can ask:

What is a minimal set of evidence such that if it were not present, the decision* would not have been made?

*The “decision” can be a threshold crossing for a prob. esQmaQon, scoring or regression model

see (Martens & FP MISQ 2014); (Chen, Moakler, Fraiberger, FP, Big Data 2017) (Moeyersoms et al.; Chen, et al.; ICML’16 Wkshp on Human Interpretability In ML) (cf. Hume 1748)

slide-7
SLIDE 7

Foster Provost – 11/17/17 7

Why was Mariko shown this PoFery Barn ad? Why was Mariko shown this PoFery Barn ad?

Because she visited:

  • www.diningroomtableshowroom.com
  • www.mazeltovfurniture.com
  • www.realtor.com
  • www.recipezaar.com
  • www.americanidol.com
slide-8
SLIDE 8

Foster Provost – 11/17/17 8

Let’s focus on the developers

ExplanaQons aid the data science process

  • Help to understand false

posiQves – omen revealing problems with the training data

  • Can reveal problems with the

model

slide-9
SLIDE 9

Foster Provost – 11/17/17 9

With the increasing use of predic=ve models from massive fine-grained behavior data… Consumers are increasingly concerned about the inferences drawn about them.

Kosinski, M., SQllwell, D., & Graepel, T. (2013). Proceedings of the NaQonal Academy of Sciences, 110(15), 5802-5805.

slide-10
SLIDE 10

Foster Provost – 11/17/17 10 Effect of removing selected Facebook Likes from consideraQon by the predicQve model

Two guys predicted to be gay:

Model: logisQc regression

  • n the top 100 latent

dimensions from an SVD

  • f the user/Like matrix.

(Chen, Moakler, Fraiberger, … Big Data 2017) (Chen, et al., ICML Wkshp Interpretability 2016)

slide-11
SLIDE 11

Foster Provost – 11/17/17 11

Why was this guy predicted to be smart?

Opportunity for

  • ffering users control

via a “cloaking device”?

Effect of removing selected Likes from consideraQon by the predicQve model False PosiQves

(Chen, Moakler, Fraiberger, … Big Data 2017) (Chen, et al., ICML Wkshp Interpretability 2016)

slide-12
SLIDE 12

Foster Provost – 11/17/17 12

But there’s a twist…

A firm could purport to give users transparency and control … … but actually make it cumbersome for users to affect the inferences drawn about them:

(Chen, Moakler, Fraiberger, … Big Data 2017) (Chen, et al., ICML Wkshp Interpretability 2016)

slide-13
SLIDE 13

Foster Provost – 11/17/17 13

  • So. ExplanaQons of individual decisions can

help with many issues in the process of building and using machine learned models. But we need more help with one very important problem…

The problem of Unknown Unknowns

  • What is your model missing? What is it

missing and it really thinks that it’s correct?

  • Why would it be missing things?
slide-14
SLIDE 14

Foster Provost – 11/17/17 14

We need to think carefully about the data-generaQng process(es) and the data preparaQon processes – especially the process of ge[ng labeled training & tesQng data.

The problem of Unknown Unknowns

  • What is your model missing? What is it

missing and it really thinks that it’s correct?

  • Why would it be missing things?

– Sampling bias – Learning bias (ML favors larger subpopulaQons) – Labeling bias – Especially severe for Non-self-revealing problems

(AFenberg, IpeiroQs & Provost JDIQ 2015)

slide-15
SLIDE 15

Foster Provost – 11/17/17 15

Harness Humans to Improve Machine Learning

  • With normal labeling, humans are passively labeling the

data that we give them

31

Instead ask humans to search and find positive instances of a rare class

Searching instead of labeling has intriguing performance

(AFenberg & FP KDD 2010)

slide-16
SLIDE 16

Foster Provost – 11/17/17 16

Active learning missing disjunctive subconcepts

33 (AFenberg & FP KDD 2010)

NIPS 2016

slide-17
SLIDE 17

Foster Provost – 11/17/17 17

35

BeFer, but…..

  • Classifier seems great: Cross-validaQon tests show excellent

performance

  • Alas, classifier fails on “unknown unknowns”

“Unknown unknowns” à classifier fails with high confidence

(AFenberg, IpeiroQs & Provost JDIQ 2015)

slide-18
SLIDE 18

Foster Provost – 11/17/17 18

37

Beat the Machine!

Ask humans to find examples that

  • the classifier will classify incorrectly
  • another human will classify correctly

Example: Find hate speech pages that the machine will classify as benign (AFenberg, IpeiroQs & Provost JDIQ 2015)

38

Beat the Machine!

Example: Find hate speech pages that the machine will classify as benign

IncenQve structure:

  • $1 if you “beat the machine”
  • $0.001 if the machine already knows

(AFenberg, IpeiroQs & Provost JDIQ 2015)

slide-19
SLIDE 19

Foster Provost – 11/17/17 19

AAAI 2017

(AFenberg, IpeiroQs & Provost JDIQ 2015)

AAAI 2017

slide-20
SLIDE 20

Foster Provost – 11/17/17 20

Summary

  • We can provide transparency into the reasons

why AI systems make the decisions that they do

  • We can create mechanisms to help find the

“Unknown Unknowns”

  • As a research area, there’s sQll a lot to do

Some reading

Martens & FP, “Explaining Data-driven Document ClassificaQon.” MISQ 2014 Moeyersoms et al. 2016, ICML’16 Wkshp on Human Interpretability In ML Chen, et al. 2016, ICML’16 Wkshp on Human Interpretability In ML Chen, Fraiberger, Moakler, Provost. Big Data 5(3) 2017 AFenberg, J. & Provost, F. Why label when you can search? AlternaQves to acQve learning for applying human resources to build classificaQon models under extreme class

  • imbalance. In KDD 2010.

AFenberg, J., IpeiroQs, P. & Provost, F. Beat the Machine: Challenging Humans to Find a PredicQve Model's “Unknown Unknowns”. Journal of Data and InformaQon Quality (JDIQ), 6(1) 2015.