Decision-making Bias in Instance Matching Model Selection Mayank - - PowerPoint PPT Presentation

decision making
SMART_READER_LITE
LIVE PREVIEW

Decision-making Bias in Instance Matching Model Selection Mayank - - PowerPoint PPT Presentation

Decision-making Bias in Instance Matching Model Selection Mayank Kejriwal, Daniel P . Miranker Acknowledgements: US National Science Foundation, Microsoft Research Instance Matching 50+ year old Artificial Intelligence problem When do


slide-1
SLIDE 1

Decision-making Bias in Instance Matching Model Selection

Mayank Kejriwal, Daniel P . Miranker Acknowledgements: US National Science Foundation, Microsoft Research

slide-2
SLIDE 2

Instance Matching

 50+ year old Artificial Intelligence problem  When do two entities refer to the same underlying entity? “Record linkage: making maximum use of the discriminating power of identifying information.” Newcombe and Kennedy (1962) Numerous surveys by Winkler (2006), Rahm et al. (2010) etc. 2

slide-3
SLIDE 3

Machine learning

3

Classifier example: feedforward multilayer perceptron (MLP)

“Machine Learning: an artificial intelligence approach.” Michalski, Carbonell and Mitchell (2013)

slide-4
SLIDE 4

Supervised machine learning

4

  • Requires a (manually) labeled set for both training and

validation

  • Typically acquired through sampling a ground-truth
  • Training: Classifier parameters (e.g. edge weights of MLP)
  • Validation: Classifier hyperparameters (e.g. number of

layers, nodes, learning rate...)

  • Also requires model selection decisions:
  • Which training algorithm?
  • What sampling technique?
  • How to split the data for training/validation?
  • Not obvious

“Machine Learning: an artificial intelligence approach.” Michalski, Carbonell and Mitchell (2013)

slide-5
SLIDE 5

Model Selection Exercise

 What percentage of labeled data should I

use for training and what percentage for validation?

“Machine Learning: an artificial intelligence approach.” Michalski, Carbonell and Mitchell (2013) 5

slide-6
SLIDE 6

What do other people do?

 Most common approach in the literature is a ten-fold

split (and less often, two-fold)

 What if I care more about one performance metric (say

recall, versus precision) within reasonable constraints?

 What if I have sampled and labeled a lot of data (say 90%

  • f the estimated ground-truth?)

 Should answers to these questions (and others) bias my

decision?

“Semi-supervised instance matching using boosted classifiers.” Kejriwal and Miranker (2015) 6

slide-7
SLIDE 7

Let’s do an experiment

Labeled Data (as percentage of ground-truth) Precision Recall 10% 54.13% 25.77% 50% 61.51% 28.77% 90% 73.27% 27.69% 10% 45.47% 35.64% 50% 55.50% 34.92% 90% 66.67% 36.92% Ten-fold split Two-fold split Results for the Amazon-GoogleProducts benchmark, using MLP Consistent results across two other benchmarks, and several experimental controls...

7

slide-8
SLIDE 8

 What if I care more about recall than precision?  I should choose a two-fold split (unlike what the

literature would suggest)

 What if I have sampled and labeled a lot of data(say

90% of the estimated ground-truth?)

 An irrelevant concern, once the metric is

specified

Concluding the exercise

8

Takeaway: Some model selection decisions can bias other model selection decisions, not always in an obvious way

slide-9
SLIDE 9

How do we make informed model selection decisions?

9

slide-10
SLIDE 10

Decision-making and Model Selection

 Cognitive psychology has shown (empirically) that

human beings are neither logical nor rational

 Wason Selection Task  Prospect Theory (awarded the 2002 Nobel Prize for

Economics)

“Reasoning about a rule.” Wason (1968) “The logic of social exchange: Has natural selection shaped how humans reason? Studies with the Wason selection task.” Cosmides (1989) “Propsect theory: an analysis of decision under risk.” Kahneman and Tversky (1979) 10

slide-11
SLIDE 11

One systematic method is to start by...

 Visualizing decision-making biases through capturing

influences between decisions Labeling budget Computational resources Training/ Validation split Performance Metric

11

Decision

slide-12
SLIDE 12

Concise approach: bipartite graphs

“Bipartite graphs and their applications.” Asratian et al. (1998)

Labeling budget Computational resources Training/ Validation split Performance Metric Node of influence

12

The interpretation of the nodes and edges is abstract (we don’t impose strict requirements)

slide-13
SLIDE 13

Hypothesizing about biases

 The art in model selection: are there edges we should

consider removing/adding?

 In the paper, we form at least four hypotheses that

directly translate to recommendations Labeling budget Computational resources Training/ Validation split Performance Metric

13

slide-14
SLIDE 14

14

slide-15
SLIDE 15

Experimental platform

 Collected over 25 GB of data on the Microsoft Azure ML platform  Used three publicly available benchmarks

15

slide-16
SLIDE 16

Efficiency Recommendation 1

 Validation is usually much faster than training,

especially for expressive classifiers

 Run-time reductions of almost 70% with proportionally less

loss in effectiveness

 Recommendation: consider favoring more validation over

training if speed is an important concern

16

slide-17
SLIDE 17

Efficiency Recommendation 2

 Validation is usually much faster than training,

especially for expressive classifiers

 Grid search is no more effective than random search for

default hyperparameter values

 Mean difference less than 0.99% and not statistically

significant

 Recommendation: Favor random search in your

hyperparameter optimization as it is much faster (over 90% run-time decrease)

17

slide-18
SLIDE 18

Concluding notes

 Hard problems (e.g. instance matching) require an

ingenious combination of heuristics, biases and models

 Understanding decision-making biases can help us do

better model selection

 Can also help to identify experimental confounds!

 There are many proposals to visualize decision-making,

but not decision-making bias

 We proposed a bipartite graph as a good candidate

 The visualization is not just a pedantic exercise

 About 25 GB of data shows that it can also be useful

 Many future directions!

kejriwalresearch.azurewebsites.net

18

https://sites.google.com/a/utexas.edu/mayank- kejriwal/projects/semantics-and-model-selection

slide-19
SLIDE 19

What biases go into your model selection process?

19