Make sure we can query black box algorithms - - PowerPoint PPT Presentation

make sure we can query black box algorithms
SMART_READER_LITE
LIVE PREVIEW

Make sure we can query black box algorithms - - PowerPoint PPT Presentation

Make sure we can query black box algorithms http://www.bloomberg.com/graphics/2016-amazon-same-day/ Auditing Black Box Models Training vs Testing No access to training Training data or algorithm data Test data Auditing Black Box


slide-1
SLIDE 1
slide-2
SLIDE 2

Auditing Black Box Models

Make sure we can query black box algorithms

http://www.bloomberg.com/graphics/2016-amazon-same-day/

slide-3
SLIDE 3

Auditing Black Box Models

Training vs Testing

✔ ✖

Training data Test data No access to training data or algorithm

slide-4
SLIDE 4

Auditing Black Box Models

How can we understand a model

™ If we use a “simple” model we can interpret it directly.

™ Decision trees ™ Linear classifiers ™ SLIM (Sparse Linear Interpretable Models)

slide-5
SLIDE 5

Auditing Black Box Models

Simple models are hard

Paul Raccuglia, Katherine C. Elbert, Philip D. F. Adler, Casey Falk, Malia B. Wenny, Aurelio Mollo, Matthias Zeller, Sorelle A. Friedler, Joshua Schrier, and Alexander J. Norquist. Machine- learning-assisted materials discovery using failed experiments. Nature, 533: 73 - 76, May 5, 2016. http://dx.doi.org/10.1038/nature17439

slide-6
SLIDE 6

Auditing Black Box Models

Research Question

™ Given a black box function ™ Determine the influence each variable has on the

  • utcome

™ How do we quantify influence ™ How do we model it (random perturbations?) ™ How do we handle indirect and joint influence

Y = f (x1, . . . , xn)

slide-7
SLIDE 7

Auditing Black Box Models

Direct vs Indirect Influence Auditing

™ Does a feature (or group of features) directly influence the outcome?

™ E.g a feature used in a decision tree

™ Intervention:

™ Replace feature with random noise and see how much model accuracy degrades.

slide-8
SLIDE 8

Auditing Black Box Models

Direct vs Indirect Influence Auditing

™ Does a feature (or group of features) indirectly influence the outcome?

™ E.g zipcode as a proxy for race?

™ Intervention:

™ Direct perturbation no longer works, because more than

  • ne variable carries the desired signal.
slide-9
SLIDE 9

Auditing Black Box Models

Information content and indirect influence

the information content of a feature can be estimated by trying to predict it from the remaining features

If the removed feature can’t be predicted from the remaining features, then the information from that feature can’t influence the outcome of the model.

slide-10
SLIDE 10

Auditing Black Box Models

Information content and indirect influence

the information content of a feature can be estimated by trying to predict it from the remaining features

Given variables X, Y that are correlated, find Y’ conditionally independent of X such that Y’ is as similar to X as possible.

slide-11
SLIDE 11

Auditing Black Box Models

Gradient Feature Audit

For each feature,

  • 1. Remove indirect influence of feature on other features in

data

  • 2. Run model on modified test data
  • 3. Feature influence = original accuracy – resulting accuracy

Example: Auditing Amazon model: Feature to remove: race Eliminate (obscure) influence of race on zipcode

slide-12
SLIDE 12

Auditing Black Box Models

Gradient Feature Audit

For each feature,

  • 1. Remove indirect influence of feature on other features in

data

  • 2. Run model on modified test data
  • 3. Feature influence = original accuracy – resulting accuracy

Example: Auditing Amazon model: Feature to remove: race Eliminate (obscure) influence of race on zipcode

All our measures

  • f influence are

relative to a fixed model.

slide-13
SLIDE 13

Auditing Black Box Models

How do we remove indirect influence?

0.000 0.002 0.004 0.006 0.008 200 400 600 800

Hypothetical SAT scores

Merge conditional distributions of obscured feature based on eliminated feature.

slide-14
SLIDE 14

Auditing Black Box Models

How do we remove indirect influence?

0.000 0.002 0.004 0.006 0.008 200 400 600 800

Hypothetical SAT scores

This will ensure that F-test will fail to tell them apart (provably*)

slide-15
SLIDE 15

Auditing Black Box Models

How do we remove indirect influence?

0.000 0.002 0.004 0.006 0.008 200 400 600 800

Hypothetical SAT scores

Need different approaches for categorical and numerical removed and eliminated variables.

slide-16
SLIDE 16

Auditing Black Box Models

Representation matters!

™ Should race be categorical or numerical? ™ Should it be “white/non-white” or multi-valued? ™ These issues matter! For more, see

™ https://arxiv.org/abs/1802.04422 ™ https://github.com/algofairness/fairness-comparison