GENERALIZED INVERSE CLASSIFICATION SDM 17 Michael T. Lash 1 , - - PowerPoint PPT Presentation

generalized inverse classification
SMART_READER_LITE
LIVE PREVIEW

GENERALIZED INVERSE CLASSIFICATION SDM 17 Michael T. Lash 1 , - - PowerPoint PPT Presentation

GENERALIZED INVERSE CLASSIFICATION SDM 17 Michael T. Lash 1 , Qihang Lin 2 , W. Nick Street 2 , Jennifer G. Robinson 3 , and Jeffrey Ohlmann 2 1 Department of Computer Science, 2 Deparment of Management Sciences, 3 Department of Epidemiology


slide-1
SLIDE 1

GENERALIZED INVERSE CLASSIFICATION

SDM ’17

Michael T. Lash1, Qihang Lin2, W. Nick Street2, Jennifer

  • G. Robinson3, and Jeffrey Ohlmann2

1Department of Computer Science, 2Deparment of Management

Sciences, 3Department of Epidemiology www.michaeltlash.com

slide-2
SLIDE 2

What is inverse classification?

The process of making meaningful perturbations to a test instance such that the probability of a desirable outcome is maximized.

1

slide-3
SLIDE 3

What is inverse classification?

The process of making meaningful perturbations to a test instance such that the probability of a desirable outcome is maximized.

1

slide-4
SLIDE 4

What is inverse classification?

The process of making meaningful perturbations to a test instance such that the probability of a desirable outcome is maximized.

1

slide-5
SLIDE 5

What is inverse classification?

The process of making meaningful perturbations to a test instance such that the probability of a desirable outcome is maximized.

1

slide-6
SLIDE 6

What is inverse classification?

The process of making meaningful perturbations to a test instance such that the probability of a desirable outcome is maximized.

1

slide-7
SLIDE 7

What is inverse classification?

The process of making meaningful perturbations to a test instance such that the probability of a desirable outcome is maximized.

1

slide-8
SLIDE 8

What is inverse classification?

The process of making meaningful perturbations to a test instance such that the probability of a desirable outcome is maximized.

1

slide-9
SLIDE 9

What is inverse classification?

The process of making meaningful perturbations to a test instance such that the probability of a desirable outcome is maximized. What about the meaningful part of the definition?

1

slide-10
SLIDE 10

Meaningful Perturbations Well...lets visit some past work!

Michael T. Lash, Qihang Lin, W. Nick Street, and Jennifer G. Robinson, “A budget-constrained inverse classification framework for smooth classifiers”, arXiv preprint arXiv:1605.09068, submitted.

2

slide-11
SLIDE 11

Meaningful Perturbations

Begin with a basic formulation.

2

slide-12
SLIDE 12

Meaningful Perturbations

2

slide-13
SLIDE 13

Meaningful Perturbations

2

slide-14
SLIDE 14

Meaningful Perturbations

Some regressor

Segment features.

2

slide-15
SLIDE 15

Meaningful Perturbations

Some regressor

Estimate indirectly changeable.

2

slide-16
SLIDE 16

Meaningful Perturbations

Some regressor

Update objective function. Add constraints.

2

slide-17
SLIDE 17

Meaningful Perturbations

Some regressor

Cost-change function

2

slide-18
SLIDE 18

Meaningful Perturbations

Some regressor Budget

2

slide-19
SLIDE 19

Meaningful Perturbations

Some regressor Bounds

2

slide-20
SLIDE 20

Main Contributions

  • 1. Relax assumptions about f (·).

Some regressor Bounds

3

slide-21
SLIDE 21

Main Contributions

  • 1. Relax assumptions about f (·).

Some regressor Bounds

3

slide-22
SLIDE 22

Main Contributions

  • 1. Relax assumptions about f (·).

Some regressor Bounds

Generalized inverse classification

3

slide-23
SLIDE 23

Main Contributions

  • 1. Relax assumptions about f (·).
  • 2. Quadratic cost-change function.

Some regressor

4

slide-24
SLIDE 24

Main Contributions

  • 1. Relax assumptions about f (·).
  • 2. Quadratic cost-change function.

Some regressor

4

slide-25
SLIDE 25

Main Contributions

  • 1. Relax assumptions about f (·).
  • 2. Quadratic cost-change function.

Some regressor

4

slide-26
SLIDE 26

Main Contributions

  • 1. Relax assumptions about f (·).
  • 2. Quadratic cost-change function.
  • 3. Three real-valued heuristic optimization

methods and two sensitivity analysis-based

  • ptimization methods.

* Projection operator to maintain feasibility.

5

slide-27
SLIDE 27

Optimization Methodology

Heuristic Hill Climbing + Local Search (HC+LS) Genetic Algorithm (GA) Genetic Algorithm + Local Search (GA+LS) Sensitivity Analysis Local Variable Perturbation – First Improvement (LVP-FI) Local Variable Perturbation – Best Improvement (LVP-BI)

6

slide-28
SLIDE 28

Experiment Decisions and Data

f (·): Random forest H(·): Kernel regression Dataset 1: Student Performance (UCI Machine Learning Repository). Dataset 2: ARIC One f for optimization, separate f for heldout evaluation.

7

slide-29
SLIDE 29

Results: Student Performance

8

slide-30
SLIDE 30

Results: Student Performance

8

slide-31
SLIDE 31

Results: ARIC

9

slide-32
SLIDE 32

Results: ARIC

Need sparsity constraints

9

slide-33
SLIDE 33

Conclusions

Generalized Inverse Classification: can use virtually any learned f (as shown by experiments w/ Random Forest classifier). Our proposed methods were successful, although this varied by dataset.

10

slide-34
SLIDE 34

GENERALIZED INVERSE CLASSIFICATION

SDM ’17

Michael T. Lash1, Qihang Lin2, W. Nick Street2, Jennifer

  • G. Robinson3, and Jeffrey Ohlmann2

1Department of Computer Science, 2Deparment of Management

Sciences, 3Department of Epidemiology www.michaeltlash.com

10

slide-35
SLIDE 35

Causality and Inverse Classification

11

slide-36
SLIDE 36

Causality and Inverse Classification

Yes! ....

11

slide-37
SLIDE 37

Causality and Inverse Classification

Yes! .... What we’re doing:

  • 1. Imposing our own causal structure (DAG).
  • 2. We’re not taking the usual counterfactual approach.

11

slide-38
SLIDE 38

Causality and Inverse Classification

Yes! .... What we’re doing:

  • 1. Imposing our own causal structure (DAG).
  • 2. We’re not taking the usual counterfactual approach.

Future work will focus on incorporating causal methodology...

11