DeepI DeepIV: A : A F Flexibl ble A Appr pproa oach for for - - PowerPoint PPT Presentation

deepi deepiv a a f flexibl ble a appr pproa oach for for
SMART_READER_LITE
LIVE PREVIEW

DeepI DeepIV: A : A F Flexibl ble A Appr pproa oach for for - - PowerPoint PPT Presentation

DeepI DeepIV: A : A F Flexibl ble A Appr pproa oach for for Co Counte terf rfac actu tual al Pr Predicti tion Greg Lewis * and Matt Taddy Jason Hartford and Kevin Leyton-Brown University of British Columbia Microsoft Research


slide-1
SLIDE 1

DeepI DeepIV: A : A F Flexibl ble A Appr pproa

  • ach for

for Co Counte terf rfac actu tual al Pr Predicti tion

Jason Hartford and Kevin Leyton-Brown Greg Lewis* and Matt Taddy†

University of British Columbia Microsoft Research &

*NBER / † University of Chicago

slide-2
SLIDE 2

SkyHighAir

Jason

I need a model that predicts the effect of price on ticket sales

slide-3
SLIDE 3

Prediction with confounding effects

Jason

We can raise prices and get more sales!

slide-4
SLIDE 4

Prediction with confounding effects

!

Sales Price

" = $ !

slide-5
SLIDE 5

Prediction with confounding effects

Sales Price

" = $ !, & !

slide-6
SLIDE 6

Prediction with confounding effects

Sales

" = $ !, & ! = '(&)

Automated pricing engine increases prices as the plane fills

Price

slide-7
SLIDE 7

The observational distribution

Sales Holidays Price

" = $(!, *) ! = '(*) *

“features / observed confounders” “policy / treatment” “response”

slide-8
SLIDE 8

The interventional distribution

Sales Holidays Price

Set ! = !̂

  • ("|do !̂ , *)

*

“features / observed confounders” “policy / treatment” “response”

slide-9
SLIDE 9

Identification of causal effects

Sales Holidays Price

" = $(!, *) ! = '(*) *

“features / observed confounders” “policy / treatment” “response” If *, ! & " observed,

  • ("|do ! , *) is identified.

See e.g. [Athey et al. 2016], [Shalit et al. 2017]

slide-10
SLIDE 10

Identification of causal effects

Sales Holidays Price

" = $ !, *, & ! = '(*, &) * &

“features / observed confounders” “policy / treatment” “response” “latent / unobserved confounders” Not identified without further assumptions

Conference

slide-11
SLIDE 11

Identification of causal effects

Sales Holidays Price Fuel Cost

" = $ !, * + & ! = '(*, 3, &) * & 3

Variable that

  • nly affects the response

indirectly via its effect on price Additive latent effects “instrument” “features / observed confounders” “policy / treatment” “response” “latent / unobserved confounders”

Conference

slide-12
SLIDE 12

Simulate a world without latent effects on price

Price Sales Holidays Fuel Cost Conference Estimated

slide-13
SLIDE 13

Simulate a world without latent effects on price

Price Sales Holidays Fuel Cost Conference Estimated

slide-14
SLIDE 14

The learning problem

These assumptions imply the following identity1,

4 " *, 3 = 4 $ !, * *, 3 = ∫ $ !, * 67(!|*, 3)

So we can recover $(!, *) solve the implied inversion problem...

min

;∈= > "? − ∫ $ !, *? 67 ! *, 3 A B ?CD

  • 1. This holds if 4 & *] = 0. In general we recover $(!, *) up to a constant wrt ! – see paper for details.
slide-15
SLIDE 15

Stage 1: fit GH I J K, L using the model of your choice. We use mixture density networks [Bishop 94] Stage 2: train network MN O using stochastic gradient descent with monte-carlo integration.

min

M∈P > QR − ∫ M J, KR SG J K, L T U RCV

GH I J K, L

Sample

At each SGD iteration

* !̇ " … MN O(J, K)

XY Z = −2 QR − 1 JV ̇ > M ] J ̇ V, KR

  • JV

̇ ~G ` J K, L

× 1 |JT ̇ | > bNM ] JT ̇ , KR

  • JT

̇ ~G ` J K, L

A two-stage solution

slide-16
SLIDE 16
  • In general, out-of-sample validation causal models is challenging /

impossible…

  • But… both our losses depend only on observable quantities and

reflect causal loss, so we can simply use standard validation sets.

Causal Validation

slide-17
SLIDE 17

Evaluation

Simulation & Bing Ads Experiments

slide-18
SLIDE 18

Ticket Sales Holidays

Conference

Customer type

c~d{0, 1, . . , 6} i lets us smoothly vary the correlation between sales and price

Ticket Price Fuel Cost

Customer features

Price Sensitivity

Simulation Experiments

slide-19
SLIDE 19

Simulation – low dimensional feature space

slide-20
SLIDE 20

Simulation – low dimensional feature space

slide-21
SLIDE 21

Simulation – low dimensional feature space

slide-22
SLIDE 22

Simulation – low dimensional feature space

slide-23
SLIDE 23

Simulation – low dimensional feature space

slide-24
SLIDE 24

Simulation – low dimensional feature space

slide-25
SLIDE 25

[Darolles et al. 2011]

Simulation – low dimensional feature space

slide-26
SLIDE 26

[Darolles et al. 2011]

Simulation – low dimensional feature space

slide-27
SLIDE 27

Implications and future directions

  • We recover heterogeneous treatment effects in settings with

unobserved confounding effects for both discrete and continuous variables… and SGD scales naturally to very large datasets.

  • Can leverage the flexibility of deep nets for rich data types. E.g.

raw text in our Bing ads application experiments / images in simulation. Future work:

  • Methods for uncertainty estimates over predictions.

Code and paper available at http://bit.ly/DeepIV Poster #127