Decomposing racial differences in outcomes from nonlinear models - - PowerPoint PPT Presentation

decomposing racial differences
SMART_READER_LITE
LIVE PREVIEW

Decomposing racial differences in outcomes from nonlinear models - - PowerPoint PPT Presentation

Decomposing racial differences in outcomes from nonlinear models Paul L. Hebert, PhD Research Associate Professor University of Washington School of Public Health Department of Health Services AcademyHealth Annual Research Meeting 2017


slide-1
SLIDE 1

Decomposing racial differences in outcomes from nonlinear models

Paul L. Hebert, PhD Research Associate Professor University of Washington School of Public Health Department of Health Services AcademyHealth Annual Research Meeting 2017

Supported by the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institutes of Health under Award Number R01HD078565 and by the National Institute on Minority Health and Health Disparities of the National Institutes of Health under Award Number R01MD007651. The content is solely the responsibility of the authors and does not necessarily represent the

  • fficial views of the National Institutes of Health.
slide-2
SLIDE 2

Motivation

White mothers Black mothers N 1418 2775 Died<28 days or neonatal morbidity (“Events”), n 319 893 Event rate,% 22.5% 32.2% Excess events among black mothers 269

  • From 2010-14 there were 2775 very

preterm (24-32 gestational weeks) neonates born to black mothers and 1418 to white mothers in NYC.

  • The rate of morbidity and mortality

(“events”) was 9.7 percentage points higher for black mothers than for white mothers.

  • This implies there were 269

(9.7%*2775=269) excess events for Black mothers.

  • Q: What are the most important

modifiable factors that contribute to these 269 excess deaths/morbidity among black neonates?

Neonatal morbidity and mortality among very preterm neonates of White and Black mothers in New York City, 2010-2014

slide-3
SLIDE 3

There are many candidate causes of the excess events

  • Black mothers differ

substantial from white mothers in terms of:

  • A. Baby risk characteristics

(e.g., weight, gender, APGAR, gestational age, etc..)

  • B. Mother’s health risk

(e.g, age, diabetes,

  • besity,…)
  • C. Mother’s SES (e.g.,

insurance, education,…)

White (n=1418) Black (n=2775) Baby’s risk Birthweight, mean (SD) 1274 (485) 1159 (433) APGAR<7 39%(550) 48% (1344) Mothers Risk Age<20, %(n) 1.6% (23) 5.7% (158) Overweight/obese, %(n) 32% (454) 61%(1682) Mother’s SES <HS education, %(n) 7% 20% Medicaid Insurance 26% 68%

slide-4
SLIDE 4

Typical, unsatisfying solution

  • Estimate a series of logisitic

regressions of the event as a function of race,

  • Sequentially add sets of variables

to the model each time

  • See how the odds ratio on race

changes with each set of new variables.

1.64 1.55 1.36 1.15 UNADJUSTED MOM SES MOM SES+ MOM RISK MOM SES + MOM RISK+ BABY RISK

Odds ratio on black race from sequence of logistic regressions

(P<0.001) (P<0.001) (P=0.03) (P=0.171)

0.21

slide-5
SLIDE 5

Why this is unsatisfying

  • The order matters
  • The results are expressed in terms
  • f odds ratios.
  • We need a more useful method for

decomposing the number of excess events into various factors that contribute to those events. E.g.,

  • How many excess black neonate

deaths/morbidity would be avoided if the birthweight of neonates of minority and white mothers did not differ?

  • How many would be avoided if

insurance status of minority and white mothers did not differ?

1.64 1.35 1.23 1.15 UNADJUSTED BABY RISK BABY RISK +MOM RISK BABY RISK +MOM RISK + MOM SES

Odds ratio on black race from asequence of logisitic regressions

(P<0.001) (P=0.001) (P=0.033) (p=0.171)

0.29

slide-6
SLIDE 6

Today’s presentation

  • Problems with decomposing outcomes from nonlinear models
  • Solution proposed by Fairlie (2005)
  • Solution proposed through application of the Shapley Value
  • Summary
slide-7
SLIDE 7

Decomposing racial differences in linear models is relatively straight-forward

Birth weight Probability of event Black mean White mean P P’

slide-8
SLIDE 8

Decomposing racial differences in linear models is relatively straight-forward

Birth weight Probability of event Black mean White mean P|Mcaid Medicaid recipients P’|Mcaid

  • The effect of birth weight on the event is the same for Medicaid

and non-Medicaid recipients, and

  • The effect of Medicaid is the same at every level of birth weight

Private insurance

slide-9
SLIDE 9

Oaxaca-Blinder decomposition for linear models (1973)

  • You can decompose racial differences estimated from linear models

using only the means of the explanatory variables and the coefficients from the model. ത 𝑍𝑋 − ത 𝑍𝐶 = ത 𝑌𝑋 − ത 𝑌𝐶 ∗ መ 𝛾𝑋 + መ 𝛾𝑋 − መ 𝛾𝐶 ∗ ത 𝑌𝐶

slide-10
SLIDE 10

Nonlinearity of logit model complicates things

Birth weight Probability of event Black mean White mean

  • The effect of birth weight on the probability
  • f the event depends on the level of birth

weight, and on the level of every other variable in the model.

  • Effect of Medicaid depends on birth weight
  • Effects sizes are path dependent

}- Effect of Medicaid at black

mean birth weight } – Effect of Medicaid at white mean birth weight

slide-11
SLIDE 11

A consequence of this nonlinearity is Jensen’s inequality

𝐹 𝑔 𝑦 ≡ 𝑔 𝐹 𝑦 iff 𝑔 . is linear

  • For example, event rate for black mothers in our sample was 0.322
  • But, if we estimate a logistic regression and apply the coefficients to the mean of the

variables in the model, we get

𝑚𝑝𝑕𝑗𝑢−1 መ 𝛾0 + መ 𝛾1𝐶𝑏𝑐𝑧𝐶 + መ 𝛾2𝑇𝐹𝑇𝐶 + መ 𝛾3𝑁𝑝𝑛𝐶 + መ 𝛾4 = 0.276 Where 𝐶𝑏𝑐𝑧𝐶 is the mean—i.e., expected value—of baby risk factors for black mothers, and መ 𝛾4 is the coefficient on black race

  • On the other hand, the mean of the individual predictions for black mothers is identical

to the black event rate

1 𝑜𝑐 ෍

𝑗=1 𝑜𝑐

𝑚𝑝𝑕𝑗𝑢−1 መ 𝛾0 + መ 𝛾1𝐶𝑏𝑐𝑧𝑗 + መ 𝛾2𝑇𝐹𝑇𝑗 + መ 𝛾3𝑁𝑝𝑛𝑗 + መ 𝛾4 = .322

  • This is the inspiration for the decomposition method proposed by Fairlie (2005)
slide-12
SLIDE 12

Today’s presentation

  • Problems with decomposing outcomes from nonlinear models
  • Solution proposed by Fairlie (2005)
  • Solution proposed through application of the Shapley Value
  • Summary
slide-13
SLIDE 13

Fairlie RW (2005), “An Extension of Blinder-Oaxaca Decomposition Technique to Logit and Probit Models” J of Econ and Soc Measure, 30(4), 305-316

  • Don’t use the means of explanatory variables for black and white mothers, swap explanatory

variables between a white and black mothers at the individual level.

  • To estimate the contribution of groups of variables A, B, and C on an event Y for a sample that

contains Nb black and Nw white mothers:

1. Estimate a logistic regression of the probability of the event as a function of A, B, and C for each race group. Calculate the predicted probability the event for every mother. Call this P0. 2. Draw at random Nb white mothers. Match black mothers 1:1 to sampled white mothers based on predicted probability of the event 3. Replace each black mother’s data for variables in A with data from her matched white counterpart, and re- calculate the predicted probability of the event for the black mothers. Call this PA. 4. Replace each black mother’s data for variables in B with data from her matched white counterpart , and re- calculate the predicted probability of the event for the black mothers. Call this PAB 5. Replace each black mother’s data for variables in C with data from her matched white counterpart , and re- calculate the predicted probability of the event for the black mothers. Call this PABC 6. The contribution of A to the racial difference in the probability of the event is PA- P0, the contribution of B is PAB- PA, and C is PABC-PAB.

slide-14
SLIDE 14

Baby risk factors (A) Mother's SES (B) Mother's risk (C) id Birth weight sex … Education Insurance … Age diabetes … Estimated prob(event) 1 1386 F … 12 Medicaid 31 Y … 0.85 2 761 M … 12 Private 17 N … 0.85 3 1487 F … 14 Private 34 N … 0.83 4 1197 M … 10 Private 27 Y … 0.82 5 1287 F … 9 Medicaid 35 N … 0.78 6 613 F … 8 Private 21 Y … 0.75 7 1412 F … 11 Private 34 N … 0.51 8 1261 M … 10 Private 32 Y … 0.27 9 1447 F … 10 Private 24 N … 0.21 10 721 F … 14 Medicaid 24 N … 0.04 … … … … … … … … … … XXX 1492 F … 16 Private 30 N … 0.12 Baby risk factors (A) Mother's SES (B) Mother's risk (C) id Birth weight sex … Education insurance … Age diabetes … Estimated prob(event) XXX 554 F … 8 Medicaid 20 Y … 0.97 12 815 F … 13 Private 16 N … 0.66 13 1122 M … 15 Medicaid 24 N … 0.54 14 1044 F … 10 Medicaid 19 Y … 0.51 15 556 M … 11 Medicaid 24 N … 0.42 … … … … … … … … … … YYY 606 F … 15 Private … 34 N … 0.72

P0=Mean Prob(Event|Ablk, Bblk, Cblk) 0.322 PA=Mean Prob(Event|Awht, Bblk, Cblk) PAB=Mean Prob(Event|Awht, Bwht, Cblk) PABC=Mean Prob(Event|Awht, Bwht, Cwht)

White Mothers Black Mothers

  • 1. Estimate a logistic regression of the event as a function of A, B, and C.

Calculate the predicted probability the event for every mother.

slide-15
SLIDE 15

Baby risk factors (A) Mother's SES (B) Mother's risk (C) id Birth weight sex … Education Insurance … Age diabetes … Estimated prob(event) 1 1386 F … 12 Medicaid 31 Y … 0.85 2 761 M … 12 Private 17 N … 0.85 3 1487 F … 14 Private 34 N … 0.83 4 1197 M … 10 Private 27 Y … 0.82 5 1287 M … 9 Medicaid 35 N … 0.78 6 613 F … 8 Private 21 Y … 0.75 7 1412 F … 11 Private 34 N … 0.51 8 1261 M … 10 Private 32 Y … 0.27 9 1447 F … 10 Private 24 N … 0.21 10 721 M … 14 Medicaid 24 N … 0.04 … … … … … … … … … … XXX 1492 F … 16 Private 30 N … 0.12 Baby risk factors (A) Mother's SES (B) Mother's risk (C) id Birth weight sex … Education insurance … Age diabetes … Estimated prob(event) XXX 554 F … 8 Medicaid 20 Y … 0.97 12 815 F … 13 Private 16 N … 0.66 13 1122 M … 15 Medicaid 24 N … 0.54 14 1044 F … 10 Medicaid 19 Y … 0.51 15 556 M … 11 Medicaid 24 N … 0.42 … … … … … … … … … … YYY 606 F … 15 Private … 34 N … 0.72 P0=Mean Prob(Event|A=Ab, B=Bb, C=Cb) 0.62 Mean Prob(Event|A=AW, B=Bb, C=Cb) Mean Prob(Event|A=Aw, B=BW, C=Cb) Mean Prob(Event|A=Aw, B=BW, C=CW)

White Mothers Black Mothers

  • 2. Draw at random Nb white mothers. Match black mothers 1:1 to sampled white mothers based
  • n predicted probability of the event

P0=Mean Prob(Event|Ablk, Bblk, Cblk) 0.322 PA=Mean Prob(Event|Awht, Bblk, Cblk) PAB=Mean Prob(Event|Awht, Bwht, Cblk) PABC=Mean Prob(Event|Awht, Bwht, Cwht)

slide-16
SLIDE 16

Baby risk factors (A) Mother's SES (B) Mother's risk (C) id Birth weight sex … Education Insurance … Age diabetes … Estimated prob(event) 1 1386 F … 12 Medicaid 31 Y … 0.85 2 761 M … 12 Private 17 N … 0.85 3 1487 F … 14 Private 34 N … 0.83 4 1197 M … 10 Private 27 Y … 0.82 5 1287 M … 9 Medicaid 35 N … 0.78 6 613 F … 8 Private 21 Y … 0.75 7 1412 F … 11 Private 34 N … 0.51 8 1261 M … 10 Private 32 Y … 0.27 9 1447 F … 10 Private 24 N … 0.21 10 721 M … 14 Medicaid 24 N … 0.04 … … … … … … … … … … XXX 1492 F … 16 Private 30 N … 0.12 Baby risk factors (A) Mother's SES (B) Mother's risk (C) id Birth weight sex … Education insurance … Age diabetes … Estimated prob(event) XXX 761 M … 8 Medicaid 20 Y … 0.67 12 1197 M … 13 Private 16 N … 0.56 13 1287 M … 15 Medicaid 24 N … 0.74 14 1412 F … 10 Medicaid 19 Y … 0.61 15 721 M … 11 Medicaid 24 N … 0.74 … … … … … … … … … … YYY 1276 F … 15 Private … 34 N … 0.72 P0=Mean Prob(Event|A=Ab, B=Bb, C=Cb) 0.62 PA=Mean Prob(Event|A=AW, B=Bb, C=Cb) 0.71 Mean Prob(Event|A=Aw, B=BW, C=Cb) Mean Prob(Event|A=Aw, B=BW, C=CW)

White Mothers Black Mothers

  • 3. Replace each black mother’s data for variables in A with data from her matched white

counterpart, and re-calculate the mean of the predicted probability of the event for the black

  • mothers. Call this PA.

P0=Mean Prob(Event|Ablk, Bblk, Cblk) 0.322 PA=Mean Prob(Event|Awht, Bblk, Cblk) 0.255 PAB=Mean Prob(Event|Awht, Bwht, Cblk) PABC=Mean Prob(Event|Awht, Bwht, Cwht)

slide-17
SLIDE 17

Baby risk factors (A) Mother's SES (B) Mother's risk (C) id Birth weight sex … Education Insurance … Age diabetes … Estimated prob(event) 1 1386 F … 12 Medicaid … 31 Y … 0.85 2 761 M … 12 Private … 17 N … 0.85 3 1487 F … 14 Private … 34 N … 0.83 4 1197 M … 10 Private … 27 Y … 0.82 5 1287 M … 9 Medicaid … 35 N … 0.78 6 613 F … 8 Private … 21 Y … 0.75 7 1412 F … 11 Private … 34 N … 0.51 8 1261 M … 10 Private … 32 Y … 0.27 9 1447 F … 10 Private … 24 N … 0.21 10 721 M … 14 Medicaid … 24 N … 0.04 … … … … … … … … … … … XXX 1492 F … 16 Private … 30 N … 0.12 Baby risk factors (A) Mother's SES (B) Mother's risk (C) id Birth weight sex … Education insurance … Age diabetes … Estimated prob(event) XXX 761 M … 12 Private … 20 Y … 0.97 12 1197 M … 10 Private … 16 N … 0.66 13 1287 M … 9 Medicaid … 24 N … 0.54 14 1412 F … 11 Private … 19 Y … 0.51 15 721 M … 14 Medicaid … 24 N … 0.42 … … … … … … … … … … YYY 1276 F … 15 Private … 34 N … 0.72 P0=Mean Prob(Event|A=Ab, B=Bb, C=Cb) 0.62 PA=Mean Prob(Event|A=AW, B=Bb, C=Cb) 0.71 PAB=Mean Prob(Event|A=Aw, B=BW, C=Cb) 0.74 Mean Prob(Event|A=Aw, B=BW, C=CW)

White Mothers Black Mothers

  • 4. Replace each black mother’s data for variables in B with data from her matched white counterpart , and re-

calculate the mean of the predicted probability of the event for the black mothers. Call this PAB

P0=Mean Prob(Event|Ablk, Bblk, Cblk) 0.322 PA=Mean Prob(Event|Awht, Bblk, Cblk) 0.275 PAB=Mean Prob(Event|Awht, Bwht, Cblk) 0.255 PABC=Mean Prob(Event|Awht, Bwht, Cwht)

slide-18
SLIDE 18

Baby risk factors (A) Mother's SES (B) Mother's risk (C) id Birth weight sex … Education Insurance … Age diabetes … Estimated prob(event) 1 1386 F … 12 Medicaid … 31 Y … 0.85 2 761 M … 12 Private … 17 N … 0.85 3 1487 F … 14 Private … 34 N … 0.83 4 1197 M … 10 Private … 27 Y … 0.82 5 1287 M … 9 Medicaid … 35 N … 0.78 6 613 F … 8 Private … 21 Y … 0.75 7 1412 F … 11 Private … 34 N … 0.51 8 1261 M … 10 Private … 32 Y … 0.27 9 1447 F … 10 Private … 24 N … 0.21 10 721 M … 14 Medicaid … 24 N … 0.04 … … … … … … … … … … … XXX 1492 F … 16 Private … 30 N … 0.12 Baby risk factors (A) Mother's SES (B) Mother's risk (C) id Birth weight sex … Education insurance … Age diabetes … Estimated prob(event) XXX 761 M … 12 Private … 17 N … 0.76 12 1197 M … 10 Private … 27 Y … 0.86 13 1287 M … 9 Medicaid … 35 N … 0.77 14 1412 F … 11 Private … 34 N … 0.80 15 721 M … 14 Medicaid … 24 N … 0.72 … … … … … … … … … … YYY 1276 F … 15 Private … 31 Y … 0.79

White Mothers Black Mothers

  • 5. Replace each black mother’s data for variables in C with data from her matched white counterpart , and re-

calculate the predicted probability of the event for the black mothers. Call this PABC

P0=Mean Prob(Event|Ablk, Bblk, Cblk) 0.322 PA=Mean Prob(Event|Awht, Bblk, Cblk) 0.275 PAB=Mean Prob(Event|Awht, Bwht, Cblk) 0.255 PABC=Mean Prob(Event|Awht, Bwht, Cwht) 0.245

slide-19
SLIDE 19

Example, Fairlie method applied to very preterm birth data in NYC 2010-14

Probability of the event Events Percent P-value White morbidity-mortality 22.5 319 Black morbidity-mortality 32.2 893 Difference 9.7 <0.001 269 Contributing factors

  • A. Baby’s health risk

5.7 <0.001 158

  • B. Mother's socioeconomic status

1.1 0.140 38

  • C. Mother's health risk

0.9 0.495 31 Total explained difference 7.7 228 Percent of total attributed to Baby health risk 74%

slide-20
SLIDE 20

Benefits of Fairlie (2005)

  • Addresses the problem of Jensen’s inequality
  • Simple and intuitive
  • Software is available
  • Stata code for Fairlie:
  • http://fmwww.bc.edu/RePEc/bocode/f/
  • net install fairlie.pkg
  • SAS macro for Fairlie:

https://people.ucsc.edu/~rfairlie/decomposition/decompexample_v6.sas

slide-21
SLIDE 21

Potential problems with Fairlie

  • The results still depend on the order in which you swap white/black variables
  • Fairlie suggests repeating estimates multiple times with random ordering of swapping

variables.

  • Matches smaller group to a sample from the larger group.
  • Fairlie suggests repeating estimates multiple times, drawing different samples each time, but
  • if minority group is the larger group (as in the present study) your results potentially apply to
  • nly a sample of the minority patients in your study.
  • Matching based on predicted probability of event is potentially problematic.
  • If minority and majority group are the same size then same patients are matched in each

iteration.

  • Matching on a sample makes the use of survey weights problematic
  • When you average the predicted probability for black mothers with white mothers’

characteristics, do you use the black mothers’ survey weights or the white mothers’?

slide-22
SLIDE 22

Today’s presentation

  • Problems with decomposing outcomes from nonlinear models
  • Solution proposed by Fairlie (2005)
  • Solution proposed through application of the Shapley Value
  • Summary
slide-23
SLIDE 23

Shapley overview

  • Lloyd Shapley: Nobel Prize in economics in 2012
  • Shapley value: How to assign values to individual players in a cooperative game. A

cooperative game consist of:

  • Players—e.g., a game could have four players: Players A, B, C and D.
  • Coalitions--Players can play by themselves or form coalitions with other players—e.g.,{A} is player

A by himself, {A,C} is players A and C forming a team, etc..

  • Worth of coalitions- The payoff or cost or some other metric of the “worth” of each coalition can

be defined.

  • Shapley Value defines how each of the players in a game should be valued.
  • Shapley Value is based on how much each player contributes to the worth of coalitions

they join.

  • Useful for dividing up a single pot of money across several “players”
  • Example: Dividing a single insurance settlement for a car accident across several people injured in

the accident

  • Google Analytics uses it to measure the value of different forms of internet advertising
slide-24
SLIDE 24

Shapley Example: Splitting the cost of commuting with Steve Zeliadt and Vince Fan

Coalition Miles driven I commute alone 5.4 miles Steve alone Vince alone Steve and me Vince and me Steve and Vince Steve and Vince and me Paul VA HSR&D VA Hospital

slide-25
SLIDE 25

Shapley Example: Splitting the cost of commuting with Steve and Vince

Coalition Miles driven I commute alone 5.4 miles Steve alone 5.2 miles Vince alone Steve and me Vince and me Steve and Vince Steve and Vince and me Paul Steve VA HSR&D VA Hospital

slide-26
SLIDE 26

Shapley Example: Splitting the cost of commuting with Steve and Vince

Coalition Miles driven I commute alone 5.4 miles Steve alone 5.2 miles Vince alone 5.9 miles Steve and me Vince and me Steve and Vince Steve and Vince and me Vince Paul Steve VA HSR&D VA Hospital

slide-27
SLIDE 27

Shapley Example: Splitting the cost of commuting with Steve and Vince

Coalition Miles drive I commute alone 5.4 miles Steve alone 5.2 miles Vince alone 5.9 miles Steve and me 5.6 miles Vince and me Steve and Vince Steve and Vince and me Vince Paul Steve VA HSR&D VA Hospital

slide-28
SLIDE 28

Shapley Example: Splitting the cost of commuting with Steve and Vince

Coalition Miles driven I commute alone 5.4 miles Steve alone 5.2 miles Vince alone 5.9 miles Steve and me 5.6 miles Vince and me 8.8 miles Steve and Vince 8.9 miles Steve and Vince and me 9.2 miles Vince Paul Steve VA HSR&D VA Hospital

slide-29
SLIDE 29

Incremental costs of coalition members

Incremental costs Coalition Miles Total coalition annual gas costs Paul Steve Vince Null coalition $0 Paul 5.4 $1,688 1,688 Steve 5.1 $1,622 1,622 Vince 5.9 $1,798 1,798 Paul + Steve 5.6 $1,732 110 44 Paul + Vince 8.8 $2,436 638 748 Steve + Vince 8.9 $2,458 660 836 Paul + Vince+ Steve 9.2 $2,524 66 88 792

Q: At the end of the year, how do we divide this $2,524 bill across the three players?

slide-30
SLIDE 30

Calculating the Shapley value (𝜚𝑗)from the worth of each 2N-1 coalition

  • Shapley value for a player is a weighted average of

the contribution that player makes to each coalition the player joins

  • This is not as complicated as it looks
  • N! is the number of permutations of Me, Steve and

Vince, i.e. Paul, Steve, Vince Paul, Vince, Steve Steve, Paul, Vince Vince, Paul, Steve Steve, Vince, Paul Vince, Steve, Paul

  • For a given coalition, the weight is the proportion
  • f these permutations in which I enter the car to

find the coalition I am joining Coalition Weight Incremental costs of me joining the coalition Me alone 2/6 $1688 Steve + me 1/6 110 Vince + me 1/6 638 Steve+Vince+me 2/6 66 My Shapley Value $709 𝜚𝑗(𝑂, 𝑤) =

1 𝑂! σ𝑇⊆𝑂\{𝑗} 𝑇 !

𝑂 − 𝑇 − 1 ! 𝑤 𝑇 ∪ 𝑗 − 𝑤(𝑇)

slide-31
SLIDE 31

Properties of the Shapley Value

Shapley value Paul $709 Steve $687 Vince $1,127 Sum $2,524 Worth {Paul, Steve, Vince} $2,524

  • The sum of the Shapley Values of

all the players is equal to the worth of the coalition formed by all of the player.

  • This means the total cost or

payout or settlement is exactly apportioned to each of the players.

slide-32
SLIDE 32

How does this work with decomposing racial differences in outcomes?

  • Racial differences in outcomes are the result of a cooperative game.
  • The players are factors that contribute to bad outcomes, e.g.,

A. Baby’s risk factors B. Mother’s SES C. Mother’s risk factors D. Mother’s Race

  • Players form coalitions to produce bad outcomes for black mothers
  • The worth of a coalition is the number of excess outcomes for black mothers associated

with that coalition of players

  • Like Fairlie, the worth is estimated by swapping data between black and white mothers
  • E.g, the worth of coalition {A,C} is the number of excess events that would occur if each

black mother had their own baby’s {A} and mother’s risk factors {C}, but swapped SES {B} and race {D} with every white mother in the sample.

slide-33
SLIDE 33

How does this work with decomposing racial differences in outcomes? (2)

  • 1. Estimate a logistic regression of the probability of the event as a function of

A, B, C, and a race dummy.

  • 2. Replace each white mother’s data for variables in A with data from the first

black mother in the sample. Calculate the predicted probability of the event for each white mother and average them. This is the counterfactually probability of the event for coalition {A} for the first black mother.

  • 3. Repeat (2) for every other black mother in the sample.
  • 4. Repeat (2)-(3) for every possible coalition of the four players. This provides

the worth of every possible coalitions of players.

  • 5. Calculate the Shapley Value for each player from the results of (4).
slide-34
SLIDE 34

Baby risk factors (A) Mother's SES (B) Mother's risk (C) id Birth weight sex … Education Insurance Age diabetes … RACE Estimated prob(event) 1 1386 F … 12 Medicaid 31 Y … W 2 761 M … 12 Private 17 N … W 3 1487 F … 14 Private 34 N … W 4 1197 M … 10 Private 27 Y … W 5 1287 M … 9 Medicaid 35 N … W 6 613 F … 8 Private 21 Y … W 7 1412 F … 11 Private 34 N … W 8 1261 M … 10 Private 32 Y … W 9 1447 F … 10 Private 24 N … W 10 721 M … 14 Medicaid 24 N … W Baby risk factors (A) Mother's SES (B) Mother's risk (C) id Birth weight sex … Education insurance Age diabetes … RACE Estimated prob(event) 11 554 F … 8 Medicaid 20 Y … B 12 815 F … 13 Private 16 N … B 13 1122 M … 15 Medicaid 24 N … B 14 1044 F … 10 Medicaid 23 Y … B 15 556 M … 11 Medicaid 24 N … B

White Mothers Black Mothers

slide-35
SLIDE 35

Baby risk factors (A) Mother's SES (B) Mother's risk (C) (D) id Birth weight sex … Education Insurance Age diabetes … RACE Estimated prob(event) 1 554 F … 12 Medicaid 31 Y … W 0.623 2 554 F … 12 Private 17 N … W 0.038 3 554 F … 14 Private 34 N … W 0.788 4 554 F … 10 Private 27 Y … W 0.322 5 554 F … 9 Medicaid 35 N … W 0.594 6 554 F … 8 Private 21 Y … W 0.995 7 554 F … 11 Private 34 N … W 0.611 8 554 F … 10 Private 32 Y … W 0.452 9 554 F … 10 Private 24 N … W 0.255 10 554 F … 14 Medicaid 24 N … W 0.798 Baby risk factors (A) Mother's SES (B) Mother's risk (C) (D) id Birth weight sex … Education insurance Age diabetes … RACE Estimated prob(event) 11 554 F … 8 Medicaid 20 Y … B 0.547 12 815 F … 13 Private 16 N … B 13 1122 M … 15 Medicaid 24 N … B 14 1044 F … 10 Medicaid 23 Y … B 15 556 M … 11 Medicaid 24 N … B

White Mothers Black Mothers 𝑞11

𝐵 = 1

10 ෍

𝑘=1 10

𝑚𝑝𝑕𝑗𝑢−1 መ 𝛾0 + መ 𝛾1𝑩𝟐𝟐 + መ 𝛾2𝐶

𝑘 + መ

𝛾3𝐷

𝑘 + መ

𝛾4𝐸

𝑘 = 0.547

𝑞11

𝐵 =

0.547

  • 1. Get the counterfactual probability of the event if black mother id=11 had her own baby’s risk factors,

but the SES and mother’s risk factors of all white mothers. Note:

  • There is not sampling or matching
  • If survey weights are necessary,

use the white mothers weights.

slide-36
SLIDE 36

Baby risk factors (A) Mother's SES (B) Mother's risk (C) (D) id Birth weight sex … Education Insurance Age diabetes … RACE Estimated prob(event) 1 815 F … 12 Medicaid 31 Y … W 0.494 2 815 F … 12 Private 17 N … W 0.822 3 815 F … 14 Private 34 N … W 0.029 4 815 F … 10 Private 27 Y … W 0.426 5 815 F … 9 Medicaid 35 N … W 0.831 6 815 F … 8 Private 21 Y … W 0.279 7 815 F … 11 Private 34 N … W 0.734 8 815 F … 10 Private 32 Y … W 0.850 9 815 F … 10 Private 24 N … W 0.094 10 815 F … 14 Medicaid 24 N … W 0.226 Baby risk factors (A) Mother's SES (B) Mother's risk (C) (D) id Birth weight sex … Education insurance Age diabetes … RACE Estimated prob(event) 11 554 F … 8 Medicaid 20 Y … B 0.547 12 815 F … 13 Private 16 N … B 0.478 13 1122 M … 15 Medicaid 24 N … B 14 1044 F … 10 Medicaid 23 Y … B 15 556 M … 11 Medicaid 24 N … B

White Mothers Black Mothers 𝑞12

𝐵 = 1

10 ෍

𝑘=1 10

𝑚𝑝𝑕𝑗𝑢−1 መ 𝛾0 + መ 𝛾1𝑩𝟐𝟑 + መ 𝛾2𝐶

𝑘 + መ

𝛾3𝐷

𝑘 + መ

𝛾4𝐸

𝑘 = 0.478

𝑞12

𝐵 =

0.478

  • 2. Repeat for the next black mother in the sample.
slide-37
SLIDE 37

Baby risk factors (A) Mother's SES (B) Mother's risk (C) (D) id Birth weight sex … Education Insurance Age diabetes … RACE Estimated prob(event) 1 556 M … 12 Medicaid 31 Y … W 0.494 2 556 M … 12 Private 17 N … W 0.822 3 556 M … 14 Private 34 N … W 0.029 4 556 M … 10 Private 27 Y … W 0.426 5 556 M … 9 Medicaid 35 N … W 0.831 6 556 M … 8 Private 21 Y … W 0.279 7 556 M … 11 Private 34 N … W 0.734 8 556 M … 10 Private 32 Y … W 0.850 9 556 M … 10 Private 24 N … W 0.094 10 556 M … 14 Medicaid 24 N … W 0.226 Baby risk factors (A) Mother's SES (B) Mother's risk (C) (D) id Birth weight sex … Education insurance Age diabetes … RACE Estimated prob(event) 11 554 F … 8 Medicaid 20 Y … B 0.547 12 815 F … 13 Private 16 N … B 0.478 13 1122 M … 15 Medicaid 24 N … B 0.693 14 1044 F … 10 Medicaid 23 Y … B 0.550 15 556 M … 11 Medicaid 24 N … B 0.110

White Mothers Black Mothers 𝑤 𝐵 = ෍

𝑗=11 𝑜𝑐𝑚𝑙

𝑞𝑗

𝐵 =

769.8

  • Repeat this process to get the worth of every possible coalition of players A, B, C, and D.

There are 24-1 of them.

  • Calculate the Shapley Value for each player based on the coalitions’ worths.
  • The Shapley Value for a player is the number of excess events that player is responsible for.
  • 3. Sum the predicted probabilities for black mothers. This yields the expected number of events, or worth,

for the coalition that includes Baby risk factors alone.

slide-38
SLIDE 38

Shapley methods applied to very preterm birth data

Incremental worth of player to each coalition Coalition Worth

  • A. Baby
  • B. SES C. Mom risk
  • D. Race

null 622 A 770 148 B 659 36 C 632 10 D 683 61 AB 810 151 40 ABC 820 152 40 10 ABD 881 159 42 71 AC 780 148 10 ACD 849 156 10 69 AD 839 156 69 BC 668 36 9 BCD 731 38 9 63 BD 722 38 63 CD 693 10 61 ABCD 893 162 44 12 73 Shapley Value 153.8 39.2 9.9 65.9 p-value <0.001 0.039 0.54 0.055 Percent of excess 57% 15% 4% 25%

Notes:

  • Very little path dependence.

A player adds nearly the same to each coalition it joins

  • Similar results to Fairlie:
  • A. 158 events (p<0.001)
  • B. 38 events (p>0.05)
  • C. 31 events (p>0.05)
slide-39
SLIDE 39

Advantages/disadvantages of method based

  • n Shapley Value
  • Advantages
  • Every minority subject is matched with every white subject: there is no

sampling from the larger population, which could be the population of interest.

  • The relative sizes of the populations are not relevant.
  • Easy to incorporate survey weights.
  • Results sum to observed excess events for minority group.
  • Disadvantage
  • Computationally intensive
slide-40
SLIDE 40

Summary Today’s presentation

  • Problems with decomposing outcomes from nonlinear models
  • Jensen’s inequality means you cannot rely only on means and coefficients.
  • Solution proposed by Fairlie (2005)
  • Solves the problem of Jensen’s inequality by swapping data between black

and white mothers.

  • Limitations in some circumstance,
  • Order dependent
  • Minority sample larger than white sample is problematic
  • Survey weights difficult to apply
  • Solution proposed through application of the Shapley Value
slide-41
SLIDE 41

References

  • Fairlie RW (2005), “An Extension of Blinder-Oaxaca Decomposition

Technique to Logit and Probit Models” Journal of Economic and Social Measurement, 30(4), 305-316

  • Stata code for Fairlie:
  • http://fmwww.bc.edu/RePEc/bocode/f/
  • net install fairlie.pkg
  • SAS code for Fairlie:

https://people.ucsc.edu/~rfairlie/decomposition/decompexample_v6.sas

  • Stata code for Shapley Value
  • Paul.Hebert2@va.gov