Potential outcomes & threats to validity February 19, 2020 - - PowerPoint PPT Presentation

potential outcomes threats to validity
SMART_READER_LITE
LIVE PREVIEW

Potential outcomes & threats to validity February 19, 2020 - - PowerPoint PPT Presentation

Potential outcomes & threats to validity February 19, 2020 Fill out your reading report PMAP 8521: Program Evaluation for Public Service on iCollege! Andrew Young School of Policy Studies Spring 2020 Plan for today Potential outcomes


slide-1
SLIDE 1

Potential outcomes & threats to validity

February 19, 2020

PMAP 8521: Program Evaluation for Public Service Andrew Young School of Policy Studies Spring 2020 Fill out your reading report

  • n iCollege!
slide-2
SLIDE 2

Plan for today

The Four Horsemen of Validity Potential outcomes

slide-3
SLIDE 3

Potential outcomes

slide-4
SLIDE 4

Program effect

Post-program outcome level Outcome with program Outcome without program Outcome change Outcome variable Before program During program After program Program effect Pre-program

  • utcome level

δ Y X

slide-5
SLIDE 5

Some equation translations

δ = P(Y |do(X))

<latexit sha1_base64="mrj29KeXdBFG1YaYs/b1vIjiaqE=">AB/nicbVBNS8NAEN34WetXVDx5WSxCeylJFfQiFL14rGA/pA1ls5m2SzebsLsRSiz4V7x4UMSrv8Ob/8Ztm4O2Ph4vDfDzDw/5kxpx/m2lpZXVtfWcxv5za3tnV17b7+hokRSqNOIR7LlEwWcCahrpjm0Ygk9Dk0/eH1xG8+gFQsEnd6FIMXkr5gPUaJNlLXPuwEwDXBl7hWvMePOIiKrVKpaxecsjMFXiRuRgoQ61rf3WCiCYhCE05UartOrH2UiI1oxzG+U6iICZ0SPrQNlSQEJSXTs8f4xOjBLgXSVNC46n6eyIloVKj0DedIdEDNe9NxP+8dqJ7F17KRJxoEHS2qJdwrCM8yQIHTALVfGQIoZKZWzEdEmoNonlTQju/MuLpFEpu6flyu1ZoXqVxZFDR+gYFZGLzlEV3aAaqiOKUvSMXtGb9WS9WO/Wx6x1ycpmDtAfWJ8/CGmTlg=</latexit>

δ = (Y |X = 1) − (Y |X = 0)

<latexit sha1_base64="mV6zpEfD9CQ06MbS6u9P8VklX9Q=">AC3icbZDLSgMxFIbP1Fut1GXbkKL0C4sM1XQjVB047KCvUg7lEwmbUMzF5KMUMbu3fgqblwo4tYXcOfbmLYDausPgS/OYfk/G7EmVSW9WVklpZXVtey67mNza3tHXN3ryHDWBaJyEPRcvFknIW0LpitNWJCj2XU6b7vByUm/eUSFZGNyoUQdH/cD1mMEK21zXzHo1xhdI6Kt+getTYJXT0c7NKXbNgla2p0CLYKRQgVa1rfna8kMQ+DRThWMq2bUXKSbBQjHA6znViSNMhrhP2xoD7FPpJNdxuhQOx7qhUKfQKGp+3siwb6UI9/VnT5WAzlfm5j/1dqx6p05CQuiWNGAzB7qxRypE2CQR4TlCg+0oCJYPqviAywETp+HI6BHt+5UVoVMr2cblyfVKoXqRxZOEA8lAEG06hCldQgzoQeIAneIFX49F4Nt6M91lrxkhn9uGPjI9vZHWKg=</latexit>

δ = Y1 − Y0

<latexit sha1_base64="Y3246V1lNJpRUthV/7KaxLrH0s=">AB+3icbVDLSsNAFJ3UV62vWJduBovgxpJUQTdC0Y3LCvYhbQiTybQdOpmEmRuxhP6KGxeKuPVH3Pk3TtstPXAvRzOuZe5c4JEcA2O820VlbX1jeKm6Wt7Z3dPXu/3NJxqihr0ljEqhMQzQSXrAkcBOskipEoEKwdjG6mfvuRKc1jeQ/jhHkRGUje5SAkXy73AuZAIKv8IPv4lPTHd+uOFVnBrxM3JxUI6Gb3/1wpimEZNABdG6zoJeBlRwKlgk1Iv1SwhdEQGrGuoJBHTXja7fYKPjRLifqxMScAz9fdGRiKtx1FgJiMCQ73oTcX/vG4K/Usv4zJgUk6f6ifCgwxngaBQ64YBTE2hFDFza2YDokiFExcJROCu/jlZdKqVd2zau3uvFK/zuMokN0hE6Qiy5QHd2iBmoip7QM3pFb9bEerHerY/5aMHKdw7QH1ifPxVjkoQ=</latexit>

E = expected value,

  • r average

P = probability distribution

δ = E(Y |do(X)) − E(Y |!do(X))

<latexit sha1_base64="n4YXIxAEThwZuXbjmaECd6bAXQw=">ACDnicbVDLSgMxFM34rPU16tJNtBTahWmCroRiK4rGAf0g4lk8m0oZlkSDJCqf0CN/6KGxeKuHXtzr8xbWehrQcC5zLzf3+DGjSjvOt7WwuLS8spZy65vbG5t2zu7dSUSiUkNCyZk0eKMpJTVPNSDOWBEU+Iw2/fzn2G/dEKir4rR7ExItQl9OQYqSN1LHz7YAwjeA5vCrcwQcYiEKzWIRHaXkwrTt2zik5E8B54qYkB1JUO/ZXOxA4iQjXmCGlWq4Ta2+IpKaYkVG2nSgSI9xHXdIylKOIKG84OWcE80YJYCikeVzDifp7YogipQaRbzojpHtq1huL/3mtRIdn3pDyONGE4+miMGFQCzjOBgZUEqzZwBCEJTV/hbiHJMLaJg1IbizJ8+TernkHpfKNye5ykUaRwbsg0NQAC4BRVwDaqgBjB4BM/gFbxZT9aL9W59TFsXrHRmD/yB9fkDZFuX4A=</latexit>
slide-6
SLIDE 6

Fundamental problem of causal inference

δi = Y 1

i − Y 0 i

<latexit sha1_base64="6honxTkUB64g6L3bUQhexACzE10=">ACAXicbVDLSsNAFJ3UV62vqBvBzWAR3FiSKuhGKLpxWcE+pI1hMrlph04ezEyEurGX3HjQhG3/oU7/8Zpm4W2Hhju4Zx7uXOPl3AmlWV9G4WFxaXleJqaW19Y3PL3N5pyjgVFBo05rFoe0QCZxE0FMc2okAEnocWt7gauy3HkBIFke3apiAE5JexAJGidKSa+51feCKuAxf4Lt7W9djXS2XuWbZqlgT4Hli56SMctRd86vrxzQNIVKUEyk7tpUoJyNCMcphVOqmEhJCB6QHU0jEoJ0skFI3yoFR8HsdAvUni/p7ISCjlMPR0Z0hUX856Y/E/r5Oq4NzJWJSkCiI6XRSkHKsYj+PAPhNAFR9qQqhg+q+Y9okgVOnQSjoEe/bkedKsVuyTSvXmtFy7zOMon10gI6Qjc5QDV2jOmogih7RM3pFb8aT8WK8Gx/T1oKRz+yiPzA+fwCpaJUW</latexit>

Individual-level effects are impossible to observe! No individual counterfactuals!

slide-7
SLIDE 7

Average treatment effect (ATE)

Solution: Use averages instead

ATE = E(Y1 − Y0) = E(Y1) − E(Y0)

<latexit sha1_base64="pN7mJOGZdI4pMNJmbJ2I7RQyEFU=">ACDXicbVDLSgMxFM3UV62vUZduglVoF5aZKuhGqErBZYU+aYchk2ba0MyDJCOUoT/gxl9x40IRt+7d+Tdm2hG09UDg3HPu5eYeJ2RUSMP40jJLyura9n13Mbm1vaOvrvXFEHEMWngAW87SBGPVJQ1LJSDvkBHkOIy1ndJP4rXvCBQ38uhyHxPLQwKcuxUgqydaPrupVeAmrhY5twhPYsY3iT1lUdUKMoq3njZIxBVwkZkryIEXN1j97/QBHvElZkiIrmE0oRlxQzMsn1IkFChEdoQLqK+sgjwoqn10zgsVL60A24er6EU/X3RIw8Icaeozo9JIdi3kvE/7xuJN0LK6Z+GEni49kiN2JQBjCJBvYpJ1iysSIc6r+CvEQcYSlCjCnQjDnT14kzXLJPC2V787yles0jiw4AIegAExwDirgFtRA2DwAJ7AC3jVHrVn7U17n7VmtHRmH/yB9vENh0KWKQ=</latexit>

Difference between average/expected value when program is on vs. expected value when program is off

δ = ( ¯ Y |P = 1) − ( ¯ Y |P = 0)

<latexit sha1_base64="togvVy7XxoWsr9z5bpvtjw7BhDE=">ACF3icbVDLSsNAFJ3UV62vqEs3g0VoF4akCroRim5cVrAPaUKZTCbt0MkzEyEvsXbvwVNy4Ucas7/8Zpm4W2Hrhw5px7mXuPnzAqlW1/G4Wl5ZXVteJ6aWNza3vH3N1ryTgVmDRxzGLR8ZEkjHLSVFQx0kEQZHPSNsfXk389j0Rksb8Vo0S4kWoz2lIMVJa6pmWGxCmELyAFdHIrsbwfY0E+nCo/nNbvaM8u2ZU8BF4mTkzLI0eiZX24Q4zQiXGpOw6dqK8DAlFMSPjkptKkiA8RH3S1ZSjiEgvm941hkdaCWAYC1cwan6eyJDkZSjyNedEVIDOe9NxP+8bqrCcy+jPEkV4Xj2UZgyqGI4CQkGVBCs2EgThAXVu0I8QAJhpaMs6RCc+ZMXSatmOSdW7ea0XL/M4yiCA3AIKsABZ6AOrkEDNAEGj+AZvI348l4Md6Nj1lrwchn9sEfGJ8/YUmbpA=</latexit>
slide-8
SLIDE 8

Person Sex Treated? Outcome with program Outcome without program Effect 1 M TRUE 80 60 20 2 M TRUE 75 70 5 3 M TRUE 85 80 5 4 M FALSE 70 60 10 5 F TRUE 75 70 5 6 F FALSE 80 80 7 F FALSE 90 100

  • 10

8 F FALSE 85 80 5

slide-9
SLIDE 9

Person Sex Treated? Outcome with program Outcome without program Effect 1 M TRUE 80 60 20 2 M TRUE 75 70 5 3 M TRUE 85 80 5 4 M FALSE 70 60 10 5 F TRUE 75 70 5 6 F FALSE 80 80 7 F FALSE 90 100 −10 8 F FALSE 85 80 5

δ = ( ¯ Y |P = 1) − ( ¯ Y |P = 0)

<latexit sha1_base64="togvVy7XxoWsr9z5bpvtjw7BhDE=">ACF3icbVDLSsNAFJ3UV62vqEs3g0VoF4akCroRim5cVrAPaUKZTCbt0MkzEyEvsXbvwVNy4Ucas7/8Zpm4W2Hrhw5px7mXuPnzAqlW1/G4Wl5ZXVteJ6aWNza3vH3N1ryTgVmDRxzGLR8ZEkjHLSVFQx0kEQZHPSNsfXk389j0Rksb8Vo0S4kWoz2lIMVJa6pmWGxCmELyAFdHIrsbwfY0E+nCo/nNbvaM8u2ZU8BF4mTkzLI0eiZX24Q4zQiXGpOw6dqK8DAlFMSPjkptKkiA8RH3S1ZSjiEgvm941hkdaCWAYC1cwan6eyJDkZSjyNedEVIDOe9NxP+8bqrCcy+jPEkV4Xj2UZgyqGI4CQkGVBCs2EgThAXVu0I8QAJhpaMs6RCc+ZMXSatmOSdW7ea0XL/M4yiCA3AIKsABZ6AOrkEDNAEGj+AZvI348l4Md6Nj1lrwchn9sEfGJ8/YUmbpA=</latexit>

ATE = 5

slide-10
SLIDE 10

Conditional ATE (CATE)

ATE in subgroups

Is the program more effective for specific sexes?

slide-11
SLIDE 11

CATEMale = 10

δ = ( ¯ YMale|P = 1) − ( ¯ YMale|P = 0)

<latexit sha1_base64="AtyJpDfsbDc/ahR6OGWMg0RxUag=">ACL3icfVDLSgNBEJz1bXxFPXoZDEJyMOyqoBdBFMSLEMFEJRtC76Sjg7MPZnrFsOaPvPgrXkQU8epfOIk5aBQLGoq7pnuChIlDbnuszMyOjY+MTk1nZuZnZtfyC8u1UycaoFVEatYnwdgUMkIqyRJ4XmiEcJA4VlwfdDz25QGxlHp9RJsBHCZSTbUgBZqZk/9FuoCPguL/oB6Oyi2/QJbyk7BoVdfscr1vJKfP0/3y018wW37PbBfxNvQApsgEoz/+i3YpGJFQYEzdcxNqZKBJCvtwzk8NJiCu4RLrlkYQomlk/Xu7fM0qLd6Ota2IeF/9PpFBaEwnDGxnCHRlhr2e+JdXT6m908hklKSEkfj6qJ0qTjHvhcdbUqMg1bEhJZ2Vy6uQIMgG3HOhuANn/yb1DbK3mZ542SrsLc/iGOKrbBVmQe2Z7IhVWJUJds8e2Qt7dR6cJ+fNef9qHXEGM8vsB5yPT+6DpoI=</latexit>

δ = ( ¯ YFemale|P = 1) − ( ¯ YFemale|P = 0)

<latexit sha1_base64="t/jYDUPLDO/9g8Md3K1n3X3RTI4=">ACM3icfVDJSgNBFOxjXGLevTSGAQ9GZU0IsgCiKeIhgXMiG86bxok56F7jdiGPNPXvwRD4J4UMSr/2BnObhQUNRVa+7XwWJkoZc98kZGh4ZHRvPTeQnp6ZnZgtz86cmTrXAiohVrM8DMKhkhBWSpPA80QhoPAsaO13/bNr1EbG0Qm1E6yFcBnJphRAVqoXjvwGKgK+w1f8AHR20an7hDeUHWAICjv8lpet6a3ytf8T7mq9UHRLbg/8N/EGpMgGKNcLD34jFmIEQkFxlQ9N6FaBpqksBfn/dRgAqIFl1i1NIQTS3r7dzhy1Zp8Gas7YmI9SvExmExrTDwCZDoCvz0+uKf3nVlJrbtUxGSUoYif5DzVRxinm3QN6QGgWptiUgtLR/5eIKNAiyNedtCd7PlX+T0/WSt1FaP94s7u4N6sixRbEVpjHtguO2RlVmGC3bFH9sJenXvn2Xlz3vRIWcws8C+wfn4BFYPqEA=</latexit>

CATEFemale =

Person Sex Treated? Outcome with program Outcome without program Effect 1 M TRUE 80 60 20 2 M TRUE 75 70 5 3 M TRUE 85 80 5 4 M FALSE 70 60 10 5 F TRUE 75 70 5 6 F FALSE 80 80 7 F FALSE 90 100 −10 8 F FALSE 85 80 5

slide-12
SLIDE 12

ATT & ATU

Average treatment on the treated

ATT / TOT Effect for those with treatment

Average treatment on the untreated

ATU / TUT Effect for those with without treatment

slide-13
SLIDE 13

ATT = 8.75 ATU = 1.25

δ = ( ¯ YTreated|P = 1) − ( ¯ YTreated|P = 0)

<latexit sha1_base64="GtJed9vipYNzsE6Pf4U60/XfzNA=">ACNXichVC7SgNBFJ2NrxhfUubwSBoYdhVQRshaGNhESEvyYwO3tjBmcfzNwVw5qfsvE/rLSwUMTWX3DyKDQKHhg4nHPuzNzjxVJotO1nKzM1PTM7l53PLSwuLa/kV9dqOkoUhyqPZKQaHtMgRQhVFCihEStgSeh7l2fDvz6DSgtorCvRhaAbsKRUdwhkZq589dHyQyeky3XY+p9LfdhFuMa2YWxD8Pr2jZeM6O3T3n4i9084X7KI9BP1NnDEpkDHK7fyj60c8CSBELpnWTceOsZUyhYJL6OfcREPM+DW7gqahIQtAt9Lh1n26ZRSfdiJlToh0qH6fSFmgdS/wTDJg2NWT3kD8y2sm2DlqpSKME4SQjx7qJiRAcVUl8o4Ch7hjCuhPkr5V2mGEdTdM6U4Eyu/JvU9orOfnHv4qBQOhnXkSUbZJNsE4ckhI5I2VSJZzckyfySt6sB+vFerc+RtGMNZ5ZJz9gfX4BYCSpUg=</latexit>

δ = ( ¯ YUntreated|P = 1) − ( ¯ YUntreated|P = 0)

<latexit sha1_base64="FD4EnJ8lTIMymoELTRPkKZAWBmc=">ACOXichVDLSgMxFM3UV62vqks3wSLowjJTBd0IRTcuK1gfdErJZG7b0ExmSO6IZexvufEv3AluXCji1h8wrV34Ag8EDuecm+SeIJHCoOs+OLmJyanpmfxsYW5+YXGpuLxyZuJUc6jzWMb6ImAGpFBQR4ESLhINLAoknAe9o6F/fgXaiFidYj+BZsQ6SrQFZ2ilVrHmhyCR0QO6QdMZ5eDlo9wjVldob0HIRzQG1qzvrdFt/8NuVutYsktuyPQ38QbkxIZo9Yq3vthzNMIFHLJjGl4boLNjGkUXMKg4KcGEsZ7rAMNSxWLwDSz0eYDumGVkLZjbY9COlK/TmQsMqYfBTYZMeyan95Q/MtrpNjeb2ZCJSmC4p8PtVNJMabDGmkoNHCUfUsY18L+lfIu04yjLbtgS/B+rvybnFXK3k65crJbqh6O68iTNbJONolH9kiVHJMaqRNObskjeSYvzp3z5Lw6b5/RnDOeWSXf4Lx/ACEdq0A=</latexit>

Person Sex Treated? Outcome with program Outcome without program Effect 1 M TRUE 80 60 20 2 M TRUE 75 70 5 3 M TRUE 85 80 5 4 M FALSE 70 60 10 5 F TRUE 75 70 5 6 F FALSE 80 80 7 F FALSE 90 100 −10 8 F FALSE 85 80 5

slide-14
SLIDE 14

ATE, ATT, & ATU

The ATE is the weighted average

  • f ATT and ATU

(8.75 × 4/8) + (1.25 × 4/8) 5 4.375 + 0.625

slide-15
SLIDE 15

Selection bias

ATE and ATT aren’t always the same

5 = 8.75 + x Randomization fixes this, makes x = 0 x = −3.75

ATE = ATT + Selection bias

slide-16
SLIDE 16

Treatment not randomly assigned

Person Sex Treated? Actual outcome 1 M TRUE 80 2 M TRUE 75 3 M TRUE 85 4 M FALSE 60 5 F TRUE 75 6 F FALSE 80 7 F FALSE 100 8 F FALSE 80

Actual data

We can’t see unit- level causal effects

slide-17
SLIDE 17

Person Sex Treated? Actual outcome 1 M TRUE 80 2 M TRUE 75 3 M TRUE 85 4 M FALSE 60 5 F TRUE 75 6 F FALSE 80 7 F FALSE 100 8 F FALSE 80

Actual data

Treatment seems to be correlated with sex

slide-18
SLIDE 18

Person Sex Treated? Actual outcome 1 M TRUE 80 2 M TRUE 75 3 M TRUE 85 4 M FALSE 60 5 F TRUE 75 6 F FALSE 80 7 F FALSE 100 8 F FALSE 80

Actual data

We can estimate ATE by finding weighted average of sex- based CATEs [ ATE = πMale \ CATEMale + πFemale \ CATEFemale

<latexit sha1_base64="wdX3RFu9y8ivwLkwZ3z8Hp73M9g=">ACaXichVHLSgMxFM2M7/qiK6CRZBEMqMCroRqkVxIyhYFTqlZNLbNph5kNxRyzDgN7rzB9z4E6btLGoVvBA4nEceJ34shUbH+bDsicmp6ZnZucL8wuLScnFl9V5HieJQ45GM1KPNEgRQg0FSniMFbDAl/DgP1X7+sMzKC2i8A57MTQC1glFW3CGhmoW37wX0YIuw9RDeMX07O4iy+gp9WLRHDLXTEJGx2zVvm/UkNH9kcwlBP+lckvWLJacsjMY+hu4OSiRfG6axXevFfEkgBC5ZFrXSfGRsoUCm42LHiJhpjxJ9aBuoEhC0A30kFTGd01TIu2I2VWiHTAjiZSFmjdC3zjDBh29bjWJ/S6gm2TxqpCOMEIeTDg9qJpBjRfu20JRwlD0DGFfC3JXyLlOMo/mcginBHX/yb3B/UHYPywe3R6XKeV7HLNkmO2SPuOSYVMgVuSE1wsmntWCtWxvWl71ib9pbQ6t5Zk18mPs0jeWCrzH</latexit>

As long as we assume/pretend treatment was randomly assigned within each sex = unconfoundedness

slide-19
SLIDE 19

Person Sex Treated? Actual outcome 1 M TRUE 80 2 M TRUE 75 3 M TRUE 85 4 M FALSE 60 5 F TRUE 75 6 F FALSE 80 7 F FALSE 100 8 F FALSE 80

Actual data

[ ATE = πMale \ CATEMale + πFemale \ CATEFemale

<latexit sha1_base64="wdX3RFu9y8ivwLkwZ3z8Hp73M9g=">ACaXichVHLSgMxFM2M7/qiK6CRZBEMqMCroRqkVxIyhYFTqlZNLbNph5kNxRyzDgN7rzB9z4E6btLGoVvBA4nEceJ34shUbH+bDsicmp6ZnZucL8wuLScnFl9V5HieJQ45GM1KPNEgRQg0FSniMFbDAl/DgP1X7+sMzKC2i8A57MTQC1glFW3CGhmoW37wX0YIuw9RDeMX07O4iy+gp9WLRHDLXTEJGx2zVvm/UkNH9kcwlBP+lckvWLJacsjMY+hu4OSiRfG6axXevFfEkgBC5ZFrXSfGRsoUCm42LHiJhpjxJ9aBuoEhC0A30kFTGd01TIu2I2VWiHTAjiZSFmjdC3zjDBh29bjWJ/S6gm2TxqpCOMEIeTDg9qJpBjRfu20JRwlD0DGFfC3JXyLlOMo/mcginBHX/yb3B/UHYPywe3R6XKeV7HLNkmO2SPuOSYVMgVuSE1wsmntWCtWxvWl71ib9pbQ6t5Zk18mPs0jeWCrzH</latexit>

CATEMale = 20 CATEFemale = −11.67 ATE = 4.16

slide-20
SLIDE 20

Person Sex Treated? Actual outcome 1 M TRUE 80 2 M TRUE 75 3 M TRUE 85 4 M FALSE 60 5 F TRUE 75 6 F FALSE 80 7 F FALSE 100 8 F FALSE 80

CATETreated = 78.75 CATEUntreated = 80 ATE = −1.25

[ ATE = \ CATETreated − \ CATEUntreated

<latexit sha1_base64="hSP8RBz3CkiInfEMU2JsPFnvIc=">ACTXicfVFLSwMxGMzWR2t9VT16CRbBi2VXBb0IVRE8KvQhtKVks19tMJtdkm/VsvQPehG8+S+8eFBETB+CVnEgMzMl8fEj6Uw6LpPTmZqemY2m5vLzy8sLi0XVlZrJko0hyqPZKQvfWZACgVFCjhMtbAQl9C3b8+Gfj1G9BGRKqCvRhaIbtSoiM4Qyu1C0HzVgTQZg2Ee4wPaqc9v0kE7IJwO9PeIVuz9CYGPb/8WqCr+C7ULRLblD0N/EG5MiGeO8XhsBhFPQlDIJTOm4bkxtlKmUXAJ/XwzMRAzfs2uoGpYiGYVjpso083rRLQTqTtUkiH6veJlIXG9ELfJkOGXTPpDcS/vEaCnYNWKlScICg+OqiTSIoRHVRLA6GBo+xZwrgW9q6Ud5lmHO0H5G0J3uSTf5PaTsnbLe1c7BXLx+M6cmSdbJAt4pF9UiZn5JxUCSf35Jm8kjfnwXlx3p2PUTjGfWyA9ksp9Qtb0</latexit>

Only do this if treatment is random!

slide-21
SLIDE 21

Matching and ATEs

We chose sex here because it correlates with (and confounds) the outcome [ ATE = πMale \ CATEMale + πFemale \ CATEFemale

<latexit sha1_base64="wdX3RFu9y8ivwLkwZ3z8Hp73M9g=">ACaXichVHLSgMxFM2M7/qiK6CRZBEMqMCroRqkVxIyhYFTqlZNLbNph5kNxRyzDgN7rzB9z4E6btLGoVvBA4nEceJ34shUbH+bDsicmp6ZnZucL8wuLScnFl9V5HieJQ45GM1KPNEgRQg0FSniMFbDAl/DgP1X7+sMzKC2i8A57MTQC1glFW3CGhmoW37wX0YIuw9RDeMX07O4iy+gp9WLRHDLXTEJGx2zVvm/UkNH9kcwlBP+lckvWLJacsjMY+hu4OSiRfG6axXevFfEkgBC5ZFrXSfGRsoUCm42LHiJhpjxJ9aBuoEhC0A30kFTGd01TIu2I2VWiHTAjiZSFmjdC3zjDBh29bjWJ/S6gm2TxqpCOMEIeTDg9qJpBjRfu20JRwlD0DGFfC3JXyLlOMo/mcginBHX/yb3B/UHYPywe3R6XKeV7HLNkmO2SPuOSYVMgVuSE1wsmntWCtWxvWl71ib9pbQ6t5Zk18mPs0jeWCrzH</latexit>

And we assumed unfoundedness; that treatment is randomly assigned within the groups

slide-22
SLIDE 22

Does attending a private university cause an increase in earnings?

slide-23
SLIDE 23

Average private − Average public

(110,000 + 100,000 + 60,000 + 115,000 + 75,000) / 5 = $92,000 (110,000 + 30,000 + 90,000 + 60,000) / 4 = $72,500

($92,500 × 5/9) − ($72,500 × 4/9) = $19,166.67

This is wrong! [ ATE = πPrivate \ CATEPrivate − πPublic \ CATEPublic

<latexit sha1_base64="eLtd6ePyJdR1mtDOYJOlHKQG6aM=">AAACb3icfVHLSgMxFM2Mr1pfVRcuKhIsgi4sM1XQjaAWwWUFq0KnlEx624ZmHiR3qmWYrR/ozn9w4x+YPhZaxQuBw3lwkxM/lkKj47xb9tz8wuJSbjm/srq2vlHY3HrQUaI41HkkI/XkMw1ShFBHgRKeYgUs8CU8+v3qSH8cgNIiCu9xGEMzYN1QdARnaKhW4dV7Fm3oMUw9hBdMr+5vsoxeUC8WrQlTU2LAEDI646yOrDOejB5/Tya+FPz/4MSStQolp+yMh/4G7hSUyHRqrcKb1454EkCIXDKtG64TYzNlCgWXkOW9REPMeJ91oWFgyALQzXTcV0YPDNOmnUiZEyIds98TKQu0Hga+cQYMe3pWG5F/aY0EO+fNVIRxghDyyaJOIilGdFQ+bQsFHOXQAMaVMHelvMcU42i+KG9KcGef/Bs8VMruSblyd1q6vJ7WkSNFsk8OiUvOyCW5JTVSJ5x8WFtW0dq1Pu0de8+mE6ttTTPb5MfYR18v6r/n</latexit>
slide-24
SLIDE 24

Grouping and matching

These groups look like they have similar characteristics

(Unconfoundedness?)

slide-25
SLIDE 25

−$5,000 $30,000 ??? ??? (−$5,000 × 3/5) + ($30,000 × 2/5) = $9,000

This is less wrong! [ ATE = πGroup A \ CATEGroup A + πGroup B \ CATEGroup B

<latexit sha1_base64="EMOOklnKxFPP9+5mj/fJAaeuAXc=">AAACcXicjVHLSgMxFM2M7/qqj42IEiyCIJSZKuhG0IrosoKtQqeUTHrbhmYeJHfUMsze73PnT7jxB0wfC21deCFwOA9ucuLHUmh0nA/Lnpmdm19YXMotr6yurec3Nms6ShSHKo9kpJ58pkGKEKooUMJTrIAFvoRHv3c90B+fQWkRhQ/Yj6ERsE4o2oIzNFQz/+a9iBZ0GaYewiumVw83WUYvqBeL5oi5VVES06uMTjivB9YJT0aPp5LlfyTLWdbMF5yiMxw6DdwxKJDxVJr5d68V8SSAELlkWtddJ8ZGyhQKLiHLeYmGmPEe60DdwJAFoBvpsLGMHhqmRduRMidEOmR/JlIWaN0PfOMMGHb1pDYg/9LqCbbPG6kI4wQh5KNF7URSjOigftoSCjjKvgGMK2HuSnmXKcbRfFLOlOBOPnka1EpF96RYuj8tXJbHdSySXXJAjohLzsgluSMVUiWcfFrb1p61b33ZOza1D0ZW2xpntsivsY+/AXezvz8=</latexit>
slide-26
SLIDE 26

Matching with regression

earnings = ↵ + 1Private + 2Group A + ✏

<latexit sha1_base64="+FhaKJ6egxIdLTPGwgAWiZw=">ACPnicbVBaxNBGJ2tWtUa9RjL4OhUBDCbhT0IkQ96DGFblvIhvDt5EsydHZ2mPk2GJb9ZV78Dd489tKDIr32MlmKdr2wcDjvfNzPdSo6SjMPwVbDx4+Gjz8dZ2a+fJ091n7ecvjl1eWIGxyFVuT1NwqKTGmCQpPDUWIUsVnqRn1f+yQKtk7k+oqXBUQYzLadSAHlp3I4Twm9UIlgt9cxV/ANPQJk58Nc8SZFgHPF1ZGDlAgirG6PXGF9sXhj+sTbQOKlW93bCbliD3yVRQzqswWDc/plMclFkqEkocG4YhYZGJViSQmHVSgqHBsQZzHDoqYM3ais16/4vlcmfJpbfzTxWv13oTMuW+mQGNHe3vZV4nzcsaPp+VEptCkIt1g9NC8Up56su+URaFKSWnoCw0v+VizlYEOQb/kSotsr3yXHvW70pts7fNvpf2rq2GJ7BU7YBF7x/rsKxuwmAn2nZ2z3+xP8CO4CP4Gl+voRtDMvGT/Ibi6BopyrM=</latexit>

model_earnings <- lm(Earnings ~ Private + Group A, data = schools)

term estimate std_error statistic p_value Intercept 40000 11952.29 3.3467 0.08 Private 10000 13093.07 0.7638 0.52 Group A 60000 13093.07 4.5826 0.04

Β1 = $10,000 This is less wrong! Significance details!

slide-27
SLIDE 27

The Four Horsemen

  • f Validity
slide-28
SLIDE 28

Internal validity External validity Construct validity Statistical conclusion validity

Threats to validity

slide-29
SLIDE 29

Omitted variable bias Trends Study calibration Contamination

Selection Attrition Maturation Secular trends Testing Regression Measurement error Time frame of study Seasonality Hawthorne John Henry Spillovers Intervening events

Internal validity

slide-30
SLIDE 30

If people can choose to enroll in a program, those that enroll will be different than those that do not How to fix Randomization into treatment and control groups

Selection

slide-31
SLIDE 31

If people can choose when to enroll in a program, time might influence the result How to fix Shift time around

Selection

slide-32
SLIDE 32
slide-33
SLIDE 33

Married young Married later Never married

slide-34
SLIDE 34

Is this gap the happiness bump?

slide-35
SLIDE 35
slide-36
SLIDE 36

https://vimeo.com/83228781

slide-37
SLIDE 37

If the people who leave a program or study are different than those that stay, the effects will be biased How to fix Check characteristics of those that stay and those that leave

Attrition

slide-38
SLIDE 38

Fake microfinance program results

ID Increase in income Remained in program 1 $3.00 Yes 2 $3.50 Yes 3 $2.00 Yes 4 $1.50 No 5 $1.00 No

ATE with attriters = $2.20 ATE without attriters = $2.83

slide-39
SLIDE 39

Growth is expected naturally, like checking if a program helps child cognitive ability (Sesame Street) How to fix Use a comparison group to remove the trend

Maturation

slide-40
SLIDE 40
slide-41
SLIDE 41

Trends in data are happening because

  • f larger global processes

How to fix Use a comparison group to remove the trend

Recessions Cultural shifts Marriage equality

Secular trends

slide-42
SLIDE 42

Trends in data are happening because of regular time-based trends How to fix Compare observations from same time period or use yearly/monthly averages

Seasonal trends

slide-43
SLIDE 43

0% 2% 4% 6% 8% 10% 12% 14% 16% 18% 20% J a n u a r y F e b r u a r y M a r c h A p r i l M a y J u n e J u l y A u g u s t S e p t e m b e r O c t

  • b

e r N

  • v

e m b e r D e c e m b e r

Charitable giving by month, 2017

slide-44
SLIDE 44

Repeated exposure to questions or tasks will make people improve How to fix Change tests, don’t offer pre-tests maybe, use a control group that receives the test

Testing

slide-45
SLIDE 45

People in the extreme have a tendency to become less extreme over time How to fix Don’t select super high or super low performers

Luck Crime and terrorism Hot hand effect

Regression to the mean

slide-46
SLIDE 46

Measuring the outcome incorrectly will mess with effect How to fix Measure the outcome well

Measurement error

slide-47
SLIDE 47

If the study is too short, the effect might not be detectable yet; if the study is too long, attrition becomes a problem How to fix Use prior knowledge about the thing you’re studying to choose the right length

Time frame

slide-48
SLIDE 48

Observing people makes them behave differently How to fix Hide? Use completely unobserved control groups

Hawthorne effect

slide-49
SLIDE 49

Control group works hard to prove they’re as good as the treatment group How to fix Keep two groups separate

John Henry effect

slide-50
SLIDE 50

Control groups naturally pick up what the treatment group is getting How to fix Keep two groups separate, use distant control groups

Externalities Social interaction Equilibrium effects

Spillover effect

slide-51
SLIDE 51
slide-52
SLIDE 52

Something happens that affects one of the groups and not the other How to fix

¯\_(ツ)_/¯

Intervening events

slide-53
SLIDE 53

Omitted variable bias Trends Study calibration Contamination

Selection Attrition Maturation Secular trends Testing Regression Measurement error Time frame of study Seasonality Hawthorne John Henry Spillovers Intervening events

Internal validity

slide-54
SLIDE 54

Randomization fixes a host of big issues

Selection Maturation Regression to the mean

Randomization doesn’t fix everything!

Attrition Contamination Measurement

Fixing internal validity

slide-55
SLIDE 55

Findings are generalizable to the entire universe or population

External validity

slide-56
SLIDE 56

Laboratory conditions vs. real world Study volunteers are weird

(Western, educated, from industrialized, rich, and democratic countries)

Not everyone takes surveys

Amazon Mechanical Turk Online surveys Random digit dialing

External validity

slide-57
SLIDE 57

Different circumstances in general Does a study in one state apply to other states? Does a mosquito net trial in Eritrea transfer to Bolivia?

External validity

slide-58
SLIDE 58

The Streetlight Effect

Construct validity

slide-59
SLIDE 59

You’re measuring the thing you want to measure

Test scores measure how good kids are at taking tests Do test scores work for school evaluation?

This is why we spent so much time on

  • utcome measurement construction

Construct validity

slide-60
SLIDE 60

Are your stats correct?

Statistical power Violated assumptions

  • f statistical tests

Fishing and p-hacking and error rate problem If p = 0.05, and you measure 20 outcomes, 1

  • f those will likely show correlation

Statistical conclusion validity

slide-61
SLIDE 61

Internal validity External validity Construct validity Statistical conclusion validity

Omitted variable bias Trends Study calibration Contamination

Threats to validity

slide-62
SLIDE 62

Omitted variable bias Trends Study calibration Contamination

Selection Attrition Maturation Secular trends Testing Regression Measurement error Time frame of study Seasonality Hawthorne John Henry Spillovers Intervening events

Internal validity