Gov 2002 - Causal Inference II: Instrumental Variables Matthew - PowerPoint PPT Presentation

IV estimator with constant effects Y i = α + τ A i + γ U i + η i ◮ With this in hand, we can formulate an expression for the average treatment effect here: τ = Cov( Y i , Z i ) Cov( A i , Z i ) = Cov( Y i , Z i ) / V [ Z i ] Cov( A i , Z i ) / V [ Z i ] ◮ Reduced form coefficient: Cov( Y i , Z i ) / V [ Z i ]

IV estimator with constant effects Y i = α + τ A i + γ U i + η i ◮ With this in hand, we can formulate an expression for the average treatment effect here: τ = Cov( Y i , Z i ) Cov( A i , Z i ) = Cov( Y i , Z i ) / V [ Z i ] Cov( A i , Z i ) / V [ Z i ] ◮ Reduced form coefficient: Cov( Y i , Z i ) / V [ Z i ] ◮ First stage coefficient: Cov( A i , Z i ) / V [ Z i ]

IV estimator with constant effects Y i = α + τ A i + γ U i + η i ◮ With this in hand, we can formulate an expression for the average treatment effect here: τ = Cov( Y i , Z i ) Cov( A i , Z i ) = Cov( Y i , Z i ) / V [ Z i ] Cov( A i , Z i ) / V [ Z i ] ◮ Reduced form coefficient: Cov( Y i , Z i ) / V [ Z i ] ◮ First stage coefficient: Cov( A i , Z i ) / V [ Z i ] ◮ What happens with a weak first stage?

Wald Estimator ◮ With a binary instrument, there is a simple estimator based on this formulation called the Wald estimator. It is easy to show that: τ = Cov( Y i , Z i ) Cov( A i , Z i ) = E [ Y i | Z i = 1] − E [ Y i | Z i = 0] E [ A i | Z i = 1] − E [ A i | Z i = 0]

Wald Estimator ◮ With a binary instrument, there is a simple estimator based on this formulation called the Wald estimator. It is easy to show that: τ = Cov( Y i , Z i ) Cov( A i , Z i ) = E [ Y i | Z i = 1] − E [ Y i | Z i = 0] E [ A i | Z i = 1] − E [ A i | Z i = 0] ◮ Intuitively, the effects of Z i on Y i divided by the effect of Z i on A i

What about covariates? ◮ No covariates up until now. What if we have a set of covariates X i that we are also conditioning on?

What about covariates? ◮ No covariates up until now. What if we have a set of covariates X i that we are also conditioning on? ◮ Let’s start with linear models for both the outcome and the treatment: Y i = X ′ i β + τ A i + ε i A i = X ′ i α + γ Z i + ν i

What about covariates? ◮ No covariates up until now. What if we have a set of covariates X i that we are also conditioning on? ◮ Let’s start with linear models for both the outcome and the treatment: Y i = X ′ i β + τ A i + ε i A i = X ′ i α + γ Z i + ν i ◮ Now, we assume that X i are exogenous along with Z i : E [ Z i ν i ] = 0 E [ Z i ε i ] = 0 E [ X i ν i ] = 0 E [ X i ε i ] = 0

What about covariates? ◮ No covariates up until now. What if we have a set of covariates X i that we are also conditioning on? ◮ Let’s start with linear models for both the outcome and the treatment: Y i = X ′ i β + τ A i + ε i A i = X ′ i α + γ Z i + ν i ◮ Now, we assume that X i are exogenous along with Z i : E [ Z i ν i ] = 0 E [ Z i ε i ] = 0 E [ X i ν i ] = 0 E [ X i ε i ] = 0 ◮ . . . but A i is endogenous : E [ A i ε i ] � = 0

Getting the reduced form ◮ We can plug the treatment equation into the outcome equation: Y i = X ′ i β + τ [ X ′ i α + γ Z i + ν i ] + ε i = X ′ i β + τ [ X ′ i α + γ Z i ] + [ τν i + ε i ] = X ′ i β + τ [ X ′ i α + γ Z i ] + ε ∗ i

Getting the reduced form ◮ We can plug the treatment equation into the outcome equation: Y i = X ′ i β + τ [ X ′ i α + γ Z i + ν i ] + ε i = X ′ i β + τ [ X ′ i α + γ Z i ] + [ τν i + ε i ] = X ′ i β + τ [ X ′ i α + γ Z i ] + ε ∗ i ◮ Red value in the brackets is the population fitted value of the treatment, E [ A i | X i , Z i ]

Getting the reduced form ◮ We can plug the treatment equation into the outcome equation: Y i = X ′ i β + τ [ X ′ i α + γ Z i + ν i ] + ε i = X ′ i β + τ [ X ′ i α + γ Z i ] + [ τν i + ε i ] = X ′ i β + τ [ X ′ i α + γ Z i ] + ε ∗ i ◮ Red value in the brackets is the population fitted value of the treatment, E [ A i | X i , Z i ] ◮ Because Z i and X i are uncorrelated with ν i and ε i , then this fitted value is also independent of ε ∗ i .

Getting the reduced form ◮ We can plug the treatment equation into the outcome equation: Y i = X ′ i β + τ [ X ′ i α + γ Z i + ν i ] + ε i = X ′ i β + τ [ X ′ i α + γ Z i ] + [ τν i + ε i ] = X ′ i β + τ [ X ′ i α + γ Z i ] + ε ∗ i ◮ Red value in the brackets is the population fitted value of the treatment, E [ A i | X i , Z i ] ◮ Because Z i and X i are uncorrelated with ν i and ε i , then this fitted value is also independent of ε ∗ i . ◮ Thus, the population regression coefficient of a Y i on [ X ′ i α + γ Z i ] is the average treatment effect, τ .

Two-stage least squares ◮ In practice, we estimate the first stage from a sample and calculate OLS fitted values: ˆ A i = X ′ i ˆ α + ˆ γ Z i .

Two-stage least squares ◮ In practice, we estimate the first stage from a sample and calculate OLS fitted values: ˆ A i = X ′ i ˆ α + ˆ γ Z i . ◮ Here, ˆ α and ˆ γ are estimates from OLS. Then, we estimate a regression of Y i on X i and ˆ A i . We plug this into our equation for Y i and note that the error for A i is now a residual: i β + τ ˆ A i + [ ε i + τ ( A i − ˆ Y i = X ′ A i )]

Two-stage least squares ◮ In practice, we estimate the first stage from a sample and calculate OLS fitted values: ˆ A i = X ′ i ˆ α + ˆ γ Z i . ◮ Here, ˆ α and ˆ γ are estimates from OLS. Then, we estimate a regression of Y i on X i and ˆ A i . We plug this into our equation for Y i and note that the error for A i is now a residual: i β + τ ˆ A i + [ ε i + τ ( A i − ˆ Y i = X ′ A i )] ◮ Key question: is ˆ A i uncorrelated with the error?

Two-stage least squares ◮ In practice, we estimate the first stage from a sample and calculate OLS fitted values: ˆ A i = X ′ i ˆ α + ˆ γ Z i . ◮ Here, ˆ α and ˆ γ are estimates from OLS. Then, we estimate a regression of Y i on X i and ˆ A i . We plug this into our equation for Y i and note that the error for A i is now a residual: i β + τ ˆ A i + [ ε i + τ ( A i − ˆ Y i = X ′ A i )] ◮ Key question: is ˆ A i uncorrelated with the error? ◮ ˆ A i is just a function of X i and Z i so it is uncorrelated with ε i .

Two-stage least squares ◮ In practice, we estimate the first stage from a sample and calculate OLS fitted values: ˆ A i = X ′ i ˆ α + ˆ γ Z i . ◮ Here, ˆ α and ˆ γ are estimates from OLS. Then, we estimate a regression of Y i on X i and ˆ A i . We plug this into our equation for Y i and note that the error for A i is now a residual: i β + τ ˆ A i + [ ε i + τ ( A i − ˆ Y i = X ′ A i )] ◮ Key question: is ˆ A i uncorrelated with the error? ◮ ˆ A i is just a function of X i and Z i so it is uncorrelated with ε i . ◮ We also know that ˆ A i is uncorrelated with ( A i − ˆ A i )?

Two-stage least squares ◮ Heuristic procedure:

Two-stage least squares ◮ Heuristic procedure: 1. Run regression of treatment on covariates and instrument

Two-stage least squares ◮ Heuristic procedure: 1. Run regression of treatment on covariates and instrument 2. Construct fitted values of treatment

Two-stage least squares ◮ Heuristic procedure: 1. Run regression of treatment on covariates and instrument 2. Construct fitted values of treatment 3. Run regression of outcome on covariates and fitted values

Two-stage least squares ◮ Heuristic procedure: 1. Run regression of treatment on covariates and instrument 2. Construct fitted values of treatment 3. Run regression of outcome on covariates and fitted values ◮ Note that this isn’t how we actually estimate 2SLS because the standard errors are all wrong.

Two-stage least squares ◮ Heuristic procedure: 1. Run regression of treatment on covariates and instrument 2. Construct fitted values of treatment 3. Run regression of outcome on covariates and fitted values ◮ Note that this isn’t how we actually estimate 2SLS because the standard errors are all wrong. ◮ Computer wants to calculate the standard errors based on ε ∗ i , but what we really want is the standard errors based on ε i .

Nunn & Wantchekon IV example

General 2SLS ◮ To save on notation, we’ll roll all the variables in the structural model in one vector, X i , of size k , some of which may be endogenous.

General 2SLS ◮ To save on notation, we’ll roll all the variables in the structural model in one vector, X i , of size k , some of which may be endogenous. ◮ The structural model, then is: Y i = X ′ i β + ε i

General 2SLS ◮ To save on notation, we’ll roll all the variables in the structural model in one vector, X i , of size k , some of which may be endogenous. ◮ The structural model, then is: Y i = X ′ i β + ε i ◮ Z i will be a vector of l exogenous variables that includes any exogenous variables in X i plus any instruments. Key assumption: E [ Z i ε i ] = 0

Nasty Matrix Algebra ◮ Useful quantities: i ]) − 1 E [ Z i X ′ Π = ( E [ Z i Z ′ i ] (projection matrix) V i = Π ′ Z i (fitted values)

Nasty Matrix Algebra ◮ Useful quantities: i ]) − 1 E [ Z i X ′ Π = ( E [ Z i Z ′ i ] (projection matrix) V i = Π ′ Z i (fitted values) ◮ To derive the 2SLS estimator, take the fitted values, Π ′ Z i and multiply both sides of the outcome equation by them: Y i = X ′ i β + ε i

Nasty Matrix Algebra ◮ Useful quantities: i ]) − 1 E [ Z i X ′ Π = ( E [ Z i Z ′ i ] (projection matrix) V i = Π ′ Z i (fitted values) ◮ To derive the 2SLS estimator, take the fitted values, Π ′ Z i and multiply both sides of the outcome equation by them: Y i = X ′ i β + ε i Π ′ Z i Y i = Π ′ Z i X ′ i β + Π ′ Z i ε i

Nasty Matrix Algebra ◮ Useful quantities: i ]) − 1 E [ Z i X ′ Π = ( E [ Z i Z ′ i ] (projection matrix) V i = Π ′ Z i (fitted values) ◮ To derive the 2SLS estimator, take the fitted values, Π ′ Z i and multiply both sides of the outcome equation by them: Y i = X ′ i β + ε i Π ′ Z i Y i = Π ′ Z i X ′ i β + Π ′ Z i ε i Π ′ E [ Z i Y i ] = Π ′ E [ Z i X ′ i ] β + Π ′ E [ Z i ε i ]

Nasty Matrix Algebra ◮ Useful quantities: i ]) − 1 E [ Z i X ′ Π = ( E [ Z i Z ′ i ] (projection matrix) V i = Π ′ Z i (fitted values) ◮ To derive the 2SLS estimator, take the fitted values, Π ′ Z i and multiply both sides of the outcome equation by them: Y i = X ′ i β + ε i Π ′ Z i Y i = Π ′ Z i X ′ i β + Π ′ Z i ε i Π ′ E [ Z i Y i ] = Π ′ E [ Z i X ′ i ] β + Π ′ E [ Z i ε i ] Π ′ E [ Z i Y i ] = Π ′ E [ Z i X ′ i ] β

Nasty Matrix Algebra ◮ Useful quantities: i ]) − 1 E [ Z i X ′ Π = ( E [ Z i Z ′ i ] (projection matrix) V i = Π ′ Z i (fitted values) ◮ To derive the 2SLS estimator, take the fitted values, Π ′ Z i and multiply both sides of the outcome equation by them: Y i = X ′ i β + ε i Π ′ Z i Y i = Π ′ Z i X ′ i β + Π ′ Z i ε i Π ′ E [ Z i Y i ] = Π ′ E [ Z i X ′ i ] β + Π ′ E [ Z i ε i ] Π ′ E [ Z i Y i ] = Π ′ E [ Z i X ′ i ] β i ]) − 1 Π ′ E [ Z i Y i ] β = (Π ′ E [ Z i X ′

Nasty Matrix Algebra ◮ Useful quantities: i ]) − 1 E [ Z i X ′ Π = ( E [ Z i Z ′ i ] (projection matrix) V i = Π ′ Z i (fitted values) ◮ To derive the 2SLS estimator, take the fitted values, Π ′ Z i and multiply both sides of the outcome equation by them: Y i = X ′ i β + ε i Π ′ Z i Y i = Π ′ Z i X ′ i β + Π ′ Z i ε i Π ′ E [ Z i Y i ] = Π ′ E [ Z i X ′ i ] β + Π ′ E [ Z i ε i ] Π ′ E [ Z i Y i ] = Π ′ E [ Z i X ′ i ] β i ]) − 1 Π ′ E [ Z i Y i ] β = (Π ′ E [ Z i X ′ i ]) − 1 E [ Z i X ′ i ]) − 1 E [ Z i X ′ i ]) − 1 E [ Z i Y i ] β = ( E [ X i Z ′ i ]( E [ Z i Z ′ i ]( E [ Z i Z ′

How to estimate the parameters ◮ Collect X i into a n × k matrix X = ( X ′ 1 , . . . , X ′ n )

How to estimate the parameters ◮ Collect X i into a n × k matrix X = ( X ′ 1 , . . . , X ′ n ) ◮ Collect Z i into a n × l matrix Z = ( Z ′ 1 , . . . , Z ′ n )

How to estimate the parameters ◮ Collect X i into a n × k matrix X = ( X ′ 1 , . . . , X ′ n ) ◮ Collect Z i into a n × l matrix Z = ( Z ′ 1 , . . . , Z ′ n ) ◮ Matrix party trick: X ′ Z / n = (1 / n ) � N p i X i Z ′ → E [ X i Z ′ i ]. i

How to estimate the parameters ◮ Collect X i into a n × k matrix X = ( X ′ 1 , . . . , X ′ n ) ◮ Collect Z i into a n × l matrix Z = ( Z ′ 1 , . . . , Z ′ n ) ◮ Matrix party trick: X ′ Z / n = (1 / n ) � N p i X i Z ′ → E [ X i Z ′ i ]. i ◮ Take the population formula for the parameters: i ]) − 1 E [ Z i X ′ i ]) − 1 E [ Z i X ′ i ]) − 1 E [ Z i Y i ] β = ( E [ Z i X ′ i ]( E [ Z i Z ′ i ]( E [ Z i Z ′

How to estimate the parameters ◮ Collect X i into a n × k matrix X = ( X ′ 1 , . . . , X ′ n ) ◮ Collect Z i into a n × l matrix Z = ( Z ′ 1 , . . . , Z ′ n ) ◮ Matrix party trick: X ′ Z / n = (1 / n ) � N p i X i Z ′ → E [ X i Z ′ i ]. i ◮ Take the population formula for the parameters: i ]) − 1 E [ Z i X ′ i ]) − 1 E [ Z i X ′ i ]) − 1 E [ Z i Y i ] β = ( E [ Z i X ′ i ]( E [ Z i Z ′ i ]( E [ Z i Z ′ ◮ And plug in the sample values (the n cancels out): ˆ β = [( X ′ Z )( Z ′ Z ) − 1 ( Z ′ X )] − 1 ( Z ′ X )( Z ′ Z ) − 1 ( Z ′ Y )

How to estimate the parameters ◮ Collect X i into a n × k matrix X = ( X ′ 1 , . . . , X ′ n ) ◮ Collect Z i into a n × l matrix Z = ( Z ′ 1 , . . . , Z ′ n ) ◮ Matrix party trick: X ′ Z / n = (1 / n ) � N p i X i Z ′ → E [ X i Z ′ i ]. i ◮ Take the population formula for the parameters: i ]) − 1 E [ Z i X ′ i ]) − 1 E [ Z i X ′ i ]) − 1 E [ Z i Y i ] β = ( E [ Z i X ′ i ]( E [ Z i Z ′ i ]( E [ Z i Z ′ ◮ And plug in the sample values (the n cancels out): ˆ β = [( X ′ Z )( Z ′ Z ) − 1 ( Z ′ X )] − 1 ( Z ′ X )( Z ′ Z ) − 1 ( Z ′ Y ) ◮ This is how R/Stata estimates the 2SLS parameters

Asymptotics for 2SLS ◮ Let V = Z ( Z ′ Z ) − 1 Z ′ X be the matrix of fitted values for X , then we have ˆ β = ( V ′ V ) − 1 V ′ Y

Asymptotics for 2SLS ◮ Let V = Z ( Z ′ Z ) − 1 Z ′ X be the matrix of fitted values for X , then we have ˆ β = ( V ′ V ) − 1 V ′ Y ◮ We can insert the true model for Y : ˆ β = ( V ′ V ) − 1 V ′ ( X β + ε )

Asymptotics for 2SLS ◮ Let V = Z ( Z ′ Z ) − 1 Z ′ X be the matrix of fitted values for X , then we have ˆ β = ( V ′ V ) − 1 V ′ Y ◮ We can insert the true model for Y : ˆ β = ( V ′ V ) − 1 V ′ ( X β + ε ) ◮ Using the matrix party trick and that V ′ X = V ′ V , we have ˆ β = ( V ′ V ) − 1 V ′ X β + ( V ′ V ) − 1 V ′ ε � � − 1 n − 1 � n − 1 � V i V ′ = β + V i ε i i i i

Asymptotics for 2SLS ◮ Let V = Z ( Z ′ Z ) − 1 Z ′ X be the matrix of fitted values for X , then we have ˆ β = ( V ′ V ) − 1 V ′ Y ◮ We can insert the true model for Y : ˆ β = ( V ′ V ) − 1 V ′ ( X β + ε ) ◮ Using the matrix party trick and that V ′ X = V ′ V , we have ˆ β = ( V ′ V ) − 1 V ′ X β + ( V ′ V ) − 1 V ′ ε � � − 1 n − 1 � n − 1 � V i V ′ = β + V i ε i i i i ◮ Consistent because n − 1 � p i V i ε i → E [ V i ε i ] = 0.

Asymptotic variance for 2SLS � � − 1 � � n − 1 � n − 1 / 2 � √ n (ˆ V i V ′ β − β ) = V i ε i i i i ◮ By the CLT, n − 1 / 2 � i V i ε i converges in distribution to N (0 , B ), where B = E [ V ′ i ε ′ i ε i V i ].

Asymptotic variance for 2SLS � � − 1 � � n − 1 � n − 1 / 2 � √ n (ˆ V i V ′ β − β ) = V i ε i i i i ◮ By the CLT, n − 1 / 2 � i V i ε i converges in distribution to N (0 , B ), where B = E [ V ′ i ε ′ i ε i V i ]. ◮ By the LLN, n − 1 � p i V i V ′ → E [ V ′ i V i ]. i

Asymptotic variance for 2SLS � � − 1 � � n − 1 � n − 1 / 2 � √ n (ˆ V i V ′ β − β ) = V i ε i i i i ◮ By the CLT, n − 1 / 2 � i V i ε i converges in distribution to N (0 , B ), where B = E [ V ′ i ε ′ i ε i V i ]. ◮ By the LLN, n − 1 � p i V i V ′ → E [ V ′ i V i ]. i ◮ Thus, we have that √ n (ˆ β − β ) has asymptotic variance: i V i ]) − 1 E [ V ′ i V i ]) − 1 ( E [ V ′ i ε ′ i ε i V i ]( E [ V ′

Asymptotic variance for 2SLS � � − 1 � � n − 1 � n − 1 / 2 � √ n (ˆ V i V ′ β − β ) = V i ε i i i i ◮ By the CLT, n − 1 / 2 � i V i ε i converges in distribution to N (0 , B ), where B = E [ V ′ i ε ′ i ε i V i ]. ◮ By the LLN, n − 1 � p i V i V ′ → E [ V ′ i V i ]. i ◮ Thus, we have that √ n (ˆ β − β ) has asymptotic variance: i V i ]) − 1 E [ V ′ i V i ]) − 1 ( E [ V ′ i ε ′ i ε i V i ]( E [ V ′ ◮ Replace with the sample quantities to get estimates: β ) = ( V ′ V ) − 1 � � � var(ˆ u 2 ( V ′ V ) − 1 i V i V ′ � ˆ i i i ˆ where ˆ u i = Y i − X ′ β

Overidentification ◮ What if we have more instruments than endogenous variables?

Overidentification ◮ What if we have more instruments than endogenous variables? ◮ When there are more instruments than causal parameters ( l > k ), the model is overidentified .

Overidentification ◮ What if we have more instruments than endogenous variables? ◮ When there are more instruments than causal parameters ( l > k ), the model is overidentified . ◮ When there are as many instruments as causal parameters ( l = k ), the model is just identified .

Overidentification ◮ What if we have more instruments than endogenous variables? ◮ When there are more instruments than causal parameters ( l > k ), the model is overidentified . ◮ When there are as many instruments as causal parameters ( l = k ), the model is just identified . ◮ With more than one instrument and constant effects, we can test for the plausibility of the exclusion restriction(s) using an overidentification test.

Overidentification ◮ What if we have more instruments than endogenous variables? ◮ When there are more instruments than causal parameters ( l > k ), the model is overidentified . ◮ When there are as many instruments as causal parameters ( l = k ), the model is just identified . ◮ With more than one instrument and constant effects, we can test for the plausibility of the exclusion restriction(s) using an overidentification test. ◮ Is it plausible to find more than one instrument?

Overidentification tests ◮ Sargan test, Hansen test, J-test, etc.

Overidentification tests ◮ Sargan test, Hansen test, J-test, etc. ◮ Basic idea: under null that all instruments are good, running it with different subset of the instruments should only differ due to sampling noise.

Overidentification tests ◮ Sargan test, Hansen test, J-test, etc. ◮ Basic idea: under null that all instruments are good, running it with different subset of the instruments should only differ due to sampling noise. ◮ Identify the distribution of that noise under the null to develop a test.

Overidentification tests ◮ Sargan test, Hansen test, J-test, etc. ◮ Basic idea: under null that all instruments are good, running it with different subset of the instruments should only differ due to sampling noise. ◮ Identify the distribution of that noise under the null to develop a test. ◮ If we reject the null hypothesis in these overidentification tests, then it means that the exclusion restrcitions for our instruments are probably incorrect. Note that it won’t tell us which of them are incorrect, just that at least one is.

Overidentification tests ◮ Sargan test, Hansen test, J-test, etc. ◮ Basic idea: under null that all instruments are good, running it with different subset of the instruments should only differ due to sampling noise. ◮ Identify the distribution of that noise under the null to develop a test. ◮ If we reject the null hypothesis in these overidentification tests, then it means that the exclusion restrcitions for our instruments are probably incorrect. Note that it won’t tell us which of them are incorrect, just that at least one is. ◮ These overidentification tests depend heavily on the constant effects assumption

Overidentification tests ◮ Sargan test, Hansen test, J-test, etc. ◮ Basic idea: under null that all instruments are good, running it with different subset of the instruments should only differ due to sampling noise. ◮ Identify the distribution of that noise under the null to develop a test. ◮ If we reject the null hypothesis in these overidentification tests, then it means that the exclusion restrcitions for our instruments are probably incorrect. Note that it won’t tell us which of them are incorrect, just that at least one is. ◮ These overidentification tests depend heavily on the constant effects assumption ◮ Once we move away from constant effects, we no longer can generally pool multiple instruments together in this way.

Reading

Instrumental Variables and Potential Outcomes ◮ The basic idea behind instrumental variable approaches is that we do not have ignorability for A i , but we do have a variable, Z i , that affects A i , but only affects the outcome through A i .

Instrumental Variables and Potential Outcomes ◮ The basic idea behind instrumental variable approaches is that we do not have ignorability for A i , but we do have a variable, Z i , that affects A i , but only affects the outcome through A i . ◮ Note that we allow the instrument, Z i to have an effect on A i , so the treatment must have potential outcomes, A i (1) and A i (0), with the usual consistency assumption: A i = Z i A i (1) + (1 − Z i ) A i (0)

Instrumental Variables and Potential Outcomes ◮ The basic idea behind instrumental variable approaches is that we do not have ignorability for A i , but we do have a variable, Z i , that affects A i , but only affects the outcome through A i . ◮ Note that we allow the instrument, Z i to have an effect on A i , so the treatment must have potential outcomes, A i (1) and A i (0), with the usual consistency assumption: A i = Z i A i (1) + (1 − Z i ) A i (0) ◮ Outcome can depend on both the treatment and the instrument: Y i ( a , z ) is the outcome if unit i had received treatment A i = a and instrument value Z i = z .

Instrumental Variables and Potential Outcomes ◮ The basic idea behind instrumental variable approaches is that we do not have ignorability for A i , but we do have a variable, Z i , that affects A i , but only affects the outcome through A i . ◮ Note that we allow the instrument, Z i to have an effect on A i , so the treatment must have potential outcomes, A i (1) and A i (0), with the usual consistency assumption: A i = Z i A i (1) + (1 − Z i ) A i (0) ◮ Outcome can depend on both the treatment and the instrument: Y i ( a , z ) is the outcome if unit i had received treatment A i = a and instrument value Z i = z . ◮ The effect of the treatment given the value of the instrument is Y i (1 , Z i ) − Y i (0 , Z i ) .

Key assumptions 1. Randomization

Key assumptions 1. Randomization 2. Exclusion Restriction

Gov 2002 - Causal Inference II: Instrumental Variables Matthew - PowerPoint PPT Presentation

Gov 2002 - Causal Inference II: Instrumental Variables Matthew Blackwell Arthur Spirling October 2nd, 2014 Instrumental Variables Last week we talked about how to make progress when you have randomization or selection on the observables.

Political Science 209 - Fall 2018 Causal Inference Florian Hollenbach 7th September 2018 Causal

Econ 2148, fall 2017 Instrumental variables II, continuous treatment Maximilian Kasy Department

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case Maximilian Kasy

Econ 2148, fall 2019 Instrumental variables I, origins and binary treatment case Maximilian Kasy

Econ 2148, fall 2019 Instrumental variables II, continuous treatment Maximilian Kasy Department

Causal Effect Evaluation and Causal Network Learning Zhi Geng Peking University, China June

Causal Inference By: Miguel A. Hern an and James M. Robins Part I: Causal inference without

Instrumental Variables Philosophy of Economics University of Virginia Matthias Brinkmann

Gov 2002 - Causal Inference III: Regression Discontinuity Designs Matthew Blackwell Arthur

A Brief Introduction to Causal Inference Brady Neal causalcourse.com What is causal inference?

Introduction to Causal Inference Lan Liu University of Minnesota at Twin Cities liux3771@umn.edu

FYE 03/2002 2Q Financial Results FYE 03/2002 FYE 03/2002 FYE 03/2002 2Q Financial Results 2Q

FYE 03/2002 3Q Financial Results FYE 03/2002 FYE 03/2002 FYE 03/2002 3Q Financial Results 3Q

Variables (IV) in Stata Austin Nichols @austnnchols Magic Bullets Instrumental Variables

Foundations of Causal Discovery Frederick Eberhardt KDD Causality Workshop 2016 Causal Discovery

Causal inference Part II: Difference In Difference and Instrumental Variables Difference in

Presentation to the School Board Stafford County Public Schools December 10, 2019 Audit

Check 1 Check 2 What an Employer Needs to Know About Employee Background Checks James R.

1 L e ve l of Audit Comme nts Sc he dule o f F inding s Ma na g e me nt L e tte r E ffic

The Patient Record Scorecard and Survey Explained Hosted by: Deven McGraw Chief Regulatory

Do audits deter future non-compliance? Evidence on Self-Employed Taxpayers Sebastian Beer,

Energy Policy Act of 2005: Effects of the 3-Year Inspection Frequency Requirement on Compliance

TCEQ AUDIT PROCESS Introduction On August 8, 2019, at the Region VI Pretreatment Association

Be Beha havi vior Adv Advantage: Building legally defensible & & clinically sound b

Sambuz

Useful Links

Newsletter

Mail Us

Gov 2002 - Causal Inference II: Instrumental Variables Matthew - PowerPoint PPT Presentation

Gov 2002 - Causal Inference II: Instrumental Variables Matthew Blackwell Arthur Spirling October 2nd, 2014 Instrumental Variables Last week we talked about how to make progress when you have randomization or selection on the observables.

Political Science 209 - Fall 2018 Causal Inference Florian Hollenbach 7th September 2018 Causal

Econ 2148, fall 2017 Instrumental variables II, continuous treatment Maximilian Kasy Department

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case Maximilian Kasy

Econ 2148, fall 2019 Instrumental variables I, origins and binary treatment case Maximilian Kasy

Econ 2148, fall 2019 Instrumental variables II, continuous treatment Maximilian Kasy Department

Causal Effect Evaluation and Causal Network Learning Zhi Geng Peking University, China June

Causal Inference By: Miguel A. Hern an and James M. Robins Part I: Causal inference without

Instrumental Variables Philosophy of Economics University of Virginia Matthias Brinkmann

Gov 2002 - Causal Inference III: Regression Discontinuity Designs Matthew Blackwell Arthur

A Brief Introduction to Causal Inference Brady Neal causalcourse.com What is causal inference?

Introduction to Causal Inference Lan Liu University of Minnesota at Twin Cities liux3771@umn.edu

FYE 03/2002 2Q Financial Results FYE 03/2002 FYE 03/2002 FYE 03/2002 2Q Financial Results 2Q

FYE 03/2002 3Q Financial Results FYE 03/2002 FYE 03/2002 FYE 03/2002 3Q Financial Results 3Q

Variables (IV) in Stata Austin Nichols @austnnchols Magic Bullets Instrumental Variables

Foundations of Causal Discovery Frederick Eberhardt KDD Causality Workshop 2016 Causal Discovery

Causal inference Part II: Difference In Difference and Instrumental Variables Difference in

Presentation to the School Board Stafford County Public Schools December 10, 2019 Audit

Check 1 Check 2 What an Employer Needs to Know About Employee Background Checks James R.

1 L e ve l of Audit Comme nts Sc he dule o f F inding s Ma na g e me nt L e tte r E ffic

The Patient Record Scorecard and Survey Explained Hosted by: Deven McGraw Chief Regulatory

Do audits deter future non-compliance? Evidence on Self-Employed Taxpayers Sebastian Beer,

Energy Policy Act of 2005: Effects of the 3-Year Inspection Frequency Requirement on Compliance

TCEQ AUDIT PROCESS Introduction On August 8, 2019, at the Region VI Pretreatment Association

Be Beha havi vior Adv Advantage: Building legally defensible &amp; &amp; clinically sound b

Sambuz

Useful Links

Newsletter

Mail Us

Be Beha havi vior Adv Advantage: Building legally defensible & & clinically sound b