Projection Inference for Set-Identified Svars Bulat Gafarov (PSU), - - PowerPoint PPT Presentation

projection inference for set identified svars
SMART_READER_LITE
LIVE PREVIEW

Projection Inference for Set-Identified Svars Bulat Gafarov (PSU), - - PowerPoint PPT Presentation

Introduction Notation Results Implementation Conclusion Projection Inference for Set-Identified Svars Bulat Gafarov (PSU), Matthias Meier (University of Bonn), and Jos e-Luis Montiel-Olea (Columbia) September 21, 2016 1 / 38


slide-1
SLIDE 1

Introduction Notation Results Implementation Conclusion

Projection Inference for Set-Identified Svars

Bulat Gafarov (PSU), Matthias Meier (University of Bonn), and Jos´ e-Luis Montiel-Olea (Columbia) September 21, 2016

1 / 38

slide-2
SLIDE 2

Introduction Notation Results Implementation Conclusion

Introduction: Set-id. SVARs ⋆ SVAR: Theoretical restrictions R imposed on a VAR. (Sims [1980, 1986]) Yt = A1Yt−1 + . . . + ApYt−p + ηt, Σ = E[ηtη′

t]

⋆ Goal of the restrictions: (A1, . . . Ap, Σ) →R IRFk,i,j (response of variable i to a j-th ‘structural shock’ at horizon k) ⋆ Map ‘→R’ can be 1-to-1 (point id.) or 1-to-many (set id.). (set i.d. SVARs have become popular in applied macro work) ⋆ Common practice: set-identify SVARs with ≥ / = restrictions. (Faust [1998]; Canova and De Nicolo [2002]; Uhlig[2005])

2 / 38

slide-3
SLIDE 3

Introduction Notation Results Implementation Conclusion

Motivation ⋆ Most empirical studies report Bayesian credible sets for IRFk,i,j (Bayesian Inference depends on the specification of prior beliefs) ⋆ Practical concern: prior beliefs are not ‘dominated’ by the data (results are sensitive to the choice of priors even if T → ∞) ⋆ Theoretical Critique: Coverage and ‘Robust’credibility → 0. (as T → ∞; Moon & Schorfheide [2012], Kitagawa [2012]) ⋆ Recent work on non-Bayesian Inference for set-id. SVARs. (MSG [2013]-Freq. Inference; GK [2014]-Robust Bayes) ⋆ Is there a simple way to conduct inference in set-id. SVARs? (that pleases both a frequentist and a robust Bayesian, and that is general and computationally feasible?)

3 / 38

slide-4
SLIDE 4

Introduction Notation Results Implementation Conclusion

Description of the Inference Problem IRFk,i,j ∈ IR

k,i,j(µ) ⊆

  • vk,i,j(µ) , vk,i,j(µ)
  • ,

µ ≡ (A, Σ).

4 / 38

slide-5
SLIDE 5

Introduction Notation Results Implementation Conclusion

This paper ⋆ Studies the properties of ‘projection inference’ for set i.d. SVARs. (Scheff´ e [1953]; Dufour [1990]; Dufour and Taamouti [2005]) ⋆ We collect IRFk,i,j’s in a 1 − α Wald Ellipsoid for µ ≡ (A, Σ). (that is, we ‘project’ a nominal 1 − α Wald Ellipsoid) ⋆ Strategy: focus on the endpoints of the identified set for IRFk,i,j. (the maximum and minimum response, vk,i,j(µ), vk,i,j(µ))

  • inf

µ∈CST (1−α) vk,i,j(µ),

sup

µ∈CST (1−α)

vk,i,j(µ)

  • ⋆ Our projection region has coverage and RB credibility ≥ 1 − α.

(for any vector of IRFs! Thus providing simultaneous inference)

5 / 38

slide-6
SLIDE 6

Introduction Notation Results Implementation Conclusion

Pros & Cons Pros: ⋆ Generality: can handle the typical application in applied work (+/0 restrictions on IRFs, long-run restrictions, elasticity bounds) ⋆ Feasibility: solve two nonlinear optimization problems per IRFk,i,j (we use state-of-the-art solution algorithms for these problems) Cons: ⋆ Projection is conservative for a frequentist and a Robust Bayesian (coverage and robust credibility are strictly above 1-α.) ⋆ We ‘calibrate’ projection to remove the excess of Robust Cred. (= 1 − α and not > 1 − α. Calibration based on KMS[2016])

6 / 38

slide-7
SLIDE 7

Introduction Notation Results Implementation Conclusion

Outline

  • 1. Model and Main Definitions
  • 2. Assumptions and Results
  • 3. Implementation and Illustrative Example
  • 4. Conclusion

7 / 38

slide-8
SLIDE 8

Introduction Notation Results Implementation Conclusion

  • 1. Model and Main Definitions

8 / 38

slide-9
SLIDE 9

Introduction Notation Results Implementation Conclusion

SVAR(p) ⋆ Structural VAR for the n-dimensional vector Yt: Yt = A1Yt−1 + . . . + ApYt−p + Bεt, Σ ≡ BB′ ⋆ Vector of reduced form parameters is: µ = (vec(A1, A2, . . . , Ap)′, vech(Σ)′)′ ∈ Rd ⋆ Coefficients of the Structural Impulse Response Function: IRFH = {IRFkh,ih,jh(A, B)}H

h=1,

IRFkh,ih,jh(A, B) = e′

ihCkh(A)

  • 1×n

Bjh. ⋆ Interested in simulatenous inference about λH ≡ IRFH. (Inoue and Kilian [2013,2016] and L¨ utkepohl et. al [2016])

9 / 38

slide-10
SLIDE 10

Introduction Notation Results Implementation Conclusion

Restrictions R(µ) on B ⋆ Identified set for λH: IR

H (µ) ≡

  • λ ∈ RH
  • λh = IRFkh,ih,jh s.t. BB′ = Σ, B ∈ R(µ), ∀h
  • ⋆ ±/0 restrictions on IRFs: e′

i′Ck′(A)Bj′ ≥ 0

(e.g. Sims [1980], Uhlig [2005]) ⋆ ±/0 long-run restrictions: e′

i′(In − A(1))−1Bj′ ≥ 0

(e.g. Blanchard, Quah [1989], Gali [1999]) ⋆ Elasticity bounds: (e′

i′Bj′)/(e′ iBj′) ∈ [c, d]

(e.g. Kilian, Murphy [2012], Baumeister, Hamilton [2015])

10 / 38

slide-11
SLIDE 11

Introduction Notation Results Implementation Conclusion

Bounds on the Identified Set: Max and Min Response ⋆ The endpoints of the identified set for each IRFk,i,j: vk,i,j(µ) ≡sup

B

IRFk,i,j(A, B) s.t. BB′ = Σ, B ∈ R(µ) vk,i,j(µ) ≡inf

B IRFk,i,j(A, B)

s.t. BB′ = Σ, B ∈ R(µ) ⋆ Nonlinear, possibly nondifferentiable transformations of µ. ⋆ Obviously ... IR

H (µ) ⊆ ×H h=1

  • vkh,ih,jh(µ) , vkh,ih,jh(µ)
  • .

⋆ No need to assume the i.d. set is connected.

11 / 38

slide-12
SLIDE 12

Introduction Notation Results Implementation Conclusion

Projection region for λH ⋆ Let CST(1 − α; µ) be the (typical) Wald ellipsoid for µ. ⋆ Let CST(1 − α; IRFk,i,j) be the interval defined by:

  • inf

µ∈CST (1−α;µ) vk,i,j(µ),

sup

µ∈CST (1−α;µ)

vk,i,j(µ)

  • ⋆ The projection region for λH = {IRFkh,ih,jh(A, B)}H

h=1 is:

CST(1 − α; λH) ≡ CST(1 − α; IRFk1,i1,j1) × . . . × CST(1 − α; IRFkH,iH,jH) ⋆ We now present the properties of CST(1 − α; λH) as T → ∞

12 / 38

slide-13
SLIDE 13

Introduction Notation Results Implementation Conclusion

  • 2. Assumptions and Results 1 to 4

13 / 38

slide-14
SLIDE 14

Introduction Notation Results Implementation Conclusion

Result 1: Frequentist Coverage ⋆ Let P be a DGP for the data. Parameterized by (A, B, F). ⋆ We want projection to be valid over a class P of DGPs: ⋆ A1: Suppose the class of DGPs P is such that lim inf

T→∞ inf P∈P P

  • µ(P) ∈ CST(1 − α; µ)
  • ≥ 1 − α.

⋆ R1: Under Assumption A1: lim inf

T→∞ inf P∈P

inf

λH∈IR

H (µ(P)) P

  • λH ∈ CST(1 − α; λH)
  • ≥ 1 − α.

14 / 38

slide-15
SLIDE 15

Introduction Notation Results Implementation Conclusion

Proof: Straightforward Projection Argument Suppose that H = 1. For any λ ∈ IR

k,i,j(µ(P)) :

P

  • λ ∈
  • inf

µ∈CST (1−α) vk,i,j(µ),

sup

µ∈CST (1−α)

vk,i,j(µ)

P

  • vk,i,j(µ(P)), vk,i,j(µ(P)) ∈
  • inf

µ∈CST (1−α) vk,i,j(µ),

sup

µ∈CST (1−α)

vk,i,j(µ)

  • as IR

k,i,j(µ(P)) ⊆ [vk,i,j(µ(P)), vk,i,j(µ(P))]

P

  • µ(P) ∈ CST(1 − α)
  • .

15 / 38

slide-16
SLIDE 16

Introduction Notation Results Implementation Conclusion

Robust Bayes Framework ⋆ Let P∗ be a prior for the structural parameters (A, B). (F is now a fixed known distribution; we use N(0, In)) ⋆ Represent the prior P∗ in terms of (P∗

µ, P∗ Q|µ), Q ≡ Σ−1/2B.

(Orthogonal reduced-form parameterization Arias et. al [2014]) ⋆ Let P(P∗

µ) denote the class of priors such that µ ∼ P∗ µ.

⋆ The robust credibility of CST(1 − α, λH) is defined as: inf

P∗∈P(P∗

µ) P∗

λH(A, B) ∈ CST(1 − α; λH)

  • YT
  • 16 / 38
slide-17
SLIDE 17

Introduction Notation Results Implementation Conclusion

Result 2: Robust Bayesian credibility ⋆ We can view robust credibility as a random variable (as it depends on the data YT) ⋆ A2: Suppose that P∗ is such that whenever YT ∼ f (YT|µ0): P∗(µ(A, B) ∈ CST(1 − α; µ) | YT) = 1 − α + op(YT|µ0). ⋆ This is implied by the Bernstein von-Mises Theorem for µ. ⋆ R2: Under Assumption 2: inf

P∗∈P(P∗

µ) P∗

λH ∈ CST(1 − α; λH)

  • YT
  • ≥ 1 − α + op(YT|µ0)

⋆ Proof: Another embarrassingly simple projection argument!

17 / 38

slide-18
SLIDE 18

Introduction Notation Results Implementation Conclusion

Calibrated Projection ⋆ Yes: We know that projection inference is conservative! (both in terms of frequentist coverage and a robust credibility) ⋆ In theory, it is conceptually simple to remove ‘projection bias’ (project a smaller Wald ellipsoid as suggested by KMS[2016]) ⋆ In practice, removing the excess of robust Bayesian credibility is much easier than removing the excess of frequentist coverage. ⋆ We suggest an algorithm to ‘calibrate’ robust credibility. ⋆ The algorithm also removes the excess of frequentist coverage (provided the bounds of i.d. set are differentiable)

18 / 38

slide-19
SLIDE 19

Introduction Notation Results Implementation Conclusion

Result 3: Calibrated Robust Credibility ⋆ Our calibration algorithm is based on the following result. ⋆ Suppose there is a nominal level 1 − α∗(YT) s.t: P∗

µ

  • ×H

h=1

  • vkh,ih,jh(µ), vkh,ih,jh(µ)
  • ⊆ CST(1 − α∗(YT), λH) | YT
  • = 1 − α

⋆ R3: Then, for every data realization: inf

P∗∈P(P∗

µ) P∗

λH(A, B) ∈ CST(1 − α∗(YT); λH)

  • YT
  • = 1 − α,

⋆ Proof: slightly more involved. See Appendix A.2.

19 / 38

slide-20
SLIDE 20

Introduction Notation Results Implementation Conclusion

Calibration Algorithm ⋆ Take M draws (µ∗

m) from the posterior distribution of µ

(or from its asymptotic approximation based on BvM) ⋆ For each h = 1, . . . H and each m = 1, . . . , M evaluate: [vkh,ih,jh(µ∗

m) , vkh,ih,jh(µ∗ m)]

(we use nonlinear numerical solvers to evaluate the bounds) ⋆ Fix a confidence level 1 − αs < 1 − α. Count how often: [vkh,ih,jh(µ∗

m) , vkh,ih,jh(µ∗ m)] ⊆ CST(1 − αs; λkh,ih,jh)

for all h = 1, . . . H. ⋆ If RBC is not in between [1 − α − η, 1 − α + η], change αs. (η is a tolerance level for the excess of robust credibility)

20 / 38

slide-21
SLIDE 21

Introduction Notation Results Implementation Conclusion

Result 4: Coverage of Calibrated Projection ⋆ Question: Suppose we have found α∗(YT). Is it true that: Pµ0

  • [ vh(µ0), vh(µ0) ]

⊆ CST(1 − α∗(YT); λh), ∀h = 1, . . . , H

  • → 1 − α ?

⋆ Answer: Yes. The following regularity conditions suffice: ⋆ vh(µ0), vh(µ0) are differentiable at µ0 for each h = 1, . . . , H. ⋆ √ T( µ − µ0) d → N(0, Ω), ΩT

p

→ Ω, and BvM holds. ⋆ Proof: Under differentiability CST(1−α∗(YT); λh) is approx- imately:

  • vh(

µT) − r∗

Tσh(µ0)

√ T , vh( µT) + r∗

Tσh(µ0)

√ T

  • 21 / 38
slide-22
SLIDE 22

Introduction Notation Results Implementation Conclusion

  • 3. Implementation and illustrative example

22 / 38

slide-23
SLIDE 23

Introduction Notation Results Implementation Conclusion

Projection as an Optimization Problem ⋆ To implement projection we need to solve the program: sup

µ∈CST (1−α;µ)

vk,i,j(µ) ⋆ The confidence set for the reduced-form parameters is taken as: CST(1 − α) ≡

  • µ ∈ Rd
  • T(

µT − µ)′ Ω−1

T (

µT − µ) ≤ χ2

1−α,d

  • ⋆ Thus, the program of interest becomes:

sup

µ∈CST (1−α)

  • sup

B∈IR

k,i,j(µ)

e′

iCk(A)Bj

  • ⋆ Which is a twice differentiable, non-convex, nonlinear program:

sup

µ,B

e′

iCk(A)Bj

s.t. B ∈ IR

k,i,j(µ) and µ ∈ CST(1 − α)

23 / 38

slide-24
SLIDE 24

Introduction Notation Results Implementation Conclusion

Solution Algorithm ⋆ State-of-the-art nonlinear, non-convex, large-scale problems? ⋆ Large literature on local and global optimization algorithms (Local: AL, SQP, IP; Global: Multistart, GSearch, GAs) ⋆ Since the problem is non-convex, local solvers are not enough. (we use a two-phase solution algorithm: local+global) ⋆ SQP/IP ‘most powerful local algorithm for nonlinear prog.’ (Nocedal and Wright [2006], p. 253; implemented in fmincon) ⋆ For the global stage we use Multistart, GlobalSearch, ga (take the local solution as an input for global algorithm)

24 / 38

slide-25
SLIDE 25

Introduction Notation Results Implementation Conclusion

Example: Demand-Supply SVAR BH(2015), ECMA ⋆ Effect of a structural shock on labor demand over wages/emp? (2-SVAR, 6 lags (1-AIC, 1-BIC), Q1-70/Q2-14: ∆wt, ∆ηt.) ∆wt ∆ηt

  • = A1

∆wt−1 ∆ηt−1

  • + . . . + A6

∆wt−6 ∆ηt−6

  • + B

ǫd

t

ǫs

t

  • .

⋆ Sign restrictions set-identify structural demand and supply shocks: B ≡ b1 b3 b2 b4

  • satisfies

+ − + +

  • .

⋆ Elasticities of supply and demand: α ≡ b2/b1 β ≡ b4/b3.

25 / 38

slide-26
SLIDE 26

Introduction Notation Results Implementation Conclusion

Table: Additional Identifying Restrictions

Motivation BH(2015) This paper Empirical studies α ∼ max{.6 + .6t3, 0} .27 ≤ α ≤ 2 report α ∈ [.27, 2] Empirical studies β ∼ min{−.6 + .6t3, 0} −2.5 ≤ β ≤ −.15 β ∈ [−2.5, −.15] γ = 0 is too strong γ ∼ N(0, V ) −2V ≤ γ ≤ 2V γ ≡ e′

2(In − A1 − A2 − . . . − A6)−1B1

26 / 38

slide-27
SLIDE 27

Introduction Notation Results Implementation Conclusion

68% Projection Region and 68% Credible Set.

Quarters after shock

5 10 15 20

cumulative % change in wage

  • 4
  • 3
  • 2
  • 1

1 2

Quarters after shock

5 10 15 20

cumulative % change in employment

  • 2

2 4 6

Expansionary Demand Shock (BH priors)

27 / 38

slide-28
SLIDE 28

Introduction Notation Results Implementation Conclusion

68% Projection Region and 68% Credible Set.

Quarters after shock

5 10 15 20

cumulative % change in wage

  • 4
  • 3
  • 2
  • 1

1 2

Quarters after shock

5 10 15 20

cumulative % change in employment

  • 2

2 4 6

Expansionary Demand Shock (Uhlig’s priors)

28 / 38

slide-29
SLIDE 29

Introduction Notation Results Implementation Conclusion

68% Projection Region and 68% Credible Set.

Quarters after shock

5 10 15 20

cumulative % change in wage

  • 4
  • 3
  • 2
  • 1

1 2

Quarters after shock

5 10 15 20

cumulative % change in employment

  • 2

2 4 6

Expansionary Supply Shock (BH priors)

29 / 38

slide-30
SLIDE 30

Introduction Notation Results Implementation Conclusion

68% Projection Region and 68% Credible Set.

Quarters after shock

5 10 15 20

cumulative % change in wage

  • 4
  • 3
  • 2
  • 1

1 2

Quarters after shock

5 10 15 20

cumulative % change in employment

  • 2

2 4 6

Expansionary Supply Shock (Uhlig’s priors)

30 / 38

slide-31
SLIDE 31

Introduction Notation Results Implementation Conclusion

Robustness or Conservativeness? ⋆ Projection is informative about the effects of ǫd on ηt . . . ⋆ But not very informative about the rest of the dynamic effects. ⋆ This could be a consequence of the ‘robustness’of projection, ⋆ Or a consequence of its conservativeness (> 1 − α). ⋆ To separate these effects, we report the calibrated projection. (represented as dotted line in the following figures)

31 / 38

slide-32
SLIDE 32

Introduction Notation Results Implementation Conclusion

68% Projection Region and 68% Calibrated Projection

4 8 12 16 20

Quarters after shock

  • 4
  • 3
  • 2
  • 1

1 2

cumulative % change in wage

4 8 12 16 20

Quarters after shock

  • 2

2 4 6

cumulative % change in employment

Expansionary Demand Shock Boxes: horizon-by-horizon 68% robust Bayesian credible set Whiskers: minimum/maximum IRF (100,000 draws)

32 / 38

slide-33
SLIDE 33

Introduction Notation Results Implementation Conclusion

68% Projection Region and 68% Calibrated Projection

4 8 12 16 20

Quarters after shock

  • 4
  • 3
  • 2
  • 1

1 2

cumulative % change in wage

4 8 12 16 20

Quarters after shock

  • 2

2 4 6

cumulative % change in employment

Expansionary Supply Shock Boxes: horizon-by-horizon 68% robust Bayesian credible set Whiskers: minimum/maximum IRF (100,000 draws)

33 / 38

slide-34
SLIDE 34

Introduction Notation Results Implementation Conclusion

Comments ⋆ Credible sets differ substantially depending on the prior beliefs. (compare BH with Uhlig priors) ⋆ Prior-free, ‘projected’ region: qualitatively different inference. (only the employment response to demand shock significant) ⋆ SQP/IP: 12 min; Uhlig: 38 min; BH: 66min; Global: 9hrs. (see Table III, p. 20 in the paper) ⋆ Global algorithms do not improve the local solution. (see Appendix B, p. 48 in the paper) ⋆ Calibration of RCS takes around 3 minutes for M = 1, 000. (and around 5 hours with M = 100, 000 and 50 parallel workers)

34 / 38

slide-35
SLIDE 35

Introduction Notation Results Implementation Conclusion

  • 4. Conclusion

35 / 38

slide-36
SLIDE 36

Introduction Notation Results Implementation Conclusion

Main Messages from our Paper ⋆ We studied the properties of projection inference for SVARs. (delivers frequentist and Robust Bayes interpretation) ⋆ We have emphasized the generality of projection inference. (can handle typical applications in applied macro work) ⋆ We thought seriously about computational feasibility. (implementation requires solving two mathematical programs) ⋆ We showed how to calibrate the RB credibility of projection. (which also calibrates coverage under some reg. conditions)

36 / 38

slide-37
SLIDE 37

Introduction Notation Results Implementation Conclusion

Bottom Line We think that projection is a simple way to conduct inference in set-id SVARs. (it has both frequentist coverage and Robust Bayes credibility) (it is also general, feasible, and delivers simultaneous inference)

37 / 38

slide-38
SLIDE 38

Introduction Notation Results Implementation Conclusion

Thanks very much for listening!

38 / 38