DPM in Applications Yong Song University of Melbourne Department - - PowerPoint PPT Presentation

dpm in applications
SMART_READER_LITE
LIVE PREVIEW

DPM in Applications Yong Song University of Melbourne Department - - PowerPoint PPT Presentation

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential DPM in Applications Yong Song University of Melbourne Department of Economics BAM Yong Song University of Melbourne Department of Economics BAM DPM in


slide-1
SLIDE 1

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

DPM in Applications

Yong Song

University of Melbourne Department of Economics BAM

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-2
SLIDE 2

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Today

Extract from the forthcoming 3-day BAM short course in Nov 2016.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-3
SLIDE 3

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Definition of DP (Ferguson 1973)

Definition (Dirichlet Process) The Dirichlet process over a set Ω is a stochastic process whose sample path is a probability distribution over Ω. For a random distribution F distributed according to a Dirichlet process DP(α, G0), given any finite measurable partition A1, A2, · · · , AK of the sample space Ω, the random vector (F(A1), · · · , F(AK)) is distributed as a Dirichlet distribution with parameters (αG0(A1), · · · , αG0(AK))

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-4
SLIDE 4

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Partition Example

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-5
SLIDE 5

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Interpretation

A random draw from DP(α, G0) is a distribution F over Ω (the real line in the above figure) For the above partion, the probability measure is random as (F(A1), F(A2), ..., F(A7)) ∼ Dir

  • αG0(A1), αG0(A2), ..., αG0(A7)
  • Yong Song

University of Melbourne Department of Economics BAM DPM in Applications

slide-6
SLIDE 6

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Conjugacy

Suppose that we observe a random draw θ from distribution F, where F ∼ DP(α, G0). The posterior distribution of F? For any partition A1, ...AK, what is the posterior of [F(A1), ..., F(AK)] conditional on θ?

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-7
SLIDE 7

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Conjugacy

Prior: [F(A1), ..., F(AK)] ∼ Dir(αG0(A1), αG0(A2), ..., αG0(A7)) Likelihood p(θ | [F(A1), ..., F(AK)]) = F(A1)δθ(A1)F(A2)δθ(A2)...F(AK)δθ(AK ), where δθ(A) = 1 if θ ∈ A and 0 otherwise.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-8
SLIDE 8

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Conjugacy

Posterior kernel p([F(A1), ..., F(AK)] | θ) ∝p(θ | [F(A1), ..., F(AK)])p([F(A1), ..., F(AK)]) ∝F(A1)δθ(A1)F(A2)δθ(A2)...F(AK)δθ(AK ) × F(A1)αG0(A1)−1F(A2)αG0(A2)−1...F(AK)αG0(AK )−1 ∝ F(A1)αG0(A1)+δθ(A1)−1F(A2)αG0(A2)+δθ(A2)−1 ...F(AK)αG0(AK )+δθ(AK )−1

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-9
SLIDE 9

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Conjugacy

[F(A1), ..., F(AK)] | θ ∼Dir (αG0(A1) + δθ(A1), αG0(A2) + δθ(A2), ..., αG0(AK) + δθ(AK)) This formula applies to ANY finite partition. We can write a new concentration parameter α = α + 1 and new shape parameter G 0 =

α α+1G0 + 1 α+1δθ.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-10
SLIDE 10

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Conjugacy

F | θ ∼ DP(α, G 0) with α = α + 1 and G 0 =

α α+1G0 + 1 α+1δθ.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-11
SLIDE 11

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Conjugacy

Suppose that you observe n observations, denoted by Θ = (θ1, ..., θn), drawn from F. For any finite partition A1, ..., AK, what is the posterior [F(A1), ..., F(AK)]? Prior (same): [F(A1), ..., F(AK)] ∼ Dir(α, [G0(A1), ..., G0(AK)]) Likelihood p(Θ | [F(A1), ..., F(AK)]) =F(A1)

n

  • i=1

δθi (A1)

F(A2)

n

  • i=1

δθi (A2)

...F(AK)

n

  • i=1

δθi (AK )

, where

n

  • i=1

δθi(Aj) counts the number of θi’s that fall in the set Aj.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-12
SLIDE 12

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Conjugacy

Posterior kernel p([F(A1), ..., F(AK)] | Θ) ∝p(Θ | [F(A1), ..., F(AK)])p([F(A1), ..., F(AK)]) ∝F(A1)

n

  • i=1

δθi (A1)

F(A2)

n

  • i=1

δθi (A2)

...F(AK)

n

  • i=1

δθi (AK )

× F(A1)αG0(A1)−1F(A2)αG0(A2)−1...F(AK)αG0(AK )−1 ∝ F(A1)

αG0(A1)+

n

  • i=1

δθi (A1)−1

F(A2)

αG0(A2)+

n

  • i=1

δθi (A2)−1

...F(AK)

αG0(AK )+

n

  • i=1

δθi (AK )−1

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-13
SLIDE 13

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Conjugacy

[F(A1), ..., F(AK)] | Θ ∼Dir

  • αG0(A1) +

n

  • i=1

δθi(A1), ..., αG0(AK) +

n

  • i=1

δθi(AK)

  • This formula applies to ANY finite partition. We can write a new

concentration parameter α = α + n and new shape parameter G 0 =

α α+nG0 + 1 α+n n

  • i=1

δθi.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-14
SLIDE 14

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Conjugacy

F | Θ ∼ DP(α, G 0) with α = α + n and G 0 =

α α+nG0 + 1 α+n n

  • i=1

δθi.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-15
SLIDE 15

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Conjugacy

Shape parameter G 0 = α α + nG0 + 1 α + n

n

  • i=1

δθi = α α + nG0 + n α + n × 1 n

n

  • i=1

δθi Notice that 1

n n

  • i=1

δθi is the empirical distribution of observations. What if n → ∞?

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-16
SLIDE 16

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Conjugacy

F | Θ ∼ DP(α, G 0) with α = α + n and G 0 =

α α+nG0 + n α+n 1 n n

  • i=1

δθi.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-17
SLIDE 17

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Conjugacy

1 DP is conjugate to discrete distributions (finite or infinite). 2 Each draw F from DP is a distribution. 3 Each draw F from DP is a discrete distribution.

These properties make DP suitable as a prior for the parameters in an infinite mixture model.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-18
SLIDE 18

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Dinstinct values

We can use θ∗

j to represent distinct values of θi for i = 1, ..., n. For

instance, if there are M distinct values of θi, we have Θ∗ = (θ∗

1, ..., θ∗ M).

To link Θ to Θ∗, we can use West et al.[1994] and Escobar and West[1995] by a link funcion ci (c means choice). If ci = j, then θi = θ∗

j . More concisely,

θi = θ∗

ci

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-19
SLIDE 19

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Dinstinct values

We can write the conditional posterior of F | Θ in terms of distinct values as F | Θ ∼ DP(α, G 0) with α = α + n and G 0 =

α α+nG0 + n α+n M

  • j=1

nj n δθ∗

j . nj is the

number of θi’s that take value of θ∗

j .

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-20
SLIDE 20

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Stick Breaking Representation (Sethuraman 1994)

F ≡ (wk, θk)∞

k=1 ∼ DP(α, G0) can be generated by

θk ∼ G0 for k = 1, 2, · · · Vk ∼ Beta(1, α) for k = 1, 2, · · · wk = Vk

k−1

  • j=1

(1 − Vk) for k = 1, 2, · · · The generation of {wk} is called stick breaking process, or simply w ∼ SBP(α).

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-21
SLIDE 21

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Stick Breaking Process

π1 π2 π3 π4 ...

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-22
SLIDE 22

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Dirichelt Process Mixture (DPM)

West et al.[1994] and Escobar and West[1995] proposed the Dirichlet process mixture model. DPM w ∼ SBP(α) θk ∼ G0 for k = 1, 2, · · · y ∼

  • k=1

wkf (y | θk)

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-23
SLIDE 23

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Slice Sampler

We cannot simualte a state indicator si from an infinite number of states as the finite case. So we need slice sampler as in Walker(2007).

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-24
SLIDE 24

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Slice Sampler

  • 5
  • 4
  • 3
  • 2
  • 1

1 2 3 4 5 0.05 0.1 0.15 0.2 0.25 0.3

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-25
SLIDE 25

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Slice Sampler

Draw auxilliary variable u | y from a uniform distribution U(0, p(y)). Then you have a super simple kernel. p(y, u) = p(y)p(u | y) = p(y) 1 p(y)1(u < p(y)) = 1(u < p(y)) You can simulate y | u from a region that p(y) > u, which has finite support instead of infinite! Model Consisitency: u is drawn conditional on y. The marginal distribution of y is NOT affected.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-26
SLIDE 26

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Slice Sampler

Simulate uniformly from such region.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-27
SLIDE 27

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Slice Sampler in DPM

We want a slice sample to shrink the infinite number of regimes to a finite one. So it should work on w.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-28
SLIDE 28

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Slice Sampler in DPM

First, augment the model by state indicator S = (s1, ..., sn) w ∼ SBP(α) θk ∼ G0 for k = 1, 2, · · · yi | si = k ∼ f (y | θk) P(si = k) = wk

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-29
SLIDE 29

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Slice Sampler in DPM

Now, augment the model by slice variable U = (u1, ..., un) w ∼ SBP(α) θk ∼ G0 for k = 1, 2, · · · yi | si = k ∼ f (y | θk) P(si = k) = wk ui | si = k ∼ U(0, wk) Marginal distribution is NOT affected. But conditional on ui, si can only choose from a finite set of indicators.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-30
SLIDE 30

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Slice Sampler in DPM

1 2 3 4 5 6 7 8 0.05 0.1 0.15 0.2 0.25 0.3 Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-31
SLIDE 31

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Parameter Space

1 w = (w1, w2, ...) 2 Θ = (θ1, θ2, ...) 3 S = (s1, ..., sn) 4 U = (u1, ..., un)

I do not discuss α today because of time limit. More discussion is prepared for the November 2016 short course.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-32
SLIDE 32

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Posterior Kernel

p(w, Θ, S, U | y) ∝p(w)p(Θ)p(S | w)p(U | S)p(y | Θ, S) ∝p(w)p(Θ)

n

  • i=1
  • wsi

1 wsi 1(ui < wsi)p(y | θ∗

si)

  • ∝p(w)p(Θ)

n

  • i=1
  • 1(ui < wsi)p(y | θ∗

si)

  • Yong Song

University of Melbourne Department of Economics BAM DPM in Applications

slide-33
SLIDE 33

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Slice Sampler in DPM

1 find umin = min(u1, ..., un). 2 Write w = (w1, w2, ..., wK, wK ′), where wK ′ = 1 − K

  • k=1

wk is the residual weight.

3 If umin > wK ′, then it is impossible for any si to take value

larger than K. Because it requires ui < wJ for some J > K. This is impossible because if it is true, then ui < wJ umin < ui < wJ < wK ′ umin < wK ′ Contradiction.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-34
SLIDE 34

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Slice Sampler in DPM

4 If umin < wK ′, then expand the state space until If umin > wK ′

is satisfied.

5 To expand, use the stick breaking representation. Why? We

can verify that (wK+1, wK+2, ...) ∼ wK ′SBP(α). So simply keep breaking the stick and increase K until umin > wK ′ is satisfied.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-35
SLIDE 35

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

w | ·

p(w | ·) ∝ p(w | α)

n

  • i=1

1(ui < wsi) There is no convenient way! What can we do? Change the kernel!

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-36
SLIDE 36

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

New kernel

p(w, Θ, S | ·) ∝ p(w)p(Θ)p(S | w)

n

  • i=1

p(y | θ∗

si)

and p(w | ·) ∝ p(w |)p(S | w) We discard U.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-37
SLIDE 37

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

w

We know the prior (w, Θ) ∼ DP(α, G0) With observations θ∗

m for m = 1, ..., M, where M is the number of

active regimes that have data. The posterior (w, Θ) | S, Θ∗ is also a DP (w, Θ) ∼ DP  α + n, α α + nG0 + n α + n

M

  • j=1

nj n θ∗

j

 

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-38
SLIDE 38

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

w

For the points θ∗

j for j = 1, ..., M, its posterior is Dirichlet

distribution (not DP, but inferred from DP’s definition) (P(θ∗

1), ..., P(θ∗ M), P(else)) ∼ Dir(n1, ..., nM, α)

This is equivalent to (w1, ..., wM, wM′) ∼ Dir(n1, ..., nM, α) wj is the probability associated with distinct value θ∗

j and

wM′ = 1 −

M

  • j=1

wj.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-39
SLIDE 39

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

w

We simulated w | Θ, S as follows

1 get rid of all regimes that have no data and relabel w1, ..., wM

as the weight associated with distinct values θ∗

1, ..., θ∗ M. 2 Simulate (w1, ..., wM, wM′) ∼ Dir(n1, ..., nM, α)

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-40
SLIDE 40

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Θ | ·

This is the same as finite model conidtional on S.

1 Collect the data for each regime j = 1, ..., M. Notice that M

may change in each iteration.

2 Simulate θ∗ j for the distinct value conditional on the data in

that regime.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-41
SLIDE 41

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

S | ·

From the kernel p(w, Θ, S, U | ·) ∝ p(w)p(Θ)

n

  • i=1
  • 1(ui < wsi)p(y | θ∗

si)

  • we have

p(S | ·) ∝

n

  • i=1
  • 1(ui < wsi)p(y | θ∗

si)

  • For each si, choose its value k with probability proportional to

p(y | θ∗

k) as long as ui < wk. Hence we only need to choose from

a finite set. parallelizable

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-42
SLIDE 42

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

U | ·

From kernel p(w, Θ, S, U | ·) ∝ p(w)p(Θ)

n

  • i=1
  • 1(ui < wsi)p(y | θ∗

si)

  • To find

p(U | ·) ∝

n

  • i=1

1(ui < wsi) Simply simulated ui from a uniform distribution U(0, wsi).

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-43
SLIDE 43

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Multiple Kernel

We have seen

1 p(w, Θ, S, U | ·) for S, U, Θ 2 p(w, Θ, S | ·) for w

Confused?

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-44
SLIDE 44

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Simple Example

Want p(a, b, c), ideally

1 a | b, c 2 b | a, c 3 c | a, b

However, we know b | a instead of b | a, c

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-45
SLIDE 45

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Simple Example

If know b | a instead of b | a, c, we gain efficiency by simualting b | a first and then c | a, b. It is equivalent to (b, c) | a

  • jointly. Rao-Blackwellization.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-46
SLIDE 46

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Back to DPM

1 simulate (w, U) | Θ, S by 1 w | Θ, S, α use DP conjugacy 2 U | w, Θ, S, α. slice 2 simulate Θ | w, U, S traditional 3 simulate S | w, U, Θ slice sampler

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-47
SLIDE 47

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Density plot: α ∼ G(10, 10)

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-48
SLIDE 48

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Number of Regimes: α ∼ G(10, 10)

1000 2000 3000 4000 5000 2 4 6 8 10 12 14 16

α∼ G(10, 10) Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-49
SLIDE 49

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Density plot: α ∼ G(10, 100)

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-50
SLIDE 50

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Number of Regimes: α ∼ G(10, 100)

1000 2000 3000 4000 5000 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8

α∼ G(10, 100) Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-51
SLIDE 51

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Density plot: α ∼ G(100, 10)

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-52
SLIDE 52

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Number of Regimes: α ∼ G(100, 10)

1000 2000 3000 4000 5000 5 10 15 20 25 30 35 40 45

α∼ G(100, 10) Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-53
SLIDE 53

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Numeric Pitfall: α ∼ G(10, 1000)

1000 2000 3000 4000 5000 6000 0.005 0.01 0.015 0.02 0.025 1000 2000 3000 4000 5000 6000 ×10 -4 1 2 3 4 5 Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-54
SLIDE 54

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Numeric Pitfall: α ∼ G(1, 100)

1000 2000 3000 4000 5000 6000 0.02 0.04 0.06 0.08 1000 2000 3000 4000 5000 6000 ×10 -4 1 2 3 4 5

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-55
SLIDE 55

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Notes on α

1 It is the prior on the mixture probability w. 2 Always monitor its value in the posterior is you estimate it. 3 A large value α is NOT parsimonious.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-56
SLIDE 56

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Density plot: α = 1

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-57
SLIDE 57

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Number of Regimes: α = 1

1000 2000 3000 4000 5000 2 4 6 8 10 12 14

α = 1 Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-58
SLIDE 58

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Density plot: α = 10

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-59
SLIDE 59

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Number of Regimes: α = 10

1000 2000 3000 4000 5000 10 15 20 25 30 35 40 45 50

α = 10 Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-60
SLIDE 60

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Density plot: α = 0.1

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-61
SLIDE 61

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Number of Regimes: α = 0.1

1000 2000 3000 4000 5000 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8

α = 0.1 Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-62
SLIDE 62

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Suggestions for α

1 Give it a small value without estimation 2 Need robustness check

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-63
SLIDE 63

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Hepatitis C

12 week treatment Harvoni costs

  • the US (94,500 USD)
  • Canada (80,000 USD)
  • the UK (39,000 GBP)
  • Germany (48,000 euro)
  • Egypt (1200 USD)
  • India (900 USD)
  • Australia (38.3 AUD)

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-64
SLIDE 64

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

X variables

Coffee intake Age Alcohol intake smoking body weight

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-65
SLIDE 65

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

LS Estimator

Estimate SE tStat pValue Coffee

  • 1.9917

0.87174

  • 2.2848

0.022883 Inter

  • 5.2584

2.8781

  • 1.8271

0.068481 Age 0.13293 0.039702 3.3482 0.0008957 Alcohol 0.019548 0.021608 0.90469 0.36621 Smoking 2.0362 0.91397 2.2279 0.026481 Weight 0.11617 0.026996 4.3034 2.1478e-05

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-66
SLIDE 66

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

A DPM (Neal and Shahbaba 2009)

w ∼ SBP(α) θj = (µj, Σj) ∼ G0 for j = 1, 2, ... P(si = j) | w = wj (yi, xi) | si = k ∼ N(µk, Σk) for i = 1, 2, ..., n Nonparametric for the data (y, x). This is a nonparametric model.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-67
SLIDE 67

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Parameter Space

1 w = (w1, w2, ...) 2 Θ = (θ1, θ2, ...) 3 S = (s1, ..., sn)

Auxilliary variable U for the slice sampler is also needed. We set α as a constant this time for simplicity.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-68
SLIDE 68

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

θ

We consider a conjugate prior for θ. Namely θj = (µj, Σj) ∼ N − IW (m0, h0, A0, a0)

  • r

Σj ∼ IW (A0, a0) µj | Σj ∼ N(m0, h−1

0 Σj)

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-69
SLIDE 69

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Conditional Normal

If (y, x) ∼ N µy µx

  • ,

Σyy Σyx Σxy Σxx

  • then

y | x ∼ N(µ(x), Σ) with µ(x) = µy + ΣyxΣ−1

xx (x − µx)

Σ = Σyy − ΣyxΣ−1

xx Σxy

Still linear, but in DPM it is NOT.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-70
SLIDE 70

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

DPM is nonlinear

If (y, x) ∼

n

  • k=1

wkN(µk, Σk) then p(y | x) = p(y, x) p(x) It is not easy to see, but we can use the regime indicator s.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-71
SLIDE 71

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

DPM is nonlinear

If (y, x) | s = k ∼ N(µk, Σk) P(s = k) = wk then p(y | x) =

  • p(y, s | x)ds

=

  • p(y | s, x)p(s | x)ds

=

  • k=1

p(y | s = k, x)p(s = k | x)

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-72
SLIDE 72

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

DPM is nonlinear

p(y | s = k, x) = fN(y | µk(x), σ2

k)

The first one is simply conditional normal (do not bother these notations).

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-73
SLIDE 73

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

DPM is nonlinear

Second term p(s = k | x) ∝ p(s = k, x) ∝ p(x | s = k)p(s = k) ∝ fN(x | µk,x, Σxx)wk Hence we have a weight function of x as wk(x) ∝ fN(x | µk,x, Σxx)wk. If x is likely to come from component k, then that component receives larger weight in forecasting y.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-74
SLIDE 74

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

DPM is nonlinear

Over all, in a DPM p(y | x) =

  • k=1

wk(x)N(µk(x), σ2

k)

Conditional distribution is highly nonlinear.

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-75
SLIDE 75

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Parameter Space

1 w = (w1, w2, ...) 2 Θ = (θ1, θ2, ...) 3 S = (s1, ..., sn) 4 U = (u1, ..., un) slice sampler

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-76
SLIDE 76

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Posterior Kernel

p(w, S, U, Θ | y, x) ∝p(w)p(S | w)p(U | w, S)p(Θ)p(y | Θ, S) ∝p(w)p(Θ)

n

  • i=1
  • wsi

1 wsi 1(ui < wsi)p((yi, xi) | θsi)

  • ∝p(w)p(Θ)

n

  • i=1

[1(ui < wsi)p((yi, xi) | θsi)]

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-77
SLIDE 77

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

MCMC

1 simulate (w, U) | Θ, S by 1 w | Θ, S use DP conjugacy 2 U | w, Θ, S. slice 2 simulate Θ | w, U, S traditional (different but simple, instead

  • f NIG, use NIW)

3 simulate S | w, U, Θ slice sampler

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-78
SLIDE 78

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Trace plot

1000 2000 3000 4000 5000 1 1.5 2 2.5 3 3.5 4 4.5 5

α = 1 Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-79
SLIDE 79

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Marginal Effect of Coffee

  • 3
  • 2.5
  • 2
  • 1.5
  • 1
  • 0.5

0.5 1

  • 0.3545
  • 0.3545
  • 0.3545
  • 0.3544
  • 0.3544
  • 0.3544
  • 0.3544

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-80
SLIDE 80

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Marginal Effect of Coffee with 90% Density Interval

  • 3
  • 2.5
  • 2
  • 1.5
  • 1
  • 0.5

0.5 1

  • 0.7
  • 0.6
  • 0.5
  • 0.4
  • 0.3
  • 0.2
  • 0.1

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-81
SLIDE 81

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Heat Map

50 100 150 200 250 300 350 50 100 150 200 250 300 350

0.5 0.6 0.7 0.8 0.9 1 Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-82
SLIDE 82

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

LS Estimator after Removing Outliers

Estimate SE tStat pValue Coffee

  • 1.4263

0.81913

  • 1.7412

0.082479 Inter

  • 5.3347

2.3868

  • 2.2351

0.026008 Age 0.14562 0.033026 4.4092 1.3634e-05 Alcohol 3.8176e-05 0.019467 0.0019611 0.99844 Smoking 1.4899 0.77155 1.9311 0.054238 Weight 0.10213 0.022423 4.5546 7.1459e-06

Yong Song University of Melbourne Department of Economics BAM DPM in Applications

slide-83
SLIDE 83

Dirichlet Process DPM MCMC Application DPM Application Reserach Potential

Applications

Jensen and Maheu (2010, 2013, 2014). Financial Econometrics Kleibergen and Zivot (2003), Conley, Hansen, McCulloch, and Rossi (2008). Instrumental Variable. Song(2014). Infinite hidden Markov Model suitable topics:

small dimension large data set natural sorting and clustering hate overparametrization. structural estimation mixture of discrete and continuous variables

Yong Song University of Melbourne Department of Economics BAM DPM in Applications