On a Class of Nonparametric Bayesian Autoregressive Models Maria - - PowerPoint PPT Presentation

on a class of nonparametric bayesian autoregressive models
SMART_READER_LITE
LIVE PREVIEW

On a Class of Nonparametric Bayesian Autoregressive Models Maria - - PowerPoint PPT Presentation

On a Class of Nonparametric Bayesian Autoregressive Models Maria Anna Di Lucca 1 , Alessandra Guglielmi 2 , uller 3 , Fernando A. Quintana 4 Peter M Karolinska Institutet 1 Politecnico di Milano 2 University of Texas, Austin 3 olica de Chile 4


slide-1
SLIDE 1

On a Class of Nonparametric Bayesian Autoregressive Models

Maria Anna Di Lucca1, Alessandra Guglielmi2, Peter M¨ uller3, Fernando A. Quintana4

Karolinska Institutet1 Politecnico di Milano2 University of Texas, Austin3 Pontificia Universidad Cat´

  • lica de Chile4

ICERM Workshop, Providence, RI, USA, September 17–21, 2012

: slide 1 of 37

slide-2
SLIDE 2

Outline

1

Motivation

2

DDP Models

3

The Model Some Previous Work The Model: Continuous Case The Model: Binary Case

4

Data Ilustrations Old Faithful Geyser Data from Multiple Binary Sequences

5

Final Comments

: slide 2 of 37

slide-3
SLIDE 3

Outline

1

Motivation

2

DDP Models

3

The Model Some Previous Work The Model: Continuous Case The Model: Binary Case

4

Data Ilustrations Old Faithful Geyser Data from Multiple Binary Sequences

5

Final Comments

Motivation: slide 3 of 37

slide-4
SLIDE 4

Motivation

Autoregressive models are very popular. We want to generalize usual assumptions ⇒ parametric case limits the scope and extent of inference. Instead, we want to define a notion of “flexible autoregressive model”. For instance, for order 1 dependence, we would like to replace Yt = β + αYt−1 + ǫt by Yt | Yt−1 = y ∼ Fy. Proposal is based on dependent Dirichlet processes (DDP) but method can be extended to other types of random probability measures.

Motivation: slide 4 of 37

slide-5
SLIDE 5

Motivation

Autoregressive models are very popular. We want to generalize usual assumptions ⇒ parametric case limits the scope and extent of inference. Instead, we want to define a notion of “flexible autoregressive model”. For instance, for order 1 dependence, we would like to replace Yt = β + αYt−1 + ǫt by Yt | Yt−1 = y ∼ Fy. Proposal is based on dependent Dirichlet processes (DDP) but method can be extended to other types of random probability measures.

Motivation: slide 4 of 37

slide-6
SLIDE 6

Motivation

Autoregressive models are very popular. We want to generalize usual assumptions ⇒ parametric case limits the scope and extent of inference. Instead, we want to define a notion of “flexible autoregressive model”. For instance, for order 1 dependence, we would like to replace Yt = β + αYt−1 + ǫt by Yt | Yt−1 = y ∼ Fy. Proposal is based on dependent Dirichlet processes (DDP) but method can be extended to other types of random probability measures.

Motivation: slide 4 of 37

slide-7
SLIDE 7

Motivation

Autoregressive models are very popular. We want to generalize usual assumptions ⇒ parametric case limits the scope and extent of inference. Instead, we want to define a notion of “flexible autoregressive model”. For instance, for order 1 dependence, we would like to replace Yt = β + αYt−1 + ǫt by Yt | Yt−1 = y ∼ Fy. Proposal is based on dependent Dirichlet processes (DDP) but method can be extended to other types of random probability measures.

Motivation: slide 4 of 37

slide-8
SLIDE 8

Motivation

Autoregressive models are very popular. We want to generalize usual assumptions ⇒ parametric case limits the scope and extent of inference. Instead, we want to define a notion of “flexible autoregressive model”. For instance, for order 1 dependence, we would like to replace Yt = β + αYt−1 + ǫt by Yt | Yt−1 = y ∼ Fy. Proposal is based on dependent Dirichlet processes (DDP) but method can be extended to other types of random probability measures.

Motivation: slide 4 of 37

slide-9
SLIDE 9

Motivation

Autoregressive models are very popular. We want to generalize usual assumptions ⇒ parametric case limits the scope and extent of inference. Instead, we want to define a notion of “flexible autoregressive model”. For instance, for order 1 dependence, we would like to replace Yt = β + αYt−1 + ǫt by Yt | Yt−1 = y ∼ Fy. Proposal is based on dependent Dirichlet processes (DDP) but method can be extended to other types of random probability measures.

Motivation: slide 4 of 37

slide-10
SLIDE 10

Motivation

Autoregressive models are very popular. We want to generalize usual assumptions ⇒ parametric case limits the scope and extent of inference. Instead, we want to define a notion of “flexible autoregressive model”. For instance, for order 1 dependence, we would like to replace Yt = β + αYt−1 + ǫt by Yt | Yt−1 = y ∼ Fy. Proposal is based on dependent Dirichlet processes (DDP) but method can be extended to other types of random probability measures.

Motivation: slide 4 of 37

slide-11
SLIDE 11

Motivation

Autoregressive models are very popular. We want to generalize usual assumptions ⇒ parametric case limits the scope and extent of inference. Instead, we want to define a notion of “flexible autoregressive model”. For instance, for order 1 dependence, we would like to replace Yt = β + αYt−1 + ǫt by Yt | Yt−1 = y ∼ Fy. Proposal is based on dependent Dirichlet processes (DDP) but method can be extended to other types of random probability measures.

Motivation: slide 4 of 37

slide-12
SLIDE 12

Outline

1

Motivation

2

DDP Models

3

The Model Some Previous Work The Model: Continuous Case The Model: Binary Case

4

Data Ilustrations Old Faithful Geyser Data from Multiple Binary Sequences

5

Final Comments

DDP Models: slide 5 of 37

slide-13
SLIDE 13

Dependent Dirichlet Processes (DDP)

Given a set of indices {x : x ∈ X }, MacEachern (1999, 2000) proposed to consider Gx(·) =

  • j=1

wj(x)δθj(x)(·), x ∈ X . Barrientos et al. (2012) studied the case wj(x) = Vj(x) j−1

i=1(1 − Vi(x)), where {Vj(x)}x∈X are i.i.d.

stochastic processes (s.p.) such that Vj(x) ∼ Beta(1, Mx) for every x ∈ X using copulas! the {θj(x)}x∈X are i.i.d. s.p. with θj(x) ∼ G0 using copulas too! {Vj(x)} and {θj(x)} vary smoothly with x.

DDP Models: slide 6 of 37

slide-14
SLIDE 14

Dependent Dirichlet Processes (DDP)

Given a set of indices {x : x ∈ X }, MacEachern (1999, 2000) proposed to consider Gx(·) =

  • j=1

wj(x)δθj(x)(·), x ∈ X . Barrientos et al. (2012) studied the case wj(x) = Vj(x) j−1

i=1(1 − Vi(x)), where {Vj(x)}x∈X are i.i.d.

stochastic processes (s.p.) such that Vj(x) ∼ Beta(1, Mx) for every x ∈ X using copulas! the {θj(x)}x∈X are i.i.d. s.p. with θj(x) ∼ G0 using copulas too! {Vj(x)} and {θj(x)} vary smoothly with x.

DDP Models: slide 6 of 37

slide-15
SLIDE 15

Dependent Dirichlet Processes (DDP)

Given a set of indices {x : x ∈ X }, MacEachern (1999, 2000) proposed to consider Gx(·) =

  • j=1

wj(x)δθj(x)(·), x ∈ X . Barrientos et al. (2012) studied the case wj(x) = Vj(x) j−1

i=1(1 − Vi(x)), where {Vj(x)}x∈X are i.i.d.

stochastic processes (s.p.) such that Vj(x) ∼ Beta(1, Mx) for every x ∈ X using copulas! the {θj(x)}x∈X are i.i.d. s.p. with θj(x) ∼ G0 using copulas too! {Vj(x)} and {θj(x)} vary smoothly with x.

DDP Models: slide 6 of 37

slide-16
SLIDE 16

Dependent Dirichlet Processes (DDP)

Given a set of indices {x : x ∈ X }, MacEachern (1999, 2000) proposed to consider Gx(·) =

  • j=1

wj(x)δθj(x)(·), x ∈ X . Barrientos et al. (2012) studied the case wj(x) = Vj(x) j−1

i=1(1 − Vi(x)), where {Vj(x)}x∈X are i.i.d.

stochastic processes (s.p.) such that Vj(x) ∼ Beta(1, Mx) for every x ∈ X using copulas! the {θj(x)}x∈X are i.i.d. s.p. with θj(x) ∼ G0 using copulas too! {Vj(x)} and {θj(x)} vary smoothly with x.

DDP Models: slide 6 of 37

slide-17
SLIDE 17

Dependent Dirichlet Processes (DDP)

Given a set of indices {x : x ∈ X }, MacEachern (1999, 2000) proposed to consider Gx(·) =

  • j=1

wj(x)δθj(x)(·), x ∈ X . Barrientos et al. (2012) studied the case wj(x) = Vj(x) j−1

i=1(1 − Vi(x)), where {Vj(x)}x∈X are i.i.d.

stochastic processes (s.p.) such that Vj(x) ∼ Beta(1, Mx) for every x ∈ X using copulas! the {θj(x)}x∈X are i.i.d. s.p. with θj(x) ∼ G0 using copulas too! {Vj(x)} and {θj(x)} vary smoothly with x.

DDP Models: slide 6 of 37

slide-18
SLIDE 18

Dependent Dirichlet Processes (DDP)

Given a set of indices {x : x ∈ X }, MacEachern (1999, 2000) proposed to consider Gx(·) =

  • j=1

wj(x)δθj(x)(·), x ∈ X . Barrientos et al. (2012) studied the case wj(x) = Vj(x) j−1

i=1(1 − Vi(x)), where {Vj(x)}x∈X are i.i.d.

stochastic processes (s.p.) such that Vj(x) ∼ Beta(1, Mx) for every x ∈ X using copulas! the {θj(x)}x∈X are i.i.d. s.p. with θj(x) ∼ G0 using copulas too! {Vj(x)} and {θj(x)} vary smoothly with x.

DDP Models: slide 6 of 37

slide-19
SLIDE 19

DDPs (Cont.)

Generic form to construct DDPs: use real-valued i.i.d. Gaussian processes {Zj(x)} and {Uj(x)}, j ≥ 1, with N(0, 1) marginals, say. For instance, a continuous AR(1) when X = R. define Vj(x) = B−1

x (Φ(Zj(x))) where Bx: CDF for the Beta(1, Mx)

distribution and Φ: N(0, 1) CDF. define θj(x) = G−1

0 (Φ(Uj(x))).

define Gx(·) =

  • j=1
  • Vj(x)

j−1

  • i=1

(1 − Vi(x))

  • δθj(x)(·).

Gx ∼ DP(Mx, G0) for every x ∈ X .

DDP Models: slide 7 of 37

slide-20
SLIDE 20

DDPs (Cont.)

Generic form to construct DDPs: use real-valued i.i.d. Gaussian processes {Zj(x)} and {Uj(x)}, j ≥ 1, with N(0, 1) marginals, say. For instance, a continuous AR(1) when X = R. define Vj(x) = B−1

x (Φ(Zj(x))) where Bx: CDF for the Beta(1, Mx)

distribution and Φ: N(0, 1) CDF. define θj(x) = G−1

0 (Φ(Uj(x))).

define Gx(·) =

  • j=1
  • Vj(x)

j−1

  • i=1

(1 − Vi(x))

  • δθj(x)(·).

Gx ∼ DP(Mx, G0) for every x ∈ X .

DDP Models: slide 7 of 37

slide-21
SLIDE 21

DDPs (Cont.)

Generic form to construct DDPs: use real-valued i.i.d. Gaussian processes {Zj(x)} and {Uj(x)}, j ≥ 1, with N(0, 1) marginals, say. For instance, a continuous AR(1) when X = R. define Vj(x) = B−1

x (Φ(Zj(x))) where Bx: CDF for the Beta(1, Mx)

distribution and Φ: N(0, 1) CDF. define θj(x) = G−1

0 (Φ(Uj(x))).

define Gx(·) =

  • j=1
  • Vj(x)

j−1

  • i=1

(1 − Vi(x))

  • δθj(x)(·).

Gx ∼ DP(Mx, G0) for every x ∈ X .

DDP Models: slide 7 of 37

slide-22
SLIDE 22

DDPs (Cont.)

Generic form to construct DDPs: use real-valued i.i.d. Gaussian processes {Zj(x)} and {Uj(x)}, j ≥ 1, with N(0, 1) marginals, say. For instance, a continuous AR(1) when X = R. define Vj(x) = B−1

x (Φ(Zj(x))) where Bx: CDF for the Beta(1, Mx)

distribution and Φ: N(0, 1) CDF. define θj(x) = G−1

0 (Φ(Uj(x))).

define Gx(·) =

  • j=1
  • Vj(x)

j−1

  • i=1

(1 − Vi(x))

  • δθj(x)(·).

Gx ∼ DP(Mx, G0) for every x ∈ X .

DDP Models: slide 7 of 37

slide-23
SLIDE 23

DDPs (Cont.)

Generic form to construct DDPs: use real-valued i.i.d. Gaussian processes {Zj(x)} and {Uj(x)}, j ≥ 1, with N(0, 1) marginals, say. For instance, a continuous AR(1) when X = R. define Vj(x) = B−1

x (Φ(Zj(x))) where Bx: CDF for the Beta(1, Mx)

distribution and Φ: N(0, 1) CDF. define θj(x) = G−1

0 (Φ(Uj(x))).

define Gx(·) =

  • j=1
  • Vj(x)

j−1

  • i=1

(1 − Vi(x))

  • δθj(x)(·).

Gx ∼ DP(Mx, G0) for every x ∈ X .

DDP Models: slide 7 of 37

slide-24
SLIDE 24

DDPs (Cont.)

Generic form to construct DDPs: use real-valued i.i.d. Gaussian processes {Zj(x)} and {Uj(x)}, j ≥ 1, with N(0, 1) marginals, say. For instance, a continuous AR(1) when X = R. define Vj(x) = B−1

x (Φ(Zj(x))) where Bx: CDF for the Beta(1, Mx)

distribution and Φ: N(0, 1) CDF. define θj(x) = G−1

0 (Φ(Uj(x))).

define Gx(·) =

  • j=1
  • Vj(x)

j−1

  • i=1

(1 − Vi(x))

  • δθj(x)(·).

Gx ∼ DP(Mx, G0) for every x ∈ X .

DDP Models: slide 7 of 37

slide-25
SLIDE 25

DDPs (Cont.)

Particular cases: “single weights”: Vj(x) ≡ Vj for all x ∈ X ; “single atoms”: θj(x) ≡ θj for all x ∈ X ; “single everything”: Vj(x) ≡ Vj and θj(x) ≡ θj for all x ∈ X ⇒ the usual DP. Let Θ: support of baseline measure; P(Θ): set of all probability measures supported on Θ; P(Θ)X : all P(Θ)-valued functions defined on X .

Result

Adequate construction of DDPs implies good properties (Barrientos et al., 2012), in particular, full weak support in P(Θ)X . True also for the single-weights or the single-atoms models.

DDP Models: slide 8 of 37

slide-26
SLIDE 26

DDPs (Cont.)

Particular cases: “single weights”: Vj(x) ≡ Vj for all x ∈ X ; “single atoms”: θj(x) ≡ θj for all x ∈ X ; “single everything”: Vj(x) ≡ Vj and θj(x) ≡ θj for all x ∈ X ⇒ the usual DP. Let Θ: support of baseline measure; P(Θ): set of all probability measures supported on Θ; P(Θ)X : all P(Θ)-valued functions defined on X .

Result

Adequate construction of DDPs implies good properties (Barrientos et al., 2012), in particular, full weak support in P(Θ)X . True also for the single-weights or the single-atoms models.

DDP Models: slide 8 of 37

slide-27
SLIDE 27

DDPs (Cont.)

Particular cases: “single weights”: Vj(x) ≡ Vj for all x ∈ X ; “single atoms”: θj(x) ≡ θj for all x ∈ X ; “single everything”: Vj(x) ≡ Vj and θj(x) ≡ θj for all x ∈ X ⇒ the usual DP. Let Θ: support of baseline measure; P(Θ): set of all probability measures supported on Θ; P(Θ)X : all P(Θ)-valued functions defined on X .

Result

Adequate construction of DDPs implies good properties (Barrientos et al., 2012), in particular, full weak support in P(Θ)X . True also for the single-weights or the single-atoms models.

DDP Models: slide 8 of 37

slide-28
SLIDE 28

DDPs (Cont.)

Particular cases: “single weights”: Vj(x) ≡ Vj for all x ∈ X ; “single atoms”: θj(x) ≡ θj for all x ∈ X ; “single everything”: Vj(x) ≡ Vj and θj(x) ≡ θj for all x ∈ X ⇒ the usual DP. Let Θ: support of baseline measure; P(Θ): set of all probability measures supported on Θ; P(Θ)X : all P(Θ)-valued functions defined on X .

Result

Adequate construction of DDPs implies good properties (Barrientos et al., 2012), in particular, full weak support in P(Θ)X . True also for the single-weights or the single-atoms models.

DDP Models: slide 8 of 37

slide-29
SLIDE 29

DDPs (Cont.)

Particular cases: “single weights”: Vj(x) ≡ Vj for all x ∈ X ; “single atoms”: θj(x) ≡ θj for all x ∈ X ; “single everything”: Vj(x) ≡ Vj and θj(x) ≡ θj for all x ∈ X ⇒ the usual DP. Let Θ: support of baseline measure; P(Θ): set of all probability measures supported on Θ; P(Θ)X : all P(Θ)-valued functions defined on X .

Result

Adequate construction of DDPs implies good properties (Barrientos et al., 2012), in particular, full weak support in P(Θ)X . True also for the single-weights or the single-atoms models.

DDP Models: slide 8 of 37

slide-30
SLIDE 30

DDPs (Cont.)

Particular cases: “single weights”: Vj(x) ≡ Vj for all x ∈ X ; “single atoms”: θj(x) ≡ θj for all x ∈ X ; “single everything”: Vj(x) ≡ Vj and θj(x) ≡ θj for all x ∈ X ⇒ the usual DP. Let Θ: support of baseline measure; P(Θ): set of all probability measures supported on Θ; P(Θ)X : all P(Θ)-valued functions defined on X .

Result

Adequate construction of DDPs implies good properties (Barrientos et al., 2012), in particular, full weak support in P(Θ)X . True also for the single-weights or the single-atoms models.

DDP Models: slide 8 of 37

slide-31
SLIDE 31

DDPs (Cont.)

We typically want to use mixture model fx(· | Gx) =

  • k(· | θ) dGx(θ)

for some convenient kernel density function k(· | θ) (e.g. location-scale family).

Result

Under adequate assumptions on k(· | θ), Hellinger support of {fx : x ∈ X } is

x∈X

  • Θ k(· | θ)dPx(θ) : Px ∈ P(Θ)
  • valid for

DDPs, single-atoms or single-weights models. It is even possible to obtain large Kullback-Leibler support under further conditions on k(· | θ) (similar to Wu and Ghosal, 2008).

DDP Models: slide 9 of 37

slide-32
SLIDE 32

DDPs (Cont.)

We typically want to use mixture model fx(· | Gx) =

  • k(· | θ) dGx(θ)

for some convenient kernel density function k(· | θ) (e.g. location-scale family).

Result

Under adequate assumptions on k(· | θ), Hellinger support of {fx : x ∈ X } is

x∈X

  • Θ k(· | θ)dPx(θ) : Px ∈ P(Θ)
  • valid for

DDPs, single-atoms or single-weights models. It is even possible to obtain large Kullback-Leibler support under further conditions on k(· | θ) (similar to Wu and Ghosal, 2008).

DDP Models: slide 9 of 37

slide-33
SLIDE 33

DDPs (Cont.)

We typically want to use mixture model fx(· | Gx) =

  • k(· | θ) dGx(θ)

for some convenient kernel density function k(· | θ) (e.g. location-scale family).

Result

Under adequate assumptions on k(· | θ), Hellinger support of {fx : x ∈ X } is

x∈X

  • Θ k(· | θ)dPx(θ) : Px ∈ P(Θ)
  • valid for

DDPs, single-atoms or single-weights models. It is even possible to obtain large Kullback-Leibler support under further conditions on k(· | θ) (similar to Wu and Ghosal, 2008).

DDP Models: slide 9 of 37

slide-34
SLIDE 34

DDPs (Cont.)

We typically want to use mixture model fx(· | Gx) =

  • k(· | θ) dGx(θ)

for some convenient kernel density function k(· | θ) (e.g. location-scale family).

Result

Under adequate assumptions on k(· | θ), Hellinger support of {fx : x ∈ X } is

x∈X

  • Θ k(· | θ)dPx(θ) : Px ∈ P(Θ)
  • valid for

DDPs, single-atoms or single-weights models. It is even possible to obtain large Kullback-Leibler support under further conditions on k(· | θ) (similar to Wu and Ghosal, 2008).

DDP Models: slide 9 of 37

slide-35
SLIDE 35

Outline

1

Motivation

2

DDP Models

3

The Model Some Previous Work The Model: Continuous Case The Model: Binary Case

4

Data Ilustrations Old Faithful Geyser Data from Multiple Binary Sequences

5

Final Comments

The Model: slide 10 of 37

slide-36
SLIDE 36

Some recent references

Caron et al. (2008a): linear dynamic models with Dirichlet process mixtures for hidden states and observations. Caron et al. (2008b): propose a stationary sequence of urn models, each marginally following a DPM. Rodr´ ıguez and ter Horst (2008): propose time-dependent stick-breaking weights (but focus on the single-weights case) and Markovian dependence in the atoms using a dynamic linear model. Lau and So (2008): propose an infinite mixture of autoregressive models. Fox et al. (2011): propose a modified version of the HDP-HMM of Teh et al. (2006) applied to speaker diarization data, to allow persistence of states in time (i.e., sticky states). Rodr´ ıguez and Dunson (2011): propose a probit stick-breaking approach, with atoms defined in terms of a latent Markov random field. Nieto-Barajas et al. (2012): a time dependence is introduced in the weights of stick-breaking representation.

The Model: Some Previous Work slide 11 of 37

slide-37
SLIDE 37

The Model: Continuous Case

Given p ≥ 1, we want a flexible model for Yt | (Yt−1, . . . , Yt−p) = y. We propose, in general, Yt | (Yt−1, . . . , Yt−p) = y, mt ∼ N(Yt | mt, σ2), mt ∼ Gy, where Gy(·) =

  • h=1

wh(y)δθh(y)(·). Equivalent representation: Yt | (Yt−1, . . . , Yt−p) = y ∼

  • h≥1

wh(y)N(Yt | θh(y), σ2). Similar to M¨ uller, West and MacEachern (1997). Different from Mena and Walker (2004), where they focus on stationary models with a given stationary distribution.

The Model: The Model: Continuous Case slide 12 of 37

slide-38
SLIDE 38

The Model: Continuous Case

Given p ≥ 1, we want a flexible model for Yt | (Yt−1, . . . , Yt−p) = y. We propose, in general, Yt | (Yt−1, . . . , Yt−p) = y, mt ∼ N(Yt | mt, σ2), mt ∼ Gy, where Gy(·) =

  • h=1

wh(y)δθh(y)(·). Equivalent representation: Yt | (Yt−1, . . . , Yt−p) = y ∼

  • h≥1

wh(y)N(Yt | θh(y), σ2). Similar to M¨ uller, West and MacEachern (1997). Different from Mena and Walker (2004), where they focus on stationary models with a given stationary distribution.

The Model: The Model: Continuous Case slide 12 of 37

slide-39
SLIDE 39

The Model: Continuous Case

Given p ≥ 1, we want a flexible model for Yt | (Yt−1, . . . , Yt−p) = y. We propose, in general, Yt | (Yt−1, . . . , Yt−p) = y, mt ∼ N(Yt | mt, σ2), mt ∼ Gy, where Gy(·) =

  • h=1

wh(y)δθh(y)(·). Equivalent representation: Yt | (Yt−1, . . . , Yt−p) = y ∼

  • h≥1

wh(y)N(Yt | θh(y), σ2). Similar to M¨ uller, West and MacEachern (1997). Different from Mena and Walker (2004), where they focus on stationary models with a given stationary distribution.

The Model: The Model: Continuous Case slide 12 of 37

slide-40
SLIDE 40

The Model: Continuous Case

Given p ≥ 1, we want a flexible model for Yt | (Yt−1, . . . , Yt−p) = y. We propose, in general, Yt | (Yt−1, . . . , Yt−p) = y, mt ∼ N(Yt | mt, σ2), mt ∼ Gy, where Gy(·) =

  • h=1

wh(y)δθh(y)(·). Equivalent representation: Yt | (Yt−1, . . . , Yt−p) = y ∼

  • h≥1

wh(y)N(Yt | θh(y), σ2). Similar to M¨ uller, West and MacEachern (1997). Different from Mena and Walker (2004), where they focus on stationary models with a given stationary distribution.

The Model: The Model: Continuous Case slide 12 of 37

slide-41
SLIDE 41

The Model: Continuous Case

Given p ≥ 1, we want a flexible model for Yt | (Yt−1, . . . , Yt−p) = y. We propose, in general, Yt | (Yt−1, . . . , Yt−p) = y, mt ∼ N(Yt | mt, σ2), mt ∼ Gy, where Gy(·) =

  • h=1

wh(y)δθh(y)(·). Equivalent representation: Yt | (Yt−1, . . . , Yt−p) = y ∼

  • h≥1

wh(y)N(Yt | θh(y), σ2). Similar to M¨ uller, West and MacEachern (1997). Different from Mena and Walker (2004), where they focus on stationary models with a given stationary distribution.

The Model: The Model: Continuous Case slide 12 of 37

slide-42
SLIDE 42

The Model: Continuous Case

Given p ≥ 1, we want a flexible model for Yt | (Yt−1, . . . , Yt−p) = y. We propose, in general, Yt | (Yt−1, . . . , Yt−p) = y, mt ∼ N(Yt | mt, σ2), mt ∼ Gy, where Gy(·) =

  • h=1

wh(y)δθh(y)(·). Equivalent representation: Yt | (Yt−1, . . . , Yt−p) = y ∼

  • h≥1

wh(y)N(Yt | θh(y), σ2). Similar to M¨ uller, West and MacEachern (1997). Different from Mena and Walker (2004), where they focus on stationary models with a given stationary distribution.

The Model: The Model: Continuous Case slide 12 of 37

slide-43
SLIDE 43

The Model: Continuous Case (cont.)

Example: if p = 1, wh(y) = wh and if θh(y) = βh + αhy the model can be represented as

p(Yt | Yt−1 = y, (βt, αt), σ2) = N(Yt | βt + αty, σ2) (βt, αt) | G i.i.d. ∼ G G ∼ DP(M, G0)

(DP mixture model where atoms are given by linear trajectories, similar to Lau and So, 2008).

The Model: The Model: Continuous Case slide 13 of 37

slide-44
SLIDE 44

The Model: Continuous Case (cont.)

Example: if p = 1, wh(y) = wh and if θh(y) = βh + αhy the model can be represented as

p(Yt | Yt−1 = y, (βt, αt), σ2) = N(Yt | βt + αty, σ2) (βt, αt) | G i.i.d. ∼ G G ∼ DP(M, G0)

(DP mixture model where atoms are given by linear trajectories, similar to Lau and So, 2008).

The Model: The Model: Continuous Case slide 13 of 37

slide-45
SLIDE 45

The Model: Continuous Case (cont.)

It may be computationally convenient to consider truncated version of model: Redefine the weights as wh(y) =

i<h(1 − Vi(y))Vh(y), for

h = 1, . . . , H, con Vh(y) as before, and VH(y) ≡ 1, which guarantees P(H

h=1 wh(y) = 1) = 1 for all y ∈ Y (Ishwaran and James, 2001).

Hierarchical version of the former (linear atoms case): Yt | Yt−1 = y, rt = h, {(βj, αj)}, σ2 ∼ N(βh + αhy, σ2), P(rt = h) = wh(y), (βh, αh) i.i.d. ∼ G0, h = 1, . . . , H.

General thought

Despite the great generality of the proposed construction, it is in practice useful to resort to simple and manageable specifications.

The Model: The Model: Continuous Case slide 14 of 37

slide-46
SLIDE 46

The Model: Continuous Case (cont.)

It may be computationally convenient to consider truncated version of model: Redefine the weights as wh(y) =

i<h(1 − Vi(y))Vh(y), for

h = 1, . . . , H, con Vh(y) as before, and VH(y) ≡ 1, which guarantees P(H

h=1 wh(y) = 1) = 1 for all y ∈ Y (Ishwaran and James, 2001).

Hierarchical version of the former (linear atoms case): Yt | Yt−1 = y, rt = h, {(βj, αj)}, σ2 ∼ N(βh + αhy, σ2), P(rt = h) = wh(y), (βh, αh) i.i.d. ∼ G0, h = 1, . . . , H.

General thought

Despite the great generality of the proposed construction, it is in practice useful to resort to simple and manageable specifications.

The Model: The Model: Continuous Case slide 14 of 37

slide-47
SLIDE 47

The Model: Continuous Case (cont.)

It may be computationally convenient to consider truncated version of model: Redefine the weights as wh(y) =

i<h(1 − Vi(y))Vh(y), for

h = 1, . . . , H, con Vh(y) as before, and VH(y) ≡ 1, which guarantees P(H

h=1 wh(y) = 1) = 1 for all y ∈ Y (Ishwaran and James, 2001).

Hierarchical version of the former (linear atoms case): Yt | Yt−1 = y, rt = h, {(βj, αj)}, σ2 ∼ N(βh + αhy, σ2), P(rt = h) = wh(y), (βh, αh) i.i.d. ∼ G0, h = 1, . . . , H.

General thought

Despite the great generality of the proposed construction, it is in practice useful to resort to simple and manageable specifications.

The Model: The Model: Continuous Case slide 14 of 37

slide-48
SLIDE 48

The Model: Continuous Case (cont.)

It may be computationally convenient to consider truncated version of model: Redefine the weights as wh(y) =

i<h(1 − Vi(y))Vh(y), for

h = 1, . . . , H, con Vh(y) as before, and VH(y) ≡ 1, which guarantees P(H

h=1 wh(y) = 1) = 1 for all y ∈ Y (Ishwaran and James, 2001).

Hierarchical version of the former (linear atoms case): Yt | Yt−1 = y, rt = h, {(βj, αj)}, σ2 ∼ N(βh + αhy, σ2), P(rt = h) = wh(y), (βh, αh) i.i.d. ∼ G0, h = 1, . . . , H.

General thought

Despite the great generality of the proposed construction, it is in practice useful to resort to simple and manageable specifications.

The Model: The Model: Continuous Case slide 14 of 37

slide-49
SLIDE 49

Model for Binary Outcomes

Purpose: to extend the previous constructions to time series of binary

  • utcomes.

Idea: use the previous model in a latent scale. Albert and Chib (1993): introduce Zt (continuous) such that Yt = 1 ⇐ ⇒ Zt > 0, (so that P(Yt = 1) = P(Zt > 0)). Latent sequence {Zt} defines binary sequence {Yt}. Two options:

1

Consider Zt | (Yt−1, . . . , Yt−p) = y (Markovian of order p!); or

2

Consider Zt | (Zt−1, . . . , Zt−p) = z (can be easily extended to ordinal

  • utcomes).

The Model: The Model: Binary Case slide 15 of 37

slide-50
SLIDE 50

Model for Binary Outcomes

Purpose: to extend the previous constructions to time series of binary

  • utcomes.

Idea: use the previous model in a latent scale. Albert and Chib (1993): introduce Zt (continuous) such that Yt = 1 ⇐ ⇒ Zt > 0, (so that P(Yt = 1) = P(Zt > 0)). Latent sequence {Zt} defines binary sequence {Yt}. Two options:

1

Consider Zt | (Yt−1, . . . , Yt−p) = y (Markovian of order p!); or

2

Consider Zt | (Zt−1, . . . , Zt−p) = z (can be easily extended to ordinal

  • utcomes).

The Model: The Model: Binary Case slide 15 of 37

slide-51
SLIDE 51

Model for Binary Outcomes

Purpose: to extend the previous constructions to time series of binary

  • utcomes.

Idea: use the previous model in a latent scale. Albert and Chib (1993): introduce Zt (continuous) such that Yt = 1 ⇐ ⇒ Zt > 0, (so that P(Yt = 1) = P(Zt > 0)). Latent sequence {Zt} defines binary sequence {Yt}. Two options:

1

Consider Zt | (Yt−1, . . . , Yt−p) = y (Markovian of order p!); or

2

Consider Zt | (Zt−1, . . . , Zt−p) = z (can be easily extended to ordinal

  • utcomes).

The Model: The Model: Binary Case slide 15 of 37

slide-52
SLIDE 52

Model for Binary Outcomes

Purpose: to extend the previous constructions to time series of binary

  • utcomes.

Idea: use the previous model in a latent scale. Albert and Chib (1993): introduce Zt (continuous) such that Yt = 1 ⇐ ⇒ Zt > 0, (so that P(Yt = 1) = P(Zt > 0)). Latent sequence {Zt} defines binary sequence {Yt}. Two options:

1

Consider Zt | (Yt−1, . . . , Yt−p) = y (Markovian of order p!); or

2

Consider Zt | (Zt−1, . . . , Zt−p) = z (can be easily extended to ordinal

  • utcomes).

The Model: The Model: Binary Case slide 15 of 37

slide-53
SLIDE 53

Model for Binary Outcomes

Purpose: to extend the previous constructions to time series of binary

  • utcomes.

Idea: use the previous model in a latent scale. Albert and Chib (1993): introduce Zt (continuous) such that Yt = 1 ⇐ ⇒ Zt > 0, (so that P(Yt = 1) = P(Zt > 0)). Latent sequence {Zt} defines binary sequence {Yt}. Two options:

1

Consider Zt | (Yt−1, . . . , Yt−p) = y (Markovian of order p!); or

2

Consider Zt | (Zt−1, . . . , Zt−p) = z (can be easily extended to ordinal

  • utcomes).

The Model: The Model: Binary Case slide 15 of 37

slide-54
SLIDE 54

Model for Binary Outcomes

Purpose: to extend the previous constructions to time series of binary

  • utcomes.

Idea: use the previous model in a latent scale. Albert and Chib (1993): introduce Zt (continuous) such that Yt = 1 ⇐ ⇒ Zt > 0, (so that P(Yt = 1) = P(Zt > 0)). Latent sequence {Zt} defines binary sequence {Yt}. Two options:

1

Consider Zt | (Yt−1, . . . , Yt−p) = y (Markovian of order p!); or

2

Consider Zt | (Zt−1, . . . , Zt−p) = z (can be easily extended to ordinal

  • utcomes).

The Model: The Model: Binary Case slide 15 of 37

slide-55
SLIDE 55

Model for Binary Outcomes (cont.)

“Completely latent” definition: Yt = I{Zt > 0} with Zt | (Zt−1, . . . , Zt−p) = z, mt ∼ N(Zt | mt, σ2), mt ∼ Gz, where Gz(·) =

  • h=1

wh(z)δθh(z). The other case is similar. We can adopt the same previous simplifications, i.e. truncation, single weights or atoms, etc.

The Model: The Model: Binary Case slide 16 of 37

slide-56
SLIDE 56

Model for Binary Outcomes (cont.)

“Completely latent” definition: Yt = I{Zt > 0} with Zt | (Zt−1, . . . , Zt−p) = z, mt ∼ N(Zt | mt, σ2), mt ∼ Gz, where Gz(·) =

  • h=1

wh(z)δθh(z). The other case is similar. We can adopt the same previous simplifications, i.e. truncation, single weights or atoms, etc.

The Model: The Model: Binary Case slide 16 of 37

slide-57
SLIDE 57

Model for Binary Outcomes (cont.)

“Completely latent” definition: Yt = I{Zt > 0} with Zt | (Zt−1, . . . , Zt−p) = z, mt ∼ N(Zt | mt, σ2), mt ∼ Gz, where Gz(·) =

  • h=1

wh(z)δθh(z). The other case is similar. We can adopt the same previous simplifications, i.e. truncation, single weights or atoms, etc.

The Model: The Model: Binary Case slide 16 of 37

slide-58
SLIDE 58

Outline

1

Motivation

2

DDP Models

3

The Model Some Previous Work The Model: Continuous Case The Model: Binary Case

4

Data Ilustrations Old Faithful Geyser Data from Multiple Binary Sequences

5

Final Comments

Data Ilustrations: slide 17 of 37

slide-59
SLIDE 59

Old Faithful Geyser

Data discussed in H¨ ardle (1991). Available on-line in R. Consider {yt, t = 1, . . . , 272}, where yt: waiting time until tth eruption of the geyser.

Data Ilustrations: Old Faithful Geyser slide 18 of 37

slide-60
SLIDE 60

Old Faithful Geyser (cont.): yt vs. yt−1

Data Ilustrations: Old Faithful Geyser slide 19 of 37

slide-61
SLIDE 61

Old Faithful Geyser (cont.): ¯ Fy = E(Fy | data), AR(1) model, single weights, linear atoms

Density of the posterior mean ¯ fyt−1(yt) for yt−1 = 50 (left), 65 (center) and 80 (right). Black line: prior σ−2 ∼ Ga(2, 2); red line: σ2 = 25; blue line: kernel estimator.

Data Ilustrations: Old Faithful Geyser slide 20 of 37

slide-62
SLIDE 62

Old Faithful Geyser (cont.): ¯ Fy = E(Fy | data), AR(1) model, single weights, linear atoms

Density of the posterior mean ¯ fyt−1(·) for yt−1 = 85 (blue), with pointwise 95% credibility bands (red) and median (black).

Data Ilustrations: Old Faithful Geyser slide 21 of 37

slide-63
SLIDE 63

Old Faithful Geyser (cont.)

Density of the posterior mean ¯ fyt−1(·) for yt−1 = 85 with M = 1, H = 20 (red), for M = 10, H = 20 (orange), for M = 1, H = 50 (green) and for M = 10, H = 50 (blue).

Data Ilustrations: Old Faithful Geyser slide 22 of 37

slide-64
SLIDE 64

Old Faithful Geyser (cont.)

yt−1 = 50 yt−1 = 65 yt−1 = 80

Posterior means ¯ fyt−1(·) under AR(1)-DDP model with H = ∞, and with varying weights wh(y) = Vh(y)

i<j(1 − Vh(y)) with Vh(y) = logit(ηh1 + ηh2y). Data Ilustrations: Old Faithful Geyser slide 23 of 37

slide-65
SLIDE 65

Old Faithful Geyser (cont.)

One draw of all the atoms θh, h = 1, . . . , H in the linear case θh(y) = βh + αhy (left) and the quadratic case θh(y) = βh + αhy + γhy2 (right). Colors identify points in the same cluster.

Data Ilustrations: Old Faithful Geyser slide 24 of 37

slide-66
SLIDE 66

Bladder Cancer Data

Data from a bladder cancer study carried out by the Veteran’s Administration Cooperative Urological Research Group, VACURG (Byar et al., 1977, Davis and Wei, 1988, Giardina et al. 2011). Target: compare efficacy of 2 treatments (placebo and thiotepa) in prevention of bladder cancer recurrence. m = 81 patients with ≤ 12 observations (3-months periodicity). Two groups (thiotepa treatment; placebo): T (36 patients), P (45 patients). We record indicator of cancerous tumor recurrence.

yit = 1 if # detected tumors at time t increased for patient i, yit = 0

  • therwise, t = 1, . . . , ni, i = 1, 2, . . . , m.

xi = 0 if patient i ∈ group P, and xi = 1 otherwise.

Data Ilustrations: Data from Multiple Binary Sequences slide 25 of 37

slide-67
SLIDE 67

Bladder Cancer Data

Data from a bladder cancer study carried out by the Veteran’s Administration Cooperative Urological Research Group, VACURG (Byar et al., 1977, Davis and Wei, 1988, Giardina et al. 2011). Target: compare efficacy of 2 treatments (placebo and thiotepa) in prevention of bladder cancer recurrence. m = 81 patients with ≤ 12 observations (3-months periodicity). Two groups (thiotepa treatment; placebo): T (36 patients), P (45 patients). We record indicator of cancerous tumor recurrence.

yit = 1 if # detected tumors at time t increased for patient i, yit = 0

  • therwise, t = 1, . . . , ni, i = 1, 2, . . . , m.

xi = 0 if patient i ∈ group P, and xi = 1 otherwise.

Data Ilustrations: Data from Multiple Binary Sequences slide 25 of 37

slide-68
SLIDE 68

Bladder Cancer Data

Data from a bladder cancer study carried out by the Veteran’s Administration Cooperative Urological Research Group, VACURG (Byar et al., 1977, Davis and Wei, 1988, Giardina et al. 2011). Target: compare efficacy of 2 treatments (placebo and thiotepa) in prevention of bladder cancer recurrence. m = 81 patients with ≤ 12 observations (3-months periodicity). Two groups (thiotepa treatment; placebo): T (36 patients), P (45 patients). We record indicator of cancerous tumor recurrence.

yit = 1 if # detected tumors at time t increased for patient i, yit = 0

  • therwise, t = 1, . . . , ni, i = 1, 2, . . . , m.

xi = 0 if patient i ∈ group P, and xi = 1 otherwise.

Data Ilustrations: Data from Multiple Binary Sequences slide 25 of 37

slide-69
SLIDE 69

Bladder Cancer Data

Data from a bladder cancer study carried out by the Veteran’s Administration Cooperative Urological Research Group, VACURG (Byar et al., 1977, Davis and Wei, 1988, Giardina et al. 2011). Target: compare efficacy of 2 treatments (placebo and thiotepa) in prevention of bladder cancer recurrence. m = 81 patients with ≤ 12 observations (3-months periodicity). Two groups (thiotepa treatment; placebo): T (36 patients), P (45 patients). We record indicator of cancerous tumor recurrence.

yit = 1 if # detected tumors at time t increased for patient i, yit = 0

  • therwise, t = 1, . . . , ni, i = 1, 2, . . . , m.

xi = 0 if patient i ∈ group P, and xi = 1 otherwise.

Data Ilustrations: Data from Multiple Binary Sequences slide 25 of 37

slide-70
SLIDE 70

Bladder Cancer Data

Data from a bladder cancer study carried out by the Veteran’s Administration Cooperative Urological Research Group, VACURG (Byar et al., 1977, Davis and Wei, 1988, Giardina et al. 2011). Target: compare efficacy of 2 treatments (placebo and thiotepa) in prevention of bladder cancer recurrence. m = 81 patients with ≤ 12 observations (3-months periodicity). Two groups (thiotepa treatment; placebo): T (36 patients), P (45 patients). We record indicator of cancerous tumor recurrence.

yit = 1 if # detected tumors at time t increased for patient i, yit = 0

  • therwise, t = 1, . . . , ni, i = 1, 2, . . . , m.

xi = 0 if patient i ∈ group P, and xi = 1 otherwise.

Data Ilustrations: Data from Multiple Binary Sequences slide 25 of 37

slide-71
SLIDE 71

Bladder Cancer Data

Data from a bladder cancer study carried out by the Veteran’s Administration Cooperative Urological Research Group, VACURG (Byar et al., 1977, Davis and Wei, 1988, Giardina et al. 2011). Target: compare efficacy of 2 treatments (placebo and thiotepa) in prevention of bladder cancer recurrence. m = 81 patients with ≤ 12 observations (3-months periodicity). Two groups (thiotepa treatment; placebo): T (36 patients), P (45 patients). We record indicator of cancerous tumor recurrence.

yit = 1 if # detected tumors at time t increased for patient i, yit = 0

  • therwise, t = 1, . . . , ni, i = 1, 2, . . . , m.

xi = 0 if patient i ∈ group P, and xi = 1 otherwise.

Data Ilustrations: Data from Multiple Binary Sequences slide 25 of 37

slide-72
SLIDE 72

Data

Recurrent tumors are removed at each visit, then treatment continues.

Time Patient 1 2 3 4 5 6 7 8 9 10 11 12 1 (P) 2 (P) 3 (P) 1 4 (P) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 (P) 1 1 1 1 1 1 1 46 (T) 1 47 (T) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 (T)

Data Ilustrations: Data from Multiple Binary Sequences slide 26 of 37

slide-73
SLIDE 73

Model: Multiple Binary Sequences with covariates

Y i = (Yi1, . . . , Yini), Zi = (Zi1, . . . , Zini): sequences of responses and latent variables for patient i = 1, . . . , m, with Yit = 1 ⇔ Zit > 0. Latent AR(1) model: {Zi} are conditionally independent:

Zit|Zi t−1 = zi t−1, xi, β0, β1 ∼

  • R2 N(β0+β1xi+α1zi t−1+α2xizi t−1, σ2)dG(α1, α2),

G ∼ DP(M, G0)

Latent-Y AR(1) model (Markovian):

Zit|Yi t−1 = yi t−1, xi, β0, β1 ∼

  • R2 N(β0+β1xi+α1yi t−1+α2xiyi t−1, σ2)dG(α1, α2),

G ∼ DP(M, G0)

σ2 is fixed due to identifiability reasons.

Data Ilustrations: Data from Multiple Binary Sequences slide 27 of 37

slide-74
SLIDE 74

Model: Multiple Binary Sequences with covariates

Y i = (Yi1, . . . , Yini), Zi = (Zi1, . . . , Zini): sequences of responses and latent variables for patient i = 1, . . . , m, with Yit = 1 ⇔ Zit > 0. Latent AR(1) model: {Zi} are conditionally independent:

Zit|Zi t−1 = zi t−1, xi, β0, β1 ∼

  • R2 N(β0+β1xi+α1zi t−1+α2xizi t−1, σ2)dG(α1, α2),

G ∼ DP(M, G0)

Latent-Y AR(1) model (Markovian):

Zit|Yi t−1 = yi t−1, xi, β0, β1 ∼

  • R2 N(β0+β1xi+α1yi t−1+α2xiyi t−1, σ2)dG(α1, α2),

G ∼ DP(M, G0)

σ2 is fixed due to identifiability reasons.

Data Ilustrations: Data from Multiple Binary Sequences slide 27 of 37

slide-75
SLIDE 75

Model: Multiple Binary Sequences with covariates

Y i = (Yi1, . . . , Yini), Zi = (Zi1, . . . , Zini): sequences of responses and latent variables for patient i = 1, . . . , m, with Yit = 1 ⇔ Zit > 0. Latent AR(1) model: {Zi} are conditionally independent:

Zit|Zi t−1 = zi t−1, xi, β0, β1 ∼

  • R2 N(β0+β1xi+α1zi t−1+α2xizi t−1, σ2)dG(α1, α2),

G ∼ DP(M, G0)

Latent-Y AR(1) model (Markovian):

Zit|Yi t−1 = yi t−1, xi, β0, β1 ∼

  • R2 N(β0+β1xi+α1yi t−1+α2xiyi t−1, σ2)dG(α1, α2),

G ∼ DP(M, G0)

σ2 is fixed due to identifiability reasons.

Data Ilustrations: Data from Multiple Binary Sequences slide 27 of 37

slide-76
SLIDE 76

Model: Multiple Binary Sequences with covariates

Y i = (Yi1, . . . , Yini), Zi = (Zi1, . . . , Zini): sequences of responses and latent variables for patient i = 1, . . . , m, with Yit = 1 ⇔ Zit > 0. Latent AR(1) model: {Zi} are conditionally independent:

Zit|Zi t−1 = zi t−1, xi, β0, β1 ∼

  • R2 N(β0+β1xi+α1zi t−1+α2xizi t−1, σ2)dG(α1, α2),

G ∼ DP(M, G0)

Latent-Y AR(1) model (Markovian):

Zit|Yi t−1 = yi t−1, xi, β0, β1 ∼

  • R2 N(β0+β1xi+α1yi t−1+α2xiyi t−1, σ2)dG(α1, α2),

G ∼ DP(M, G0)

σ2 is fixed due to identifiability reasons.

Data Ilustrations: Data from Multiple Binary Sequences slide 27 of 37

slide-77
SLIDE 77

Model (cont.)

Models are completed by defining

G0(α) ≡ N2(α; α0, Vα) and α0 ∼ N2(α00, V ). (β0, β1) ∼ N(β0, Vβ); Initial value for each sequence: Zi1|xi, µxi ∼ N(µxi, σ2

1),

i = 1, . . . , m, xi = 0, 1, with prior such that µ0 = µ1 + D and P(D > 0) = 1.

We consider also a simplified version with no interaction term (3P model).

Data Ilustrations: Data from Multiple Binary Sequences slide 28 of 37

slide-78
SLIDE 78

Model (cont.)

Models are completed by defining

G0(α) ≡ N2(α; α0, Vα) and α0 ∼ N2(α00, V ). (β0, β1) ∼ N(β0, Vβ); Initial value for each sequence: Zi1|xi, µxi ∼ N(µxi, σ2

1),

i = 1, . . . , m, xi = 0, 1, with prior such that µ0 = µ1 + D and P(D > 0) = 1.

We consider also a simplified version with no interaction term (3P model).

Data Ilustrations: Data from Multiple Binary Sequences slide 28 of 37

slide-79
SLIDE 79

Results - Latent-Y AR(1) Model

M = 1 M ∼ U(0.5, 10) M ∼ trunc-IG(2, 2) 3P 4P 4P 4P mean sd mean sd mean sd mean sd β0

  • 0.2171

0.0410

  • 0.2221

0.0439

  • 0.2206

0.0433

  • 0.2207

0.0429 β1

  • 0.1348

0.0749

  • 0.1547

0.1299

  • 0.1301

0.1038

  • 0.1286

0.0995 α01 0.0798 3.1894 0.3576 0.9326 0.4703 0.9552 0.4128 0.9386 α02

  • 0.2642

0.9937

  • 0.1596

0.9635

  • 0.1969

0.9562 µ1

  • 0.4275

0.0890

  • 0.4240

0.0876

  • 0.4252

0.0883

  • 0.4249

0.0882 D 0.1475 0.0811 0.1483 0.0816 0.1482 0.0815 0.1465 0.0809 K 4.0524 1.5484 4.2164 1.6007 3.7666 1.6754 4.2758 1.6719 M

  • 0.8411

0.3331 1.1115 0.2748

3P and 4P Models; σ2=0.25, H = 30.

Data Ilustrations: Data from Multiple Binary Sequences slide 29 of 37

slide-80
SLIDE 80

Results - Latent-Y AR(1) Model (cont.)

H = 30 and M = 1, for models 4P (continuous) and 3P (segments).

Data Ilustrations: Data from Multiple Binary Sequences slide 30 of 37

slide-81
SLIDE 81

Results - Latent AR(1) Model

M = 1 M ∼ U(0.5, 10) M ∼ trunc-IG(2, 2) mean sd mean sd mean sd β0

  • 1.0797

0.0881

  • 1.0818

0.0891

  • 1.0816

0.0891 β1

  • 0.4039

0.1483

  • 0.4009

0.1532

  • 0.4007

0.1497 α01 0.8921 0.9371 0.8870 0.9370 0.8851 0.9219 α02 0.2114 0.9766 0.2234 0.9521 0.2136 0.9411 µ1

  • 0.7454

0.1656

  • 0.7479

0.1675

  • 0.7465

0.1667 D 0.2143 0.1361 0.2173 0.1376 0.2157 0.1373 K 4.3454 1.6996 3.9334 1.8607 4.8270 2.0100 M

  • 0.8615

0.3582 1.1450 0.3103

Case H = 30 and σ2=1.

Data Ilustrations: Data from Multiple Binary Sequences slide 31 of 37

slide-82
SLIDE 82

Results - Latent AR(1) Model

Case H = 30 and M = 1, for σ2 = 1.

Data Ilustrations: Data from Multiple Binary Sequences slide 32 of 37

slide-83
SLIDE 83

Comparison of predictions for both models (4P case)

Prediction for a new P and T patient.

Data Ilustrations: Data from Multiple Binary Sequences slide 33 of 37

slide-84
SLIDE 84

Comparison of predictions (cont.)

Data Ilustrations: Data from Multiple Binary Sequences slide 34 of 37

slide-85
SLIDE 85

Outline

1

Motivation

2

DDP Models

3

The Model Some Previous Work The Model: Continuous Case The Model: Binary Case

4

Data Ilustrations Old Faithful Geyser Data from Multiple Binary Sequences

5

Final Comments

Final Comments: slide 35 of 37

slide-86
SLIDE 86

Final Comments

We presented a flexible autoregressive model for both continuous and binary/ordinal data. Model is characterized as an infinite/finite mixture of autoregressive terms, with a fixed number of lags. Some possible extensions (future research):

multivariate model formulation; estimate the number of lags (so, make them random!); study more properties of autoregressive models.

Final Comments: slide 36 of 37

slide-87
SLIDE 87

Final Comments

We presented a flexible autoregressive model for both continuous and binary/ordinal data. Model is characterized as an infinite/finite mixture of autoregressive terms, with a fixed number of lags. Some possible extensions (future research):

multivariate model formulation; estimate the number of lags (so, make them random!); study more properties of autoregressive models.

Final Comments: slide 36 of 37

slide-88
SLIDE 88

Final Comments

We presented a flexible autoregressive model for both continuous and binary/ordinal data. Model is characterized as an infinite/finite mixture of autoregressive terms, with a fixed number of lags. Some possible extensions (future research):

multivariate model formulation; estimate the number of lags (so, make them random!); study more properties of autoregressive models.

Final Comments: slide 36 of 37

slide-89
SLIDE 89

¡MUCHAS GRACIAS! THANKS!

More at http://www.mat.puc.cl/˜quintana.

Final Comments: slide 37 of 37