Multivariate and Partially observed models Erik Lindstrm n T - PowerPoint PPT Presentation

Multivariate and Partially observed models Erik Lindström

n T Briefly on multivariate models N n 1 n 1 N (2) 1 n X n X T 1 n 1 n Consider the Vector-AR(1) (VAR) process. 1 X T X n 1 n 1 N A Matrix Cookbook] Leads to Writing down the log-likelihood... [checking the matrix algebra tricks. Yes, with a bit of Can we estimate the parameters? (1) (3) X n + 1 = A X n + ε n + 1 , ε ∼ MVN ( 0 , Σ)

n T Briefly on multivariate models N n 1 n 1 N (2) 1 n X n X T 1 n 1 n Consider the Vector-AR(1) (VAR) process. 1 X T X n 1 n 1 N A Leads to Matrix Cookbook] Writing down the log-likelihood... [checking the matrix algebra tricks. Can we estimate the parameters? Yes, with a bit of (1) (3) X n + 1 = A X n + ε n + 1 , ε ∼ MVN ( 0 , Σ)

Briefly on multivariate models n n (2) n X n X T Consider the Vector-AR(1) (VAR) process. (3) matrix algebra tricks. (1) Can we estimate the parameters? Yes, with a bit of Matrix Cookbook] Leads to Writing down the log-likelihood... [checking the X n + 1 = A X n + ε n + 1 , ε ∼ MVN ( 0 , Σ) ) − 1 ( N − 1 ) ( N − 1 ˆ ∑ ∑ A = X n + 1 X T n = 1 n = 1 N − 1 ˆ ∑ Σ = ε n ˆ ˆ ε T n = 1

Motivation, partially observed models Eg. when the regressor dimension is larger than the observable state dimension (think stoch. vol). or interest rate models or credit models (hidden jump intensity process) Missing observations can be treated in this framework. ◮ Used when regressors are unobservable

Motivation, partially observed models the observable state dimension (think stoch. vol). or interest rate models or credit models (hidden jump intensity process) Missing observations can be treated in this framework. ◮ Used when regressors are unobservable ◮ Eg. when the regressor dimension is larger than

Motivation, partially observed models the observable state dimension (think stoch. vol). or credit models (hidden jump intensity process) Missing observations can be treated in this framework. ◮ Used when regressors are unobservable ◮ Eg. when the regressor dimension is larger than ◮ or interest rate models

Motivation, partially observed models the observable state dimension (think stoch. vol). Missing observations can be treated in this framework. ◮ Used when regressors are unobservable ◮ Eg. when the regressor dimension is larger than ◮ or interest rate models ◮ or credit models (hidden jump intensity process)

Motivation, partially observed models the observable state dimension (think stoch. vol). framework. ◮ Used when regressors are unobservable ◮ Eg. when the regressor dimension is larger than ◮ or interest rate models ◮ or credit models (hidden jump intensity process) ◮ Missing observations can be treated in this

r t d t Examples d r t B t T r t A t T e P t T (6) r t d W t Short rate models (5) (4) (7) ◮ Stochastic volatility y t = σ t η t t = a 0 + a 1 log σ 2 t − 1 + e t log σ 2

Examples (4) (5) (6) (7) ◮ Stochastic volatility y t = σ t η t t = a 0 + a 1 log σ 2 t − 1 + e t log σ 2 ◮ Short rate models √ d r t = α ( β − r t ) d t + γ + δ r t d W t P ( t , T ) = A ( t , T ) e − B ( t , T ) r t

t is not directly observable 1 y 1 T y T d 1 T Stoch vol. 2 p y T p y 1 Likelihood but can be estimated. (9) Let us start with the stoch vol. model. t (8) y t Dependence structure? = σ t η t = a 0 + a 1 log σ 2 t − 1 + e t log σ 2

1 y 1 T y T d 1 T Stoch vol. Let us start with the stoch vol. model. p y T p y 1 Likelihood but can be estimated. (9) t (8) y t Dependence structure? = σ t η t = a 0 + a 1 log σ 2 t − 1 + e t log σ 2 ◮ σ 2 t is not directly observable

Stoch vol. t (9) Let us start with the stoch vol. model. Dependence structure? y t (8) = σ t η t = a 0 + a 1 log σ 2 t − 1 + e t log σ 2 ◮ σ 2 t is not directly observable ◮ but can be estimated. ◮ Likelihood ∫ p ( y 1 , . . . , y T ) = p ( σ 1 , y 1 , . . . , σ T , y T ) d σ 1 : T ?

Stoch vol. Let us start with the stoch vol. model. y t (8) (9) t = σ t η t = a 0 + a 1 log σ 2 t − 1 + e t log σ 2 ◮ σ 2 t is not directly observable ◮ but can be estimated. ◮ Likelihood ∫ p ( y 1 , . . . , y T ) = p ( σ 1 , y 1 , . . . , σ T , y T ) d σ 1 : T ? ◮ Dependence structure?

General state space models x t HMM) (11) All models we use can be written in general state complex models! (10) y t space form = h ( x t , η t ) = f ( x t − 1 , e t ) ◮ x is a hidden (unobservable) Markov process (cf. ◮ y is observed. ◮ y t | x t is independent of y s , s = 1 .. t − 1, t + 1 .. T . ◮ These rather simple structures can generate

Structure All models we use can be written in state space form y t (12) x t (13) model setup. = h ( x t , η t ) = f ( x t − 1 , e t ) ◮ These equations imply transition probabilities, i.e. we can derive p ( x t | x t − 1 ) and p ( y t | x t ) from the ◮ We also need p ( x 0 ) , i.e. initial conditions.

p y t y 1 t p y t x t p x t y 1 t 1 d x t p x t y 1 t p x t x t 1 p x t 1 y 1 t 1 d x t Likelihood 1 and 1 The likelihood can be written as We can write T 1 ∏ p ( y 1 , . . . , y T ) = p ( y 1 ) p ( y t | y 1 : t − 1 ) , t = 2 where y 1 : t − 1 is shorthand notation for { y 1 , . . . , y t − 1 } .

Likelihood The likelihood can be written as T We can write and ∏ p ( y 1 , . . . , y T ) = p ( y 1 ) p ( y t | y 1 : t − 1 ) , t = 2 where y 1 : t − 1 is shorthand notation for { y 1 , . . . , y t − 1 } . ∫ p ( y t | y 1 : t − 1 ) = p ( y t | x t ) p ( x t | y 1 : t − 1 ) d x t ∫ p ( x t | y 1 : t − 1 ) = p ( x t | x t − 1 ) p ( x t − 1 | y 1 : t − 1 ) d x t − 1

p y t x t p x t y 1 t p x t y 1 t p y t y 1 t p y t x t p x t y 1 t p x t y 1 t p y t x t p x t y 1 t 1 d x t Filter density We can derive the filter density from 1 1 or equivalently 1 ◮ The density for the hidden state x t , using the information y 1 : t is called the filter density , p ( x t | y 1 : t ) .

p x t y 1 t p x t x t 1 p x t 1 y 1 t 1 dx t Predictive density We can derive the predictive density from 1 1 ◮ The density for the hidden state x t , using the information y 1 : t − 1 is called the predictive density , p ( x t | y 1 : t − 1 ) .

Predictive density ◮ The density for the hidden state x t , using the information y 1 : t − 1 is called the predictive density , p ( x t | y 1 : t − 1 ) . ◮ We can derive the predictive density from ∫ p ( x t | y 1 : t − 1 ) = p ( x t | x t − 1 ) p ( x t − 1 | y 1 : t − 1 ) dx t − 1 .

Recursion 2. At time t , generate the predictive density 2. 1. We have the filter density p ( x 0 ) at time 0. p ( x t + 1 | y 1 : t ) . 3. At time t + 1, calculate p ( y t | y 1 : t − 1 ) and update the filter density p ( x t + 1 | y 1 : t + 1 ) . Repeat from step

Working recursions Essentially there are only 2 recursions known in closed form ◮ HMM (finite state space) ◮ Kalman filter (linear, Gaussian models)

x 0 m 0 P 0 Kalman filter Why does it give closed form recursions? Short answer: The Gaussian density is an exponential, second order polynomial. Model: Y t X t [1] Assume initial distribution p x 0 0 = CX t + η t , η t ∈ N ( 0 , Γ) = AX t − 1 + e t , e t ∈ N ( 0 , Σ)

Kalman filter Why does it give closed form recursions? Short answer: The Gaussian density is an exponential, second order polynomial. Model: Y t X t [1] Assume initial distribution = CX t + η t , η t ∈ N ( 0 , Γ) = AX t − 1 + e t , e t ∈ N ( 0 , Σ) p ( x 0 |F 0 ) = φ ( x 0 ; m 0 , P 0 )

x 1 Am 0 AP 0 A T Kalman filter [2] Calculate the predictive density Some calculations give ∫ p ( x 1 |F 0 ) = p ( x 1 | x 0 ) p ( x 0 |F 0 ) dx 0 Here p ( x 1 | x 0 ) = φ ( x 1 ; Ax 0 , Σ) , thus giving ∫ e − 1 2 ( x 1 − Ax 0 ) T Σ − 1 ( x 1 − Ax 0 ) e − 1 2 ( x 0 − m 0 ) T P − 1 0 ( x 0 − m 0 ) dx 0 . ∝

Kalman filter [2] Calculate the predictive density Some calculations give ∫ p ( x 1 |F 0 ) = p ( x 1 | x 0 ) p ( x 0 |F 0 ) dx 0 Here p ( x 1 | x 0 ) = φ ( x 1 ; Ax 0 , Σ) , thus giving ∫ e − 1 2 ( x 1 − Ax 0 ) T Σ − 1 ( x 1 − Ax 0 ) e − 1 2 ( x 0 − m 0 ) T P − 1 0 ( x 0 − m 0 ) dx 0 . ∝ = φ ( x 1 ; Am 0 , AP 0 A T + Σ) .

Kalman filter [3] The filter density is more complicated. We have Thus Note that the likelihood is a normalization, and independent of x 1 . p ( y t | x t ) p ( x t | y 1 : t − 1 ) p ( x t | y 1 : t ) = . ∫ p ( y t | x t ) p ( x t | y 1 : t − 1 ) d x t p ( x 1 | y 1 ) = p ( y 1 | x 1 ) p ( x 1 |F 0 ) ∝ p ( y 1 | x 1 ) p ( x 1 |F 0 ) . p ( y 1 |F 0 )

Multivariate and Partially observed models Erik Lindstrm n T - PowerPoint PPT Presentation

Multivariate and Partially observed models Erik Lindstrm n T Briefly on multivariate models N n 1 n 1 N (2) 1 n X n X T 1 n 1 n Consider the Vector-AR(1) (VAR) process. 1 X T X n 1 n 1 N A Matrix Cookbook] Leads to Writing

Outline Multivariate Data 1 Multivariate Parametric Methods Multivariate Normal Distribution 2

Probabilistic Graphical Models 10-708 Learning Partially Observed Learning Partially Observed

Galactic Cosmic Galactic Cosmic- - Rays Observed by Rays Observed by Rays Observed by Rays

Galactic Cosmic Galactic Cosmic- - Rays Observed by Rays Observed by Rays Observed by Rays

Reading multivariate data Surajit Ray Reader, University of Glasgow DataCamp Multivariate

Multivariate t-distributions Surajit Ray Reader, University of Glasgow DataCamp Multivariate

Model reduction of partially-observed Motivation stochastic differential equations A control

Probabilistic Graphical Models 10-708 Learning Completely Observed Learning Completely Observed

Regression Diagnostics and the Forward Search 3. A Single Multivariate Sample Anthony Atkinson,

Robust Statistics Part 2: Multivariate location and scatter Peter Rousseeuw LARS-IASC School,

Advanced PHP Dr. Steven Bitner A/B and Multivariate testing Why use multivariate testing If

Multivariate Ordination Analyses: Principal Component Analysis Dilys Vela Tatiana Boza Tatiana

Multivariate Linear Regression Max Turgeon STAT 4690Applied Multivariate Analysis

Multivariate normal distribution Surajit Ray Reader, University of Glasgow DataCamp

Multivariate Normal Distribution Max Turgeon STAT 4690Applied Multivariate Analysis Building

Tests for Multivariate Linear Models with the car Package John Fox McMaster University Hamilton,

Modeling crowds at mass-events: learning large-scale crowd

B = Y Z Z Z where y z 1 1 y z

in in Fi Finan ance ce 1 KULKUNYA PRAYARACH, PH.D. Modeling Long-Run Relationships in Finance

MTN Reginal Meeting Pharmacy Break-out Session October 2, 2012 Pharmacy Session Overview

Overview ECE 753: FAULT-TOLERANT Introduction Motivation and Background COMPUTING

On Security Enhancement of Lightweight Encryption Employing Error Correction Coding and

Quantum Computation Lecture 27 And that s all we got time for! 1 State 2 State State of

Wrap Up: Cryptographic Primitives Lecture 18 Alternate Assumptions for PKE Randomness