Lecture on Parameter Estimation for Stochastic Differential - PowerPoint PPT Presentation

Lecture on Parameter Estimation for Stochastic Differential Equations Erik Lindström

Recap (1) hedging 9.2.2.1 ( Advanced hedging (Greeks 9.2.2 and quadratic Risk management Model validation Why? 0 0 Stochastic Integral Equations )) ◮ We are interested in the parameters θ in the ∫ t ∫ t X ( t ) = X ( 0 )+ µ θ ( s , X ( s )) d s + σ θ ( s , X ( s )) d W ( s )

Recap (1) hedging 9.2.2.1 ( Advanced hedging (Greeks 9.2.2 and quadratic Risk management Why? 0 Stochastic Integral Equations 0 )) ◮ We are interested in the parameters θ in the ∫ t ∫ t X ( t ) = X ( 0 )+ µ θ ( s , X ( s )) d s + σ θ ( s , X ( s )) d W ( s ) ◮ Model validation

Recap (1) hedging 9.2.2.1 ( Advanced hedging (Greeks 9.2.2 and quadratic Why? 0 0 Stochastic Integral Equations )) ◮ We are interested in the parameters θ in the ∫ t ∫ t X ( t ) = X ( 0 )+ µ θ ( s , X ( s )) d s + σ θ ( s , X ( s )) d W ( s ) ◮ Model validation ◮ Risk management

Recap Stochastic Integral Equations 0 0 (1) Why? ◮ We are interested in the parameters θ in the ∫ t ∫ t X ( t ) = X ( 0 )+ µ θ ( s , X ( s )) d s + σ θ ( s , X ( s )) d W ( s ) ◮ Model validation ◮ Risk management ◮ Advanced hedging (Greeks 9.2.2 and quadratic hedging 9.2.2.1 ( P / Q ))

2 N Some asymptotics n t 0 (4) The MLE for the diffusion ( ) parameter is given by 2 1 N 1 N 1 0 W t 0 X t n 1 X t n 2 d 2 1 N 1 t N W t N Consider the arithmetic Brownian motion 1 (2) The drift is estimated by computing the mean, and compensating for the sampling t n 1 t n 1 N N n t 0 0 X t n 1 X t n (3) Expanding this expression reveals that the MLE is given by X t N X t 0 t N (5) d X ( t ) = µ d t + σ d W ( t )

2 N Some asymptotics 0 The MLE for the diffusion ( ) parameter is given by 2 1 N 1 N 1 n X t n t 0 1 X t n 2 d 2 1 N 1 (4) t N Consider the arithmetic Brownian motion Expanding this expression reveals that the MLE is (2) The drift is estimated by computing the mean, and W t 0 (3) (5) given by X t N X t 0 t N t 0 W t N d X ( t ) = µ d t + σ d W ( t ) compensating for the sampling δ = t n + 1 − t n N − 1 ∑ µ = 1 ˆ X ( t n + 1 ) − X ( t n ) . δ N n = 0

2 N Some asymptotics 0 2 1 N 1 N 1 n 1 X t n (4) X t n 2 d 2 1 N 1 The MLE for the diffusion ( ) parameter is given by (5) Consider the arithmetic Brownian motion (2) The drift is estimated by computing the mean, and given by Expanding this expression reveals that the MLE is (3) d X ( t ) = µ d t + σ d W ( t ) compensating for the sampling δ = t n + 1 − t n N − 1 ∑ µ = 1 ˆ X ( t n + 1 ) − X ( t n ) . δ N n = 0 µ = X ( t N ) − X ( t 0 ) = µ + σ W ( t N ) − W ( t 0 ) ˆ . t N − t 0 t N − t 0

Some asymptotics (3) d 1 (4) Consider the arithmetic Brownian motion given by Expanding this expression reveals that the MLE is (5) (2) The drift is estimated by computing the mean, and d X ( t ) = µ d t + σ d W ( t ) compensating for the sampling δ = t n + 1 − t n N − 1 ∑ µ = 1 ˆ X ( t n + 1 ) − X ( t n ) . δ N n = 0 µ = X ( t N ) − X ( t 0 ) = µ + σ W ( t N ) − W ( t 0 ) ˆ . t N − t 0 t N − t 0 The MLE for the diffusion ( σ ) parameter is given by N − 1 → σ 2 χ 2 ( N − 1 ) σ 2 = ∑ ˆ ( X ( t n + 1 ) − X ( t n ) − ˆ µδ ) 2 δ ( N − 1 ) N − 1 n = 0

t n X t n t n X t n A simple method log T t X t t X t t X t covariance P and Normal distribution with argument x , mean m and x m P is the density for a multivariate where (7) X t n 1 X t n 1 Many data sets are sampled at high frequency, t X t d W t making the bias due to discretization of the SDEs some of the schemes in Chapter 12 acceptable. The simplest discretization, the Explicit Euler method, would for the stochastic differential equation d X t t X t d t (6) n correspond the Discretized Maximum Likelihood (DML) estimator given by DML arg max N 1 (8)

A simple method correspond the Discretized Maximum Likelihood covariance P and Normal distribution with argument x , mean m and (7) Many data sets are sampled at high frequency, (DML) estimator given by (8) (6) would for the stochastic differential equation The simplest discretization, the Explicit Euler method, some of the schemes in Chapter 12 acceptable. making the bias due to discretization of the SDEs d X ( t ) = µ ( t , X ( t )) d t + σ ( t , X ( t )) d W ( t ) N − 1 ˆ ∑ θ DML = arg max log φ ( X ( t n + 1 ) , X ( t n ) + µ ( t n , X ( t n ))∆ , Σ( t n , X ( t n θ ∈ Θ n = 1 where φ ( x , m , P ) is the density for a multivariate Σ( t , X ( t )) = σ ( t , X ( t )) σ ( t , X ( t )) T .

Consistency Approximate ML estimators (13.5) are, provided enough computational resources are allocated Simulation based estimators Fokker-Planck based estimators Series expansions. GMM-type estimators (13.6) are consistent if the moments are correctly specified (which is a non-trivial problem!) ◮ The DMLE is generally NOT consistent.

Consistency enough computational resources are allocated GMM-type estimators (13.6) are consistent if the moments are correctly specified (which is a non-trivial problem!) ◮ The DMLE is generally NOT consistent. ◮ Approximate ML estimators (13.5) are, provided ◮ Simulation based estimators ◮ Fokker-Planck based estimators ◮ Series expansions.

Consistency enough computational resources are allocated moments are correctly specified (which is a non-trivial problem!) ◮ The DMLE is generally NOT consistent. ◮ Approximate ML estimators (13.5) are, provided ◮ Simulation based estimators ◮ Fokker-Planck based estimators ◮ Series expansions. ◮ GMM-type estimators (13.6) are consistent if the

x t x s x t x Simultion based estimators s Works very well for Multivariate models! (2012) and Whitaker et al (2016) etc. Improved by Durham-Gallant (2002), Lindström This is the Pedersen algorithm. (9) s t p E p Then it follows that and is easily (...) extended to Levy driven SDEs. ◮ Discretely observed SDEs are Markov processes

Simultion based estimators (9) This is the Pedersen algorithm. Improved by Durham-Gallant (2002), Lindström (2012) and Whitaker et al (2016) etc. Works very well for Multivariate models! and is easily (...) extended to Levy driven SDEs. ◮ Discretely observed SDEs are Markov processes ◮ Then it follows that p θ ( x t | x s ) = E θ [ p θ ( x t | x τ ) |F ( s )] , t > τ > s

Simultion based estimators (9) This is the Pedersen algorithm. (2012) and Whitaker et al (2016) etc. Works very well for Multivariate models! and is easily (...) extended to Levy driven SDEs. ◮ Discretely observed SDEs are Markov processes ◮ Then it follows that p θ ( x t | x s ) = E θ [ p θ ( x t | x τ ) |F ( s )] , t > τ > s ◮ Improved by Durham-Gallant (2002), Lindström

Simultion based estimators (9) This is the Pedersen algorithm. (2012) and Whitaker et al (2016) etc. ◮ Discretely observed SDEs are Markov processes ◮ Then it follows that p θ ( x t | x s ) = E θ [ p θ ( x t | x τ ) |F ( s )] , t > τ > s ◮ Improved by Durham-Gallant (2002), Lindström ◮ Works very well for Multivariate models! ◮ and is easily (...) extended to Levy driven SDEs.

Some key points estimate - use CRNs or importance sampling control variates) process, as it reduces variance AND improves the asymptotics. albeit somewhat restrictive in terms of the class of feasible models. ◮ Naive implementation only provides a point wise ◮ Variance reduction helps (antithetic variates, ◮ The near optimal importance sampler is a Bridge ◮ There is a version that is completely bias free,

t x t x 0 x 2 t p x t x 0 p x t 2 2 2 1 p x t x t p x t where (12) Fokker-Planck Consider the expectation p the Ito formula. Equating these yields Two possible ways to compute this, direct and using (11) and then (10) (13) ∫ E [ h ( X ( t )) |F ( 0 )] = h ( x ( t )) p ( x ( t ) | x ( 0 )) d x ( t ) ∂ ∂ t E [ h ( X ( t )) |F ( 0 )]

t x t x 0 x 2 t Fokker-Planck p x t 2 2 2 1 p x t x t p x t where (12) p x t x 0 p Consider the expectation Equating these yields o formula. Two possible ways to compute this, direct and using (11) and then (10) (13) ∫ E [ h ( X ( t )) |F ( 0 )] = h ( x ( t )) p ( x ( t ) | x ( 0 )) d x ( t ) ∂ ∂ t E [ h ( X ( t )) |F ( 0 )] the It ¯

Fokker-Planck (11) 2 where (12) Consider the expectation o formula. Equating these yields Two possible ways to compute this, direct and using (13) and then (10) ∫ E [ h ( X ( t )) |F ( 0 )] = h ( x ( t )) p ( x ( t ) | x ( 0 )) d x ( t ) ∂ ∂ t E [ h ( X ( t )) |F ( 0 )] the It ¯ ∂ p ∂ t ( x ( t ) | x ( 0 )) = A ⋆ p ( x ( t ) | x ( 0 )) ∂ ∂ 2 ( ) A ⋆ p ( x ( t )) = − ∂ x ( t ) ( µ ( · ) p ( x ( t )))+ 1 σ 2 ( · ) p ( x ( t )) . ∂ x 2 ( t )

Example of the Fokker-Planck equation From (Lindström, 2007) Figure: Fokker-Planck equation computed for the CKLS process 400 300 p(s,x s ;t,x t ) 200 0.08 100 0.07 0 0.06 0.08 0.075 0.05 0.07 0.04 0.065 0.06 0.03 0.055 Time 0.05 0.02 Grid

Comments on the PDE approach Generally better than the Monte Carlo method in Figure: Comparing Monte Carlo, 2nd order and 4th order numerical approximations of the Fokker-Planck equation low dimensional problems. Durham−Gallant Poulsen −1 10 Order2−Pade(1,1) Order4−Pade(2,2) −2 10 −3 MAE 10 −4 10 −5 10 2 3 10 10 time

Lecture on Parameter Estimation for Stochastic Differential - PowerPoint PPT Presentation

Lecture on Parameter Estimation for Stochastic Differential Equations Erik Lindstrm Recap (1) hedging 9.2.2.1 ( Advanced hedging (Greeks 9.2.2 and quadratic Risk management Model validation Why? 0 0 Stochastic Integral Equations ))

I 4 - Bayesian parameter estimation in a normal model STAT 587 (Engineering) Iowa State

Lecture on Parameter Estimation for Stochastic Differential Equations Erik Lindstrm

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Unsupervised Maximum Likelihood

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Supervised Maximum Likelihood

Lecture 6. Bayesian estimation Lecture 6. Bayesian estimation 1 (172) 6. Bayesian estimation

Maximum-likelihood and Bayesian parameter estimation Andrea Passerini passerini@disi.unitn.it

Maximum likelihood parameter estimation Maximum likelihood parameter estimation For an HMM

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Martin Emms September 20, 2019 4CSLL5

6. Parameter Passing Parameter Passing CS 381 Spring 2016 Example (Formal) Parameter void

10/16/19 Parameter Control Genetic Algorithms Motivation Parameter setting Tuning

Outline Introduction Knowledge Structures Parameter Estimation Maximum Likelihood Estimation

Stochastic Processes Will Perkins March 7, 2013 Stochastic Processes Q: What is a Stochastic

Parameter Passing and Pointers Parameter passing and functions I: reference parameters

10/16/19 Parameters and Parameter Tuning Genetic Algorithms History Taxonomy

Parameter Estimation and Lexicalization for PCFGs Informatics 2A: Lecture 21 John Longley 4

Entropy Estimation on the Basis Stochastic Model of a Stochastic Model Werner Schindler Bundesamt

CS70: Lecture 28. Continuous Probability 1. Conditional Probability (Recap: revisit G ( p )) 2.

3.7, 3.8, 3.9, 3.11 Functions of multiple random variables (continuous) Prof. Tesler Math 186

Bayesian networks Petr Po s k Czech Technical University in Prague Faculty of Electrical

CSE 473: Artificial Intelligence Spring 2014 Uncertainty & Probabilistic Reasoning Hanna

Improve your work fl ow for reproducible science Mine etinkaya-Rundel University of Edinburgh

Understanding MCMC Dynamics as Flows on the Wasserstein Space Chang Liu, Jingwei Zhuo, Jun Zhu 1

noise and number of sensors Giovanni Capellari Eleni Chatzi Stefano Mariani 3 rd International

slides of Layered Adaptive Importance Sampling Presentation June 2016 CITATION READS 1 40 3

Lecture on Parameter Estimation for Stochastic Differential - PowerPoint PPT Presentation

Lecture on Parameter Estimation for Stochastic Differential Equations Erik Lindstrm Recap (1) hedging 9.2.2.1 ( Advanced hedging (Greeks 9.2.2 and quadratic Risk management Model validation Why? 0 0 Stochastic Integral Equations ))

I 4 - Bayesian parameter estimation in a normal model STAT 587 (Engineering) Iowa State

Lecture on Parameter Estimation for Stochastic Differential Equations Erik Lindstrm

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Unsupervised Maximum Likelihood

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Supervised Maximum Likelihood

Lecture 6. Bayesian estimation Lecture 6. Bayesian estimation 1 (172) 6. Bayesian estimation

Maximum-likelihood and Bayesian parameter estimation Andrea Passerini passerini@disi.unitn.it

Maximum likelihood parameter estimation Maximum likelihood parameter estimation For an HMM

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Martin Emms September 20, 2019 4CSLL5

6. Parameter Passing Parameter Passing CS 381 Spring 2016 Example (Formal) Parameter void

10/16/19 Parameter Control Genetic Algorithms Motivation Parameter setting Tuning

Outline Introduction Knowledge Structures Parameter Estimation Maximum Likelihood Estimation

Stochastic Processes Will Perkins March 7, 2013 Stochastic Processes Q: What is a Stochastic

Parameter Passing and Pointers Parameter passing and functions I: reference parameters

10/16/19 Parameters and Parameter Tuning Genetic Algorithms History Taxonomy

Parameter Estimation and Lexicalization for PCFGs Informatics 2A: Lecture 21 John Longley 4

Entropy Estimation on the Basis Stochastic Model of a Stochastic Model Werner Schindler Bundesamt

CS70: Lecture 28. Continuous Probability 1. Conditional Probability (Recap: revisit G ( p )) 2.

3.7, 3.8, 3.9, 3.11 Functions of multiple random variables (continuous) Prof. Tesler Math 186

Bayesian networks Petr Po s k Czech Technical University in Prague Faculty of Electrical

CSE 473: Artificial Intelligence Spring 2014 Uncertainty &amp; Probabilistic Reasoning Hanna

Improve your work fl ow for reproducible science Mine etinkaya-Rundel University of Edinburgh

Understanding MCMC Dynamics as Flows on the Wasserstein Space Chang Liu, Jingwei Zhuo, Jun Zhu 1

noise and number of sensors Giovanni Capellari Eleni Chatzi Stefano Mariani 3 rd International

slides of Layered Adaptive Importance Sampling Presentation June 2016 CITATION READS 1 40 3

CSE 473: Artificial Intelligence Spring 2014 Uncertainty & Probabilistic Reasoning Hanna