draft
play

Draft Supercanonical convergence rates in the simulation of Markov - PowerPoint PPT Presentation

1 Draft Supercanonical convergence rates in the simulation of Markov chains Pierre LEcuyer Joint work with Christian L ecot, David Munger, and Bruno Tuffin Valuetools, Taormina, Italia, Ottobre 2016 2 Draft Markov Chain Setting A


  1. 1 Draft Supercanonical convergence rates in the simulation of Markov chains Pierre L’Ecuyer Joint work with Christian L´ ecot, David Munger, and Bruno Tuffin Valuetools, Taormina, Italia, Ottobre 2016

  2. 2 Draft Markov Chain Setting A Markov chain with state space X evolves as X 0 = x 0 , X j = ϕ j ( X j − 1 , U j ) , j ≥ 1 , where the U j are i.i.d. uniform r.v.’s over (0 , 1) d . Payoff (or cost) function: τ � Y = g j ( X j ) j =1 for some fixed time horizon τ .

  3. 2 Draft Markov Chain Setting A Markov chain with state space X evolves as X 0 = x 0 , X j = ϕ j ( X j − 1 , U j ) , j ≥ 1 , where the U j are i.i.d. uniform r.v.’s over (0 , 1) d . Payoff (or cost) function: τ � Y = g j ( X j ) j =1 for some fixed time horizon τ . We may want to estimate µ = E [ Y ] , or some other functional of Y , or perhaps the entire distribution of Y .

  4. 3 Draft Ordinary Monte Carlo simulation For i = 0 , . . . , n − 1, generate X i , j = ϕ j ( X i , j − 1 , U i , j ), j = 1 , . . . , τ , where the U i , j ’s are i.i.d. U(0 , 1) d . Estimate µ by n − 1 τ n − 1 µ n = 1 g j ( X i , j ) = 1 � � � ˆ Y i . n n i =0 j =1 i =0 µ n ] = 1 n Var [ Y i ] = O ( n − 1 ) . E [ˆ µ n ] = µ and Var [ˆ The width of a confidence interval on µ converges as O ( n − 1 / 2 ) . That is, for each additional digit of accuracy, one must multiply n by 100.

  5. 3 Draft Ordinary Monte Carlo simulation For i = 0 , . . . , n − 1, generate X i , j = ϕ j ( X i , j − 1 , U i , j ), j = 1 , . . . , τ , where the U i , j ’s are i.i.d. U(0 , 1) d . Estimate µ by n − 1 τ n − 1 µ n = 1 g j ( X i , j ) = 1 � � � ˆ Y i . n n i =0 j =1 i =0 µ n ] = 1 n Var [ Y i ] = O ( n − 1 ) . E [ˆ µ n ] = µ and Var [ˆ The width of a confidence interval on µ converges as O ( n − 1 / 2 ) . That is, for each additional digit of accuracy, one must multiply n by 100. Can also estimate the distribution (density) of Y by the empirical distribution of Y 0 , . . . , Y n − 1 , or by an histogram (perhaps smoothed), or by a kernel density estimator. The mean integrated square error (MISE) for the density typically converges as O ( n − 2 / 3 ) for an histogram and O ( n − 4 / 5 ) for the best density estimators.

  6. 3 Draft Ordinary Monte Carlo simulation For i = 0 , . . . , n − 1, generate X i , j = ϕ j ( X i , j − 1 , U i , j ), j = 1 , . . . , τ , where the U i , j ’s are i.i.d. U(0 , 1) d . Estimate µ by n − 1 τ n − 1 µ n = 1 g j ( X i , j ) = 1 � � � ˆ Y i . n n i =0 j =1 i =0 µ n ] = 1 n Var [ Y i ] = O ( n − 1 ) . E [ˆ µ n ] = µ and Var [ˆ The width of a confidence interval on µ converges as O ( n − 1 / 2 ) . That is, for each additional digit of accuracy, one must multiply n by 100. Can also estimate the distribution (density) of Y by the empirical distribution of Y 0 , . . . , Y n − 1 , or by an histogram (perhaps smoothed), or by a kernel density estimator. The mean integrated square error (MISE) for the density typically converges as O ( n − 2 / 3 ) for an histogram and O ( n − 4 / 5 ) for the best density estimators. Can we do better than those rates?

  7. 4 Draft Plenty of applications fit this setting: Finance Queueing systems Inventory, distribution, logistic systems Reliability models MCMC in Bayesian statistics Many many more...

  8. 5 Draft Example: An Asian Call Option (two-dim state) Given observation times t 1 , t 2 , . . . , t τ , s 0 > 0, and X 0 = 0, let X ( t j − 1 ) + ( r − σ 2 / 2)( t j − t j − 1 ) + σ ( t j − t j − 1 ) 1 / 2 Z j , X ( t j ) = S ( t j ) = s 0 exp[ X ( t j )] , (geometric Brownian motion) where U j ∼ U [0 , 1) and Z j = Φ − 1 ( U j ) ∼ N (0 , 1). Running average: ¯ S j = 1 � j i =1 S ( t i ). j 0 , ¯ � � Payoff at step j = τ is Y = g τ ( X τ ) = max S τ − K . MC State: X j = ( S ( t j ) , ¯ S j ) . Transition: S ( t j ) , ( j − 1) ¯ � � S j − 1 + S ( t j ) X j = ( S ( t j ) , ¯ S j ) = ϕ j ( S ( t j − 1 ) , ¯ S j − 1 , U j ) = . j Want to estimate E [ Y ], or distribution of Y , etc.

  9. 6 Draft Take τ = 12, T = 1 (one year), t j = j /τ for j = 0 , . . . , τ , K = 100, s 0 = 100, r = 0 . 05, σ = 0 . 5. We make n = 10 6 independent runs. Mean: 13.1 . Max = 390.8 In 53.47% of cases, the payoff is 0.

  10. 6 Draft Take τ = 12, T = 1 (one year), t j = j /τ for j = 0 , . . . , τ , K = 100, s 0 = 100, r = 0 . 05, σ = 0 . 5. We make n = 10 6 independent runs. Mean: 13.1 . Max = 390.8 In 53.47% of cases, the payoff is 0. Histogram of positive values: Frequency ( × 10 3 ) average = 13 . 1 30 20 10 0 Payoff 0 50 100 150

  11. 7 Draft Now try n = 4096 runs instead. Frequency 150 100 50 0 Payoff 0 25 50 75 100 125 150 For histogram: MISE = O ( n − 2 / 3 ) . For polygonal interpolation: MISE = O ( n − 4 / 5 ) . Same with KDE. Can we do better?

  12. 8 Draft Example: Asian Call Option S (0) = 100, K = 100, r = 0 . 05, σ = 0 . 15, t j = j / 52, j = 0 , . . . , τ = 13. log 2 Var [ˆ µ RQMC , n ] -10 n − 1 crude MC -20 RQMC sequential -30 array-RQMC, split sort n − 2 -40 log 2 n 8 10 12 14 16 18 20

  13. 9 Draft Example: Asian Call Option S (0) = 100, K = 100, r = ln(1 . 09), σ = 0 . 2, t j = (230 + j ) / 365, for j = 1 , . . . , τ = 10. log 2 Var [ ¯ Y n , j ] Sort RQMC points VRF CPU (sec) log 2 n 2 . 0 × 10 2 Batch sort SS -1.38 744 4 . 2 × 10 6 ( n 1 = n 2 ) Sobol -2.03 532 2 . 8 × 10 6 Sobol+NUS -2.03 1035 4 . 4 × 10 6 Korobov+baker -2.04 482 2 . 4 × 10 3 Hilbert sort SS -1.55 840 2 . 6 × 10 6 (logistic map) Sobol -2.03 534 2 . 8 × 10 6 Sobol+NUS -2.02 724 3 . 3 × 10 6 Korobov+baker -2.01 567 VRF for n = 2 20 . CPU time for m = 100 replications.

  14. 10 Draft Classical Randomized Quasi-Monte Carlo (RQMC) for Markov Chains One RQMC point for each sample path. Put V i = ( U i , 1 , . . . , U i ,τ ) ∈ (0 , 1) s = (0 , 1) d τ . Estimate µ by n − 1 τ µ rqmc , n = 1 � � ˆ g j ( X i , j ) n i =0 j =1 where P n = { V 0 , . . . , V n − 1 } ⊂ (0 , 1) s satisfies: (a) each point V i has the uniform distribution over (0 , 1) s ; (b) P n covers (0 , 1) s very evenly (i.e., has low discrepancy). The dimension s is often very large!

  15. 11 Draft Array-RQMC for Markov Chains L., L´ ecot, Tuffin, et al. [2004, 2006, 2008, etc.] Earlier deterministic versions: L´ ecot et al. Simulate an “array” of n chains in “parallel.” At each step, use an RQMC point set P n to advance all the chains by one step. Seek global negative dependence across the chains. Goal : Want small discrepancy (or “distance”) between empirical distribution of S n , j = { X 0 , j , . . . , X n − 1 , j } and theoretical distribution of X j . If we succeed, these (unbiased) estimators will have small variance: n − 1 µ arqmc , j , n = 1 � µ j = E [ g j ( X j )] ≈ ˆ g j ( X i , j ) n i =0 n − 1 n − 1 Var [ g j ( X i , j )] 2 � � Var [ˆ µ arqmc , j , n ] = + Cov [ g j ( X i , j ) , g j ( X k , j )] . n 2 n i =0 k = i +1

  16. Some RQMC insight: To simplify the discussion, suppose X j ∼ U (0 , 1) ℓ . 12 Draft This can be achieved (in principle) by a change of variable. We estimate � µ j = E [ g j ( X j )] = E [ g j ( ϕ j ( X j − 1 , U ))] = [0 , 1) ℓ + d g j ( ϕ j ( x , u )) d x d u (we take a single j here) by n − 1 n − 1 µ arqmc , j , n = 1 g j ( X i , j ) = 1 � � ˆ g j ( ϕ j ( X i , j − 1 , U i , j )) . n n i =0 i =0 This is (roughly) RQMC with the point set Q n = { ( X i , j − 1 , U i , j ) , 0 ≤ i < n } . We want Q n to have low discrepancy (LD) (be highly uniform) over [0 , 1) ℓ + d .

  17. Some RQMC insight: To simplify the discussion, suppose X j ∼ U (0 , 1) ℓ . 12 Draft This can be achieved (in principle) by a change of variable. We estimate � µ j = E [ g j ( X j )] = E [ g j ( ϕ j ( X j − 1 , U ))] = [0 , 1) ℓ + d g j ( ϕ j ( x , u )) d x d u (we take a single j here) by n − 1 n − 1 µ arqmc , j , n = 1 g j ( X i , j ) = 1 � � ˆ g j ( ϕ j ( X i , j − 1 , U i , j )) . n n i =0 i =0 This is (roughly) RQMC with the point set Q n = { ( X i , j − 1 , U i , j ) , 0 ≤ i < n } . We want Q n to have low discrepancy (LD) (be highly uniform) over [0 , 1) ℓ + d . We do not choose the X i , j − 1 ’s in Q n : they come from the simulation. To construct the (randomized) U i , j , select a LD point set ˜ Q n = { ( w 0 , U 0 , j ) , . . . , ( w n − 1 , U n − 1 , j ) } , where the w i ∈ [0 , 1) ℓ are fixed and each U i , j ∼ U (0 , 1) d . Permute the states X i , j − 1 so that X π j ( i ) , j − 1 is “close” to w i for each i (LD between the two sets), and compute X i , j = ϕ j ( X π j ( i ) , j − 1 , U i , j ) for each i . Example: If ℓ = 1, can take w i = ( i + 0 . 5) / n and just sort the states. For ℓ > 1, there are various ways to define the matching (multivariate sort).

  18. 13 Draft Array-RQMC algorithm X i , 0 ← x 0 (or X i , 0 ← x i , 0 ) for i = 0 , . . . , n − 1; for j = 1 , 2 , . . . , τ do Compute the permutation π j of the states (for matching); Randomize afresh { U 0 , j , . . . , U n − 1 , j } in ˜ Q n ; X i , j = ϕ j ( X π j ( i ) , j − 1 , U i , j ), for i = 0 , . . . , n − 1; µ arqmc , j , n = ¯ � n − 1 Y n , j = 1 ˆ i =0 g ( X i , j ); n end for µ arqmc , n = � τ Estimate µ by the average ¯ Y n = ˆ j =1 ˆ µ arqmc , j , n .

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend