Draft 1 Density estimation by Monte Carlo and randomized - PowerPoint PPT Presentation

Draft 1 Density estimation by Monte Carlo and randomized quasi-Monte Carlo (RQMC) Pierre L’Ecuyer Joint work with Amal Ben Abdellah, Art B. Owen, and Florian Puchhammer NUS, Singapore, October 2019

Draft 2 What this talk is about Monte Carlo (MC) simulation is widely used to estimate the expectation E [ X ] of a random variable X and compute a confidence interval on E [ X ]. MSE = Var [ ¯ X n ] = O ( n − 1 ).

Draft 2 What this talk is about Monte Carlo (MC) simulation is widely used to estimate the expectation E [ X ] of a random variable X and compute a confidence interval on E [ X ]. MSE = Var [ ¯ X n ] = O ( n − 1 ). But simulation usually provides information to do much more! The output data can be used to estimate the entire distribution of X , e.g., the cumulative distribution function (cdf) F of X , defined by F ( x ) = P [ X ≤ x ], or its density f defined by f ( x ) = F ′ ( x ).

Draft 2 What this talk is about Monte Carlo (MC) simulation is widely used to estimate the expectation E [ X ] of a random variable X and compute a confidence interval on E [ X ]. MSE = Var [ ¯ X n ] = O ( n − 1 ). But simulation usually provides information to do much more! The output data can be used to estimate the entire distribution of X , e.g., the cumulative distribution function (cdf) F of X , defined by F ( x ) = P [ X ≤ x ], or its density f defined by f ( x ) = F ′ ( x ). If X 1 , . . . , X n are n indep. realizations of X , the empirical cdf n F n ( x ) = 1 ˆ � I [ X i ≤ x ] n i =1 is unbiased for F ( x ) at all x , and Var [ ˆ F n ( x )] = O ( n − 1 ).

Draft 2 What this talk is about Monte Carlo (MC) simulation is widely used to estimate the expectation E [ X ] of a random variable X and compute a confidence interval on E [ X ]. MSE = Var [ ¯ X n ] = O ( n − 1 ). But simulation usually provides information to do much more! The output data can be used to estimate the entire distribution of X , e.g., the cumulative distribution function (cdf) F of X , defined by F ( x ) = P [ X ≤ x ], or its density f defined by f ( x ) = F ′ ( x ). If X 1 , . . . , X n are n indep. realizations of X , the empirical cdf n F n ( x ) = 1 ˆ � I [ X i ≤ x ] n i =1 is unbiased for F ( x ) at all x , and Var [ ˆ F n ( x )] = O ( n − 1 ). For a continuous r.v. X , the density f provides a better visual idea of the distribution. Here we focus on estimating f over [ a , b ] ⊂ R .

Draft 3 Setting Classical density estimation was developed in the context where independent observations X 1 , . . . , X n of X are given and one wishes to estimate the density f of X from that. Best rate: Var [ˆ f n ( x )] = O ( n − 4 / 5 ). Here we assume that X 1 , . . . , X n are generated by simulation from a stochastic model. We can choose n and we have some freedom on how the simulation is performed. The X i ’s are realizations of a random variable X = g ( U ) ∈ R with density f , where U = ( U 1 , . . . , U s ) ∼ U (0 , 1) s and g ( u ) can be computed easily for any u ∈ (0 , 1) s .

Draft 3 Setting Classical density estimation was developed in the context where independent observations X 1 , . . . , X n of X are given and one wishes to estimate the density f of X from that. Best rate: Var [ˆ f n ( x )] = O ( n − 4 / 5 ). Here we assume that X 1 , . . . , X n are generated by simulation from a stochastic model. We can choose n and we have some freedom on how the simulation is performed. The X i ’s are realizations of a random variable X = g ( U ) ∈ R with density f , where U = ( U 1 , . . . , U s ) ∼ U (0 , 1) s and g ( u ) can be computed easily for any u ∈ (0 , 1) s . 1. Can we obtain a better estimate of f with RQMC instead of MC? How much better? What about taking a stratified sample over [0 , 1) s ? 2. Is it possible to obtain unbiased density estimators that converge as O ( n − 1 ) or faster, using clever sampling strategies?

Draft 4 Small example: A stochastic activity network Gives precedence relations between activities. Activity k has random duration Y k (also length of arc k ) with known cdf F k ( y ) := P [ Y k ≤ y ] . Project duration X = (random) length of longest path from source to sink. Can look at deterministic equivalent of X , E [ X ], cdf, density, ... sink 5 8 Y 11 Y 6 Y 10 Y 13 Want to estimate the density of X , 2 4 7 Y 5 Y 9 f ( x ) = F ′ ( x ) = d d x P [ X ≤ x ]. Y 2 Y 3 Y 8 Y 12 0 1 3 6 Y 1 Y 5 Y 7 source

Draft 4 Small example: A stochastic activity network Gives precedence relations between activities. Activity k has random duration Y k (also length of arc k ) with known cdf F k ( y ) := P [ Y k ≤ y ] . Project duration X = (random) length of longest path from source to sink. Can look at deterministic equivalent of X , E [ X ], cdf, density, ... The sample cdf sink F n ( x ) = 1 ˆ � n i =1 I [ X i ≤ x ] 5 8 n Y 11 is an unbiased estimator of the cdf Y 6 F ( x ) = P [ X ≤ x ] . Y 10 Y 13 Want to estimate the density of X , 2 4 7 Y 5 Y 9 f ( x ) = F ′ ( x ) = d d x P [ X ≤ x ]. Y 2 Y 3 Y 8 Y 12 0 1 3 6 Y 1 Y 5 Y 7 source

Draft 4 Small example: A stochastic activity network Gives precedence relations between activities. Activity k has random duration Y k (also length of arc k ) with known cdf F k ( y ) := P [ Y k ≤ y ] . Project duration X = (random) length of longest path from source to sink. Can look at deterministic equivalent of X , E [ X ], cdf, density, ... The sample cdf sink F n ( x ) = 1 ˆ � n i =1 I [ X i ≤ x ] 5 8 n Y 11 is an unbiased estimator of the cdf Y 6 F ( x ) = P [ X ≤ x ] . Y 10 Y 13 Want to estimate the density of X , 2 4 7 Y 5 Y 9 f ( x ) = F ′ ( x ) = d d x P [ X ≤ x ]. Y 2 Y 3 Y 8 Y 12 The sample derivative ˆ F ′ n ( x ) is useless 0 1 3 6 fo estimate f ( x ), because it is 0 almost Y 1 Y 5 Y 7 everywhere. source

Draft 5 Numerical illustration from Elmaghraby (1977): Y k ∼ N ( µ k , σ 2 k ) for k = 1 , 2 , 4 , 11 , 12, and Y k ∼ Expon(1 /µ k ) otherwise. µ 1 , . . . , µ 13 : 13.0, 5.5, 7.0, 5.2, 16.5, 14.7, 10.3, 6.0, 4.0, 20.0, 3.2, 3.2, 16.5. Results of an experiment with n = 100 000. Note: X is not normal! Frequency X det = 48 . 2 10000 5000 0 X 0 25 50 75 100 125 150 175 200

Draft 5 Numerical illustration from Elmaghraby (1977): Y k ∼ N ( µ k , σ 2 k ) for k = 1 , 2 , 4 , 11 , 12, and Y k ∼ Expon(1 /µ k ) otherwise. µ 1 , . . . , µ 13 : 13.0, 5.5, 7.0, 5.2, 16.5, 14.7, 10.3, 6.0, 4.0, 20.0, 3.2, 3.2, 16.5. Results of an experiment with n = 100 000. Note: X is not normal! Frequency mean = 64 . 2 X det = 48 . 2 10000 5000 ˆ ξ 0 . 99 = 131 . 8 0 X 0 25 50 75 100 125 150 175 200

Draft 6 Density Estimation Suppose we estimate the density f over a finite interval [ a , b ]. Let ˆ f n ( x ) denote the density estimator at x , with sample size n . We use the following measures of error: � b E [(ˆ f n ( x ) − f ( x )) 2 ] d x = mean integrated squared error = MISE a = IV + ISB � b Var [ˆ IV = integrated variance = f n ( x )] d x a � b ( E [ˆ f n ( x )] − f ( x )) 2 d x ISB = integrated squared bias = a To minimize the MISE, we need to balance the IV and ISB.

Draft 7 Density Estimation Simple histogram: Partition [ a , b ] in m intervals of size h = ( b − a ) / m and define f n ( x ) = n j ˆ nh for x ∈ I j = [ a + ( j − 1) h , a + jh ) , j = 1 , ..., m where n j is the number of observations X i that fall in interval j . Kernel Density Estimator (KDE) : Select kernel k (unimodal symmetric density centered at 0) and bandwidth h > 0 (horizontal stretching factor for the kernel). The KDE is n n f n ( x ) = 1 � x − X i � = 1 � x − g ( U i ) � ˆ � � k k . nh h nh h i =1 i =1

Draft 8 Example of a KDE in s = 1 dimension KDE (blue) vs true density (red) with n = 2 19 : Here we take U 1 , . . . , U n in (0 , 1) and put X i = F − 1 ( U i ). midpoint rule (left) stratified sample of U = F ( X ) (right). (10 − 5 ) (10 − 5 ) 80 80 70 70 60 60 50 50 40 40 30 30 20 20 10 10 0 0 -5.1 -4.6 -4.1 -3.6 -5.1 -4.6 -4.1 -3.6

Draft 9 Asymptotic convergence with Monte Carlo for smooth f Here we assume independent random samples (Monte Carlo or given data). For histograms and KDEs, when n → ∞ and h → 0: C nh + Bh α . AMISE = AIV + AISB ∼ For any g : R → R , define � b � ∞ ( g ( x )) 2 d x , −∞ x r g ( x ) d x , R ( g ) = µ r ( g ) = for r = 0 , 1 , 2 , . . . a α C B R ( f ′ ) / 12 Histogram 1 2 ( µ 2 ( k )) 2 R ( f ′′ ) / 4 µ 0 ( k 2 ) KDE 4

Draft 10 The asymptotically optimal h is � C � 1 / ( α +1) h ∗ = B α n and it gives AMISE = Kn − α/ (1+ α ) . h ∗ C B α AMISE R ( f ′ ) ( nR ( f ′ ) / 6) − 1 / 3 O ( n − 2 / 3 ) Histogram 1 2 12 ( µ 2 ( k )) 2 R ( f ′′ ) � 1 / 5 µ 0 ( k 2 ) � µ 0 ( k 2 ) O ( n − 4 / 5 ) KDE 4 ( µ 2 ( k )) 2 R ( f ′′ ) n 4 To estimate h ∗ , one can estimate R ( f ′ ) and R ( f ′′ ) via KDE (plugin). This is true under the simplifying assumption that h must be the same all over [ a , b ].

Draft 11 Can we take the stochastic derivative of an estimator of F ? Can we estimate the density f ( x ) = F ′ ( x ) by the derivative of an estimator of F ( x ). A simple candidate cdf estimator is the empirical cdf n F n ( x ) = 1 ˆ � I [ X i ≤ x ] . n i =1 However d ˆ F n ( x ) / d x = 0 almost everywhere, so this cannot be a useful density estimator! We need a smoother estimator of F .

Draft 1 Density estimation by Monte Carlo and randomized - PowerPoint PPT Presentation

Draft 1 Density estimation by Monte Carlo and randomized quasi-Monte Carlo (RQMC) Pierre LEcuyer Joint work with Amal Ben Abdellah, Art B. Owen, and Florian Puchhammer NUS, Singapore, October 2019 Draft 2 What this talk is about Monte

DRAFT DRAFT DRAFT DRAFT DRAFT

DRAFT DRAFT A service of Maryland Health Benefit Exchange Broker Information DRAFT DRAFT 2

Agroforestry at Dartington Broadlears field Draft 1 - 22 August 2016 Draft 1 - 22 August 2016

DRAFT - Do not cite or quote DRAFT - Do not cite or quote DRAFT - Do not cite or quote 2017

2019 DRAFT RATES AND MONETARY AMOUNTS AND REVENUE LAWS AMENDMENT BILL, 2019 DRAFT TAXATION LAWS

DRAFT RESPONSE DOCUMENT 2017 DRAFT TAXATION LAWS AMENDMENT BILL (TLAB) AND DRAFT TAX

2017 STRATEGIC PLAN THE MISSISSIPPI FLOWS THROUGH US DRAFT CAPITAL PROJECTS DRAFT CAPITAL

2014 Draft Consolidated Financial Statements 2014 Draft Consolidated Financial Statements April

Master Plan Draft Document Master Plan Draft Document Accomplishments Master Plan Draft Document

DRAFT This is a draft paper and numbers and impacts are subject to change as further work is

Water Budgets and Modeling KGA Coordination Committee Meeting CA Water Foundation DRAFT DRAFT

SAVI IP Source Guard draft-baker-sava- draft-baker-sava- implementation Fred Baker Cases

Network Slicing Terms and Systems draft-galis-netslices-revised-problem-statement-01

The draft ICNIRP radiofrequency guidelines Eric van Rongen Chairman, ICNIRP DRAFT DO NOT

2015 - Another year of growth DRAFT IPO Presentation A truly connected business 1 DRAFT IPO

Draft ST3 Financial Plan (May 26, 2016) Expert Review Panel June 6, 2016 5-26-16 Draft ST3 Plan

LU -factorization and probabilities Vincent Vigon 6 septembre 2007 Vincent Vigon () LU

Steins method and Malliavin calculus for independent random variables H el` ene Halconruy

Semilattices and Domains Dana S. Scott University Professor, Emeritus Carnegie Mellon University

Evidence for General Relativity rotational frame-dragging in the light from the Sgr A*

Type and type transition for random walks on randomly directed lattices To Iain MacPhee, in

Static Analysis by Abstract Interpretation of communicating imperfectly-clocked Synchronous

Leon Henkin on Completeness Henkins renowned proofs of PhDs in Logic 2017 completeness The

Introduction to Computer Science Page 1 of 184 Sanjiva Prasad Go Back