Draft 1 Density estimation by Monte Carlo and randomized - PowerPoint PPT Presentation

Draft 1 Density estimation by Monte Carlo and randomized quasi-Monte Carlo Pierre L’Ecuyer Joint work with Amal Ben Abdellah, Art B. Owen, and Florian Puchhammer S´ eminaire au GERAD , Montr´ eal, 27 mars 2019

Draft 2 What this talk is about Quasi-Monte Carlo (QMC) and randomized QMC (RQMC) methods have been studied extensively for estimating an integral, say E [ X ], and for approximating a function from its evaluation at a finite number of points. How can we use them to estimate the entire distribution of X ? Here we will focus on estimating the density of X over [ a , b ] ⊂ R .

Draft 2 What this talk is about Quasi-Monte Carlo (QMC) and randomized QMC (RQMC) methods have been studied extensively for estimating an integral, say E [ X ], and for approximating a function from its evaluation at a finite number of points. How can we use them to estimate the entire distribution of X ? Here we will focus on estimating the density of X over [ a , b ] ⊂ R . People often look at empirical distributions via histograms, for example. More refined methods: kernel density estimators (KDEs). Can RQMC improve such density estimators, and by how much? Are there other types of density estimators than KDEs, that work better with RQMC?

Draft 2 What this talk is about Quasi-Monte Carlo (QMC) and randomized QMC (RQMC) methods have been studied extensively for estimating an integral, say E [ X ], and for approximating a function from its evaluation at a finite number of points. How can we use them to estimate the entire distribution of X ? Here we will focus on estimating the density of X over [ a , b ] ⊂ R . People often look at empirical distributions via histograms, for example. More refined methods: kernel density estimators (KDEs). Can RQMC improve such density estimators, and by how much? Are there other types of density estimators than KDEs, that work better with RQMC? We will discuss an alternative that takes the sample derivative of a smoothed estimator of the cumulative distribution function (cdf). The smoothing can be achieved via conditional Monte Carlo, for example.

Draft 3 Small example: A stochastic activity network Gives precedence relations between activities. Activity k has random duration Y k (also length of arc k ) with known cdf F k ( y ) := P [ Y k ≤ y ] . Project duration X = (random) length of longest path from source to sink. Want to estimate the density of X , sink f ( x ) = F ′ ( x ) = d d x P [ X ≤ x ]. 5 8 Y 10 Y 5 Y 9 Y 12 2 4 7 Y 4 Y 8 Y 1 Y 2 Y 7 Y 11 0 1 3 6 Y 0 Y 3 Y 6 source

Draft 3 Small example: A stochastic activity network Gives precedence relations between activities. Activity k has random duration Y k (also length of arc k ) with known cdf F k ( y ) := P [ Y k ≤ y ] . Project duration X = (random) length of longest path from source to sink. Want to estimate the density of X , sink f ( x ) = F ′ ( x ) = d d x P [ X ≤ x ]. 5 8 Y 10 The sample cdf Y 5 Y 9 Y 12 ˆ � n F n ( x ) = 1 i =1 I [ X i ≤ x ] n is an unbiased estimator of the cdf 2 4 7 F ( x ) = P [ X ≤ x ] . Y 4 Y 8 Y 1 Y 2 Y 7 Y 11 0 1 3 6 Y 0 Y 3 Y 6 source

Draft 3 Small example: A stochastic activity network Gives precedence relations between activities. Activity k has random duration Y k (also length of arc k ) with known cdf F k ( y ) := P [ Y k ≤ y ] . Project duration X = (random) length of longest path from source to sink. Want to estimate the density of X , sink f ( x ) = F ′ ( x ) = d d x P [ X ≤ x ]. 5 8 Y 10 The sample cdf Y 5 Y 9 Y 12 ˆ � n F n ( x ) = 1 i =1 I [ X i ≤ x ] n is an unbiased estimator of the cdf 2 4 7 F ( x ) = P [ X ≤ x ] . Y 4 Y 8 Y 1 Y 2 Y 7 Y 11 But its derivative ˆ F ′ n ( x ) is not a mean- ingful estimator of f ( x ), because it is 0 0 1 3 6 Y 0 Y 3 Y 6 almost everywhere. source

Draft 4 Numerical illustration from Elmaghraby (1977): Y k ∼ N ( µ k , σ 2 k ) for k = 0 , 1 , 3 , 10 , 11, and Y k ∼ Expon(1 /µ k ) otherwise. µ 0 , . . . , µ 12 : 13.0, 5.5, 7.0, 5.2, 16.5, 14.7, 10.3, 6.0, 4.0, 20.0, 3.2, 3.2, 16.5. Results of an experiment with n = 100 000. Frequency mean = 64 . 2 X det = 48 . 2 10000 5000 ˆ ξ 0 . 99 = 131 . 8 0 T 0 25 50 75 100 125 150 175 200

Draft 5 Setting Classical density estimation was developed in the context where independent observations X 1 , . . . , X n of X are given and one wishes to estimate the density f of X from that. Here we assume that X 1 , . . . , X n are generated by simulation from a stochastic model. We can choose n and we have some freedom on how the simulation is performed. The X i ’s are realizations of a random variable X = g ( U ) ∈ R with density f , where U = ( U 1 , . . . , U s ) ∼ U (0 , 1) s and g ( u ) can be computed easily for any u ∈ (0 , 1) s . Can we obtain a better estimate of f with RQMC instead of MC? How much better? Whats about taking a stratified sample over [0 , 1) s ?

Draft 6 Density Estimation Suppose we estimate the density f over a finite interval [ a , b ]. Let ˆ f n ( x ) denote the density estimator at x , with sample size n . We use the following measures of error: � b E [(ˆ f n ( x ) − f ( x )) 2 ] d x = mean integrated squared error = MISE a = IV + ISB � b Var [ˆ IV = integrated variance = f n ( x )] d x a � b ( E [ˆ f n ( x )] − f ( x )) 2 d x ISB = integrated squared bias = a

Draft 7 Density Estimation Simple histogram: Partition [ a , b ] in m intervals of size h = ( b − a ) / m and define f n ( x ) = n j ˆ nh for x ∈ I j = [ a + ( j − 1) h , a + jh ) , j = 1 , ..., m where n j is the number of observations X i that fall in interval j . Kernel Density Estimator (KDE) : Select kernel k (unimodal symmetric density centered at 0) and bandwidth h > 0 (horizontal stretching factor for the kernel). The KDE is n f n ( x ) = 1 � x − X i � ˆ � k . nh h i =1

Draft 8 KDE bandwidth selection: an illustration in s = 1 dimension KDE (blue) vs true density (red) with RQMC point sets with n = 2 19 : lattice + shift (left), Stratified sample of U = F ( X ) (right) (10 − 5 ) (10 − 5 ) 80 80 70 70 60 60 50 50 40 40 30 30 20 20 10 10 0 0 -5.1 -4.6 -4.1 -3.6 -5.1 -4.6 -4.1 -3.6

Draft 9 Asymptotic convergence with Monte Carlo for smooth f For any g : R → R , define � b ( g ( x )) 2 d x , R ( g ) = a � ∞ x r g ( x ) d x , µ r ( g ) = for r = 0 , 1 , 2 , . . . −∞ For histograms and KDEs, when n → ∞ and h → 0: C nh + Bh α . AMISE = AIV + AISB ∼ α C B R ( f ′ ) / 12 Histogram 1 2 ( µ 2 ( k )) 2 R ( f ′′ ) / 4 µ 0 ( k 2 ) KDE 4

Draft 10 The asymptotically optimal h is � C � 1 / ( α +1) h ∗ = B α n and it gives AMISE = Kn − α/ (1+ α ) . h ∗ C B α AMISE R ( f ′ ) ( nR ( f ′ ) / 6) − 1 / 3 O ( n − 2 / 3 ) Histogram 1 2 12 ( µ 2 ( k )) 2 R ( f ′′ ) � 1 / 5 µ 0 ( k 2 ) � µ 0 ( k 2 ) O ( n − 4 / 5 ) KDE 4 ( µ 2 ( k )) 2 R ( f ′′ ) n 4 To estimate h ∗ , one can estimate R ( f ′ ) and R ( f ′′ ) via KDE (plugin). This is under the simplifying assumption that h must be the same all over [ a , b ].

Draft 11 Elementary quasi-Monte Carlo (QMC) Bounds (Recall) Integration error for g : [0 , 1) s → R with point set P n = { u 0 , . . . , u n − 1 } ⊂ [0 , 1) s : n − 1 E n = 1 � � g ( u i ) − [0 , 1) s g ( u ) d u . n i =0 Koksma-Hlawka inequality: | E n | ≤ V HK ( g ) D ∗ ( P n ) where � � ∂ | v | g � � � � V HK ( g ) = ∂ v ( u ) � d u , (Hardy-Krause (HK) variation) � � � � [0 , 1) s � ∅� = v ⊆S � � � vol [ 0 , u ) − | P n ∩ [ 0 , u ) | D ∗ ( P n ) � � = sup (star-discrepancy) . � � n u ∈ [0 , 1) s � There are explicit point sets for which D ∗ ( P n ) = O ((log n ) s − 1 / n ) = O ( n − 1+ ǫ ) , ∀ ǫ > 0. Explicit RQMC constructions for which E [ E n ] = 0 and Var [ E n ] = O ( n − 2+ ǫ ) , ∀ ǫ > 0. With ordinary Monte Carlo (MC), one has Var [ E n ] = O ( n − 1 ).

Draft 12 Asymptotic convergence of KDE with RQMC Idea: Replace U 1 , . . . , U n by RQMC points. RQMC does not change the bias. For a KDE with smooth k , one could hope (perhaps) to get AIV = C ′ n − β h − 1 for β > 1, instead of Cn − 1 h − 1 . If the IV is reduced, the optimal h can be taken smaller to reduce the ISB as well (re-balance) and then reduce the MISE.

Draft 12 Asymptotic convergence of KDE with RQMC Idea: Replace U 1 , . . . , U n by RQMC points. RQMC does not change the bias. For a KDE with smooth k , one could hope (perhaps) to get AIV = C ′ n − β h − 1 for β > 1, instead of Cn − 1 h − 1 . If the IV is reduced, the optimal h can be taken smaller to reduce the ISB as well (re-balance) and then reduce the MISE. Unfortunately, things are not so simple. Roughly, decreasing h increases the variation of the function in the KDE estimator. So we rather have something like AIV = C ′ n − β h − δ IV ≈ C ′ n − β h − δ in some bounded region, for β > 1 and δ > 1. or

Draft 13 Bounding the AIV under RQMC for a KDE KDE density estimator at a single point x : n n � x − g ( U i ) � f n ( x ) = 1 1 = 1 ˆ � � hk g ( U i ) . ˜ n h n i =1 i =1 g ( u ) d u = E [ˆ � With RQMC points U i , this is an RQMC estimator of E [˜ g ( U )] = [0 , 1) s ˜ f n ( x )]. RQMC does not change the bias, but may reduce Var [ˆ f n ( x )], and then the IV. To get RQMC variance bounds, we need bounds on the variation of ˜ g .

Draft 1 Density estimation by Monte Carlo and randomized - PowerPoint PPT Presentation

Draft 1 Density estimation by Monte Carlo and randomized quasi-Monte Carlo Pierre LEcuyer Joint work with Amal Ben Abdellah, Art B. Owen, and Florian Puchhammer S eminaire au GERAD , Montr eal, 27 mars 2019 Draft 2 What this talk is

DRAFT DRAFT DRAFT DRAFT DRAFT

DRAFT DRAFT A service of Maryland Health Benefit Exchange Broker Information DRAFT DRAFT 2

Agroforestry at Dartington Broadlears field Draft 1 - 22 August 2016 Draft 1 - 22 August 2016

DRAFT - Do not cite or quote DRAFT - Do not cite or quote DRAFT - Do not cite or quote 2017

2019 DRAFT RATES AND MONETARY AMOUNTS AND REVENUE LAWS AMENDMENT BILL, 2019 DRAFT TAXATION LAWS

DRAFT RESPONSE DOCUMENT 2017 DRAFT TAXATION LAWS AMENDMENT BILL (TLAB) AND DRAFT TAX

2017 STRATEGIC PLAN THE MISSISSIPPI FLOWS THROUGH US DRAFT CAPITAL PROJECTS DRAFT CAPITAL

2014 Draft Consolidated Financial Statements 2014 Draft Consolidated Financial Statements April

Master Plan Draft Document Master Plan Draft Document Accomplishments Master Plan Draft Document

DRAFT This is a draft paper and numbers and impacts are subject to change as further work is

Water Budgets and Modeling KGA Coordination Committee Meeting CA Water Foundation DRAFT DRAFT

SAVI IP Source Guard draft-baker-sava- draft-baker-sava- implementation Fred Baker Cases

Network Slicing Terms and Systems draft-galis-netslices-revised-problem-statement-01

The draft ICNIRP radiofrequency guidelines Eric van Rongen Chairman, ICNIRP DRAFT DO NOT

2015 - Another year of growth DRAFT IPO Presentation A truly connected business 1 DRAFT IPO

Draft ST3 Financial Plan (May 26, 2016) Expert Review Panel June 6, 2016 5-26-16 Draft ST3 Plan

Static Analysis versus Software Model Checking for bug finding Dawson Englers, Madanlal Musuvathi

Yutao Zhang + , Jie Tang + , Zhilin Yang + , Jian Pei # , and Philip S. Yu* + Tsinghua University #

development upgrades Michael Brown NGM Firmware Lead Technologist Dell EMC Embedded Linux

CMC MEETING 24/5/2016, Tuesday AGENDA Taking attendance during assembly Reminders on

NLO corrections to hard process in Parton Shower MC KrkNLO method S. JADACH Contributions

Lattices from equiangular tight frames Lenny Fukshansky Claremont McKenna College (joint work

Large scale data processing pipelines at trivago: a use case 2016-11-15, Sevilla, Spain Clemens

CR16 Architecture Part of a microcontroller family from National Semiconductor 16-bit