Rare events: models and simulations Josselin Garnier (Universit e - PowerPoint PPT Presentation

Rare events: models and simulations Josselin Garnier (Universit´ e Paris Diderot) http://www.proba.jussieu.fr/~garnier Cf: http://www.proba.jussieu.fr/˜garnier/expo1.pdf • Introduction to uncertainty quantification. • Estimation of the probability of a rare event (such as the failure of a complex system). • Standard methods (quadrature, Monte Carlo). • Advanced Monte Carlo methods (variance reduction techniques). • Interacting particle systems. • Quantile estimation. CEMRACS 2013 Rare events

Uncertainty quantification • General problem: • How can we model the uncertainties in physical and numerical models ? • How can we estimate (quantify) the variability of the output of a code or an experiment as a function of the variability of the input parameters ? • How can we estimate quantify the sensitivity or the variability of the output of a code or an experiment with respect to one particular parameter ? CEMRACS 2013 Rare events

✬ ✩ ✬ ✩ ✬ ✩ Step 2 Step 1 Step 0 Uncertainty Quantification Definition of the model propagation ✲ ✲ of the sources Y = f ( X ) Central trends of uncertainty Input X Rare events ✫ ✪ ✫ ✪ Law of X Output Y ✫ ✪ ✤ ✜ Quantile ✻ Step 3 Sensitivity analysis ✛ Local variational approach ✣ ✢ Global stochastic approach CEMRACS 2013 Rare events

Uncertainty propagation • Context: numerical code (black box) or experiment Y = f ( X ) with Y = output X = random input parameters, with known distribution (with pdf p ( x )) � for any A ∈ B ( R d ) P ( X ∈ A ) = p ( x ) d x A f = deterministic function R d → R (computationally expensive). • Goal: estimation of a quantity of the form E [ g ( Y )] with an “error bar” and the minimal number of simulations. Examples (for a real-valued output Y ): • g ( y ) = y → mean of Y , i.e. E [ Y ] • g ( y ) = y 2 → variance of Y , i.e. Var( Y ) = E [( Y − E [ Y ]) 2 ] = E [ Y 2 ] − E [ Y ] 2 • g ( y ) = 1 [ a, ∞ ) ( y ) → probability P ( Y ≥ a ). CEMRACS 2013 Rare events

Analytic method • The quantity to be estimated is a d -dimensional integral: � I = E [ g ( Y )] = E [ h ( X )] = R d h ( x ) p ( x ) d x where p ( x ) is the pdf of X and h ( x ) = g ( f ( x )). • In simple cases (when the pdf p and the function h have explicit expressions), one can sometimes evaluate the integral exactly (exceptional situation). CEMRACS 2013 Rare events

Quadrature method • The quantity to be estimated is a d -dimensional integral: � I = E [ g ( Y )] = E [ h ( X )] = R d h ( x ) p ( x ) d x where p ( x ) is the pdf of X and h ( x ) = g ( f ( x )). • If p ( x ) = � d i =1 p 0 ( x i ), then it is possible to apply Gaussian quadrature with a tensorized grid with n d points: n n � � ˆ I = · · · ρ j 1 · · · ρ j d h ( ξ j 1 , . . . , ξ j d ) j 1 =1 j d =1 with the weights ( ρ j ) j =1 ,...,n and the points ( ξ j ) j =1 ,...,n associated to the quadrature with weighting function p 0 . • There exist quadrature methods with sparse grids (cf Smolyak). • Quadrature methods are efficient when: - the function x → h ( x ) is smooth (and not only f ), - the dimension d is “small” (even with sparse grids). They require many calls. CEMRACS 2013 Rare events

Monte Carlo method Principle: replace the statistical expectation E [ g ( Y )] by an empirical mean. CEMRACS 2013 Rare events

Monte Carlo method: “head and tail” model A code gives a real-valued output Y = f ( X ). For a given a we want to estimate P = P ( f ( X ) ≥ a ) • Monte Carlo method: 1) n simulations are carried out with n independent realizations X 1 , . . . , X n (with the distribution of X ). 2) let us define   1 if f ( X k ) ≥ a (head) Z k = 1 [ a, ∞ ) ( f ( X k )) =  0 if f ( X k ) < a (tail) • Intuition: when n is large, the empirical proportion of “1”s is close to P Z 1 + · · · + Z n ≃ P n Therefore the empirical proportion of “1”s can be used to estimate P . • Empirical estimator of P : n � P n := 1 ˆ Z k n k =1 CEMRACS 2013 Rare events

• Empirical estimator of P : n � P n := 1 ˆ Z k n k =1 • The estimator is unbiased: � � � � n n � � 1 = 1 ˆ P n = E Z k E [ Z k ] = E [ Z 1 ]= P E n n k =1 k =1 • The law of large numbers shows that the estimator is convergent: n � P n = 1 n →∞ ˆ − → E [ Z 1 ] = P Z k n k =1 because the variables Z k are independent and identically distributed (i.i.d.). • Result (law of large numbers): Let ( Z n ) n ∈ N ∗ be a sequence of i.i.d. random variables. If E [ | Z 1 | ] < ∞ , then n � 1 n →∞ − → m with probability 1, with m = E [ Z 1 ] Z k n k =1 “The empirical mean converges to the statistical mean”. CEMRACS 2013 Rare events

Error analysis: we want to quantify the fluctuations of ˆ P n around P . • Variance calculation: � P n − P ) 2 � Var[ ˆ ( ˆ P n ] = (mean square error) E �� n �� 2 � �� Z k − P � 2 � � Z k − P n � � = = E E n n k =1 k =1 n � � ( Z k − P ) 2 � � ( Z 1 − P ) 2 � 1 = 1 = E n E n 2 k =1 � � � − P 2 � 1 Z 2 = E 1 n 1 n ( P − P 2 ) = • The relative error is therefore: � � � 1 � 1 / 2 Var( ˆ Var( ˆ P n ) P n ) 1 √ n Error = = = P − 1 E [ ˆ P P n ] CEMRACS 2013 Rare events

Confidence intervals • Question : The estimator ˆ P n gives an approximate value of P , all the better as n is larger. How to quantify the error ? • Answer : We build a confidence interval at the level 0 . 95, i.e. an empirical a n , ˆ interval [ˆ b n ] such that � � a n , ˆ P ∈ [ˆ b n ] ≥ 0 . 95 P Construction based on the De Moivre theorem: √ �� c � P − P 2 2 � � e − x 2 / 2 dx n →∞ � ˆ P n − P � < c √ n − → √ P 2 π 0 The right member is 0 . 05 if c = 1 . 96. Therefore √ √ � � �� P − P 2 P − P 2 ˆ , ˆ P ∈ P n − 1 . 96 √ n P n + 1 . 96 √ n ≃ 0 . 95 P • Result (central limit theorem): Let ( Z n ) n ∈ N ∗ be a sequence of i.i.d. random variables. If E [ Z 2 1 ] < ∞ , then � 1 � n � √ n n →∞ → N (0 , σ 2 ) in distribution Z k − m − n k =1 where m = E [ Z 1 ] and σ 2 = Var( Z 1 ). � n “For large n , the error 1 k =1 Z k − m has Gaussian distribution N (0 , σ 2 /n ).” n CEMRACS 2013 Rare events

√ √ � � �� P − P 2 P − P 2 ˆ , ˆ P ∈ P n − 1 . 96 √ n √ n ≃ 0 . 95 P n + 1 . 96 P The unknown parameter P is still in the bounds of the interval ! Two solutions: √ P − P 2 < 1 / 2 and - P ∈ [0 , 1], therefore � � �� P n − 0 . 98 1 P n + 0 . 98 1 ˆ √ n, ˆ P ∈ √ n ≥ 0 . 95 P - asymptotically, we can replace P in the bounds by ˆ P n (OK if nP > 10 and n (1 − P ) > 10): � �     P n − ˆ ˆ P n − ˆ ˆ P 2 P 2 n n  ˆ , ˆ  ≃ 0 . 95  P ∈  √ n √ n P n − 1 . 96 P n + 1 . 96 P Conclusion: There is no bounded interval of R that contains P with probability one. There are bounded intervals (called confidence intervals) that contain P with probability close to one (chosen by the user). CEMRACS 2013 Rare events

Monte Carlo estimation: black box model • Black box model (numerical code) Y = f ( X ) We want to estimate I = E [ g ( Y )], for some function g : R → R . • Empirical estimator: n � I n = 1 � g ( f ( X k )) n k =1 where ( X k ) k =1 ,...,n is a n -sample of X . This is the empirical mean of a sequence of i.i.d. random variables. • The estimator � I n is unbiased: E [ � I n ] = I . • The law of large numbers gives the convergence of the estimator: n →∞ � I n − → I with probability 1 CEMRACS 2013 Rare events

• Error: I n ) = 1 Var( � n Var( g ( Y )) Proof: the variance of a sum of i.i.d. random variables if the sum of the variances. • Asymptotic confidence interval: � � �� I n − 1 . 96 ˆ σ n I n + 1 . 96 ˆ σ n � √ n, � I ∈ √ n ≃ 0 . 95 P where � � 1 / 2 n � 1 g ( f ( X k )) 2 − � I 2 σ n = ˆ n n k =1 • Advantages of the MC method: 1) no regularity condition for f , g (condition: E [ g ( f ( X )) 2 ] < ∞ ). 2) convergence rate 1 / √ n in any dimension. 3) can be applied for any quantity that can be expressed as an expectation. • One needs to simulate samples of X . CEMRACS 2013 Rare events

Simulation of random variables • How do we generate random numbers with a computer ? There is nothing random in a computer ! • Strategy: - find a pseudo random number generator that can generate a sequence of numbers that behave like independant copies of a random variable with uniform distribution over (0 , 1). - use deterministic transforms to generate numbers with any prescribed distribution using only the uniform pseudo random number generator. CEMRACS 2013 Rare events

• Pseudo random number generator A 32-bit multiplicative congruential generator: x n +1 = ax n mod b, with a = 7 5 , b = 2 31 − 1, and some integer x 0 . This gives a sequence of integer numbers in { 0 , 1 , ..., 2 31 − 2 } . The sequence u n = x n / (2 31 − 1) gives a “quasi-real” number between 0 and 1. Note: the sequence is periodic, with period 2 31 − 1. This is the generator mcg16807 of matlab (used in early versions). Today: matlab uses mt19937ar (the period is 2 19937 − 1). CEMRACS 2013 Rare events

Rare events: models and simulations Josselin Garnier (Universit e - PowerPoint PPT Presentation

Rare events: models and simulations Josselin Garnier (Universit e Paris Diderot) http://www.proba.jussieu.fr/~garnier Cf: http://www.proba.jussieu.fr/garnier/expo1.pdf Introduction to uncertainty quantification. Estimation of the

Interacting particle systems for the analysis of rare events Josselin Garnier (Universit e

It os calculus in physics and stochastic partial differential equations Josselin Garnier

Multiscale Analysis of Wave Propagation and Imaging in Random Media Josselin Garnier (Ecole

ISBT WORKING PARTY ON RARE DONORS Israel Have a national Rare Donor Program: Yes Number of

Atomistic simulations of rare events using the using the gentlest ascent gentlest ascent

CME / CE Rare Disease Clinical Trials Module: Rare Pulmonology Clinical Trials Timothy Craig,

Rare Rare Rare-earth Rare-earth earth-based half earth-based half based half-Heusler based

COMPUTER COMPUTER COMPUTER COMPUTER SIMULATIONS SIMULATIONS SIMULATIONS SIMULATIONS

Rare cancers What is the problem, and how big is it? Rare (orphan) diseases NIH Office for Rare

ISBT WORKING PARTY ON RARE DONORS Canada Canadian Blood Services Have a national Rare Donor

RAREDISEASEDAY.ORG 1 Rare Disease Day 2020 13 th edition on Saturday 29 February 2020

ISBT WORKING PARTY ON RARE DONORS SINGAPORE Have a national Rare Donor Program: YES Number of

CME / CE Rare Disease Clinical Trials Focus on Rare Hematologic Clinical Trials Morie A Gertz

Welcome to Idorsias AGM 2018 Meeting structure 1 Opening Remarks Jean-Pierre Garnier,

Model for DNA-less cell fate. Guillaume Garnier Team GO ParisSaclay 2019 Universit

2019 Meeting structure 1 Opening Remarks Jean-Pierre Garnier, Chairman of the Board 2

Tree-based and GA tools for optimal sampling design The R User Conference 2008 August 12-14,

Advance Stochastic Gradient with Variance Reduction Jingchang Liu December 7, 2017 University

Stratified Monte Carlo Integration and Applications R. El Haddad, R. Fakhereddine, C. L ecot,

Statistics I Chapter 7 Sampling Distributions (Part 1) Ling-Chieh Kung Department of

BlinkDB (some figures were poached from the Eurosys conference talk) The Holy Grail Support

Coxs proportional hazards/regression model - model assessment Rasmus Waagepetersen October

Announcements TCE website open - please fill it out! no assignment due next week So

Sampling Methods: How to collect data Some important terms Random - occurring by chance