advanced simulation lecture 3
play

Advanced Simulation - Lecture 3 Patrick Rebeschini January 22nd, - PowerPoint PPT Presentation

Advanced Simulation - Lecture 3 Patrick Rebeschini January 22nd, 2018 Patrick Rebeschini Lecture 3 1/ 23 From a statistical problem to a sampling problem From a statistical model you get a likelihood function and a prior on the parameters.


  1. Advanced Simulation - Lecture 3 Patrick Rebeschini January 22nd, 2018 Patrick Rebeschini Lecture 3 1/ 23

  2. From a statistical problem to a sampling problem From a statistical model you get a likelihood function and a prior on the parameters. Applying Bayes rule, you are interested in L (observations; θ ) p ( θ ) π ( θ | observations) = � Θ L (observations; θ ) p ( θ ) dθ. Inference ≡ integral w.r.t. posterior distribution. Integrals can be approximated by Monte Carlo. For Monte Carlo you need samples. Today: inversion, transformation, composition, rejection. Patrick Rebeschini Lecture 3 2/ 23

  3. Inversion Method Consider a real-valued random variable X and its associated cumulative distribution function (cdf) F ( x ) = P ( X ≤ x ) = F ( x ) . The cdf F : R → [0 , 1] is increasing; i.e. if x ≤ y then F ( x ) ≤ F ( y ), right continuous; i.e. F ( x + ε ) → F ( x ) as ε → 0 + , F ( x ) → 0 as x → −∞ and F ( x ) → 1 as x → + ∞ . We define the generalised inverse F − ( u ) = inf { x ∈ R ; F ( x ) ≥ u } also known as the quantile function. Patrick Rebeschini Lecture 3 3/ 23

  4. Inversion Method 1 F ( x ) u F − ( u ) x Figure: Cumulative distribution function F and representation of the inverse cumulative distribution function. Patrick Rebeschini Lecture 3 4/ 23

  5. Inversion Method Proposition . Let F be a cdf and U ∼ U [0 , 1] . Then X = F − ( U ) has cdf F . In other words, to sample from a distribution with cdf F , we can sample U ∼ U [0 , 1] and then return F − ( U ). Proof . F − ( u ) ≤ x ⇔ u ≤ F ( x ) so for U ∼ U [0 , 1] , we have � = P ( U ≤ F ( x )) = F ( x ) . � F − ( U ) ≤ x P Patrick Rebeschini Lecture 3 5/ 23

  6. Examples Exponential distribution . If F ( x ) = 1 − e − λx , then F − ( u ) = F − 1 ( u ) = − log (1 − u ) /λ . Thus when U ∼ U [0 , 1] , − log (1 − U ) /λ ∼ E xp ( λ ) and − log ( U ) /λ ∼ E xp ( λ ). Discrete distribution . Assume X takes values x 1 < x 2 < · · · with probability p 1 , p 2 , ... so � F ( x ) = p k , x k ≤ x F − ( u ) = x k for p 1 + · · · + p k − 1 < u ≤ p 1 + · · · + p k . Patrick Rebeschini Lecture 3 6/ 23

  7. Transformation Method Let Y ∼ q be a Y -valued random variable that we can simulate (e.g., by inversion) Let X ∼ π be X -valued, which we wish to simulate. It may be that we can find a function ϕ : Y → X with the property that if we simulate Y ∼ q and then set X = ϕ ( Y ) then we get X ∼ π. Inversion is a special case of this idea. Patrick Rebeschini Lecture 3 7/ 23

  8. Transformation Method Gamma distribution . Let Y i , i = 1 , 2 , ..., α , be i.i.d. with Y i ∼ E xp (1) and X = β − 1 � α i =1 Y i then X ∼ G a ( α, β ). Proof . The moment generating function of X is � e tX � α � � � e β − 1 tY i = (1 − t/β ) − α E = E i =1 which is the MGF of the gamma density π ( x ) ∝ x α − 1 exp ( − βx ) of parameters α, β . Beta distribution . See Lecture Notes. Patrick Rebeschini Lecture 3 8/ 23

  9. Transformation Method - Box-Muller Algorithm Gaussian distribution. Let U 1 ∼ U [0 , 1] and U 2 ∼ U [0 , 1] be independent and set � − 2 log ( U 1 ) , ϑ = 2 πU 2 . R = We have X = R cos ϑ ∼ N (0 , 1) , Y = R sin ϑ ∼ N (0 , 1) . � � Indeed R 2 ∼ E xp 1 and ϑ ∼ U [0 , 2 π ] so 2 � 1 � � � = 1 r 2 , θ − r 2 / 2 q 2 exp 2 π. Patrick Rebeschini Lecture 3 9/ 23

  10. Transformation Method - Box-Muller Algorithm Bijection: � √ � √ r 2 cos θ, r 2 sin θ ( x, y ) = � � � � x 2 + y 2 , arctan ( y/x ) r 2 , θ ⇔ = so � � � � r 2 , θ � � � � � det ∂ � � r 2 , θ π ( x, y ) = q � � � ∂ ( x, y ) where � � � �� � � r 2 , θ � − 1 � � � � cos θ � det ∂ − r sin θ � = 1 � � � � 2 r = � det 2 . � � � � sin θ � r cos θ ∂ ( x, y ) 2 r Hence we have � � x 2 + y 2 � � π ( x, y ) = 1 2 π exp − / 2 . Patrick Rebeschini Lecture 3 10/ 23

  11. Transformation Method - Multivariate Normal Let Z = ( Z 1 , ..., Z d ) i.i.d. ∼ N (0 , 1). Let L be a real invertible d × d matrix satisfying L L T = Σ, and X = LZ + µ . Then X ∼ N ( µ, Σ) . � � We have indeed q ( z ) = (2 π ) − d/ 2 exp 2 z T z − 1 and π ( x ) = q ( z ) | det ∂z/∂x | � L − 1 � = det (Σ) − 1 / 2 . where ∂z/∂x = L − 1 and det Additionally, z T z = ( x − µ ) T � L − 1 � T L − 1 ( x − µ ) = ( x − µ ) T Σ − 1 ( x − µ ) . In practice, use a Cholesky factorization Σ = L L T where L is a lower triangular matrix. Patrick Rebeschini Lecture 3 11/ 23

  12. Sampling via Composition Assume we have a joint pdf π with marginal π ; i.e. � π ( x ) = π X,Y ( x, y ) dy where π ( x, y ) can always be decomposed as π X,Y ( x, y ) = π Y ( y ) π X | Y ( x | y ) . It might be easy to sample from π ( x, y ) whereas it is difficult/impossible to compute π ( x ) . In this case, it is sufficient to sample Y ∼ π Y then X | Y ∼ π X | Y ( ·| Y ) so ( X, Y ) ∼ π X,Y and hence X ∼ π . Patrick Rebeschini Lecture 3 12/ 23

  13. Finite Mixture of Distributions Assume one wants to sample from p � π ( x ) = α i .π i ( x ) i =1 � π i ( x ) dx = 1 . where α i > 0 , � p i =1 α i = 1 and π i ( x ) ≥ 0 , We can introduce Y ∈ { 1 , ..., p } and π X,Y ( x, y ) = α y × π y ( x ) . To sample from π ( x ), first sample Y from a discrete distribution such that P ( Y = k ) = α k then X | ( Y = y ) ∼ π y . Patrick Rebeschini Lecture 3 13/ 23

  14. Rejection Sampling Basic idea : Sample from a proposal q different from the target π and correct through rejection step to obtain a sample from π . Algorithm (Rejection Sampling) . Given two densities π, q with π ( x ) ≤ M q ( x ) for all x , we can generate a sample from π by 1 Draw X ∼ q , draw U ∼ U [0 , 1] . 2 Accept X = x as a sample from π if π ( x ) U ≤ M q ( x ) , otherwise go to step 1. Patrick Rebeschini Lecture 3 14/ 23

  15. Rejection Sampling Proposition . The distribution of the samples accepted by rejection sampling is π . Proof . We have for any (measurable) set A P ( X ∈ A | X accepted) = P ( X ∈ A, X accepted) P ( X accepted) where � � � � 1 π ( x ) P ( X ∈ A, X accepted) = I A ( x ) I u ≤ q ( x ) dudx M q ( x ) X 0 � π ( x ) = I A ( x ) M q ( x ) q ( x ) dx X � I A ( x ) π ( x ) M dx = π ( A ) = . M X Patrick Rebeschini Lecture 3 15/ 23

  16. Rejection Sampling So P ( X accepted) = P ( X ∈ X , X accepted) = π ( X ) = 1 M M and P ( X ∈ A | X accepted) = π ( A ) . Rejection sampling produces samples from π . It requires to be able to evaluate the density of π point-wise, and an upper bound M on π ( x ) /q ( x ). Patrick Rebeschini Lecture 3 16/ 23

  17. Rejection Sampling In most practical scenarios, we only know π and q up to some normalising constants; i.e. π = � π/Z π and q = ˜ q/Z q � � where � π, ˜ q are known but Z π = π ( x ) dx , Z q = X ˜ q ( x ) dx X � are unknown. If Z π , Z q are unknown but you can upper bound: q ( x ) ≤ � π ( x ) / � � M, q and � then using � π , � M in the algorithm is correct. Indeed we have π ( x ) M ⇔ π ( x ) M Z q � q ( x ) ≤ � q ( x ) ≤ � = M. Z π � Patrick Rebeschini Lecture 3 17/ 23

  18. Rejection Sampling Let T denote the number of pairs ( X, U ) that have to be generated until X is accepted for the first time. Lemma . T is geometrically distributed with parameter 1 /M and in particular E ( T ) = M. In the unnormalised case, this yields P ( X accepted) = 1 Z π M = , � MZ q E ( T ) = M = Z q � M , Z π and it can be used to provide unbiased estimates of Z π /Z q and Z q /Z π . Patrick Rebeschini Lecture 3 18/ 23

  19. Examples Uniform density on a bounded subset of R p . Consider the problem of sampling uniformly over B ⊂ R p , a bounded subset of R p : π ( x ) ∝ I B ( x ) . Let R be a rectangle with B ⊂ R and q ( x ) ∝ I R ( x ) . Then we can use � M = 1 and � � � M ′ � π ( x ) / � q ( x ) = I B ( x ) . The probability of accepting a sample is then Z π /Z q . Patrick Rebeschini Lecture 3 19/ 23

  20. Examples � 2 x 2 � − 1 Normal density . Let � π ( x ) = exp and � 1 + x 2 � . We have q ( x ) = 1 / � � � � 1 + x 2 � ≤ 2 / √ e = � π ( x ) � − 1 2 x 2 q ( x ) = exp M � which is attained at ± 1. The acceptance probability is � � √ � e π ( X ) Z π 2 π � U ≤ 2 π ≈ 0 . 66 , P = = √ e π = 2 � � M � q ( X ) MZ q and the mean number of trials to success is approximately 1 / 0 . 66 ≈ 1 . 52 . Patrick Rebeschini Lecture 3 20/ 23

  21. Examples: Genetic linkage model We observe � � n ; 1 2 + θ 4 , 1 4 (1 − θ ) , 1 4 (1 − θ ) , θ ( Y 1 , Y 2 , Y 3 , Y 4 ) ∼ M 4 where M is the multinomial distribution and θ ∈ (0 , 1) . The likelihood of the observations is thus p ( y 1 , ..., y 4 ; θ ) � 1 � y 1 � 1 � y 2 + y 3 � θ � y 4 n ! 2 + θ = 4 (1 − θ ) y 1 ! y 2 ! y 3 ! y 4 ! 4 4 ∝ (2 + θ ) y 1 (1 − θ ) y 2 + y 3 θ y 4 . Bayesian approach where we select p ( θ ) = I [0 , 1] ( θ ) and are interested in p ( θ | y 1 , ..., y 4 ) ∝ (2 + θ ) y 1 (1 − θ ) y 2 + y 3 θ y 4 I [0 , 1] ( θ ) . Patrick Rebeschini Lecture 3 21/ 23

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend