Stochastic Simulation Methods: Variance reduction methods - - PowerPoint PPT Presentation

stochastic simulation
SMART_READER_LITE
LIVE PREVIEW

Stochastic Simulation Methods: Variance reduction methods - - PowerPoint PPT Presentation

Variance reduction methods Variance reduction methods To obtain better estimates with the same ressources Exploit analytical knowledge and/or correlation Stochastic Simulation Methods: Variance reduction methods Antithetic


slide-1
SLIDE 1

Stochastic Simulation Variance reduction methods

Bo Friis Nielsen

Applied Mathematics and Computer Science Technical University of Denmark 2800 Kgs. Lyngby – Denmark Email: bfni@dtu.dk Bo Friis Nielsen 8/6 2016 02443 – lecture 7 2

DTU

Variance reduction methods Variance reduction methods

  • To obtain better estimates with the same ressources
  • Exploit analytical knowledge and/or correlation
  • Methods:

⋄ Antithetic variables ⋄ Control variates ⋄ Stratified sampling ⋄ Importance sampling

Bo Friis Nielsen 8/6 2016 02443 – lecture 7 3

DTU

Case: Monte Carlo evaluation of integral Case: Monte Carlo evaluation of integral

Consider the integral 1 exdx We can interpret this interval as E

  • eU

= 1 exdx = θ U ∈ U(0, 1) To estimate the integral: sample of the random variable eU and take the average. Xi = eUi ¯ X = n

i=1 Xi

n This is the crude Monte Carlo estimator , “crude” because we use no refinements whatsoever.

Bo Friis Nielsen 8/6 2016 02443 – lecture 7 4

DTU

Analytical considerations Analytical considerations

It is straightforward to calculate the integral in this case 1 exdx = e − 1 ≈ 1.72 The estimator X E(X) = e − 1 V (X) = E(X2) − E(X)2 E(X2) = 1 (ex)2 dx = 1 2

  • e2 − 1
  • Based on one observation

V (X) = 1 2

  • e2 − 1
  • − (e − 1)2 = 0.2420
slide-2
SLIDE 2

Bo Friis Nielsen 8/6 2016 02443 – lecture 7 5

DTU

Antithetic variables Antithetic variables

General idea: to exploit correlation

  • If the estimator is positively correlated with Ui (monotone

function): Use 1 − U also Yi = eUi + e1−Ui 2 = eUi +

e eUi

2 ¯ Y = n

i=0 Yi

n

  • The computational effort of calculating ¯

Y should be similar to the effort needed to compute ¯ X. ⋄ By the latter expression of Yi we can generate the same number of Y ’s as X’s

Bo Friis Nielsen 8/6 2016 02443 – lecture 7 6

DTU

Antithetic variables - analytical Antithetic variables - analytical

We can analyse the example analytically due to its simplicity E( ¯ Y ) = E( ¯ X) = θ To calculate V ( ¯ Y ) we start with V (Yi). V (Yi) = 1 4V

  • eUi

+ 1 4V

  • e1−Ui

+ 2 · 1 4Cov

  • eUi, e1−Ui

= 1 2V

  • eUi

+ 1 2Cov

  • eUi(e1−Ui

Cov

  • eUi, e1−Ui

= E

  • eUie1−Ui

− E

  • eUi

E

  • e1−Ui

= e − (e − 1)2 = 3e − e2 − 1 = −0.2342 V (Yi) = 1 2(0.2420 − 0.2342) = 0.0039

Bo Friis Nielsen 8/6 2016 02443 – lecture 7 7

DTU

Comparison: Crude method vs. antithetic Comparison: Crude method vs. antithetic

Crude method: V (Xi) = 1 2

  • e2 − 1
  • − (e − 1)2 = 0.2420

Antithetic method: V (Yi) = 1 2(0.2420 − 0.2342) = 0.0039 I.e, a reduction by 98 % , almost for free. The variance on ¯ X - and ¯ Y - will scale with 1/n, the number of samples. Going from crude to antithetic method, reduces the variance as much as increasing number of samples with a factor 50.

Bo Friis Nielsen 8/6 2016 02443 – lecture 7 8

DTU

Antethetic variables in more complex models Antethetic variables in more complex models

If X = h(U1, . . . , Un) where h is monotone in each of its coordinates, then we can use antithetic variables Y = h(1 − U1, . . . , 1 − Un) to reduce the variance, because Cov(X, Y ) ≤ 0 and therefore V( 1

2(X + Y )) ≤ 1 2VX.

slide-3
SLIDE 3

Bo Friis Nielsen 8/6 2016 02443 – lecture 7 9

DTU

Antithetic variables in the queue simulation Antithetic variables in the queue simulation

Can you device the queueing model of yesterday, so that the number of rejections is a monotone function of the underlying Ui’s? Yes: Make sure that we always use either Ui or 1 − Ui, so that a large Ui implies customers arriving quickly and remaining long.

Bo Friis Nielsen 8/6 2016 02443 – lecture 7 10

DTU

Control variates Control variates

Use of covariates Z = X + c(Y − µy) E(Y ) = µy( known) V (Z) = V (X) + c2V (Y ) + 2cCov(Y, X) We can minimize V (Z) by choosing c = −Cov(X, Y ) V (Y ) to get V (Z) = V (X) − Cov(X, Y )2 V (Y )

Bo Friis Nielsen 8/6 2016 02443 – lecture 7 11

DTU

Example Example

Use U as control variate Zi = Xi + c

  • Ui − 1

2

  • Xi = eUi

The optimal value can be found by Cov(X, Y ) = Cov

  • U, eU

= E

  • UeU

− E(U)E

  • eU

≈ 0.14086 In practice we would not know this covariance, but estimate it empirically. V (Zc= −0.14086

1/12 ) = V

  • eU

− Cov

  • eU, U

2 V (U) = 0.0039

Bo Friis Nielsen 8/6 2016 02443 – lecture 7 12

DTU

Stratified sampling Stratified sampling

This is a general survey technique: We sample in predetermined areas, using knowledge of structure of the sampling space Wi = e

Ui,1 10 + e 1 10 + Ui,2 10 + · · · + e 9 10 + Ui,10 10

10 What is an appropriate number of strata? (In this case there is a simple answer; for complex problems not so)

slide-4
SLIDE 4

Bo Friis Nielsen 8/6 2016 02443 – lecture 7 13

DTU

Importance sampling Importance sampling

Suppose we want to evaluate θ = E(h(X)) =

  • h(x)f(x)dx

For g(x) > 0 whenever f(x) > 0 this is equivalent to θ = h(x)f(x) g(x) g(x)dx = E h(Y )f(Y ) g(Y )

  • where Y is distributed according to g(y)

This is an efficient estimator of θ, if we have chosen g such that the variance of

  • h(Y )f(Y )

g(Y )

  • is small.

Such a g will lead to more Y ’s where h(y) is large. More important regions will be sampled more often.

Bo Friis Nielsen 8/6 2016 02443 – lecture 7 14

DTU

Re-using the random numbers Re-using the random numbers

We want to compare two different queueing systems. We can estimate the rejection rate of system i = 1, 2 by θi = Egi(U1, . . . , Un) and then rate the two systems according to ˆ θ2 − ˆ θ1 But typically g1(· · · ) and g2(· · · ) are positively correlated: Long service times imply many rejections.

Bo Friis Nielsen 8/6 2016 02443 – lecture 7 15

DTU

Then are more efficient estimator is based on θ2 − θ1 = E (g2(U1, . . . , Un) − g1(U1, . . . , Un)) This amounts to letting the two systems run with the same input sequence of random numbers, i.e. same arrival and service time for each customer. With some program flows, this is easily obtained by re-setting the seed of the RNG. When this is not sufficient, you must store the sequence of arrival and service times, so they can be re-used.

Bo Friis Nielsen 8/6 2016 02443 – lecture 7 16

DTU

Exercise 5: Variance reduction methods Exercise 5: Variance reduction methods

  • Estimate the integral

1

0 exdx by simulation (the crude Monte

Carlo estimator). Use eg. an estimator based on 100 samples and present the result as the point estimator and a confidence interval.

  • Estimate the integral

1

0 exdx using antithetic variables, with

comparable computer ressources.

  • Estimate the integral

1

0 exdx using a control variable, with

comparable computer ressources.

  • Estimate the integral

1

0 exdx using stratified sampling, with

comparable computer ressources.

  • Optional: Use control variates to reduce the variance of the

estimator in exercise 4 (Poisson arrivals).