Multilevel Monte Carlo A few words on concurrency in C++11 Vincent - - PowerPoint PPT Presentation

multilevel monte carlo
SMART_READER_LITE
LIVE PREVIEW

Multilevel Monte Carlo A few words on concurrency in C++11 Vincent - - PowerPoint PPT Presentation

Multilevel Monte Carlo A few words on concurrency in C++11 Vincent Lemaire LPMA UPMC June 11, 2015 Vincent Lemaire (LPMA UPMC) Multilevel Monte Carlo June 11, 2015 1 / 47 Introduction 1 Abstract framework and assumptions Monte


slide-1
SLIDE 1

Multilevel Monte Carlo

A few words on concurrency in C++11 Vincent Lemaire

LPMA – UPMC

June 11, 2015

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 1 / 47

slide-2
SLIDE 2

1

Introduction Abstract framework and assumptions Monte Carlo type estimators Crude Monte Carlo estimator in a biased framework Multistep Richardson-Romberg estimator

2

Multilevel estimators Definition Majoration of the effort Specification(s) of the allocation matrix T Convergence for a fixed R 2 Main result

3

Applications and numerical experiments Parameters and methodology Nested Monte Carlo

4

Hardware and implementation Hardware Threads in C++11 Numerical results

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 2 / 47

slide-3
SLIDE 3

1

Introduction Abstract framework and assumptions Monte Carlo type estimators Crude Monte Carlo estimator in a biased framework Multistep Richardson-Romberg estimator

2

Multilevel estimators Definition Majoration of the effort Specification(s) of the allocation matrix T Convergence for a fixed R 2 Main result

3

Applications and numerical experiments Parameters and methodology Nested Monte Carlo

4

Hardware and implementation Hardware Threads in C++11 Numerical results

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 3 / 47

slide-4
SLIDE 4

Abstract linear simulation framework

Y0 ∈ L2(P) not simulatable real-valued random variable. Aim: compute I0 = E [Y0] with a given accuracy ε > 0. Suppose we have family (Yh)h∈H of real-valued random variables in L2(P) satisfying:

◮ Bias error expansion (weak error expansion):

∃ α > 0, E [Yh] = E [Y0] + c1hα + c2h2α + · · · + cRhαR 1 + ηR(h)

  • (WEα,R)

◮ Strong approximation error assumption:

∃ β > 0,

  • Yh − Y0
  • 2

2 = E

  • Yh − Y0
  • 2

V1hβ (SEβ)

h is the bias parameter taking value in H =

  • h/n, n 1
  • .

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 4 / 47

slide-5
SLIDE 5

Monte Carlo type estimators

We consider Monte Carlo type estimators IN

π depending on some parameters π

to compute I0 = E [Y0] i.e. satisfying Constant bias: E

  • IN

π

  • = E
  • I1

π

  • for all N 1

Inverse linear variance: var(IN

π ) = ν(π) N

where ν(π) = var(I1

π)

Linear cost: Cost(IN

π ) = N κ(π) where κ(π) = Cost(I1 π)

Typically IN

π is the N-empirical mean of i.d.d. r.v.

Aim: minimize the global cost of the estimator for a given error ε > 0. (π(ε), N(ε)) = argmin

  • IN

π −I0

Cost(IN

π ).

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 5 / 47

slide-6
SLIDE 6

Definition The effort of the estimator IN

π is defined for every π∈ Π by

φ

  • π
  • = ν(π) κ(π).

If the estimator IN

π is unbiased then

  • I1

π − I0

  • 2

2 = ν(π) so that

π(ε) = π∗ = argmin

π∈Π

φ

  • π
  • ,

N(ε) = ν(π∗) ε2 = φ(π∗) κ(π∗)ε2 . If the estimator IN

π is biased then

  • I1

π − I0

  • 2

2 = ν(π) + µ2(π) where

µ2(π) =

  • E
  • IN

π

  • − I0

2 =

  • E
  • I1

π

  • − I0

2 so that π(ε) = argmin

π∈Π, |µ(π)|<ε

  • φ
  • π
  • ε2 − µ2(π)
  • , N(ε) =

φ(π(ε)) κ(π(ε))(ε2 − µ2(π(ε))).

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 6 / 47

slide-7
SLIDE 7

Theorem (Crude Monte Carlo, π = h) The Monte Carlo estimator of E [Y0] defined by ¯ Y N

h = 1

N

N

  • k=1

Y k

h satisfies

µ(h) ∼ c1hα, κ(h) = 1 h, φ(h) = var(Yh) h . Then, the optimal parameters h∗(ε) and N ∗(ε) are h∗(ε) = (1 + 2α)− 1

|c1|1/α ε

1 α , N ∗(ε) =

  • 1 + 1

2α var(Y0)(1 + θh∗(ε)

β 2 )2

ε2 . where θ =

  • V1

var(Y0). Furthermore, we have

lim sup

ε→0

ε2+ 1

α min

h∈H, |µ(h)|<ε

Cost( ¯ Y N

h ) |c1|

1 α

  • 1 + 1

  • (1 + 2α)

1 2α var(Y0). Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 7 / 47

slide-8
SLIDE 8

Multistep Richardson-Romberg estimator

[P. 2007, MCMA]

Aim: Kill the bias using the weak error expansion (WEα,R) Refiners: R-tuple of integers n := (n1, n2, . . . , nR)∈ NR satisfying n1 = 1 < n2 < · · · < nR. We consider Yh,n the RR-valued random vector Yh,n =

  • Yh, Y h

n2 , . . . , Y h nR

  • We define the vector w = (w1, . . . , wR) as the unique solution to the

Vandermonde system V w = e1 where V = V (1, n−α

2 , . . . , n−α

R ) =

     1 1 · · · 1 1 n−α

2

· · · n−α

R

. . . . . . · · · . . . 1 n−α(R−1)

2

· · · n−α(R−1)

R

     .

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 8 / 47

slide-9
SLIDE 9

Key point: The solution w has a closed form given by Cramer’s rule: ∀i ∈

  • 1, . . . , R
  • ,

wi = (−1)R−inα(R−1)

i

  • 1j<i

(nα

i − nα j )

  • i<jR

(nα

j − nα i )

. We also derive the following identity of interest in what follows

  • wR+1 :=

R

  • i=1

wi nαR

i

= (−1)R−1 n!α .

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 9 / 47

slide-10
SLIDE 10

Multistep Richardson-Romberg estimator

Definition (Multistep Richardson-Romberg estimator) ¯ Y N

h,n = 1

N

N

  • k=1
  • w, Y k

h,n

  • ,
  • Y k

h,n

  • k1 i.i.d.

= 1 N

N

  • k=1

R

  • i=1

wi Y k

h ni

Note that π = (h, n, R) and µ(π) = E

  • w, Y 1

h,n

  • =
  • w, E
  • Y 1

h,n

  • = E [Y0] +

wR+1cRhαR(1 + ηR,n(h))

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 10 / 47

slide-11
SLIDE 11

Let |n| = n1 + · · · + nR and n! = n1 · · · nR. Theorem (Multistep Richardson-Romberg) The Multistep Richardson-Romberg estimator of E [Y0] satisfies µ(h) ≃ (−1)R−1cR hR n! α , κ(h) = |n| h , φ(h) ≃ |n| var(Y0) h and for a fixed R 2 the optimal parameters h∗(ε) and N ∗(ε) are h∗(ε) = (1 + 2αR)−

1 2αR

|cR|1/(αR) ε

1 αR , N ∗(ε) =

  • 1 +

1 2αR var(Y0)(1 + θ h∗(ε)

β 2 )2

ε2 . Furthermore, we have inf

h∈H |µ(h)|<ε

Cost( ¯ Y N

h ) ∼

  • (1 + 2αR)1+

1 2αR

2αR

  • |cR|

1 αR var(Y0)

ε2+

1 αR

  • n
  • n!

1 R

as ε → 0.

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 11 / 47

slide-12
SLIDE 12

Remarks on the Multistep estimator

If ni = i then

  • n
  • = R(R+1)

2

and (n!)

1 R ∼ R

e so that

  • n
  • n!

1 R ∼ e(R+1)

2

. If ni = M i−1 (M 2) then

  • n
  • n!

1 R ∼ M R−1 2

How to avoid this drawback ? Using control variance technique such as Multilevel: We consider R independant copies (Y (j)

h,n), j = 1, . . . , R of the random

vector Yh,n and we split

  • w, Yh,n
  • into
  • w, Yh,n
  • R
  • j=1
  • Tj, Y (j)

h,n

  • =

R

  • i,j=1

Tj

i Y (j)

h ni

where T = [T1 . . . TR] is an R × R matrix such that

j Tj = w.

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 12 / 47

slide-13
SLIDE 13

1

Introduction Abstract framework and assumptions Monte Carlo type estimators Crude Monte Carlo estimator in a biased framework Multistep Richardson-Romberg estimator

2

Multilevel estimators Definition Majoration of the effort Specification(s) of the allocation matrix T Convergence for a fixed R 2 Main result

3

Applications and numerical experiments Parameters and methodology Nested Monte Carlo

4

Hardware and implementation Hardware Threads in C++11 Numerical results

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 13 / 47

slide-14
SLIDE 14

Definitions

Allocation matrix: An R × R-matrix T = [T1 . . . TR] satisfying T1 = e1 and ∀ j ∈ {2, . . . , R},

R

  • i=1

Tj

i = 0.

(1) so that

d

  • i,j=1

Tj

i = 1.

Stratification/Allocation strategy: Let q = (q1, . . . , qR), qj > 0, j = 1, . . . , R and

j qj = 1. Let Nj = ⌈qjN⌉.

Multilevel estimator of order R ¯ Y N,q

h,n = R

  • j=1

1 Nj

Nj

  • k=1
  • Tj, Y (j),k

h,n

  • ≈ 1

N

R

  • j=1

1 qj

Nj

  • k=1
  • Tj, Y (j),k

h,n

  • (2)

◮ Multilevel Monte Carlo ≡ R

j=1 Tj = eR

◮ Multilevel Richardson-Romberg ≡ R

j=1 Tj = w

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 14 / 47

slide-15
SLIDE 15

Effort of a Multilevel estimator

Parameters: π = (π0, q) with π0 = (h, n, R, T). (Unitary) Cost of a Multilevel estimator Cost( ¯ Y N,q

h,n ) = R

  • j=1

Nj

R

  • i=1

1 hni1{Tj

i =0} = N κ(π)

where the unitary complexity κ(π) is given by κ(π) = 1 h

R

  • j=1

qj

R

  • i=1

ni1{Tj

i =0}.

Effort of a Multilevel estimator: φ(π) = ν(π) κ(π) =  

R

  • j=1

1 qj var

  • Tj, Y (j)

h,n

 κ(π)

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 15 / 47

slide-16
SLIDE 16

Theorem (Bias) Assume (WEα,R) holds. (i) Multilevel Richardson-Romberg estimator µ(π0, q) ≃ (−1)R−1cR hR n! α (ii) Multilevel Monte Carlo estimator µ(π0, q) ≃ c1 h nR α In both cases, the bias does not depend on the stratification strategy q selected in the simplex S+(R).

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 16 / 47

slide-17
SLIDE 17

Steps to minimize the global cost

Step 1: Minimize the effort φ over admissible stratification strategies q∗ = q∗(π0) = argmin

q∈S+(R)

¯ φ(π0, q) where ¯ φ(π) φ(π), Let φ∗(π0) = φ(π0, q∗) be the optimally stratified effort. Step 2: Minimizing the resulting cost as a function of the remaining parameters π0 for a given target error ε π0(ε) = argmin

π0∈Π0 |µ(π0,q∗)|<ε

  • φ∗(π0)

ε2 − µ2(π0, q∗)

  • ,

N(π0(ε)) = φ∗(π0(ε)) κ(π0(ε), q∗)(ε2 − µ2(π0, q∗)).

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 17 / 47

slide-18
SLIDE 18

Optimal stratification (no weak error expansion needed)

Theorem Assume (SEβ) holds, and let θ =

  • V1

var(Y0). Then the optimally stratified

effort φ∗ satisfies

φ∗(π0) var(Y0) h

  • 1 + θh

β 2

R

  • j=1

R

  • i=1
  • Tj

i

  • n

− β

2

i

R

  • i=1

ni1

Tj

i =0

  • 1

2

2

where the optimal stratification strategy q∗ = q∗(π0) is given by      q∗

1(π0) = µ∗(1 + θh

β 2 )

q∗

j (π0) = µ∗θh

β 2

R

  • i=1
  • Tj

i

  • n

− β

2

i

R

  • i=1

ni1

Tj

i =0

  • − 1

2

, j = 2, . . . , R, (3) with µ∗ is the normalizing constant such that R

j=1 q∗ j = 1.

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 18 / 47

slide-19
SLIDE 19

ML2R

Among several possibilities we specify the allocation matrix T as follows: T =           W1 − W2 · · · · · · W2 − W3 · · · . . . ... ... ... ... . . . . . . ... ... ... · · · · · · WR−1 − WR · · · · · · · · · WR           . where Wj =

R

  • k=j

wk, j = 1, . . . , R so that W1 = 1.

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 19 / 47

slide-20
SLIDE 20

MLMC

T =           1 −1 · · · · · · 1 −1 · · · . . . ... ... ... ... . . . . . . ... ... ... · · · · · · 1 −1 · · · · · · · · · 1           .

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 20 / 47

slide-21
SLIDE 21

Let R 2 and the ni, i = 1, . . . , R be fixed parameters. Theorem (Bias parameter optimization (ML2R)) Assume (WEα,R) and (SEβ) hold. A Multilevel Richardson-Romberg estimator satisfies inf

h∈H |µ(h,q∗)|<ε

Cost

  • ¯

Y N,q∗

h,n

(1 + 2αR)1+

1 2αR

2αR |c

1 αR R | var(Y0)

ε2+

1 αR

1 n!

1 R

as ε → 0, with q∗ defined by (3). Note that |n| disappeared. . . This asymptotically optimal bound is achieved with the bias parameter: h∗(ε, R) = (1 + 2αR)−

1 2αR

ε |cR|

  • 1

αR

n!

1 R .

(4)

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 21 / 47

slide-22
SLIDE 22

Let R 2 and let ni, i = 1, . . . , R be fixed parameters. Theorem (Bias parameter optimization (MLMC)) Assume (WEα,R) and (SEβ) hold. A Multilevel Monte Carlo estimator satisfies inf

h∈H |µ(h,q∗)|<ε

Cost

  • ¯

Y N,q∗

h,n

(1 + 2α)1+ 1

2α |c1|

1 α var(Y0)

nRε2+ 1

α

as ε → 0, with q∗ defined by (3). This asymptotically optimal bound is achieved with a bias parameter given by h∗(ε, R) = (1 + 2α)− 1

ε |c1| 1

α

nR. (5)

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 22 / 47

slide-23
SLIDE 23

Theorem (Main (ML2R) result when ni = M i−1, i = 1, . . . , R.) Assume limR |cR|1/R = c ∈ (0, +∞). The Multilevel Richardson-Romberg estimator satisfies lim

ε→0 v(β, ε)×

inf

h∈H,R2 |µ(h,R,q∗)|<ε

Cost

  • Y N,q

h,n

  • K(α, β, M)

with v(β, ε) =      ε2(log(1/ε))−1 if β = 1, ε2 if β > 1, ε2e− 1−β

√α

2 log(1/ε) log(M)

if β < 1. These bounds are achieved with an order R∗(ε) =    1 2 + log(h c

1 α )

log(M) + 1 2 + log(h c

1 α )

log(M) 2 + 2 log(A/ε) α log(M)     A = √1 + 4α, and the bias parameter h∗(ε, R(ε))∈ (0, h] is given by (4).

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 23 / 47

slide-24
SLIDE 24

Theorem (Main result (MLMC)) The Multilevel Monte Carlo estimator satisfies lim

ε→0 v(β, ε) ×

inf

h∈H,R2 |µ(h,R,q∗)|<ε

Cost

  • Y N,q

h,n

  • K(α, β, M)

with v(β, ε) =      ε2(log(1/ε))−2 if β = 1 ε2 if β > 1 ε2+ 1−β

α

if β < 1. These bounds are achieved with an order R∗(ε) =

  • 1 + log
  • |c1|

1 α h

  • log(M)

+ log(A/ε) α log(M)

  • ,

A = √1 + 2α, and a bias parameter h∗(ε, R(ε))∈ (0, h] given by (5).

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 24 / 47

slide-25
SLIDE 25

1

Introduction Abstract framework and assumptions Monte Carlo type estimators Crude Monte Carlo estimator in a biased framework Multistep Richardson-Romberg estimator

2

Multilevel estimators Definition Majoration of the effort Specification(s) of the allocation matrix T Convergence for a fixed R 2 Main result

3

Applications and numerical experiments Parameters and methodology Nested Monte Carlo

4

Hardware and implementation Hardware Threads in C++11 Numerical results

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 25 / 47

slide-26
SLIDE 26

Optimal parameters for ML2R (with ni = M i−1)

R     1 2 + log( c

1 α h)

log(M) +

  • 1

2 + log( c

1 α h)

log(M) 2 + 2 log(A/ε) α log(M)     , A = √1 + 4α h−1

  • (1 + 2αR)

1 2αR ε− 1 αR M − R−1 2

  • q

q1 = µ∗(1 + θh

β 2 )

qj = µ∗θh

β 2

  Wj(R, M)

  • n

− β

2

j−1 + n − β

2

j

√nj−1 + nj   , N var(Y0)

  • 1 + θh

β 2

R

  • j=1
  • Wj(R, M)
  • n

− β

2

j−1 + n − β

2

j

nj−1 + nj 2 ε2

  • 1 +

1 2αR −1

R

  • j=1

qj(nj−1 + nj)

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 26 / 47

slide-27
SLIDE 27

Optimal parameters for MLMC (with ni = M i−1)

R

  • 1 + log(|c1|

1 α h)

log(M) + log(A/ε) α log(M)

  • , A = √1 + 2α

h−1

  • (1 + 2α)

1 2α ε− 1 α |c1| 1 α M −(R−1)

q q1 = µ∗(1 + θh

β 2 )

qj = µ∗θh

β 2

 n

− β

2

j−1 + n − β

2

j

√nj−1 + nj   , N var(Y0)

  • 1 + θh

β 2

R

  • j=1
  • n

− β

2

j−1 + n − β

2

j

nj−1 + nj 2 ε2

  • 1 + 1

2α −1

R

  • j=1

qj(nj−1 + nj)

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 27 / 47

slide-28
SLIDE 28

Choice of the root M = M(ε)

Refiners: ni = M i−1, i = 1, . . . , R, M integer, M 2. Optimal parameters are all functions of M= ⇒ Numerical optimization of the cost fonction Cε(M) = Cost(Y N(M),q(M)

h(M),n

) := N(M) κ(h(M), R(M), q(M)), for M ∈

  • 2, . . . , 10
  • , where

κ(h, R, q) = 1 h

R

  • j=1

qj(nj−1 + nj) (with n0 = 0) or (for nested simulations) κ(h, R, q) = 1 h

R

  • j=1

qjnj.

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 28 / 47

slide-29
SLIDE 29

Nested Monte Carlo

Aim: Computing E

  • ϕ
  • E [X | Y ]
  • where (X, Y ) is couple of R × RqY -valued random variable.

Assume that X has the representation X = F(Z, Y ). where: F : RqZ × RqY → R is a Borel function Z : (Ω, A) → RqZ is independent of Y . To comply with the multilevel framework, we set H = {1/K, K 1} Y0 = ϕ

  • E [X | Y ]
  • ,

Y 1

K = ϕ

  • 1

K

K

  • k=1

F(Zk, Y )

  • .

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 29 / 47

slide-30
SLIDE 30

Examples of Financial applications

Compound option pricing CVA (credit value adjustment) computation Solvency capital requirement (SCR) computation Some references: Gordy & Juneja, Nested simulation in portfolio risk measurement, 2010 Broadie et al. Efficient risk estimation via nested sequential simulation, 2011 Ghamami & Zhang Efficient Monte Carlo Counterparty Credit Risk Pricing and Measurement, 2013

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 30 / 47

slide-31
SLIDE 31

Nested Monte Carlo

Lemma (Strong approximation) Assume ϕ is Lipschitz continuous. For every h∈ H,

  • Y0 − Yh
  • 2

2 [f]2

Lip

  • X
  • 2

2 −

  • E [X | Y ]
  • 2

2

  • h

(6) so that (Yh)h∈H satisfies (SEβ) with β = 1. Lemma (Bias error (I)) Let R 1 and let ϕ : R → R be a 2R + 1 times differentiable function with ϕ(k), k = R + 1, . . . , 2R + 1, bounded over the real line. Assume X ∈ L2R+1. Then there exists c1, . . . , cR such that ∀h ∈ H, E [ϕ(Yh)] = E [ϕ(Y0)] +

R

  • r=1

crhr + O(hR+1/2). (7)

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 31 / 47

slide-32
SLIDE 32

Nested Monte Carlo

Let fY0 be the density of Y0 and let fY0|Yh−Y0=ξ be the density of Y0 given the

  • ccurence of the value ξ of Yh − Y0.

Lemma (Bias error (II), work in progress) Let R 1. Assume that the functions fY0, fY0|Yh−Y0=ξ, for ξ ∈ R, are 2R + 1 times differentiable. Assume X ∈ L2R+1 and a little more... fYh(y) = fY0(y) + fY0(y)

R

  • r=1

hr r! Pr(y) + o(hR+ 1

2 )

(8) uniformly in y ∈ RqY , where Pr is an explicit function with polynomial growth.

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 32 / 47

slide-33
SLIDE 33

Put-on-call option

Payoff of a Put-on-Call option (involves conditional expectation) (K1 − E [(ST2 − K2)+ | ST1])+ with parameters T1 = 1/12, T2 = 1/2 and K1 = 6.5, K2 = 100. Black-Scholes model with s0 = 100, r = 0.03 and σ = 0.3. Note that ST2 = F(G, ST1) = ST1e(r− σ2

2 )(T2−T1)+σ√T2−T1G Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 33 / 47

slide-34
SLIDE 34

CPU-time as a function of ˜ εL

0.0001 0.001 0.01 0.1 1 10 100 1000 2−9 2−8 2−7 2−6 2−5 2−4 2−3 2−2 2−1 20 CPU-time in seconds Empirical RMSE ˜ εL MLMC ML2R

2−1 2−2 2−3 2−4 2−5 2−6 2−7 2−8 2−9 2−1 2−2 2−3 2−4 2−5 2−6 2−7 2−8 2−9

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 34 / 47

slide-35
SLIDE 35

Parameters and complexity for a given ε

ML2R MLMC k ε = 2−k R M h−1 Cost R M h−1 Cost 1 5.00·10−01 2 5 1 1.37·10+03 2 4 1 1.14·10+03 2 2.50·10−01 2 9 1 6.33·10+03 2 7 1 5.76·10+03 3 1.25·10−01 3 3 1 4.65·10+04 3 4 1 4.57·10+04 4 6.25·10−02 3 4 1 1.87·10+05 3 6 1 2.26·10+05 5 3.12·10−02 3 5 1 7.84·10+05 3 8 1 1.06·10+06 6 1.56·10−02 3 6 1 3.32·10+06 4 5 1 6.21·10+06 7 7.81·10−03 3 7 1 1.41·10+07 4 7 1 3.02·10+07 8 3.91·10−03 3 9 1 6.28·10+07 4 8 1 1.31·10+08 9 1.95·10−03 4 4 1 3.26·10+08 4 10 1 6.06·10+08

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 35 / 47

slide-36
SLIDE 36

1

Introduction Abstract framework and assumptions Monte Carlo type estimators Crude Monte Carlo estimator in a biased framework Multistep Richardson-Romberg estimator

2

Multilevel estimators Definition Majoration of the effort Specification(s) of the allocation matrix T Convergence for a fixed R 2 Main result

3

Applications and numerical experiments Parameters and methodology Nested Monte Carlo

4

Hardware and implementation Hardware Threads in C++11 Numerical results

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 36 / 47

slide-37
SLIDE 37

Server system Pharaon2

Nom: pharaon2.lpma.math.upmc.fr Adresse IP: 134.157.65.205

Linux 2.6.32_64 #1 SMP CPU(s): 64 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 4 NUMA node(s): 4 Model name: Intel(R) Xeon(R) CPU E5-4620 0 @ 2.20GHz CPU MHz: 1200.000 BogoMIPS: 4399.44 L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 16384K

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 37 / 47

slide-38
SLIDE 38

Server system Pharaon2

Nom: pharaon2.lpma.math.upmc.fr Adresse IP: 134.157.65.205

Linux 2.6.32_64 #1 SMP CPU(s): 64 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 4 NUMA node(s): 4 Model name: Intel(R) Xeon(R) CPU E5-4620 0 @ 2.20GHz CPU MHz: 1200.000 BogoMIPS: 4399.44 L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 16384K

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 37 / 47

slide-39
SLIDE 39

Desktop system Nephtys

CPU(s): 8 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1 Model name: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz CPU MHz: 1600.125 CPU max MHz: 3900.0000 CPU min MHz: 1600.0000 BogoMIPS: 6803.85 L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 8192K

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 38 / 47

slide-40
SLIDE 40

Threads

A thread of execution is a sequence of instructions that can be executed concurrently with other such sequences in multithreading environments, while sharing a same address space.

template <class Fn, class... Args>

2

explicit thread(Fn && fn, Args &&... args);

Construct a thread object that represents a new joinable thread of execution. The new thread of execution calls fn passing args as arguments. The arguments to the thread function are moved or copied by value. If a reference argument needs to be passed to the thread function, it has to be wrapped (e.g. with std::ref or std::cref). Any return value from the function fn is ignored.

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 39 / 47

slide-41
SLIDE 41

Class std::thread

Public member class

id

represents the id of a thread Public member functions Constructor Destructor

  • perator=

moves the thread object

joinable

checks whether the thread is joinable, i.e. po- tentially running in parallel context

get_id

returns the id of the thread

hardware_concurrency

returns the number of concurrent threads supported by the implementation

join

waits for a thread to finish its execution

detach

permits the thread to execute independently from the thread handle

swap

swaps two thread objects

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 40 / 47

slide-42
SLIDE 42

Class std::thread

Static to get the number of hardware threads

Public member class

id

represents the id of a thread Public member functions Constructor Destructor

  • perator=

moves the thread object

joinable

checks whether the thread is joinable, i.e. po- tentially running in parallel context

get_id

returns the id of the thread

hardware_concurrency

returns the number of concurrent threads supported by the implementation

join

waits for a thread to finish its execution

detach

permits the thread to execute independently from the thread handle

swap

swaps two thread objects

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 40 / 47

slide-43
SLIDE 43

Class std::thread

Main function of the class

std::thread

Public member class

id

represents the id of a thread Public member functions Constructor Destructor

  • perator=

moves the thread object

joinable

checks whether the thread is joinable, i.e. po- tentially running in parallel context

get_id

returns the id of the thread

hardware_concurrency

returns the number of concurrent threads supported by the implementation

join

waits for a thread to finish its execution

detach

permits the thread to execute independently from the thread handle

swap

swaps two thread objects

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 40 / 47

slide-44
SLIDE 44

Initialization of the generators

using clock_type = std::chrono::high_resolution_clock;

2

using generator_type = std::mt19937_64; unsigned hw_threads = std::thread::hardware_concurrency();

4

unsigned seed = clock_type::now().time_since_epoch().count();

6

std::seed_seq seq({seed, seed+1, seed+2, seed+3, seed+4}); std::vector<unsigned> seeds(hw_threads);

8

seq.generate(seeds.begin(), seeds.end());

10

std::vector<generator_type> gens; for (unsigned i = 0; i < hw_threads; ++i)

12

gens.push_back(generator_type(seeds[i]));

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 41 / 47

slide-45
SLIDE 45

Initialization of the generators

Quasi random number gen- erator: Mersenne Twister 64 bits using clock_type = std::chrono::high_resolution_clock;

2

using generator_type = std::mt19937_64; unsigned hw_threads = std::thread::hardware_concurrency();

4

unsigned seed = clock_type::now().time_since_epoch().count();

6

std::seed_seq seq({seed, seed+1, seed+2, seed+3, seed+4}); std::vector<unsigned> seeds(hw_threads);

8

seq.generate(seeds.begin(), seeds.end());

10

std::vector<generator_type> gens; for (unsigned i = 0; i < hw_threads; ++i)

12

gens.push_back(generator_type(seeds[i]));

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 41 / 47

slide-46
SLIDE 46

Initialization of the generators

Initialization of vector seeds by random realizations using clock_type = std::chrono::high_resolution_clock;

2

using generator_type = std::mt19937_64; unsigned hw_threads = std::thread::hardware_concurrency();

4

unsigned seed = clock_type::now().time_since_epoch().count();

6

std::seed_seq seq({seed, seed+1, seed+2, seed+3, seed+4}); std::vector<unsigned> seeds(hw_threads);

8

seq.generate(seeds.begin(), seeds.end());

10

std::vector<generator_type> gens; for (unsigned i = 0; i < hw_threads; ++i)

12

gens.push_back(generator_type(seeds[i]));

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 41 / 47

slide-47
SLIDE 47

Definition of the template class

template <typename TLinearEstimator>

2

class parallelize_estimator : public TLinearEstimator { public:

4

parallelize_estimator(TLinearEstimator const & X, unsigned nb_threads)

6

: TLinearEstimator(X), _nb_threads(nb_threads) { for (unsigned i = 0; i < _nb_threads-1; ++i)

8

_Xs.push_back(X); }

10

parallelize_estimator & reinit() { TLinearEstimator::reinit();

12

for (TLinearEstimator & est : _Xs) est.reinit(); return *this;

14

} parallelize_estimator & operator()(generator_type & first,

16

unsigned M); private:

18

std::vector<TLinearEstimator> _Xs; unsigned _nb_threads;

20

};

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 42 / 47

slide-48
SLIDE 48

Definition of the template class

Definition of this method in the next slides template <typename TLinearEstimator>

2

class parallelize_estimator : public TLinearEstimator { public:

4

parallelize_estimator(TLinearEstimator const & X, unsigned nb_threads)

6

: TLinearEstimator(X), _nb_threads(nb_threads) { for (unsigned i = 0; i < _nb_threads-1; ++i)

8

_Xs.push_back(X); }

10

parallelize_estimator & reinit() { TLinearEstimator::reinit();

12

for (TLinearEstimator & est : _Xs) est.reinit(); return *this;

14

} parallelize_estimator & operator()(generator_type & first,

16

unsigned M); private:

18

std::vector<TLinearEstimator> _Xs; unsigned _nb_threads;

20

};

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 42 / 47

slide-49
SLIDE 49

Definition of the method operator()

template <typename TLinearEstimator>

2

parallelize_estimator<TLinearEstimator> & parallelize_estimator<TLinearEstimator>::operator()(

4

generator_type & first, unsigned M)

6

{ std::list<std::thread> threads;

8

unsigned q = M / _nb_threads; unsigned r = M % _nb_threads;

10

unsigned i = 0;

12

// suite sur le prochain slide...

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 43 / 47

slide-50
SLIDE 50

Definition of the method operator()

To overload the

  • perator() of the class

TLinearEstimator.

Assume that there exist

nb_threads generator_type

at adress &first template <typename TLinearEstimator>

2

parallelize_estimator<TLinearEstimator> & parallelize_estimator<TLinearEstimator>::operator()(

4

generator_type & first, unsigned M)

6

{ std::list<std::thread> threads;

8

unsigned q = M / _nb_threads; unsigned r = M % _nb_threads;

10

unsigned i = 0;

12

// suite sur le prochain slide...

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 43 / 47

slide-51
SLIDE 51

Definition of the method operator()

while (i < _nb_threads-1) {

2

_Xs[i].reinit(); unsigned Mi = q + (i < r ? 1 : 0);

4

threads.push_back(std::thread(std::ref(_Xs[i]), std::ref(*(&first+i)),

6

Mi)); ++i;

8

}

10

TLinearEstimator::operator()(*(&first+i), q + (i < r ? 1 : 0));

12

for (auto & th : threads) th.join();

14

for (auto const & Y : _Xs) *this += Y;

16

return *this; };

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 44 / 47

slide-52
SLIDE 52

Definition of the method operator()

Creates new std::thread

  • bject and associates it with

a thread of execution while (i < _nb_threads-1) {

2

_Xs[i].reinit(); unsigned Mi = q + (i < r ? 1 : 0);

4

threads.push_back(std::thread(std::ref(_Xs[i]), std::ref(*(&first+i)),

6

Mi)); ++i;

8

}

10

TLinearEstimator::operator()(*(&first+i), q + (i < r ? 1 : 0));

12

for (auto & th : threads) th.join();

14

for (auto const & Y : _Xs) *this += Y;

16

return *this; };

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 44 / 47

slide-53
SLIDE 53

Definition of the method operator()

Call of operator() in the main thread while (i < _nb_threads-1) {

2

_Xs[i].reinit(); unsigned Mi = q + (i < r ? 1 : 0);

4

threads.push_back(std::thread(std::ref(_Xs[i]), std::ref(*(&first+i)),

6

Mi)); ++i;

8

}

10

TLinearEstimator::operator()(*(&first+i), q + (i < r ? 1 : 0));

12

for (auto & th : threads) th.join();

14

for (auto const & Y : _Xs) *this += Y;

16

return *this; };

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 44 / 47

slide-54
SLIDE 54

Definition of the method operator()

Call of the join method while (i < _nb_threads-1) {

2

_Xs[i].reinit(); unsigned Mi = q + (i < r ? 1 : 0);

4

threads.push_back(std::thread(std::ref(_Xs[i]), std::ref(*(&first+i)),

6

Mi)); ++i;

8

}

10

TLinearEstimator::operator()(*(&first+i), q + (i < r ? 1 : 0));

12

for (auto & th : threads) th.join();

14

for (auto const & Y : _Xs) *this += Y;

16

return *this; };

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 44 / 47

slide-55
SLIDE 55

Definition of the method operator()

Assume that operator+= is defined for

TLinearEstimator

while (i < _nb_threads-1) {

2

_Xs[i].reinit(); unsigned Mi = q + (i < r ? 1 : 0);

4

threads.push_back(std::thread(std::ref(_Xs[i]), std::ref(*(&first+i)),

6

Mi)); ++i;

8

}

10

TLinearEstimator::operator()(*(&first+i), q + (i < r ? 1 : 0));

12

for (auto & th : threads) th.join();

14

for (auto const & Y : _Xs) *this += Y;

16

return *this; };

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 44 / 47

slide-56
SLIDE 56

Examples

for (unsigned k = 1; k <= hw_threads; ++k) {

2

auto estimator = make_multilevel(/* some parameters... */); auto par_estimator = make_parallelize(estimator, k);

4

auto error = make_rms_error(par_estimator, N, true_value); error(gens[0], M_erreur_L2);

6

std::cout << k << "\t" << error.time() << std::endl; }

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 45 / 47

slide-57
SLIDE 57

Examples

for (unsigned k = 1; k <= hw_threads; ++k) {

2

auto estimator = make_multilevel(/* some parameters... */); auto par_estimator = make_parallelize(estimator, k);

4

auto error = make_rms_error(par_estimator, N, true_value); error(gens[0], M_erreur_L2);

6

std::cout << k << "\t" << error.time() << std::endl; } for (unsigned k = 1; k <= hw_threads; ++k) {

2

auto estimator = make_multilevel(/* some parameters... */); auto error = make_rms_error(estimator, N, true_value);

4

auto par_error = make_parallelize(error, k); par_error(gens[0], M_erreur_L2);

6

std::cout << k << "\t" << error.time() << std::endl; }

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 45 / 47

slide-58
SLIDE 58

Examples

for (unsigned k = 1; k <= hw_threads; ++k) {

2

auto estimator = make_multilevel(/* some parameters... */); auto par_estimator = make_parallelize(estimator, k);

4

auto error = make_rms_error(par_estimator, N, true_value); error(gens[0], M_erreur_L2);

6

std::cout << k << "\t" << error.time() << std::endl; } for (unsigned k = 1; k <= hw_threads; ++k) {

2

auto estimator = make_multilevel(/* some parameters... */); auto error = make_rms_error(estimator, N, true_value);

4

auto par_error = make_parallelize(error, k); par_error(gens[0], M_erreur_L2);

6

std::cout << k << "\t" << error.time() << std::endl; }

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 45 / 47

slide-59
SLIDE 59

Result on Nephtys

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 Running time (in seconds) Number of threads of execution 1.0 1.9 2.5 2.8 3.4 4.0 2.8 3.0 1.0 1.9 2.7 2.9 3.5 4.2 3.8 3.9

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 46 / 47

slide-60
SLIDE 60

Result on Pharaon2

2 4 6 8 10 12 1 2 3 4 8 16 32 48 64 Running time (in seconds) Number of threads of execution 1.0 1.8 2.5 3.2 6.3 11.3 9.7 10.2 7.7 1.0 1.8 2.0 2.9 6.6 9.0 22.9 25.2 29.3

Vincent Lemaire (LPMA – UPMC) Multilevel Monte Carlo June 11, 2015 47 / 47