Parallel Scenario Decomposition of Risk Averse 0-1 Stochastic - - PowerPoint PPT Presentation

parallel scenario decomposition of risk averse 0 1
SMART_READER_LITE
LIVE PREVIEW

Parallel Scenario Decomposition of Risk Averse 0-1 Stochastic - - PowerPoint PPT Presentation

Parallel Scenario Decomposition of Risk Averse 0-1 Stochastic Programs Shabbir Ahmed ISyE, Georgia Tech joint work with Yan Deng, Siqian Shen (IOE, U of Michigan) 2016 ICSP 1 / 27 Outline Risk-Averse Stochastic 0-1 Program Dual


slide-1
SLIDE 1

Parallel Scenario Decomposition of Risk Averse 0-1 Stochastic Programs

Shabbir Ahmed ISyE, Georgia Tech joint work with Yan Deng, Siqian Shen (IOE, U of Michigan)

2016 ICSP

1 / 27

slide-2
SLIDE 2

Outline

◮ Risk-Averse Stochastic 0-1 Program

◮ Dual representation of coherent risk measure ◮ Dual decomposition ◮ Distributionally robust counterpart

◮ Parallelization of Decomposition Method

◮ Motivation ◮ Parallel Schemes 2 / 27

slide-3
SLIDE 3

Risk Averse 0-1 Program

min ρ(f(x, ξ)) s.t. x ∈ X ⊆ {0, 1}d

◮ ξ: a random vector with finite support {ξ1, . . . , ξK} and probabilities

p1, . . . , pK. p ∈ A =

  • (p1, . . . , pK) :

K

  • k=1

pk = 1, pk ≥ 0, ∀k = 1, . . . , K

  • ◮ f(x, ξ): cost function, e.g.,

f(x, ξ) = c⊤x + min

y

{θ(y) : y ∈ Y (x, ξ)}

◮ ρ(·): coherent risk measure.

3 / 27

slide-4
SLIDE 4

Coherent Risk Measure

min ρ(f(x, ξ)) s.t. x ∈ X ⊆ {0, 1}d

◮ Positive homogeneity:

ρ(0) = 0, and ρ(ǫw) = ǫρ(w) for any ǫ > 0

◮ Sub-additivity:

ρ(w1 + w2) ≤ ρ(w1) + ρ(w2)

◮ Monotonicity:

ρ(w1 ≥ w2), if w1 ≥ w2 in all scenarios

◮ Translation invariance:

ρ(w + C) = ρ(w) + C, for any constant C.

4 / 27

slide-5
SLIDE 5

Coherent Risk Measure

min ρ(f(x, ξ)) s.t. x ∈ X ⊆ {0, 1}d

◮ Artzner et al. (1999), Shapiro and Ahmed (2004), Shapiro (2013):

For some uncertainty set Q(p) ⊆ A, ρ(f(x, ξ)) = max

q∈Q(p)

  • Eq [f(x, ξ)] =

K

  • k=1

qkf(x, ξk)

  • .

5 / 27

slide-6
SLIDE 6

Coherent Risk Measure

min ρ(f(x, ξ)) s.t. x ∈ X ⊆ {0, 1}d

◮ Artzner et al. (1999), Shapiro and Ahmed (2004), Shapiro (2013):

For some uncertainty set Q(p) ⊆ A, ρ(f(x, ξ)) = max

q∈Q(p)

  • Eq [f(x, ξ)] =

K

  • k=1

qkf(x, ξk)

  • .

See, e.g., CVaR1−ǫ(f(x, ξ))

ϵ

ϵ

VaR CVaR

max

5 / 27

slide-7
SLIDE 7

Coherent Risk Measure

min ρ(f(x, ξ)) s.t. x ∈ X ⊆ {0, 1}d

◮ Artzner et al. (1999), Shapiro and Ahmed (2004), Shapiro (2013):

For some uncertainty set Q(p) ⊆ A, ρ(f(x, ξ)) = max

q∈Q(p)

  • Eq [f(x, ξ)] =

K

  • k=1

qkf(x, ξk)

  • .

See, e.g., CVaR1−ǫ(f(x, ξ)) = max K

  • k=1

qkf(x, ξk) :

K

  • k=1

qk = 1, 0 ≤ qk≤ pk/ǫ, ∀k = 1, . . . , K

  • 6 / 27
slide-8
SLIDE 8

Coherent Risk Measure

min ρ(f(x, ξ)) s.t. x ∈ X ⊆ {0, 1}d

◮ Artzner et al. (1999), Shapiro and Ahmed (2004), Shapiro (2013):

For some uncertainty set Q(p) ⊆ A, ρ(f(x, ξ)) = max

q∈Q(p)

  • Eq [f(x, ξ)] =

K

  • k=1

qkf(x, ξk)

  • .

◮ Minimax Reformulation

min

x∈X

max

q∈Q(p)

K

  • k=1

qkf(x, ξk)

  • ◮ Collado et. al. (2012): risk averse multistage stochastic linear program

◮ Ahmed (2013): 0-1 stochastic program ◮ Ahmed et. al. (2015): 0-1 chance constrained program 7 / 27

slide-9
SLIDE 9

Dual Decomposition

min

x∈X

max

q∈Q(p)

K

  • k=1

qkf(x, ξk)

  • 8 / 27
slide-10
SLIDE 10

Dual Decomposition

min

x∈X

max

q∈Q(p)

K

  • k=1

qkf(x, ξk)

  • ◮ Clone x for each scenario ⇒ x1, . . . , xK.

◮ Force x1 = · · · = xK by non-anticipativity constraint: K

  • k=1

αkxk = x1 (NAC) where α1, . . . , αK are positive constants that sum to 1.

8 / 27

slide-11
SLIDE 11

Dual Decomposition

min

x1,...,xK∈X

max

q∈Q(p) K

  • k=1

qkf(xk, ξk) s.t.

K

  • k=1

αkxk = x1 (NAC)

9 / 27

slide-12
SLIDE 12

Dual Decomposition

min

x1,...,xK∈X

max

q∈Q(p) K

  • k=1

qkf(xk, ξk) s.t.

K

  • k=1

αkxk = x1 (NAC)

◮ Relax (NAC) and punish violation by λ ∈ Rd.

g(λ) = min

x1,...,xK∈X

max

q∈Q(p)

  • λ⊤

K

  • k=1

αkxk − x1

  • +

K

  • k=1

qkf(xk, ξk)

  • =

min

x1,...,xK∈X

max

q∈Q(p)

K

  • k=1
  • (αk − δk)λ⊤xk + qkf(xk, ξk)
  • llllllllllllllllllllll

where δ1 = 1 and δk = 0 for k = 2, . . . , K.

9 / 27

slide-13
SLIDE 13

Dual Decomposition

min

x1,...,xK∈X

max

q∈Q(p) K

  • k=1

qkf(xk, ξk) s.t.

K

  • k=1

αkxk = x1 (NAC)

◮ Relax (NAC) and punish violation by λ ∈ Rd.

g(λ) = min

x1,...,xK∈X

max

q∈Q(p)

  • λ⊤

K

  • k=1

αkxk − x1

  • +

K

  • k=1

qkf(xk, ξk)

  • =

min

x1,...,xK∈X

max

q∈Q(p)

K

  • k=1
  • (αk − δk)λ⊤xk + qkf(xk, ξk)
  • ≥ max

q∈Q(p)

min

x1,...,xK∈X

K

  • k=1
  • (αk − δk)λ⊤xk + qkf(xk, ξk)
  • =

max

q∈Q(p)

K

  • k=1

min

xk∈X

  • (αk − δk)λ⊤xk + qkf(xk, ξk)
  • = g(λ)llllllllllllllllllll

10 / 27

slide-14
SLIDE 14

Dual Decomposition

min

x1,...,xK∈X

max

q∈Q(p) K

  • k=1

qkf(xk, ξk) s.t.

K

  • k=1

αkxk = x1 (NAC)

◮ Relax (NAC) and punish violation by λ ∈ Rd.

g(λ) = min

x1,...,xK∈X

max

q∈Q(p)

  • λ⊤

K

  • k=1

αkxk − x1

  • +

K

  • k=1

qkf(xk, ξk)

  • =

min

x1,...,xK∈X

max

q∈Q(p)

K

  • k=1
  • (αk − δk)λ⊤xk + qkf(xk, ξk)
  • ≥ max

q∈Q(p)

min

x1,...,xK∈X

K

  • k=1
  • (αk − δk)λ⊤xk + qkf(xk, ξk)
  • =

max

q∈Q(p)

K

  • k=1

min

xk∈X

  • (αk − δk)λ⊤xk + qkf(xk, ξk)
  • = g(λ)

(LB, ∀λ)

11 / 27

slide-15
SLIDE 15

LB Computation

g(λ) = max

q∈Q(p)

K

  • k=1

min

xk∈X

  • (αk − δk)λ⊤xk + qkf(xk, ξk)
  • 12 / 27
slide-16
SLIDE 16

LB Computation

g(λ) = max

q∈Q(p)

K

  • k=1

min

xk∈X

  • (αk − δk)λ⊤xk + qkf(xk, ξk)
  • ◮ Approach 1: LB ← g(0).

g(0) = maxq∈Q(p) K

k=1 qk minx∈X f(x, ξk)

  • 1: for k = 1, . . . , K do

2: βk ← min{f(x, ξk) : x ∈ X} 3: end for 4: ℓ ← max K

k=1 βkqk : q ∈ Q(p)

  • 13 / 27
slide-17
SLIDE 17

LB Computation

g(λ) = max

q∈Q(p)

K

  • k=1

min

xk∈X

  • (αk − δk)λ⊤xk + qkf(xk, ξk)
  • ◮ Approach 2: LB ← maxλ g(λ).

MP: max

q∈Q(p),λ,φ

  • φ : φ ≤

K

k=1 min x∈X

  • (αk − δk)λ⊤x + qkf(x, ξk)
  • 1: repeat

2: (ˆ φ, ˆ λ, ˆ q) ← MP 3: for k = 1, . . . , K do 4: βk ← min

  • (αk − δk)ˆ

λ⊤x + ˆ qkf(x, ξk) : x ∈ X

  • 5:

end for 6: add cut φ ≤ K

k=1

  • (αk − δk)λ⊤ˆ

xk + qkf(ˆ xk, ξk)

  • to MP

7: until ˆ φ ≤ K

k=1 βk

Slow convergence: stop after some iterations and return the best-found K

k=1 βk.

14 / 27

slide-18
SLIDE 18

LB Computation

g(λ) = max

q∈Q(p)

K

  • k=1

min

xk∈X

  • (αk − δk)λ⊤xk + qkf(xk, ξk)
  • ◮ Approach 1 & 2:

min

x1,...,xK∈X

max

q∈Q(p) K

  • k=1

qkf(xk, ξk) s.t.

K

  • k=1

αkxk = x1, ∀i = 1, . . . , K ∼ λ ∈ Rd

◮ Approach 3:

min

x1,...,xK∈X

max

q∈Q(p) K

  • k=1

qkf(xk, ξk) s.t.

K

  • k=1

αkxk = xi, ∀i = 1, . . . , K ∼ qiλi ∈ Rd

15 / 27

slide-19
SLIDE 19

LB Computation

g(λ) = max

q∈Q(p)

min

x1,...,xK∈X

K

k=1 qk

  • f(xk, ξk) − (λk)⊤xk

+ K

k=1 αkxk⊤ K k=1 qkλk

16 / 27

slide-20
SLIDE 20

LB Computation

g(λ) = max

q∈Q(p)

min

x1,...,xK∈X

K

k=1 qk

  • f(xk, ξk) − (λk)⊤xk

+ K

k=1 αkxk⊤ K k=1 qkλk

g(λ) = max

q∈Q(p) Q(λ)

K

k=1 qk min x∈X

  • f(x, ξk) − (λk)⊤x

,

where Q(λ) =

  • q :

K

k=1 qkλk = 0

  • ◮ Approach 3: LB ← maxλ g(λ).

1: initialize λ1, . . . , λK 2: repeat 3: for k = 1, . . . , K do 4: βk ← min

  • f(x, ξk) − (λk)⊤x : x ∈ X
  • 5:

end for 6: ℓ ← max K

k=1 βkqk : q ∈ Q(p) Q(λ)

  • 7:

update λ1, . . . , λK 8: until ℓ converges Slow convergence: stop after some iterations and return the best-found ℓ.

17 / 27

slide-21
SLIDE 21

Serial Algorithm

◮ LB:

Subproblem of Scenario k Approach 1 min

x∈X{f(x, ξk)}

Approach 2 min

x∈X

  • (αk − δk)λ⊤x + qkf(x, ξk)
  • Approach 3

min

x∈X

  • f(x, ξk) − (λk)⊤x
  • ◮ UB: evaluate subproblem solutions.

◮ Algorithm overview:

1: initialize LB ℓ and UB u 2: repeat 3: compute ℓ and collect subproblem solutions in S, by Approach 1/2/3 4: for ˆ x ∈ S do 5: u ← min{u, ρ(f(ˆ x, ξ))} 6: end for 7: X ← X \ S 8: until u − ℓ ≤ ǫ

◮ No-good Cut to exclude evaluated ˆ

x:

j:ˆ xj=1(1 − xj) + j:ˆ xj=0 xj ≥ 1.

18 / 27

slide-22
SLIDE 22

Distributionally Robust Risk-Averse 0-1 Program

◮ Known probability distribution p,

min

x∈X ρ(f(x, ξ)) = min x∈X

max

q∈Qρ(p) Eq[f(x, ξ)] ◮ If p is not known exactly, but an uncertainty set U is given,

min

x∈X max p∈U ρ(f(x, ξ))

19 / 27

slide-23
SLIDE 23

Distributionally Robust Risk-Averse 0-1 Program

◮ Known probability distribution p,

min

x∈X ρ(f(x, ξ)) = min x∈X

max

q∈Qρ(p) Eq[f(x, ξ)] ◮ If p is not known exactly, but an uncertainty set U is given,

min

x∈X max p∈U ρ(f(x, ξ))

= min

x∈X max p∈U

max

q∈Qρ(p) Eq[f(x, ξ)]

= min

x∈X

max

q∈{q:q∈Qρ(p), p∈P} Eq[f(x, ξ)] ◮ All the proposed dual decomposition methods are still applicable.

19 / 27

slide-24
SLIDE 24

Parallelization

◮ Parallel jobs, e.g., Sub(k), Eva(x).

20 / 27

slide-25
SLIDE 25

Parallelization

◮ Parallel jobs, e.g., Sub(k), Eva(x). ◮ Synchronization and communication in between

iterations

20 / 27

slide-26
SLIDE 26

Parallelization

◮ Parallel jobs, e.g., Sub(k), Eva(x). ◮ Synchronization and communication in between

iterations

20 / 27

slide-27
SLIDE 27

Parallelization

◮ Parallel jobs, e.g., Sub(k), Eva(x). ◮ Synchronization and communication in between

iterations

20 / 27

slide-28
SLIDE 28

Parallelization

◮ Parallel jobs, e.g., Sub(k), Eva(x). ◮ Synchronization and communication in between

iterations

◮ Similarly-structured methods:

◮ Dual decomposition [Carøe and Schultz (1999), ...] ◮ Benders decomposition [Benders (1962), ...] ◮ Progressive hedging [Rockafellar and Roger (1991), ...] ◮ Multi-stage decomposition [Slyke and Wets (1969), ...] ◮ Scenario decomposition [Higle and Sen (1991), ...] 20 / 27

slide-29
SLIDE 29

Existing Work

◮ Synchronous: barriers after job solving and before

reiteration.

e.g., Nielsen and Zenios (1997), Ahmed (2013), Lubin et al. (2013), ...

21 / 27

slide-30
SLIDE 30

Existing Work

◮ Synchronous: barriers after job solving and before

reiteration.

e.g., Nielsen and Zenios (1997), Ahmed (2013), Lubin et al. (2013), ...

barriers

21 / 27

slide-31
SLIDE 31

Existing Work

◮ Synchronous: barriers after job solving and before

reiteration.

e.g., Nielsen and Zenios (1997), Ahmed (2013), Lubin et al. (2013), ...

◮ Master-Worker: dedicate one processor to collect and

compile distributed information.

e.g., Ruszczy´ nski (1993), Birge et al. (1996), Ryan, et al. (2015), ...

barriers

21 / 27

slide-32
SLIDE 32

Existing Work

◮ Synchronous: barriers after job solving and before

reiteration.

e.g., Nielsen and Zenios (1997), Ahmed (2013), Lubin et al. (2013), ...

◮ Master-Worker: dedicate one processor to collect and

compile distributed information.

e.g., Ruszczy´ nski (1993), Birge et al. (1996), Ryan, et al. (2015), ...

◮ Dynamic assignment: jobs queue for available

processors.

barriers

21 / 27

slide-33
SLIDE 33

Existing Work

◮ Synchronous: barriers after job solving and before

reiteration.

e.g., Nielsen and Zenios (1997), Ahmed (2013), Lubin et al. (2013), ...

◮ Master-Worker: dedicate one processor to collect and

compile distributed information.

e.g., Ruszczy´ nski (1993), Birge et al. (1996), Ryan, et al. (2015), ...

◮ Dynamic assignment: jobs queue for available

processors.

◮ Force reiteration:

e.g., Linderoth and Wright (2003), ...

barriers

21 / 27

slide-34
SLIDE 34

Existing Work

◮ Synchronous: barriers after job solving and before

reiteration.

e.g., Nielsen and Zenios (1997), Ahmed (2013), Lubin et al. (2013), ...

◮ Master-Worker: dedicate one processor to collect and

compile distributed information.

e.g., Ruszczy´ nski (1993), Birge et al. (1996), Ryan, et al. (2015), ...

◮ Dynamic assignment: jobs queue for available

processors.

◮ Force reiteration:

e.g., Linderoth and Wright (2003), ...

reiterate

21 / 27

slide-35
SLIDE 35

Our Approaches

◮ Basic Parallel (BP): synchronous.

scenario subproblem evaluation exchange result

barriers

22 / 27

slide-36
SLIDE 36

Our Approaches

◮ Basic Parallel (BP): synchronous.

scenario subproblem evaluation exchange result

◮ Duplicate efforts on evaluation, e.g.,

barriers

22 / 27

slide-37
SLIDE 37

Our Approaches

◮ Basic Parallel (BP): synchronous.

scenario subproblem evaluation exchange result

◮ Duplicate efforts on evaluation, e.g.,

barriers

22 / 27

slide-38
SLIDE 38

Our Approaches

◮ Basic Parallel (BP): synchronous.

scenario subproblem evaluation exchange result

◮ Duplicate efforts on evaluation, e.g.,

barriers

22 / 27

slide-39
SLIDE 39

Our Approaches

◮ Basic Parallel (BP): synchronous.

scenario subproblem evaluation exchange result

◮ Duplicate efforts on evaluation, e.g.,

◮ Master-Worker with Barriers (MWB): master keep

solutions.

barriers

22 / 27

slide-40
SLIDE 40

Our Approaches

◮ Basic Parallel (BP): synchronous.

scenario subproblem evaluation exchange result

◮ Duplicate efforts on evaluation, e.g.,

◮ Master-Worker with Barriers (MWB): master keep

solutions.

scenario subproblem evaluation Worker: remove duplicates Master:

barriers

22 / 27

slide-41
SLIDE 41

Our Approaches

◮ Basic Parallel (BP): synchronous.

scenario subproblem evaluation exchange result

◮ Duplicate efforts on evaluation, e.g.,

◮ Master-Worker with Barriers (MWB): master keep

solutions.

scenario subproblem evaluation Worker: remove duplicates Master:

◮ Master-Worker without Barriers (MWN): master creates

jobs and updates every worker individually with results from the others.

barriers

22 / 27

slide-42
SLIDE 42

Our Approaches

◮ Basic Parallel (BP): synchronous.

scenario subproblem evaluation exchange result

◮ Duplicate efforts on evaluation, e.g.,

◮ Master-Worker with Barriers (MWB): master keep

solutions.

scenario subproblem evaluation Worker: remove duplicates Master:

◮ Master-Worker without Barriers (MWN): master creates

jobs and updates every worker individually with results from the others.

Eva((0,1,1)) Sub(2) Eva((1,1,1)) Sub(3) Sub(1) Sub(4) barriers

22 / 27

slide-43
SLIDE 43

Our Approaches

◮ Basic Parallel (BP): synchronous.

scenario subproblem evaluation exchange result

◮ Duplicate efforts on evaluation, e.g.,

◮ Master-Worker with Barriers (MWB): master keep

solutions.

scenario subproblem evaluation Worker: remove duplicates Master:

◮ Master-Worker without Barriers (MWN): master creates

jobs and updates every worker individually with results from the others.

Eva((0,1,1)) Sub(2) Eva((1,1,1)) Sub(3) Sub(1) Sub(4)

22 / 27

slide-44
SLIDE 44

Computational Results

◮ CPLEX 12.6 & C++ on a Linux workstation with four 3.4GHz

processors and 16GB memory.

◮ Parallel: OpenMPI, Flux HPC Cluster ◮ Test risk measure ρ: CVaR1−0.1 ◮ Instances from SIPLIB†

SSLP SMKP stochastic server location problem multi 0-1 knapsack problem Stage 1 10 binary var 240 binary var 1 constr 50 constr Stage 2 500 binary var, 10 continuous var 120 binary var (per scenario) 60 constr 5 constr SSLP Instances SMKP Instances _50 _100 _500 _1000 _1 _2 _3 _4 # scen 50 100 50 1000 20 40 80 160

†: S. Ahmed, R. Garcia, N. Kong, L. Ntaimo, G. Parija, F

. Qiu, S. Sen. SIPLIB: A Stochastic Integer Programming Test Problem Library. http://www.isye.gatech.edu/~sahmed/siplib, 2015.

23 / 27

slide-45
SLIDE 45

Computational Efficiency

◮ MIP: call solver to solve the LP reformulation of CVaR (Rockafellar et

al., 2002):

min

x∈X CVaRα(f(x, ξ)) = min x∈X,η

  • η +

1 1 − α

K

  • k=1

pk

  • f(x, ξk) − η

+ : η ∈ R

  • .

◮ DD-i: dual decomposition using different methods for computing

bounds.

24 / 27

slide-46
SLIDE 46

Computational Efficiency

◮ MIP: call solver to solve the LP reformulation of CVaR (Rockafellar et

al., 2002):

min

x∈X CVaRα(f(x, ξ)) = min x∈X,η

  • η +

1 1 − α

K

  • k=1

pk

  • f(x, ξk) − η

+ : η ∈ R

  • .

◮ DD-i: dual decomposition using different methods for computing

bounds.

Table : Solution time in seconds (optimality gap if not solved in 6hrs)

SSLP SMKP _50 _100 _500 _1000 _20 _40 _80 _160 MIP 195 201 (100%) (100%) 299 (0.09%) (0.11%) (0.16%) DD-2S 415 602 7231 (9%) 3496 9080 (0.01%) (0.01%) DD-2C 1276 2570 (10%) (16%) (0.02%) (0.01%) (0.02%) (0.02%) DD-1 248 502 4663 12750 2692 9866 11249 18774 #: fastest among the comparison groups.

24 / 27

slide-47
SLIDE 47

Computational Efficiency

◮ MIP: call solver to solve the LP reformulation of CVaR (Rockafellar et

al., 2002):

min

x∈X CVaRα(f(x, ξ)) = min x∈X,η

  • η +

1 1 − α

K

  • k=1

pk

  • f(x, ξk) − η

+ : η ∈ R

  • .

◮ DD-i: dual decomposition using different methods for computing

bounds.

Table : Solution time in seconds (optimality gap if not solved in 6hrs)

SSLP SMKP _50 _100 _500 _1000 _20 _40 _80 _160 MIP 195 201 (100%) (100%) 299 (0.09%) (0.11%) (0.16%) DD-2S 415 602 7231 (9%) 3496 9080 (0.01%) (0.01%) DD-2C 1276 2570 (10%) (16%) (0.02%) (0.01%) (0.02%) (0.02%) DD-1 248 502 4663 12750 2692 9866 11249 18774 #: fastest among the comparison groups.

◮ For modest and large instances, the computational efficacy:

DD-1 > DD-2S > DD-2C > MIP (1-loop) (2-loop, subgradient) (2-loop, cutting-plane)

24 / 27

slide-48
SLIDE 48

Parallel DD-1

Speedup = Serial Time / Parallel Time (= # processors, in perfect parallelism)

Figure : Speedup vs. Num of Processeshahahahaha

4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32

SSLP_50 SSLP_100 SSLP_500 SSLP_1000 SMKP_20 SMKP_40 SMKP_80 SMKP_160

Perf Bas Mas Perfect BP MWB

MWB MWN

MWN

◮ MWB and BP crossover. ◮ MWN (MWB) scales better under a smaller (larger) num of scenarios. ◮ Super-linear speedup: smaller total workload in parallel than in serial.

25 / 27

slide-49
SLIDE 49

Communication Time Tradeoff

◮ Communication

26 / 27

slide-50
SLIDE 50

Communication Time Tradeoff

◮ Communication

◮ Collective vs. Point-to-point 26 / 27

slide-51
SLIDE 51

Communication Time Tradeoff

◮ Communication

◮ Collective vs. Point-to-point

Collective

: computation jobs : collective communication

◮ BP: collective; MWB: mixed; MWN: point-to-point. 26 / 27

slide-52
SLIDE 52

Communication Time Tradeoff

◮ Communication

◮ Collective vs. Point-to-point

Collective

: computation jobs : collective communication

◮ BP: collective; MWB: mixed; MWN: point-to-point. 26 / 27

slide-53
SLIDE 53

Communication Time Tradeoff

◮ Communication

◮ Collective vs. Point-to-point

Collective Point-to-point

: computation jobs : collective communication : point-to-point communication

◮ BP: collective; MWB: mixed; MWN: point-to-point. 26 / 27

slide-54
SLIDE 54

Communication Time Tradeoff

◮ Communication

◮ Collective vs. Point-to-point

Collective Point-to-point

: computation jobs : collective communication : point-to-point communication

◮ BP: collective; MWB: mixed; MWN: point-to-point.

◮ Time tradeoff

◮ Computation time:

BP > MWB ≈ MWN

◮ Collective communication time:

BP > MWB ≫ MWN = 0

◮ Point-to-point communication time:

MWN > MWB ≫ BP = 0

26 / 27

slide-55
SLIDE 55

Communication Time Tradeoff

◮ Communication

◮ Collective vs. Point-to-point

Collective Point-to-point

: computation jobs : collective communication : point-to-point communication

◮ BP: collective; MWB: mixed; MWN: point-to-point.

◮ Time tradeoff

◮ Computation time:

BP > MWB ≈ MWN

◮ Collective communication time: ր with num of processors

BP > MWB ≫ MWN = 0

◮ Point-to-point communication time:

MWN > MWB ≫ BP = 0

26 / 27

slide-56
SLIDE 56

Communication Time Tradeoff

◮ Communication

◮ Collective vs. Point-to-point

Collective Point-to-point

: computation jobs : collective communication : point-to-point communication

◮ BP: collective; MWB: mixed; MWN: point-to-point.

◮ Time tradeoff

◮ Computation time:

BP > MWB ≈ MWN

◮ Collective communication time: ր with num of processors

BP > MWB ≫ MWN = 0

◮ Point-to-point communication time: ր with num of scenarios

MWN > MWB ≫ BP = 0

26 / 27

slide-57
SLIDE 57

Conclusion Thank you! Questions?

27 / 27