Parallel Scenario Decomposition of Risk Averse 0-1 Stochastic Programs
Shabbir Ahmed ISyE, Georgia Tech joint work with Yan Deng, Siqian Shen (IOE, U of Michigan)
2016 ICSP
1 / 27
Parallel Scenario Decomposition of Risk Averse 0-1 Stochastic - - PowerPoint PPT Presentation
Parallel Scenario Decomposition of Risk Averse 0-1 Stochastic Programs Shabbir Ahmed ISyE, Georgia Tech joint work with Yan Deng, Siqian Shen (IOE, U of Michigan) 2016 ICSP 1 / 27 Outline Risk-Averse Stochastic 0-1 Program Dual
1 / 27
◮ Risk-Averse Stochastic 0-1 Program
◮ Dual representation of coherent risk measure ◮ Dual decomposition ◮ Distributionally robust counterpart
◮ Parallelization of Decomposition Method
◮ Motivation ◮ Parallel Schemes 2 / 27
◮ ξ: a random vector with finite support {ξ1, . . . , ξK} and probabilities
K
y
◮ ρ(·): coherent risk measure.
3 / 27
◮ Positive homogeneity:
◮ Sub-additivity:
◮ Monotonicity:
◮ Translation invariance:
4 / 27
◮ Artzner et al. (1999), Shapiro and Ahmed (2004), Shapiro (2013):
q∈Q(p)
K
5 / 27
◮ Artzner et al. (1999), Shapiro and Ahmed (2004), Shapiro (2013):
q∈Q(p)
K
ϵ
max
5 / 27
◮ Artzner et al. (1999), Shapiro and Ahmed (2004), Shapiro (2013):
q∈Q(p)
K
K
◮ Artzner et al. (1999), Shapiro and Ahmed (2004), Shapiro (2013):
q∈Q(p)
K
◮ Minimax Reformulation
x∈X
q∈Q(p)
◮ Ahmed (2013): 0-1 stochastic program ◮ Ahmed et. al. (2015): 0-1 chance constrained program 7 / 27
x∈X
q∈Q(p)
x∈X
q∈Q(p)
◮ Force x1 = · · · = xK by non-anticipativity constraint: K
8 / 27
x1,...,xK∈X
q∈Q(p) K
K
9 / 27
x1,...,xK∈X
q∈Q(p) K
K
◮ Relax (NAC) and punish violation by λ ∈ Rd.
x1,...,xK∈X
q∈Q(p)
K
x1,...,xK∈X
q∈Q(p)
9 / 27
x1,...,xK∈X
q∈Q(p) K
K
◮ Relax (NAC) and punish violation by λ ∈ Rd.
x1,...,xK∈X
q∈Q(p)
K
x1,...,xK∈X
q∈Q(p)
q∈Q(p)
x1,...,xK∈X
q∈Q(p)
xk∈X
10 / 27
x1,...,xK∈X
q∈Q(p) K
K
◮ Relax (NAC) and punish violation by λ ∈ Rd.
x1,...,xK∈X
q∈Q(p)
K
x1,...,xK∈X
q∈Q(p)
q∈Q(p)
x1,...,xK∈X
q∈Q(p)
xk∈X
11 / 27
q∈Q(p)
xk∈X
q∈Q(p)
xk∈X
k=1 qk minx∈X f(x, ξk)
k=1 βkqk : q ∈ Q(p)
q∈Q(p)
xk∈X
q∈Q(p),λ,φ
k=1 min x∈X
k=1
k=1 βk
k=1 βk.
14 / 27
q∈Q(p)
xk∈X
x1,...,xK∈X
q∈Q(p) K
K
◮ Approach 3:
x1,...,xK∈X
q∈Q(p) K
K
15 / 27
q∈Q(p)
x1,...,xK∈X
k=1 qk
k=1 αkxk⊤ K k=1 qkλk
16 / 27
q∈Q(p)
x1,...,xK∈X
k=1 qk
k=1 αkxk⊤ K k=1 qkλk
q∈Q(p) Q(λ)
k=1 qk min x∈X
k=1 qkλk = 0
k=1 βkqk : q ∈ Q(p) Q(λ)
17 / 27
◮ LB:
x∈X{f(x, ξk)}
x∈X
x∈X
◮ Algorithm overview:
◮ No-good Cut to exclude evaluated ˆ
j:ˆ xj=1(1 − xj) + j:ˆ xj=0 xj ≥ 1.
18 / 27
◮ Known probability distribution p,
x∈X ρ(f(x, ξ)) = min x∈X
q∈Qρ(p) Eq[f(x, ξ)] ◮ If p is not known exactly, but an uncertainty set U is given,
x∈X max p∈U ρ(f(x, ξ))
19 / 27
◮ Known probability distribution p,
x∈X ρ(f(x, ξ)) = min x∈X
q∈Qρ(p) Eq[f(x, ξ)] ◮ If p is not known exactly, but an uncertainty set U is given,
x∈X max p∈U ρ(f(x, ξ))
x∈X max p∈U
q∈Qρ(p) Eq[f(x, ξ)]
x∈X
q∈{q:q∈Qρ(p), p∈P} Eq[f(x, ξ)] ◮ All the proposed dual decomposition methods are still applicable.
19 / 27
◮ Parallel jobs, e.g., Sub(k), Eva(x).
20 / 27
◮ Parallel jobs, e.g., Sub(k), Eva(x). ◮ Synchronization and communication in between
20 / 27
◮ Parallel jobs, e.g., Sub(k), Eva(x). ◮ Synchronization and communication in between
20 / 27
◮ Parallel jobs, e.g., Sub(k), Eva(x). ◮ Synchronization and communication in between
20 / 27
◮ Parallel jobs, e.g., Sub(k), Eva(x). ◮ Synchronization and communication in between
◮ Similarly-structured methods:
◮ Dual decomposition [Carøe and Schultz (1999), ...] ◮ Benders decomposition [Benders (1962), ...] ◮ Progressive hedging [Rockafellar and Roger (1991), ...] ◮ Multi-stage decomposition [Slyke and Wets (1969), ...] ◮ Scenario decomposition [Higle and Sen (1991), ...] 20 / 27
◮ Synchronous: barriers after job solving and before
21 / 27
◮ Synchronous: barriers after job solving and before
barriers
21 / 27
◮ Synchronous: barriers after job solving and before
◮ Master-Worker: dedicate one processor to collect and
barriers
21 / 27
◮ Synchronous: barriers after job solving and before
◮ Master-Worker: dedicate one processor to collect and
◮ Dynamic assignment: jobs queue for available
barriers
21 / 27
◮ Synchronous: barriers after job solving and before
◮ Master-Worker: dedicate one processor to collect and
◮ Dynamic assignment: jobs queue for available
◮ Force reiteration:
barriers
21 / 27
◮ Synchronous: barriers after job solving and before
◮ Master-Worker: dedicate one processor to collect and
◮ Dynamic assignment: jobs queue for available
◮ Force reiteration:
reiterate
21 / 27
◮ Basic Parallel (BP): synchronous.
barriers
22 / 27
◮ Basic Parallel (BP): synchronous.
◮ Duplicate efforts on evaluation, e.g.,
barriers
22 / 27
◮ Basic Parallel (BP): synchronous.
◮ Duplicate efforts on evaluation, e.g.,
barriers
22 / 27
◮ Basic Parallel (BP): synchronous.
◮ Duplicate efforts on evaluation, e.g.,
barriers
22 / 27
◮ Basic Parallel (BP): synchronous.
◮ Duplicate efforts on evaluation, e.g.,
◮ Master-Worker with Barriers (MWB): master keep
barriers
22 / 27
◮ Basic Parallel (BP): synchronous.
◮ Duplicate efforts on evaluation, e.g.,
◮ Master-Worker with Barriers (MWB): master keep
barriers
22 / 27
◮ Basic Parallel (BP): synchronous.
◮ Duplicate efforts on evaluation, e.g.,
◮ Master-Worker with Barriers (MWB): master keep
◮ Master-Worker without Barriers (MWN): master creates
barriers
22 / 27
◮ Basic Parallel (BP): synchronous.
◮ Duplicate efforts on evaluation, e.g.,
◮ Master-Worker with Barriers (MWB): master keep
◮ Master-Worker without Barriers (MWN): master creates
Eva((0,1,1)) Sub(2) Eva((1,1,1)) Sub(3) Sub(1) Sub(4) barriers
22 / 27
◮ Basic Parallel (BP): synchronous.
◮ Duplicate efforts on evaluation, e.g.,
◮ Master-Worker with Barriers (MWB): master keep
◮ Master-Worker without Barriers (MWN): master creates
Eva((0,1,1)) Sub(2) Eva((1,1,1)) Sub(3) Sub(1) Sub(4)
22 / 27
◮ CPLEX 12.6 & C++ on a Linux workstation with four 3.4GHz
◮ Parallel: OpenMPI, Flux HPC Cluster ◮ Test risk measure ρ: CVaR1−0.1 ◮ Instances from SIPLIB†
†: S. Ahmed, R. Garcia, N. Kong, L. Ntaimo, G. Parija, F
. Qiu, S. Sen. SIPLIB: A Stochastic Integer Programming Test Problem Library. http://www.isye.gatech.edu/~sahmed/siplib, 2015.
23 / 27
◮ MIP: call solver to solve the LP reformulation of CVaR (Rockafellar et
x∈X CVaRα(f(x, ξ)) = min x∈X,η
K
◮ DD-i: dual decomposition using different methods for computing
24 / 27
◮ MIP: call solver to solve the LP reformulation of CVaR (Rockafellar et
x∈X CVaRα(f(x, ξ)) = min x∈X,η
K
◮ DD-i: dual decomposition using different methods for computing
24 / 27
◮ MIP: call solver to solve the LP reformulation of CVaR (Rockafellar et
x∈X CVaRα(f(x, ξ)) = min x∈X,η
K
◮ DD-i: dual decomposition using different methods for computing
◮ For modest and large instances, the computational efficacy:
24 / 27
4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32 4 8 12 16 20 24 28 32
SSLP_50 SSLP_100 SSLP_500 SSLP_1000 SMKP_20 SMKP_40 SMKP_80 SMKP_160
◮ MWB and BP crossover. ◮ MWN (MWB) scales better under a smaller (larger) num of scenarios. ◮ Super-linear speedup: smaller total workload in parallel than in serial.
25 / 27
◮ Communication
26 / 27
◮ Communication
◮ Collective vs. Point-to-point 26 / 27
◮ Communication
◮ Collective vs. Point-to-point
: computation jobs : collective communication
◮ BP: collective; MWB: mixed; MWN: point-to-point. 26 / 27
◮ Communication
◮ Collective vs. Point-to-point
: computation jobs : collective communication
◮ BP: collective; MWB: mixed; MWN: point-to-point. 26 / 27
◮ Communication
◮ Collective vs. Point-to-point
: computation jobs : collective communication : point-to-point communication
◮ BP: collective; MWB: mixed; MWN: point-to-point. 26 / 27
◮ Communication
◮ Collective vs. Point-to-point
: computation jobs : collective communication : point-to-point communication
◮ BP: collective; MWB: mixed; MWN: point-to-point.
◮ Time tradeoff
◮ Computation time:
◮ Collective communication time:
◮ Point-to-point communication time:
26 / 27
◮ Communication
◮ Collective vs. Point-to-point
: computation jobs : collective communication : point-to-point communication
◮ BP: collective; MWB: mixed; MWN: point-to-point.
◮ Time tradeoff
◮ Computation time:
◮ Collective communication time: ր with num of processors
◮ Point-to-point communication time:
26 / 27
◮ Communication
◮ Collective vs. Point-to-point
: computation jobs : collective communication : point-to-point communication
◮ BP: collective; MWB: mixed; MWN: point-to-point.
◮ Time tradeoff
◮ Computation time:
◮ Collective communication time: ր with num of processors
◮ Point-to-point communication time: ր with num of scenarios
26 / 27
27 / 27