Distributed L-shaped Algorithms in Julia Martin Biel KTH - Royal - - PowerPoint PPT Presentation

distributed l shaped algorithms in julia
SMART_READER_LITE
LIVE PREVIEW

Distributed L-shaped Algorithms in Julia Martin Biel KTH - Royal - - PowerPoint PPT Presentation

KTH ROYAL INSTITUTE OF TECHNOLOGY Distributed L-shaped Algorithms in Julia Martin Biel KTH - Royal Institute of Technology November 16, 2018 Motivation - Stochastic programming decision x Martin Biel (KTH) November 16, 2018 2/19 Motivation


slide-1
SLIDE 1

KTH ROYAL INSTITUTE OF TECHNOLOGY

Distributed L-shaped Algorithms in Julia

Martin Biel

KTH - Royal Institute of Technology

November 16, 2018

slide-2
SLIDE 2

Motivation - Stochastic programming

decision x

Martin Biel (KTH) November 16, 2018 2/19

slide-3
SLIDE 3

Motivation - Stochastic programming

decision x →

  • bservation ξ(ω)

Martin Biel (KTH) November 16, 2018 2/19

slide-4
SLIDE 4

Motivation - Stochastic programming

decision x →

  • bservation ξ(ω)

→ recourse y

Martin Biel (KTH) November 16, 2018 2/19

slide-5
SLIDE 5

Motivation - Stochastic programming

decision x →

  • bservation ξ(ω)

→ recourse y

  • Determine optimal decision ˆ

x based on predicted outcomes {ωi}n

i=1

Martin Biel (KTH) November 16, 2018 2/19

slide-6
SLIDE 6

Motivation - Stochastic programming

decision x →

  • bservation ξ(ω)

→ recourse y

  • Determine optimal decision ˆ

x based on predicted outcomes {ωi}n

i=1

  • Numerous industry applications

◮ Power systems ◮ Finance ◮ Transportation

Martin Biel (KTH) November 16, 2018 2/19

slide-7
SLIDE 7

Motivation - Stochastic programming

decision x →

  • bservation ξ(ω)

→ recourse y

  • Determine optimal decision ˆ

x based on predicted outcomes {ωi}n

i=1

  • Numerous industry applications

◮ Power systems ◮ Finance ◮ Transportation

  • Traditional procedure

◮ Formulate deterministically equivalent optimization problem ◮ Optimize extended form using standard solvers

Martin Biel (KTH) November 16, 2018 2/19

slide-8
SLIDE 8

Motivation - Limitations of standard approaches

  • Industry-scale applications typically involve 10,000+ scenarios

Martin Biel (KTH) November 16, 2018 3/19

slide-9
SLIDE 9

Motivation - Limitations of standard approaches

  • Industry-scale applications typically involve 10,000+ scenarios
  • Example: 24-hour unit commitment problem [Petra et al (2014)]

◮ 16,384 scenarios ◮ 1.95 billion variables and constraints in the extended form ◮ ~ 1 hour computation time on a Titan supercomputer

Martin Biel (KTH) November 16, 2018 3/19

slide-10
SLIDE 10

Motivation - Limitations of standard approaches

  • Industry-scale applications typically involve 10,000+ scenarios
  • Example: 24-hour unit commitment problem [Petra et al (2014)]

◮ 16,384 scenarios ◮ 1.95 billion variables and constraints in the extended form ◮ ~ 1 hour computation time on a Titan supercomputer

  • Long computation time required to optimize

Martin Biel (KTH) November 16, 2018 3/19

slide-11
SLIDE 11

Motivation - Limitations of standard approaches

  • Industry-scale applications typically involve 10,000+ scenarios
  • Example: 24-hour unit commitment problem [Petra et al (2014)]

◮ 16,384 scenarios ◮ 1.95 billion variables and constraints in the extended form ◮ ~ 1 hour computation time on a Titan supercomputer

  • Long computation time required to optimize
  • Memory requirement exceeds the capacity of a single machine

Martin Biel (KTH) November 16, 2018 3/19

slide-12
SLIDE 12

Motivation - Limitations of standard approaches

  • Industry-scale applications typically involve 10,000+ scenarios
  • Example: 24-hour unit commitment problem [Petra et al (2014)]

◮ 16,384 scenarios ◮ 1.95 billion variables and constraints in the extended form ◮ ~ 1 hour computation time on a Titan supercomputer

  • Long computation time required to optimize
  • Memory requirement exceeds the capacity of a single machine

Parallel algorithms that work on distributed data are required

Martin Biel (KTH) November 16, 2018 3/19

slide-13
SLIDE 13

Contribution

  • Framework for formulating and solving stochastic programs

Martin Biel (KTH) November 16, 2018 4/19

slide-14
SLIDE 14

Contribution

  • Framework for formulating and solving stochastic programs
  • A collection of L-shaped algorithms

Martin Biel (KTH) November 16, 2018 4/19

slide-15
SLIDE 15

Contribution

  • Framework for formulating and solving stochastic programs
  • A collection of L-shaped algorithms
  • Distributed-memory setting

Martin Biel (KTH) November 16, 2018 4/19

slide-16
SLIDE 16

Contribution

  • Framework for formulating and solving stochastic programs
  • A collection of L-shaped algorithms
  • Distributed-memory setting
  • Complex functionality using simple abstractions in Julia

Martin Biel (KTH) November 16, 2018 4/19

slide-17
SLIDE 17

Contribution

  • Framework for formulating and solving stochastic programs
  • A collection of L-shaped algorithms
  • Distributed-memory setting
  • Complex functionality using simple abstractions in Julia

Rapidly formulate and solve real-world problems as stochastic programs

Martin Biel (KTH) November 16, 2018 4/19

slide-18
SLIDE 18

Contribution - Framework

Run time

Worker 1 Worker N Master

Workers

· Local scenario data · Local JuMP model · Cuts for global problem

Master

· Global problem · Global solver · Worker coordination Domain-specific models Hydromodels.jl Algebraic modeling language StochasticPrograms.jl Scalable distributed solvers LShapedSolvers.jl

Design time

Figure: Overview of software framework.

Martin Biel (KTH) November 16, 2018 4/19

slide-19
SLIDE 19

Outline

  • Background
  • Implementation
  • Numerical experiments
  • Final remarks

Martin Biel (KTH) November 16, 2018 5/19

slide-20
SLIDE 20

Background - Stochastic programming

Two-stage linear stochastic program minimize

x∈Rn

cTx + Eω[Q(x, ξ(ω))] s.t. Ax = b x ≥ 0 where Q(x, ξ(ω)) = min

y∈Rm

qT

ωy

s.t. Tωx + Wy = hω y ≥ 0

Martin Biel (KTH) November 16, 2018 6/19

slide-21
SLIDE 21

Background - Stochastic programming

Two-stage linear stochastic program minimize

x∈Rn

cTx + Eω[Q(x, ξ(ω))] s.t. Ax = b x ≥ 0 where Q(x, ξ(ω)) = min

y∈Rm

qT

ωy

s.t. Tωx + Wy = hω y ≥ 0 Deterministically equivalent form minimize

x∈Rn,yi∈Rm

cTx +

n

  • i=1

πiqT

i yi

s.t. Ax = b Tix + Wiyi = hi, i = 1, . . . , n x ≥ 0, yi ≥ 0, i = 1, . . . , n

Martin Biel (KTH) November 16, 2018 6/19

slide-22
SLIDE 22

Background - Stochastic programming

       A T1 W T2 W . . . ... W        Tn

Figure: L-shaped structure.

Martin Biel (KTH) November 16, 2018 6/19

slide-23
SLIDE 23

Background - L-shaped algorithm

Cutting-plane algorithms

X = {x ∈ Rn | Ax = b, x ≥ 0} ˆ X = Optimal set ˆ X Figure: L-shaped cutting planes

Martin Biel (KTH) November 16, 2018 6/19

slide-24
SLIDE 24

Background - L-shaped algorithm

Cutting-plane algorithms

X = {x ∈ Rn | Ax = b, x ≥ 0} ˆ X = Optimal set ˆ X x0 Figure: L-shaped cutting planes

Martin Biel (KTH) November 16, 2018 6/19

slide-25
SLIDE 25

Background - L-shaped algorithm

Cutting-plane algorithms

X = {x ∈ Rn | Ax = b, x ≥ 0} ˆ X = Optimal set ˆ X x0 Figure: L-shaped cutting planes

Martin Biel (KTH) November 16, 2018 6/19

slide-26
SLIDE 26

Background - L-shaped algorithm

Cutting-plane algorithms

X = {x ∈ Rn | Ax = b, x ≥ 0} ˆ X = Optimal set ˆ X x0 x1 Figure: L-shaped cutting planes

Martin Biel (KTH) November 16, 2018 6/19

slide-27
SLIDE 27

Background - L-shaped algorithm

Cutting-plane algorithms

X = {x ∈ Rn | Ax = b, x ≥ 0} ˆ X = Optimal set ˆ X x0 x1 Figure: L-shaped cutting planes

Martin Biel (KTH) November 16, 2018 6/19

slide-28
SLIDE 28

Background - L-shaped algorithm

Cutting-plane algorithms

X = {x ∈ Rn | Ax = b, x ≥ 0} ˆ X = Optimal set ˆ X x0 x1 x2 Figure: L-shaped cutting planes

Martin Biel (KTH) November 16, 2018 6/19

slide-29
SLIDE 29

Background - L-shaped algorithm

Master problem minimize

x∈Rn

cTx +

n

  • i=1

θi s.t. Ax = b ∂Qx + θi ≥ q, i = 1, . . . , n x ≥ 0 Subproblems minimize

yi∈Rm

Qk

i = qT i yi

s.t. Wyi = hi − Tixj yi ≥ 0 ∂Qj = πiλT

i,jTi

qj = πiλT

i,jhi Martin Biel (KTH) November 16, 2018 7/19

slide-30
SLIDE 30

Background - L-shaped algorithm

Master problem minimize

x∈Rn

cTx +

n

  • i=1

θi s.t. Ax = b ∂Qx + θi ≥ q, i = 1, . . . , n x ≥ 0 Subproblems minimize

yi∈Rm

Qk

i = qT i yi

s.t. Wyi = hi − Tixj yi ≥ 0 ∂Qj = πiλT

i,jTi

qj = πiλT

i,jhi

  • One master problem, n subproblems
  • Theoretical convergence guarantees
  • Convergence can be improved through regularization procedures
  • Readily extended to operate in parallel on distributed data

Martin Biel (KTH) November 16, 2018 7/19

slide-31
SLIDE 31

Implementation - StochasticPrograms.jl

  • Flexible problem definition
  • Deferred model instantiation
  • Scenario data injection
  • Memory-distributed
  • Minimize data passing

◮ Lightweight sampler objects to generate data ◮ Lightweight model recipes to generate second stage problems

  • Interface to structure-exploiting solver algorithms
  • Registered Julia package

Martin Biel (KTH) November 16, 2018 8/19

slide-32
SLIDE 32

Implementation - Model recipes

✞ ☎

@first_stage sp = begin @variable(model, x1 >= 40) @variable(model, x2 >= 20) @objective(model, Min, 100*x1 + 150*x2) @constraint(model, x1 + x2 <= 120) end @second_stage sp = begin @decision x1 x2 ξ = scenario @variable(model, 0 <= y1 <= ξ.d1) @variable(model, 0 <= y2 <= ξ.d2) @objective(model, Min, ξ.q1*y1 + ξ.q2*y2) @constraint(model, 6*y1 + 10*y2 <= 60*x1) @constraint(model, 8*y1 + 5*y2 <= 80*x2) end

✝ ✆

Martin Biel (KTH) November 16, 2018 9/19

slide-33
SLIDE 33

Implementation - Model recipes

✞ ☎

@first_stage sp = begin @variable(model, x1 >= 40) @variable(model, x2 >= 20) @objective(model, Min, 100*x1 + 150*x2) @constraint(model, x1 + x2 <= 120) end @second_stage sp = begin @decision x1 x2 ξ = scenario @variable(model, 0 <= y1 <= ξ.d1) @variable(model, 0 <= y2 <= ξ.d2) @objective(model, Min, ξ.q1*y1 + ξ.q2*y2) @constraint(model, 6*y1 + 10*y2 <= 60*x1) @constraint(model, 8*y1 + 5*y2 <= 80*x2) end

✝ ✆

JuMP syntax

Martin Biel (KTH) November 16, 2018 9/19

slide-34
SLIDE 34

Implementation - Model recipes

✞ ☎

@first_stage sp = begin @variable(model, x1 >= 40) @variable(model, x2 >= 20) @objective(model, Min, 100*x1 + 150*x2) @constraint(model, x1 + x2 <= 120) end @second_stage sp = begin @decision x1 x2 ξ = scenario @variable(model, 0 <= y1 <= ξ.d1) @variable(model, 0 <= y2 <= ξ.d2) @objective(model, Min, ξ.q1*y1 + ξ.q2*y2) @constraint(model, 6*y1 + 10*y2 <= 60*x1) @constraint(model, 8*y1 + 5*y2 <= 80*x2) end

✝ ✆

minimize

x1,x2∈R

100x1 + 150x2 s.t. x1 + x2 ≤ 120 x1 ≥ 40 x2 ≥ 20

Martin Biel (KTH) November 16, 2018 9/19

slide-35
SLIDE 35

Implementation - Model recipes

✞ ☎

@first_stage sp = begin @variable(model, x1 >= 40) @variable(model, x2 >= 20) @objective(model, Min, 100*x1 + 150*x2) @constraint(model, x1 + x2 <= 120) end @second_stage sp = begin @decision x1 x2 ξ = scenario @variable(model, 0 <= y1 <= ξ.d1) @variable(model, 0 <= y2 <= ξ.d2) @objective(model, Min, ξ.q1*y1 + ξ.q2*y2) @constraint(model, 6*y1 + 10*y2 <= 60*x1) @constraint(model, 8*y1 + 5*y2 <= 80*x2) end

✝ ✆

minimize

y1,y2∈R

q1(ξ) y1 + q2(ξ) y2 s.t. 6y1 + 10y2 ≤ 60 x1 8y1 + 5y2 ≤ 80 x2 0 ≤ y1 ≤ d1(ξ) 0 ≤ y2 ≤ d2(ξ)

Martin Biel (KTH) November 16, 2018 9/19

slide-36
SLIDE 36

Implementation - Model recipes

✞ ☎

@first_stage sp = begin @variable(model, x1 >= 40) @variable(model, x2 >= 20) @objective(model, Min, 100*x1 + 150*x2) @constraint(model, x1 + x2 <= 120) end @second_stage sp = begin @decision x1 x2 ξ = scenario @variable(model, 0 <= y1 <= ξ.d1) @variable(model, 0 <= y2 <= ξ.d2) @objective(model, Min, ξ.q1*y1 + ξ.q2*y2) @constraint(model, 6*y1 + 10*y2 <= 60*x1) @constraint(model, 8*y1 + 5*y2 <= 80*x2) end

✝ ✆

minimize

y1,y2∈R

q1(ξ) y1 + q2(ξ) y2 s.t. 6y1 + 10y2 ≤ 60 x1 8y1 + 5y2 ≤ 80 x2 0 ≤ y1 ≤ d1(ξ) 0 ≤ y2 ≤ d2(ξ)

Martin Biel (KTH) November 16, 2018 9/19

slide-37
SLIDE 37

Implementation - Model recipes

✞ ☎

@first_stage sp = begin @variable(model, x1 >= 40) @variable(model, x2 >= 20) @objective(model, Min, 100*x1 + 150*x2) @constraint(model, x1 + x2 <= 120) end @second_stage sp = begin @decision x1 x2 ξ = scenario @variable(model, 0 <= y1 <= ξ.d1) @variable(model, 0 <= y2 <= ξ.d2) @objective(model, Min, ξ.q1*y1 + ξ.q2*y2) @constraint(model, 6*y1 + 10*y2 <= 60*x1) @constraint(model, 8*y1 + 5*y2 <= 80*x2) end

✝ ✆

minimize

y1,y2∈R

q1(ξ) y1 + q2(ξ) y2 s.t. 6y1 + 10y2 ≤ 60 x1 8y1 + 5y2 ≤ 80 x2 0 ≤ y1 ≤ d1(ξ) 0 ≤ y2 ≤ d2(ξ)

Martin Biel (KTH) November 16, 2018 9/19

slide-38
SLIDE 38

Implementation - LShapedSolvers.jl

  • Collection of L-shaped algorithms

Martin Biel (KTH) November 16, 2018 10/19

slide-39
SLIDE 39

Implementation - LShapedSolvers.jl

  • Collection of L-shaped algorithms
  • Eight variants in total

◮ Three different regularization procedures ◮ Distributed variants of each algorithm ◮ Numerous tweakable parameters

Martin Biel (KTH) November 16, 2018 10/19

slide-40
SLIDE 40

Implementation - LShapedSolvers.jl

  • Collection of L-shaped algorithms
  • Eight variants in total

◮ Three different regularization procedures ◮ Distributed variants of each algorithm ◮ Numerous tweakable parameters

  • Trait-based implementation

Martin Biel (KTH) November 16, 2018 10/19

slide-41
SLIDE 41

Implementation - LShapedSolvers.jl

  • Collection of L-shaped algorithms
  • Eight variants in total

◮ Three different regularization procedures ◮ Distributed variants of each algorithm ◮ Numerous tweakable parameters

  • Trait-based implementation
  • Interfaces to StochasticPrograms.jl

Martin Biel (KTH) November 16, 2018 10/19

slide-42
SLIDE 42

Implementation - Distributed computations in Julia

  • Distributed computing in Julia revolves around two primitives:

◮ Remote references: administer which node data resides on ◮ Remote calls: schedule execution tasks on the distributed data

Martin Biel (KTH) November 16, 2018 11/19

slide-43
SLIDE 43

Implementation - Distributed computations in Julia

  • Distributed computing in Julia revolves around two primitives:

◮ Remote references: administer which node data resides on ◮ Remote calls: schedule execution tasks on the distributed data

  • A Channel administers data on one process

Martin Biel (KTH) November 16, 2018 11/19

slide-44
SLIDE 44

Implementation - Distributed computations in Julia

  • Distributed computing in Julia revolves around two primitives:

◮ Remote references: administer which node data resides on ◮ Remote calls: schedule execution tasks on the distributed data

  • A Channel administers data on one process
  • The Channel data is readable and writable from all processes.
  • Calling put on a remote channel involves data passing

Martin Biel (KTH) November 16, 2018 11/19

slide-45
SLIDE 45

Implementation - Distributed computations in Julia

  • Distributed computing in Julia revolves around two primitives:

◮ Remote references: administer which node data resides on ◮ Remote calls: schedule execution tasks on the distributed data

  • A Channel administers data on one process
  • The Channel data is readable and writable from all processes.
  • Calling put on a remote channel involves data passing
  • Calling fetch on a remote channel involves data fetching

Martin Biel (KTH) November 16, 2018 11/19

slide-46
SLIDE 46

Implementation - Distributed computations in Julia

  • Distributed computing in Julia revolves around two primitives:

◮ Remote references: administer which node data resides on ◮ Remote calls: schedule execution tasks on the distributed data

  • A Channel administers data on one process
  • The Channel data is readable and writable from all processes.
  • Calling put on a remote channel involves data passing
  • Calling fetch on a remote channel involves data fetching
  • A remote call returns a Future to the result

Martin Biel (KTH) November 16, 2018 11/19

slide-47
SLIDE 47

Implementation - Distributed computations in Julia

  • Distributed computing in Julia revolves around two primitives:

◮ Remote references: administer which node data resides on ◮ Remote calls: schedule execution tasks on the distributed data

  • A Channel administers data on one process
  • The Channel data is readable and writable from all processes.
  • Calling put on a remote channel involves data passing
  • Calling fetch on a remote channel involves data fetching
  • A remote call returns a Future to the result
  • A process can wait for data to arrive on a remote reference

Martin Biel (KTH) November 16, 2018 11/19

slide-48
SLIDE 48

Implementation - Distributed L-shaped channels

Master node

  • Decisions: Master solutions (D)
  • CutQueue: Optimality cuts from workers (C)

Worker nodes

  • Worker: Local subproblems (S)
  • Work: Index into Decisions (W)

Martin Biel (KTH) November 16, 2018 12/19

slide-49
SLIDE 49

Implementation - Distributed L-shaped channels

Master node

  • Decisions: Master solutions (D)
  • CutQueue: Optimality cuts from workers (C)

Worker nodes

  • Worker: Local subproblems (S)
  • Work: Index into Decisions (W)
  • Master determines first stage decisions and shedules worker tasks
  • Workers solve subproblems given first stage decisions and generate optimality cuts
  • The amount of cuts needed to proceed is governed by a asynchronicity parameter κ
  • Timestamps used to synchronize and check convergence

Martin Biel (KTH) November 16, 2018 12/19

slide-50
SLIDE 50

Implementation - Distributed L-shaped tasks

Master node ✞ ☎

function do_work!(master::Master, cuts::CutQueue, decisions::Decisions, workers::Vector{Work}) x0 = initialize() put!(decisions, 0, x0) send_work(workers, 1) while true wait(cuts) (t,Q,cut) = take!(cuts) add_cut!(master,cut) if added_cuts(master,t) ≥ κ*nscenarios(master) # Enough information to resolve master xt+1 = solve(master) # Send new work to remote nodes put!(decisions, t+1, xt+1) send_work(workers, t+1) end if added_cuts(master,t) == nscenarios(master) && converged(master) return :Optimal end end end

✝ ✆

Martin Biel (KTH) November 16, 2018 13/19

slide-51
SLIDE 51

Implementation - Distributed L-shaped tasks

Worker nodes

✞ ☎

function do_work!(worker::Worker, work::Work, cuts::CutQueue, decisions::Decisions) subproblems::Vector{SubProblem} = fetch(worker) while true wait(work) t::Int = take!(work) if t == -1 # Worker finished return end x = fetch(decisions,t) # Update and solve all local subproblems @sync for subproblem in subproblems @async begin update_subproblem!(subproblem,x) cut = subproblem() Q = cut(x) # Send optimality cut to master, with timestamp # of decision and objective value put!(cuts,(t,Q,cut)) end end end end

✝ ✆

Martin Biel (KTH) November 16, 2018 14/19

slide-52
SLIDE 52

Implementation - Distributed L-shaped

D : · · · C : · · · minimize

x∈Rn

cTx + s.t. Ax = b x ≥ 0

Master

W1 : · · · S1 : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Ts ys ≥ 0

Worker 1

Wr : · · · Sr : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Ts ys ≥ 0

Worker r

· · ·

Figure: Distributed L-shaped procedure

Martin Biel (KTH) November 16, 2018 15/19

slide-53
SLIDE 53

Implementation - Distributed L-shaped

D : x0 · · · C : · · · minimize

x∈Rn

cTx + s.t. Ax = b x ≥ 0

Master

W1 : 1 · · · S1 : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Ts ys ≥ 0

Worker 1

Wr : 1 · · · Sr : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Ts ys ≥ 0

Worker r

· · ·

pass pass

Figure: Distributed L-shaped procedure

Martin Biel (KTH) November 16, 2018 15/19

slide-54
SLIDE 54

Implementation - Distributed L-shaped

D : x0 · · · C : · · · minimize

x∈Rn

cTx + s.t. Ax = b x ≥ 0

Master

W1 : 1 · · · S1 : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Tsx0 ys ≥ 0

Worker 1

Wr : 1 · · · Sr : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Tsx0 ys ≥ 0

Worker r

· · ·

fetch fetch

Figure: Distributed L-shaped procedure

Martin Biel (KTH) November 16, 2018 15/19

slide-55
SLIDE 55

Implementation - Distributed L-shaped

D : x0 · · · C : · · · minimize

x∈Rn

cTx + s.t. Ax = b x ≥ 0

Master

W1 : 1 · · · S1 : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Tsx0 ys ≥ 0

Worker 1

Wr : 1 · · · Sr : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Tsx0 ys ≥ 0

Worker r

· · ·

pass

Figure: Distributed L-shaped procedure

Martin Biel (KTH) November 16, 2018 15/19

slide-56
SLIDE 56

Implementation - Distributed L-shaped

D : x0 · · · C : · · · minimize

x∈Rn

cTx + s.t. Ax = b x ≥ 0

Master

W1 : 1 · · · S1 : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Tsx0 ys ≥ 0

Worker 1

Wr : 1 · · · Sr : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Tsx0 ys ≥ 0

Worker r

· · ·

pass

Figure: Distributed L-shaped procedure

Martin Biel (KTH) November 16, 2018 15/19

slide-57
SLIDE 57

Implementation - Distributed L-shaped

D : x0 · · · C : · · · minimize

x∈Rn

cTx +

n

  • i=1

θi s.t. Ax = b ∂Qx + θi ≥ q, i = 1, . . . , n x ≥ 0

Master

W1 : 1 · · · S1 : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Tsx0 ys ≥ 0

Worker 1

Wr : 1 · · · Sr : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Tsx0 ys ≥ 0

Worker r

· · ·

pass

Figure: Distributed L-shaped procedure

Martin Biel (KTH) November 16, 2018 15/19

slide-58
SLIDE 58

Implementation - Distributed L-shaped

D : x0 x1 · · · C : · · · minimize

x∈Rn

cTx +

n

  • i=1

θi s.t. Ax = b ∂Qx + θi ≥ q, i = 1, . . . , n x ≥ 0

Master

W1 : 1 2 · · · S1 : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Tsx0 ys ≥ 0

Worker 1

Wr : 1 2 · · · Sr : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Tsx0 ys ≥ 0

Worker r

· · ·

pass pass pass

Figure: Distributed L-shaped procedure

Martin Biel (KTH) November 16, 2018 15/19

slide-59
SLIDE 59

Implementation - Distributed L-shaped

D : x0 x1 · · · C : · · · minimize

x∈Rn

cTx +

n

  • i=1

θi s.t. Ax = b ∂Qx + θi ≥ q, i = 1, . . . , n x ≥ 0

Master

W1 : 1 2 · · · S1 : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Tsx1 ys ≥ 0

Worker 1

Wr : 1 2 · · · Sr : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Tsx0 ys ≥ 0

Worker r

· · ·

pass fetch |Q − Θ| ≤ τ(ǫ + |Q|)?

Figure: Distributed L-shaped procedure

Martin Biel (KTH) November 16, 2018 15/19

slide-60
SLIDE 60

Implementation - Distributed L-shaped

D : x0 x1 · · · C : · · · minimize

x∈Rn

cTx +

n

  • i=1

θi s.t. Ax = b ∂Qx + θi ≥ q, i = 1, . . . , n x ≥ 0

Master

W1 : 1 2 · · · S1 : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Tsx1 ys ≥ 0

Worker 1

Wr : 1 2 · · · Sr : minimize

yi∈Rm

qT

s ys

s.t. Wys = hs − Tsx1 ys ≥ 0

Worker r

· · ·

pass fetch

Figure: Distributed L-shaped procedure

Martin Biel (KTH) November 16, 2018 15/19

slide-61
SLIDE 61

Numerical experiments - Day-ahead problem

  • Optimal order strategies on a deregulated electricity market
  • From the perspective of a hydropower producer
  • First stage: Hourly electricity volume bids for the upcoming day
  • Second stage: Optimize production when market price is known
  • Market data taken from NordPool, used to sample scenarios
  • Physical data on hydroelectric plants in river Skellefteälven
  • Full model included in HydroModels.jl

Martin Biel (KTH) November 16, 2018 16/19

slide-62
SLIDE 62

Numerical experiments - Convergence

1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0

Iteration

  • 3.6058e+07
  • 3.6051e+07
  • 3.6044e+07
  • 3.6037e+07
  • 3.6030e+07
  • 3.6023e+07
  • 3.6016e+07
  • 3.6009e+07
  • 3.6002e+07
  • 3.5995e+07
  • 3.5987e+07

Q Q

Figure: L-shaped convergence for a day-ahead problem with 10 price scenarios.

Martin Biel (KTH) November 16, 2018 16/19

slide-63
SLIDE 63

Numerical experiments - Single node

10 50 100 200 300 Number of Scenarios N 0.0 14.8 29.5 44.3 59.0 Computation Time T [s]

Trust-region L-shaped Gurobi Linearized Regularized Linearized Level set

Figure: Median computation time required to solve day-ahead problems.

Martin Biel (KTH) November 16, 2018 16/19

slide-64
SLIDE 64

Numerical experiments - Strong scaling

  • Day-ahead problem with 1000 price scenarios.
  • Results in 2.5 million variables and 1.4 million constraints.
  • Solving the extended form required 350+ seconds.

1 2 4 8 16 Number of Cores P 0.0 37.2 74.5 111.7 Computation Time T [s]

Distributed TR Distributed L-shaped Distributed LV Distributed RD

Figure: Computation time.

1 2 4 8 16 Number of Cores P 0.0 10.0 20.0 30.0 40.0 50.0 60.0 70.0 80.0 90.0 100.0 Parallel Efficiency E [% of linear scaling]

Distributed TR Distributed L-shaped Distributed LV Distributed RD

Figure: Parallel efficiency.

Martin Biel (KTH) November 16, 2018 16/19

slide-65
SLIDE 65

Numerical experiments - Load imbalance

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Iteration i 1 2 3 4 5 Computation Time T [s]

Master Worker Mean computation time per iteration

Figure: 4 workers.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Iteration i 1 2 3 4 5 Computation Time T [s]

Master Worker Mean computation time per iteration

Figure: 16 workers.

Martin Biel (KTH) November 16, 2018 16/19

slide-66
SLIDE 66

Numerical experiments - Load imbalance

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Iteration i 1 2 3 4 5 Computation Time T [s]

Master Worker Mean computation time per iteration

Figure: 4 workers with κ = 1.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Iteration i 1 2 3 4 5 Computation Time T [s]

Master Worker Mean computation time per iteration

Figure: 4 workers with κ = 0.5.

Martin Biel (KTH) November 16, 2018 16/19

slide-67
SLIDE 67

Final Remarks

Discussion

  • L-shaped algorithms outperform solving the extended form directly

Martin Biel (KTH) November 16, 2018 17/19

slide-68
SLIDE 68

Final Remarks

Discussion

  • L-shaped algorithms outperform solving the extended form directly
  • Scalability affected by load imbalance

Martin Biel (KTH) November 16, 2018 17/19

slide-69
SLIDE 69

Final Remarks

Discussion

  • L-shaped algorithms outperform solving the extended form directly
  • Scalability affected by load imbalance
  • Performance of regularized variants affected by flat objective

Martin Biel (KTH) November 16, 2018 17/19

slide-70
SLIDE 70

Final Remarks

Discussion

  • L-shaped algorithms outperform solving the extended form directly
  • Scalability affected by load imbalance
  • Performance of regularized variants affected by flat objective

Outlook on future work

  • Evaluate on other applied problems

◮ Larger scale ◮ Less flat

Martin Biel (KTH) November 16, 2018 17/19

slide-71
SLIDE 71

Final Remarks

Discussion

  • L-shaped algorithms outperform solving the extended form directly
  • Scalability affected by load imbalance
  • Performance of regularized variants affected by flat objective

Outlook on future work

  • Evaluate on other applied problems

◮ Larger scale ◮ Less flat

  • Multi-node testing

Martin Biel (KTH) November 16, 2018 17/19

slide-72
SLIDE 72

Final Remarks

Discussion

  • L-shaped algorithms outperform solving the extended form directly
  • Scalability affected by load imbalance
  • Performance of regularized variants affected by flat objective

Outlook on future work

  • Evaluate on other applied problems

◮ Larger scale ◮ Less flat

  • Multi-node testing
  • Algorithmic improvements

Martin Biel (KTH) November 16, 2018 17/19

slide-73
SLIDE 73

Final Remarks

Discussion

  • L-shaped algorithms outperform solving the extended form directly
  • Scalability affected by load imbalance
  • Performance of regularized variants affected by flat objective

Outlook on future work

  • Evaluate on other applied problems

◮ Larger scale ◮ Less flat

  • Multi-node testing
  • Algorithmic improvements
  • Bundling procedures to reduce load imbalance

Martin Biel (KTH) November 16, 2018 17/19

slide-74
SLIDE 74

Final Remarks

Julia as an alternative to MPI

  • Complexity versus implementation effort

◮ Abstractions for distributed computing are simple and efficient ◮ High-level features for modeling optimization problems

Martin Biel (KTH) November 16, 2018 18/19

slide-75
SLIDE 75

Final Remarks

Julia as an alternative to MPI

  • Complexity versus implementation effort

◮ Abstractions for distributed computing are simple and efficient ◮ High-level features for modeling optimization problems

  • MPI communicators can be used through MPI.jl

Martin Biel (KTH) November 16, 2018 18/19

slide-76
SLIDE 76

Final Remarks

Julia as an alternative to MPI

  • Complexity versus implementation effort

◮ Abstractions for distributed computing are simple and efficient ◮ High-level features for modeling optimization problems

  • MPI communicators can be used through MPI.jl
  • Prototype on laptop, run the same code on a supercomputer

Martin Biel (KTH) November 16, 2018 18/19

slide-77
SLIDE 77

Final Remarks

Summary

  • Memory-distributed stochastic programs

Martin Biel (KTH) November 16, 2018 19/19

slide-78
SLIDE 78

Final Remarks

Summary

  • Memory-distributed stochastic programs
  • L-shaped algorithms that run in parallel on distributed data

Martin Biel (KTH) November 16, 2018 19/19

slide-79
SLIDE 79

Final Remarks

Summary

  • Memory-distributed stochastic programs
  • L-shaped algorithms that run in parallel on distributed data
  • Simple Julia abstractions enable complex parallel algorithms

Martin Biel (KTH) November 16, 2018 19/19

slide-80
SLIDE 80

Final Remarks

Summary

  • Memory-distributed stochastic programs
  • L-shaped algorithms that run in parallel on distributed data
  • Simple Julia abstractions enable complex parallel algorithms
  • Framework for formulating and solving stochastic programs

Martin Biel (KTH) November 16, 2018 19/19

slide-81
SLIDE 81

Final Remarks

Summary

  • Memory-distributed stochastic programs
  • L-shaped algorithms that run in parallel on distributed data
  • Simple Julia abstractions enable complex parallel algorithms
  • Framework for formulating and solving stochastic programs
  • The full framework is open-source and freely available on Github

https://github.com/martinbiel

Martin Biel (KTH) November 16, 2018 19/19