Dampening the Curse of Dimensionality Decomposition Methods for - - PowerPoint PPT Presentation

dampening the curse of dimensionality
SMART_READER_LITE
LIVE PREVIEW

Dampening the Curse of Dimensionality Decomposition Methods for - - PowerPoint PPT Presentation

Dampening the Curse of Dimensionality Decomposition Methods for Stochastic Optimization Problems Vadim Gorski Adriana Kiszka Nils Lhndorf Goncalo Terca David W. Conference on Computational Management Science Mathematical Methods in


slide-1
SLIDE 1

Dampening the Curse of Dimensionality

Decomposition Methods for Stochastic Optimization Problems Vadim Gorski Adriana Kiszka Nils Löhndorf Goncalo Terca David W. Conference on Computational Management Science Mathematical Methods in Industry and Economics 03/2019

slide-2
SLIDE 2

Outline

Multi-Stage Stochastic Programming

Discretization & complexity The curse of dimensionality

Stochastic Optimization with a Markovian Structure

Scenario trees and scenario lattices

Decomposition algorithms

Solving stochastic optimization problems on scenario lattices

Illustrative problems

Dynamic newsvendor models Aggregate production planning

The QUASAR framework

1/28

slide-3
SLIDE 3

Optimization Under Uncertainty

Question 1

How does uncertainty enter the problem? As a discrete or continuous random variable As a discrete or continuous random process As a set of scenarios without probabilities

2/28

slide-4
SLIDE 4

Optimization Under Uncertainty

Question 1

How does uncertainty enter the problem? As a discrete or continuous random variable As a discrete or continuous random process As a set of scenarios without probabilities

Question 2

What is the nature of the decision problem? Optimizing a worst-case outcome Optimizing a (possibly risk adjusted) expectation Problem with or without recourse decisions Almost sure or probabilistic constraints Discrete time, continuous time problems Finite horizon or infinite horizon Convex or non-convex problems

2/28

slide-5
SLIDE 5

Setting of this Talk

Problem Class

In this talk we will consider convex, multi-stage decision problems in discrete time with a finite planning horizon with relative complete recourse where and constraints have to hold almost surely randomness is modeled by a Markov process.

3/28

slide-6
SLIDE 6

Setting of this Talk

Problem Class

In this talk we will consider convex, multi-stage decision problems in discrete time with a finite planning horizon with relative complete recourse where and constraints have to hold almost surely randomness is modeled by a Markov process.

Standard Approach: Scenario Trees

Model stochastic process as a scenario tree Solve problem as deterministic equivalent

3/28

slide-7
SLIDE 7

Multi-Stage Stochastic Optimization

Multi-stage problem with T stages ξ = (ξ1, . . . , ξT) with ξt(ω) ∈ Rm and ξt = (ξ1, . . . , ξt) x = (x1, . . . , xT) decisions with xt = (x1, . . . , xt) and xt ∈ Xt xt measurable w.r.t. σ(ξt)

4/28

slide-8
SLIDE 8

Multi-Stage Stochastic Optimization

Multi-stage problem with T stages ξ = (ξ1, . . . , ξT) with ξt(ω) ∈ Rm and ξt = (ξ1, . . . , ξt) x = (x1, . . . , xT) decisions with xt = (x1, . . . , xt) and xt ∈ Xt xt measurable w.r.t. σ(ξt) Define the value functions VT(xT−1, ξT) = max

xT ∈XT (xT−1,ξT ) RT(xT, ξT)

Vt(xt−1, ξt) = max

xt∈Xt(xt−1,ξt) Rt(xt, ξt) + E

  • Vt+1(xt, ξt+1)|ξt

, ∀t < T

4/28

slide-9
SLIDE 9

Multi-Stage Stochastic Optimization

Multi-stage problem with T stages ξ = (ξ1, . . . , ξT) with ξt(ω) ∈ Rm and ξt = (ξ1, . . . , ξt) x = (x1, . . . , xT) decisions with xt = (x1, . . . , xt) and xt ∈ Xt xt measurable w.r.t. σ(ξt) Define the value functions VT(xT−1, ξT) = max

xT ∈XT (xT−1,ξT ) RT(xT, ξT)

Vt(xt−1, ξt) = max

xt∈Xt(xt−1,ξt) Rt(xt, ξt) + E

  • Vt+1(xt, ξt+1)|ξt

, ∀t < T

Note

Functions xt → E

  • Vt+1(xt, ξt+1)|ξt

have to be evaluated Closed form expressions are rarely available Discretization of ξ required for numerical solutions

4/28

slide-10
SLIDE 10

Complexity of Stochastic Programming

ξ ξ3 ξ2

Number of required points grows exponentially in dimension of ξ

5/28

slide-11
SLIDE 11

Complexity of Stochastic Programming

ξ ξ3 ξ2

Number of required points grows exponentially in dimension of ξ Even if solution algorithm well behaved problems are intractable

5/28

slide-12
SLIDE 12

Complexity of Stochastic Programming

ξ ξ3 ξ2

Number of required points grows exponentially in dimension of ξ Even if solution algorithm well behaved problems are intractable Curse of Dimensionality in Stochastic Optimization

Swamy [2005] Shapiro and Nemirovski [2005] Shmoys and Swamy [2006] Dyer and Stougie [2006] Hanasusanto et al. [2016]

5/28

slide-13
SLIDE 13

Complexity of Stochastic Programming

ξ ξ3 ξ2

Number of required points grows exponentially in dimension of ξ Even if solution algorithm well behaved problems are intractable

Drivers of the Curse of Dimensionality

Number of random variables per stage Number of stages

5/28

slide-14
SLIDE 14

Scenario Tree Models

Nodes represent values, arcs the possible transitions

6/28

slide-15
SLIDE 15

Scenario Tree Models

Nodes represent values, arcs the possible transitions Tree model represents conditional probability allowing to solve Vt(xt−1, ξt) = min

xt∈Xt Rt(xt, ξt) + E

  • Vt+1(xt, ξt+1)|ξt

6/28

slide-16
SLIDE 16

Scenario Tree Models

ξ3 ξ2 ξ3 ξ2

Restriction

Branching constrains scenario locations Discretization becomes harder

7/28

slide-17
SLIDE 17

Scenario Tree Models

Sufficient branching is important for realistic models

Finance: to avoid arbitrage n assets require n + 1 successors

8/28

slide-18
SLIDE 18

Scenario Tree Models

Sufficient branching is important for realistic models Tree where every node ≥ 2 successor has at least 2T nodes

8/28

slide-19
SLIDE 19

Scenario Tree Models

Sufficient branching is important for realistic models Tree where every node ≥ 2 successor has at least 2T nodes Alternative: trees where some nodes only have one successor

Sub-problems on the nodes are deterministic Decision maker is partially clairvoyant Overly optimistic planning and flawed policies

8/28

slide-20
SLIDE 20

Scenario Tree Models

Sufficient branching is important for realistic models Tree where every node ≥ 2 successor has at least 2T nodes Alternative: trees where some nodes only have one successor

Scenario generation literature

Monte Carlo

Shapiro [2003, 2008]

Probability Metrics

Pflug [2001, 2009], Pflug and Pichler [2012] Dupacová et al. [2003], Heitsch and Römisch [2003]

Moment Matching

Høyland and Wallace [2001], Høyland et al. [2003], Kaut and Wallace [2003]

Integration Quadratures

Pennanen [2005, 2009]

8/28

slide-21
SLIDE 21

Lattices: A Compressed Representation

Definition

A scenario lattice is a scenario tree, where a node can have multiple predecessors.

Scenariotree Scenariolattice

9/28

slide-22
SLIDE 22

Lattices: A Compressed Representation

Definition

A scenario lattice is a scenario tree, where a node can have multiple predecessors.

Scenariotree Scenariolattice

No history → only works for Markov processes

9/28

slide-23
SLIDE 23

Lattices: A Compressed Representation

Definition

A scenario lattice is a scenario tree, where a node can have multiple predecessors.

Scenariotree Scenariolattice

No history → only works for Markov processes Tree represents |NT| scenarios Lattice (potentially) represents |N1| × |N2| × · · · × |NT| scenarios

9/28

slide-24
SLIDE 24

Optimal Lattice Generation

Aim

Given a Markov process ξ = (ξ1, . . . , ξT) find a small scenario lattice ˆ ξ such that |Vt(x, ξt) − Vt(x, ˆ ξt)| is small for all x.

10/28

slide-25
SLIDE 25

Optimal Lattice Generation

Aim

Given a Markov process ξ = (ξ1, . . . , ξT) find a small scenario lattice ˆ ξ such that |Vt(x, ξt) − Vt(x, ˆ ξt)| is small for all x. Bally and Pagès [2003] reduces a GBM by minimizing the Wasserstein metric using stochastic gradient descent.

10/28

slide-26
SLIDE 26

Optimal Lattice Generation

Aim

Given a Markov process ξ = (ξ1, . . . , ξT) find a small scenario lattice ˆ ξ such that |Vt(x, ξt) − Vt(x, ˆ ξt)| is small for all x. Bally and Pagès [2003] reduces a GBM by minimizing the Wasserstein metric using stochastic gradient descent. Löhndorf and Wozabal [2018] generalize these ideas to general processes using a second order stochastic gradient method.

Fast parameter free method

10/28

slide-27
SLIDE 27

Optimal Lattice Generation

Aim

Given a Markov process ξ = (ξ1, . . . , ξT) find a small scenario lattice ˆ ξ such that |Vt(x, ξt) − Vt(x, ˆ ξt)| is small for all x. Bally and Pagès [2003] reduces a GBM by minimizing the Wasserstein metric using stochastic gradient descent. Löhndorf and Wozabal [2018] generalize these ideas to general processes using a second order stochastic gradient method.

Fast parameter free method

Kiszka and Wozabal [2018] adapt ideas from Pflug and Pichler [2012] to lattices with randomness in the constraints.

Theoretically superior to Löhndorf and Wozabal [2018] Not (yet) suitable for large models

10/28

slide-28
SLIDE 28

Solving Stochastic Optimization on Trees

Assign a decision xn to every node n ∈ Nt, 1 ≤ t ≤ T Set up deterministic equivalent formulation Find optimal decisions x∗

n

11/28

slide-29
SLIDE 29

Stochastic Optimization on Lattices

Generally: decision dependent on path leading to n ∈ Nt xn = xn(xn−1, ξt)

12/28

slide-30
SLIDE 30

Stochastic Optimization on Lattices

Generally: decision dependent on path leading to n ∈ Nt xn = xn(xn−1, ξt) Define the resource state at the beginning of period t St(x1, ξ1, . . . , xt−1, ξt−1) = St(St−1, xt−1, ξt−1)

12/28

slide-31
SLIDE 31

Stochastic Optimization on Lattices

Generally: decision dependent on path leading to n ∈ Nt xn = xn(xn−1, ξt) Define the resource state at the beginning of period t St(x1, ξ1, . . . , xt−1, ξt−1) = St(St−1, xt−1, ξt−1)

Example: Hydro storage optimization

Storage level St is given by St = S0 +

t−1

  • i=1

(xi + ξi) = St−1 + (xt−1 + ξt−1) where xi are decisions and ξi are the natural inflows in period i.

12/28

slide-32
SLIDE 32

Stochastic Optimization on Lattices

Generally: decision dependent on path leading to n ∈ Nt xn = xn(xn−1, ξt) Define the resource state at the beginning of period t St(x1, ξ1, . . . , xt−1, ξt−1) = St(St−1, xt−1, ξt−1) Decision are based on current state of the system xt(x1, ξ1, . . . , xt−1, ξt−1, ξt) = xt(St, ξt)

12/28

slide-33
SLIDE 33

Stochastic Optimization on Lattices

Generally: decision dependent on path leading to n ∈ Nt xn = xn(xn−1, ξt) Define the resource state at the beginning of period t St(x1, ξ1, . . . , xt−1, ξt−1) = St(St−1, xt−1, ξt−1) Decision are based on current state of the system xt(x1, ξ1, . . . , xt−1, ξt−1, ξt) = xt(St, ξt)

Markov Decision Process (MDP)

If ξ is Markov and the Vt only depend on the current state, i.e., E

  • Vt(xt−1, ξt)|ξt−1

= E [Vt(St, ξt)|ξt−1] , then the problem has a Markov structure.

12/28

slide-34
SLIDE 34

Solving MDPs

Discretization Strategy

Discretization of events as lattice (environmental state) Continuous actions and resource states St

13/28

slide-35
SLIDE 35

Solving MDPs

Discretization Strategy

Discretization of events as lattice (environmental state) Continuous actions and resource states St Solving MDP equivalent to finding the functions S → Vt(S, ξt)

13/28

slide-36
SLIDE 36

Solving MDPs

Discretization Strategy

Discretization of events as lattice (environmental state) Continuous actions and resource states St Solving MDP equivalent to finding the functions S → Vt(S, ξt) L-Shaped Method: Van Slyke and Wets [1969] Benders decomposition: Benders [1962], Louveaux [1980] Stochastic Dual Dynamic Programming: Pereira and Pinto [1991] Approximate Dual Dynamic Programming (ADDP): Löhndorf et al. [2013], Löhndorf and Wozabal [2018]

13/28

slide-37
SLIDE 37

How to find Value Functions?

Assume that for each realization ˆ ξT of the discretized Markov process VT(ST, ˆ ξT) =

  • maxxT

c⊤

ˆ ξT xT

s.t. Wˆ

ξT xT ≤ bˆ ξT + Bˆ ξT ST

with VT(ST, ˆ ξT) > −∞ for every ST.

14/28

slide-38
SLIDE 38

How to find Value Functions?

Assume that for each realization ˆ ξT of the discretized Markov process VT(ST, ˆ ξT) =

  • maxxT

c⊤

ˆ ξT xT

s.t. Wˆ

ξT xT ≤ bˆ ξT + Bˆ ξT ST

with VT(ST, ˆ ξT) > −∞ for every ST. S → VT(S, ˆ ξT) is a piecewise linear concave function Supergradient of S → VT can be extracted from dual solution

Idea

Approximate S → VT(S, ˆ ξt) by (minimum of) supergradients.

14/28

slide-39
SLIDE 39

How to find Value Functions?

Assume that for each realization ˆ ξT of the discretized Markov process VT(ST, ˆ ξT) =

  • maxxT

c⊤

ˆ ξT xT

s.t. Wˆ

ξT xT ≤ bˆ ξT + Bˆ ξT ST

with VT(ST, ˆ ξT) > −∞ for every ST. S → VT(S, ˆ ξT) is a piecewise linear concave function Supergradient of S → VT can be extracted from dual solution

Idea

Approximate S → VT(S, ˆ ξt) by (minimum of) supergradients.

s1 V S s2 s1 V S s2 s1 V S s3

14/28

slide-40
SLIDE 40

How to find Value Functions?

Assume that for each realization ˆ ξT of the discretized Markov process VT(ST, ˆ ξT) =

  • maxxT

c⊤

ˆ ξT xT

s.t. Wˆ

ξT xT ≤ bˆ ξT + Bˆ ξT ST

with VT(ST, ˆ ξT) > −∞ for every ST. S → VT(S, ˆ ξT) is a piecewise linear concave function Supergradient of S → VT can be extracted from dual solution

Idea

Approximate S → VT(S, ˆ ξt) by (minimum of) supergradients.

Problem

How to determine for which S to calculate supergradients?

14/28

slide-41
SLIDE 41

How to find Value Functions?

Assume that for each realization ˆ ξT of the discretized Markov process VT(ST, ˆ ξT) =

  • maxxT

c⊤

ˆ ξT xT

s.t. Wˆ

ξT xT ≤ bˆ ξT + Bˆ ξT ST

with VT(ST, ˆ ξT) > −∞ for every ST. S → VT(S, ˆ ξT) is a piecewise linear concave function Supergradient of S → VT can be extracted from dual solution

Idea

Approximate S → VT(S, ˆ ξt) by (minimum of) supergradients.

Problem

How to determine for which S to calculate supergradients?

Solution: Sampling

Sample S using current value function approximations.

14/28

slide-42
SLIDE 42

ADDP: Approximating Value Funcions

Given an approximation S → ˆ Vt(S, ˆ ξt), ∀t, ∀ˆ ξt. Forward-Pass

Sample path (ˆ ξ1, . . . , ˆ ξT) from lattice Decide on xt using ˆ Vt(S, ˆ ξt) and store resource path (s1, . . . , sT)

15/28

slide-43
SLIDE 43

ADDP: Approximating Value Funcions

Given an approximation S → ˆ Vt(S, ˆ ξt), ∀t, ∀ˆ ξt. Forward-Pass

Sample path (ˆ ξ1, . . . , ˆ ξT) from lattice Decide on xt using ˆ Vt(S, ˆ ξt) and store resource path (s1, . . . , sT)

Backward-Pass (from t = T, . . . , 1)

At t calculate tangent l to S → ˆ Vt(S, ˆ ξt) at point st ˆ Vt(st, ˆ ξt) =

  • maxxt

c⊤

ˆ ξt xt + E

  • Vt+1(St+1, ˆ

ξt+1)|ˆ ξt

  • s. t.

W ˆ

ξt xt ≤ b ˆ ξt + B ˆ ξt st

ˆ Vt(S, ˆ ξt) ← min

  • l, ˆ

Vt(S, ˆ ξt)

  • .

15/28

slide-44
SLIDE 44

ADDP: Approximating Value Funcions

Given an approximation S → ˆ Vt(S, ˆ ξt), ∀t, ∀ˆ ξt. Forward-Pass

Sample path (ˆ ξ1, . . . , ˆ ξT) from lattice Decide on xt using ˆ Vt(S, ˆ ξt) and store resource path (s1, . . . , sT)

Backward-Pass (from t = T, . . . , 1)

At t calculate tangent l to S → ˆ Vt(S, ˆ ξt) at point st ˆ Vt(st, ˆ ξt) =

  • maxxt

c⊤

ˆ ξt xt + E

  • Vt+1(St+1, ˆ

ξt+1)|ˆ ξt

  • s. t.

W ˆ

ξt xt ≤ b ˆ ξt + B ˆ ξt st

ˆ Vt(S, ˆ ξt) ← min

  • l, ˆ

Vt(S, ˆ ξt)

  • .

V(S,ξ) S Value function V(S,ξ) S Value function st

15/28

slide-45
SLIDE 45

ADDP: Approximating Value Funcions

Given an approximation S → ˆ Vt(S, ˆ ξt), ∀t, ∀ˆ ξt. Forward-Pass

Sample path (ˆ ξ1, . . . , ˆ ξT) from lattice Decide on xt using ˆ Vt(S, ˆ ξt) and store resource path (s1, . . . , sT)

Backward-Pass (from t = T, . . . , 1)

At t calculate tangent l to S → ˆ Vt(S, ˆ ξt) at point st ˆ Vt(st, ˆ ξt) =

  • maxxt

c⊤

ˆ ξt xt + E

  • Vt+1(St+1, ˆ

ξt+1)|ˆ ξt

  • s. t.

W ˆ

ξt xt ≤ b ˆ ξt + B ˆ ξt st

ˆ Vt(S, ˆ ξt) ← min

  • l, ˆ

Vt(S, ˆ ξt)

  • .

Exploitation

Enumeration of all possible states (exploration) too expensive Sampling makes use (exploits) structure of the problem Approximations ˆ V only tight at relevant points

15/28

slide-46
SLIDE 46

ADDP Convergence

Proposition

Approximate value functions S → ˆ Vt(S, ξt)

are upper bounds for true value functions; converge to the true value function.

Linear problems → finitely many iterations

Differences to SDDP

Works with Markov processes, no restriction to stage-wise independence. Randomness can occur everywhere in the problem, not only in the right hand side of the constraints.

16/28

slide-47
SLIDE 47

ADDP: Convergence Check

Lower Bound

Simulate N trajectories (ˆ ξ1, . . . , ˆ ξT) on the lattice Use ˆ Vt to compute cumulative profits ˆ Πi, i = 1, . . . , N

17/28

slide-48
SLIDE 48

ADDP: Convergence Check

Lower Bound

Simulate N trajectories (ˆ ξ1, . . . , ˆ ξT) on the lattice Use ˆ Vt to compute cumulative profits ˆ Πi, i = 1, . . . , N Define a lower bound LB = N−1 N

i=1 ˆ

Πi Calculate α-quantile of LB F −1(α) by bootstrapping Check if ˆ V1(S1, ξ1) × (1 − gap) ≤ F −1(α)

17/28

slide-49
SLIDE 49

ADDP: Convergence Check

Lower Bound

Simulate N trajectories (ˆ ξ1, . . . , ˆ ξT) on the lattice Use ˆ Vt to compute cumulative profits ˆ Πi, i = 1, . . . , N Define a lower bound LB = N−1 N

i=1 ˆ

Πi Calculate α-quantile of LB F −1(α) by bootstrapping Check if ˆ V1(S1, ξ1) × (1 − gap) ≤ F −1(α)

17/28

slide-50
SLIDE 50

Value of a Policy

Out of Sample Value

Average profit for original model of process Simulate N trajectories (ξ1, . . . , ξT) for environmental state

18/28

slide-51
SLIDE 51

Value of a Policy

Out of Sample Value

Average profit for original model of process Simulate N trajectories (ξ1, . . . , ξT) for environmental state Use ˆ Vt of closest lattice node for decision

18/28

slide-52
SLIDE 52

Value of a Policy

Out of Sample Value

Average profit for original model of process Simulate N trajectories (ξ1, . . . , ξT) for environmental state Use ˆ Vt of closest lattice node for decision N simulated profits ˆ Πi, i = 1, . . . , N

18/28

slide-53
SLIDE 53

Value of a Policy

Out of Sample Value

Average profit for original model of process Simulate N trajectories (ξ1, . . . , ξT) for environmental state Use ˆ Vt of closest lattice node for decision N simulated profits ˆ Πi, i = 1, . . . , N OV = N−1 N

i=1 ˆ

Πi

18/28

slide-54
SLIDE 54

Value of a Policy

Out of Sample Value

Average profit for original model of process Simulate N trajectories (ξ1, . . . , ξT) for environmental state Use ˆ Vt of closest lattice node for decision N simulated profits ˆ Πi, i = 1, . . . , N OV = N−1 N

i=1 ˆ

Πi OV is the value under the original model If |Objective − OV| small → lattice sufficiently sized Not (easily) possible with scenario trees!

18/28

slide-55
SLIDE 55

Numerical Examples

Idea

Benchmark ADDP against state of the art tree based modeling.

19/28

slide-56
SLIDE 56

Numerical Examples

Idea

Benchmark ADDP against state of the art tree based modeling.

Approach

For a given problem: Solve in a rolling stochastic fashion N times Common random numbers (scenarios) to build trees and lattices Use same out of sample scenarios to evaluate decisions Use SCENRED2 (GAMS) to build scenario trees Solve tree based problems using deterministic equivalents Test whether observed differences are significant

19/28

slide-57
SLIDE 57

Rolling Stochastic Optimization

20/28

slide-58
SLIDE 58

Flower Girl

max E T

t=1 stpR t − btpW t

  • s.t.

st ≤ max(Dt, It), ∀t It = γIt−1 + bt−1 − st, ∀t 0 ≤ It ≤ ¯ I, ∀t Newsvendor with storage It with capacity ¯ I Optimal buying (bt) and selling (st) decisions of flowers Deterministic wholesale and retail prices pW

t , pR t

Random demand Dt following a driftless GBM

21/28

slide-59
SLIDE 59

Flower Girl

max E T

t=1 stpR t − btpW t

  • s.t.

st ≤ max(Dt, It), ∀t It = γIt−1 + bt−1 − st, ∀t 0 ≤ It ≤ ¯ I, ∀t Newsvendor with storage It with capacity ¯ I Optimal buying (bt) and selling (st) decisions of flowers Deterministic wholesale and retail prices pW

t , pR t

Random demand Dt following a driftless GBM

T Profit Tree Profit ADDP Diff % t-paired test 5 3,919 3,925 0.15% – 10 5,743 5,945 3.51% – 15 17,297 18,940 9.50% 4.1% 20 21,303 23,423 9.95% 4.1% 50 57,343 72,543 26.51% 2.0%

21/28

slide-60
SLIDE 60

Flower Girl

max E T

t=1 stpR t − btpW t

  • s.t.

st ≤ max(Dt, It), ∀t It = γIt−1 + bt−1 − st, ∀t 0 ≤ It ≤ ¯ I, ∀t Assume random sales price pR

t following a driftless GBM

Sales prices and demand are assumed to be independent

T Profit Tree Profit ADDP Diff % t-paired test 5 2,973 3,180 6.97% – 10 10,508 11,813 12.42% – 15 57,389 66,122 15.22% – 20 82,332 102,433 24.41% 5.1% 50 120,321 155,312 29.08% 1.0%

22/28

slide-61
SLIDE 61

Flower Girl

max (1 − λ)E T

t=1 stpR t − btpW t

  • + λ CVaRα

T

t=1 stpR t − btpW t

  • s.t.

st ≤ max(Dt, It), ∀t It = γIt−1 + bt−1 − st, ∀t 0 ≤ It ≤ ¯ I, ∀t Use CVaR-expectation mixture in the objective Sales deterministic

T Profit Tree Profit ADDP Diff % t-paired test 5 2,936 3,143 7.04% – 10 9,284 10,675 14.98% – 15 57,893 71,050 22.73% 4.5% 20 79,871 103,212 29.22% 2.1%

23/28

slide-62
SLIDE 62

Aggregate Production Planning

min E T

t=1

I

i=1 citXit + hitIit + rtR1 + otOt

  • s.t.

It − It+1 + Xit = Dit I

i=1 miXit ≤ R1 + Ot

R1 ≤ b X, I, R, O ≥ 0 Aggregate production planning, n products, Bitran et al. [1981] where cit . . . unit production costs excluding labor hit . . . inventory carrying cost mi . . . hours required to produce one unit rt, ot . . . regular and overtime labor costs Dit . . . Random demand for each of the i products R1 . . . Regular labor hiring decision Ot . . . Overtime labor hiring decision Xit . . . Production decision for each of the i products Iit . . . Inventory decision for each of the i products

24/28

slide-63
SLIDE 63

Aggregate Production Planning

min E T

t=1

I

i=1 citXit + hitIit + rtR1 + otOt

  • s.t.

It − It+1 + Xit = Dit I

i=1 miXit ≤ R1 + Ot

R1 ≤ b X, I, R, O ≥ 0

Characteristics

Stochastic problem with scalable dimension (number of products) Two types of decisions Inventory and production: in anticipation of the next stage Regular labor: first-stage decision only

24/28

slide-64
SLIDE 64

Aggregate Production Planning

Random demand modeled as GBM Correlation between products modeled by random covariance matrix sampled according to Lewandowski et al. [2009] Test different levels of dependency Lattice and trees built based on 50,000 to 1,000,000 scenarios. Problem parameters similar to Bitran et al. [1981]

25/28

slide-65
SLIDE 65

Results APP

I T correlation Cost Tree Cost Lattice Diff % p-value 1 5 medium 12,474 11,253 10.85 <1% 3 5 medium 3,084 2,739 12.57 <1% 5 5 medium 7,974 7,040 13.26 <1% 7 5 medium 16,275 14,315 13.69 <1% 9 5 medium 17.558 11,745 49.49 <1% 5 5 very high 12,432 9,957 19.91 <1% 5 5 high 80,095 70,307 12.22 <1% 5 5 medium 79,129 70,254 11.22 <1% 5 5 low 79,263 70,230 11.40 <1% 5 3 medium 4,466 4,146 7.15 2.8% 5 5 medium 8,009 7,066 11.78 <1% 5 10 medium 15,843 10,382 34.47 <1%

26/28

slide-66
SLIDE 66

QUASAR: An ADDP Implementation

ADDP implementation in JAVA

Generation of lattices Parallelized solution algorithm

iPython and MATLAB interfaces

Easy algebraic formulation of problems and stochastic processes Analysis and exporting facilities

Free for academic use: http://www.quantego.com

27/28

slide-67
SLIDE 67

QUASAR Flower Girl Implementation

1

T = 10; orderPrice = 2; sellPrice = 3; loss = 0.2; storageSize = 1000;

2 3

model = quasarDecisionProblem(’FlowerGirl’);

4

prevStorage = 0; lastOrder = 0;

5

profit = quasarExpression();

6

for t = 0:T-1

7

storage = model.addVariable(t, ’storage’); % amount in storage

8

storeDelta = model.addVariable(t, ’storeD’); % change in storage

9

discard = model.addVariable(t, ’discard’); % flowers discarded

10

  • rder = model.addVariable(t, ’order’);

% flowers ordered

11

sell = model.addVariable(t, ’sell’); % amount of flowers sold

12 13

model.addConstraint(storage == gamma*prevStorage + storeDelta - discard);

14

model.addConstraint(storeDelta == lastOrder - sell + discard);

15

model.addConstraint(storage <= storageSize);

16

model.addConstraint(sell <= quasarRandomVariable(’D’));

17 18

profit = profit - orderPrice * order + sellPrice * sell;

19

lastOrder = order; prevStorage = storage;

20

end

21

model.maximize(profit);

22 23

vol = 0.1; rho = 1.0; D0 = 1000;

24

errors = quasarLognormalDist.createUnivariate(0, vol);

25

process = quasarAutoregressiveModel.createUnivariate(’D’, D0, rho, errors);

26

process.geometric;

27 28

  • ptBuilder = quasarDynamicOptimizerBuilder.create(model, process, 100);

29

  • ptProblem = optBuilder.build; optProblem.solve;

28/28

slide-68
SLIDE 68

Literature 1

  • V. Bally and G. Pagès. A Quantization Algorithm for Solving Multi-Dimensional. Bernoulli, 9(6):1003–1049, 2003.

J.F. Benders. Partitioning procedures for solving mixed-variables programming problems. Numerische Mathematik, 4:238–252, 1962.

  • G. Bitran, E. Haas, and A Hax. Hierarchical production planning: A single-stage system. Operations Research, 1:717–743, 1981.
  • J. Dupacová, N. Gröwe-Kuska, and W Römisch. Scenario reduction in stochastic programming: An approach using probability metrics.
  • Math. Program., Ser.A, 95:493–511, 2003.
  • M. Dyer and L. Stougie. Computational complexity of stochastic programming problems. Mathematical Programming, 106(3):423–432,

2006. G.A. Hanasusanto, D. Kuhn, and W. Wiesemann. A comment on “computational complexity of stochastic programming problems”. Mathematical Programming, 159(1):557–569, Sep 2016. H Heitsch and W Römisch. Scenario reduction algorithms in stochastic programming. Computational Optimization and Applications, 24 (2-3):187–206, 2003.

  • K. Høyland and S.W. Wallace. Generating Scenario Decision Trees for Multistage Problems. Management Science, 47(2):295–307, 2001.
  • K. Høyland, M. Kaut, and S.W. Wallace. A heuristic for moment-matching scenario generation. Computational Optimization and

Applications, 24(2):169–185, 2003.

  • M. Kaut and S.W. Wallace. Evaluation of scenario-generation methods for stochastic programming. Stochastic Programming E-Print

Series, 14(1):1–14, 2003.

  • A. Kiszka and D. Wozabal. A Stability Result for Linear Markov Decision Processes. Working Paper, 2018.
  • D. Lewandowski, D. Kurowicka, and H. Joe. Generating random correlation matrices based on vines and extended onion method. Journal
  • f Multivariate Analysis, 100:2177–2189, 2009.
  • N. Löhndorf and Wozabal. Indifference pricing of natural gas storage contracts. Technical report, 2018.
  • N. Löhndorf, D. Wozabal, and S. Minner. Optimizing trading decisions for hydro storage systems using approximate dual dynamic
  • programming. Operation Research, 61:810–823, 2013.

F.V. Louveaux. A solution method for muiltstage stochastic programs with recourse with application to an energy investment problem. Operations Research, 28:889–902, 1980.

  • T. Pennanen. Epi-convergent discretizations of multistage stochastic programs. Mathematics of Operations Research, 30(1):245–256,

2005.

  • T. Pennanen. Epi-convergent discretizations of multistage stochastic programs via integration quadratures. Mathematical Programming,

116(1-2):461–479, 2009. M.V.F. Pereira and L.M.V.G. Pinto. Multi-stage stochastic optimization applied to energy planning. Mathematical Programming, 52(2): 359–375, 1991.

slide-69
SLIDE 69

Literature 2

  • G. Pflug and A. Pichler. A distance for multistage stochastic optimization models. SIAM J. OPTIM., 22(1):1–23, 2012.
  • G. Ch. Pflug. Version-independence and nested distributions in multistage stochastic optimization. SIAM Journal on Optimization, 20(3):

1406–1420, 2009. G.C. Pflug. Scenario tree generation for multiperiod financial optimization by optimal discretization. Mathematical programming, 89(2): 251–271, 2001.

  • A. Shapiro. Inference of statistical bounds for multistage stochastic programming problems. Mathematical Methods of Operations

Research, 58(1):57–68, 2003.

  • A. Shapiro. Stochastic programming approach to optimization under uncertainty. Mathematical Programming, 112(1):183–220, 2008.
  • A. Shapiro and A. Nemirovski. On complexity of stochastic programming problems. In V. Jeyakumar and A. Rubinov, editors, Continuous

Optimization, volume 99 of Applied Optimization, pages 111–146. 2005. D.B. Shmoys and C. Swamy. An approximation scheme for stochastic linear programming and its application to stochastic integer

  • programs. J. ACM, 53(6):978–1012, November 2006.
  • C. Swamy. Sampling-based approximation algorithms for multistage stochastic optimization. In In Proceedings of the 46th Annual IEEE

Symposium on Foundations of Computer Science, pages 357–366, 2005.

  • R. Van Slyke and R.J-B. Wets. L-shaped linear programs with applications to optimal control and stochastic programming. SIAM Journal on

Applied Mathematics, 12:638–663, 1969.