Utility Preference Robust Optimization: Piecewise Linear - - PowerPoint PPT Presentation

utility preference robust optimization piecewise linear
SMART_READER_LITE
LIVE PREVIEW

Utility Preference Robust Optimization: Piecewise Linear - - PowerPoint PPT Presentation

Utility Preference Robust Optimization: Piecewise Linear Approximation and Statistical Robustness Huifu Xu School of Mathematical Sciences, University of Southampton, UK. (Joint work with Shaoyan Guo) Presented at CMS & MMEI Chemnitz 29


slide-1
SLIDE 1

Utility Preference Robust Optimization: Piecewise Linear Approximation and Statistical Robustness

Huifu Xu

School of Mathematical Sciences, University of Southampton, UK. (Joint work with Shaoyan Guo)

Presented at CMS & MMEI Chemnitz 29 March 2019

1 / 59

slide-2
SLIDE 2

The PRO model

Expected utility maximization

Consider the following one-stage expected utility maximization problem max

x∈X EP[u(f (x, ξ(ω)))],

where u : I R → I R is a real-valued utility function, f is a continuous function of x and ξ representing a financial position

  • r an engineering design,

x ∈ X is a decision vector, ξ : Ω → Ξ is a vector of random variables defined over probability space (Ω, F, P) with a bounded support set Ξ ⊂ I Rk.

2 / 59

slide-3
SLIDE 3

The PRO model

Ambiguity in utility function

There is often inadequate information for the decision maker to identify a precise utility function It is too complex to elicit through case studies Stakeholders fail to reach a consensus on utility preference · · ·

3 / 59

slide-4
SLIDE 4

The PRO model

Preference robust optimization

max

x∈X EP[u(f (x, ξ(ω)))],

⇓ (PRO) ϑ := max

x∈X min u∈U EP[u(f (x, ξ(ω)))],

where U is an ambiguity set of utility functions which contains or approximates the true unknown utility function with a high likelihood.

4 / 59

slide-5
SLIDE 5

The PRO model

Preference robust optimization

max

x∈X EP[u(f (x, ξ(ω)))],

⇓ (PRO) ϑ := max

x∈X min u∈U EP[u(f (x, ξ(ω)))],

where U is an ambiguity set of utility functions which contains or approximates the true unknown utility function with a high likelihood. Question How to construct U? Can (PRO) model be solved efficiently?

4 / 59

slide-6
SLIDE 6

The PRO model

Construction of the ambiguity set

Parametric utility functions, i.e., S-shaped utility u(t) = tα if t ≥ 0, −λ(−t)β

  • therwise.

Mixture of some utility functions, i.e., αu1(t) + (1 − α)u2(t) for α ∈ (0, 1) Kantorovich metric Relative entropy, i.e., Kullback-Leibler divergence · · ·

5 / 59

slide-7
SLIDE 7

The PRO model

Construction of the ambiguity set – moment type conditions

Hu and Mehrotra (2015) U :=

  • u ∈ U :

b

a

ψj(t)du(t) ≤ cj, for j = 1, · · · , m

  • ,

6 / 59

slide-8
SLIDE 8

The PRO model

Construction of the ambiguity set – moment type conditions

Hu and Mehrotra (2015) U :=

  • u ∈ U :

b

a

ψj(t)du(t) ≤ cj, for j = 1, · · · , m

  • ,

where U is restricted to a class of nonconstant increasing functions defined

  • ver [a, b] with u(a) = 0, u(b) = 1,

6 / 59

slide-9
SLIDE 9

The PRO model

Construction of the ambiguity set – moment type conditions

Hu and Mehrotra (2015) U :=

  • u ∈ U :

b

a

ψj(t)du(t) ≤ cj, for j = 1, · · · , m

  • ,

where U is restricted to a class of nonconstant increasing functions defined

  • ver [a, b] with u(a) = 0, u(b) = 1,

ψj : [a, b] → I R, j = 1, · · · , m are a class of u-integrable functions.

6 / 59

slide-10
SLIDE 10

The PRO model

Construction of the ambiguity set – moment type conditions

Hu and Mehrotra (2015) U :=

  • u ∈ U :

b

a

ψj(t)du(t) ≤ cj, for j = 1, · · · , m

  • ,

where U is restricted to a class of nonconstant increasing functions defined

  • ver [a, b] with u(a) = 0, u(b) = 1,

ψj : [a, b] → I R, j = 1, · · · , m are a class of u-integrable functions. Comment: Each utility function can be regarded as a cumulative distribution function (cdf) of some random variable supported

  • ver [a, b].

6 / 59

slide-11
SLIDE 11

The PRO model

Example 1: Pairwise comparison

Let A and B be two prospects and FA and FB their respective cdf. In a case study or a survey, the decision maker is found to prefer B to A.

7 / 59

slide-12
SLIDE 12

The PRO model

Example 1: Pairwise comparison

Let A and B be two prospects and FA and FB their respective cdf. In a case study or a survey, the decision maker is found to prefer B to A. EP[u(A)] ≤ EP[u(B)], that is, b

a

u(t)dFA(t) ≤ b

a

u(t)dFB(t).

7 / 59

slide-13
SLIDE 13

The PRO model

Example 1: Pairwise comparison

Let A and B be two prospects and FA and FB their respective cdf. In a case study or a survey, the decision maker is found to prefer B to A. EP[u(A)] ≤ EP[u(B)], that is, b

a

u(t)dFA(t) ≤ b

a

u(t)dFB(t). By integration in parts, the inequality above is equivalent to b

a

FA(t)du(t) ≥ b

a

FB(t)du(t),

7 / 59

slide-14
SLIDE 14

The PRO model

Example 1: Pairwise comparison

Let A and B be two prospects and FA and FB their respective cdf. In a case study or a survey, the decision maker is found to prefer B to A. EP[u(A)] ≤ EP[u(B)], that is, b

a

u(t)dFA(t) ≤ b

a

u(t)dFB(t). By integration in parts, the inequality above is equivalent to b

a

FA(t)du(t) ≥ b

a

FB(t)du(t), so U :=

  • u :

b

a

(FB(t) − FA(t))du(t) ≤ 0

  • .

7 / 59

slide-15
SLIDE 15

The PRO model

Example 2: Certainty equivalent (Hu and Mehrotra (2015))

Consider a lottery where an investor wins 200% with probability 0.25 and nothing with 0.75.

8 / 59

slide-16
SLIDE 16

The PRO model

Example 2: Certainty equivalent (Hu and Mehrotra (2015))

Consider a lottery where an investor wins 200% with probability 0.25 and nothing with 0.75. The investor’s certainty equivalent of the lottery is an [0.16, 0.24], which means the investor will play the game if he is offered a cash of 0.16 or less and leave the game if offered a cash of 0.24 or more.

8 / 59

slide-17
SLIDE 17

The PRO model

Example 2: Certainty equivalent (Hu and Mehrotra (2015))

Consider a lottery where an investor wins 200% with probability 0.25 and nothing with 0.75. The investor’s certainty equivalent of the lottery is an [0.16, 0.24], which means the investor will play the game if he is offered a cash of 0.16 or less and leave the game if offered a cash of 0.24 or more. This can be described as u(0.16) ≤ 0.75u(0) + 0.25u(2), u(0.24) ≥ 0.75u(0) + 0.25u(2). Since u(0) = 0 and u(2) = 1, this gives u(0.16) ≤ 0.25, u(0.24) ≥ 0.25.

8 / 59

slide-18
SLIDE 18

The PRO model

Example 2 Cont’ed

This is equivalent to U :=

  • u :

2 −✶t≥0.16du(t) ≤ −0.75, 2 ✶t≥0.24du(t) ≤ 0.75

  • ,

where ✶t≥t0(·) is an indicator function.

9 / 59

slide-19
SLIDE 19

The PRO model

How to solve the (PRO)?

Given a concrete ambiguity set U, how to solve (PRO) ϑ := max

x∈X min u∈U EP[u(f (x, ξ(ω)))]?

10 / 59

slide-20
SLIDE 20

The PRO model

How to solve the (PRO)?

Given a concrete ambiguity set U, how to solve (PRO) ϑ := max

x∈X min u∈U EP[u(f (x, ξ(ω)))]?

Existing methods When u is concave, represent u by upper envelope of linear functions (Armbruster and Delage (2012), Hu and Mehrotra (2015)) and reformulate the (PRO) as a linear program.

10 / 59

slide-21
SLIDE 21

The PRO model

How to solve the (PRO)?

Given a concrete ambiguity set U, how to solve (PRO) ϑ := max

x∈X min u∈U EP[u(f (x, ξ(ω)))]?

Existing methods When u is concave, represent u by upper envelope of linear functions (Armbruster and Delage (2012), Hu and Mehrotra (2015)) and reformulate the (PRO) as a linear program. When u is quasiconcave, use hockey stick type functions max(at + b, c) and reformulate (PRO) as a MILP (Haskell, Huang and Xu (2018)).

10 / 59

slide-22
SLIDE 22

The PRO model

Upper envelope of a concave function by linear functions

11 / 59

slide-23
SLIDE 23

The PRO model

Upper envelope of a quasiconcave function by hockey stick type functions

12 / 59

slide-24
SLIDE 24

Part I: Piecewise linear approximation

Part I: Piecewise linear approximation

13 / 59

slide-25
SLIDE 25

Part I: Piecewise linear approximation

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

t

0.2 0.4 0.6 0.8 1

Piecewise linear funtion

14 / 59

slide-26
SLIDE 26

Part I: Piecewise linear approximation

Approximated ambiguity set UN :=

  • uN ∈ UN :

b

a

ψj(t)duN(t) ≤ cj, for j = 1, · · · , m

  • ,

where UN is a class of continuous, piecewise linear functions defined over interval [t1, tN] with kinks on T. T := {t1, · · · , tN} with t1 < · · · < tN being an ordered sequence of points in [a, b].

15 / 59

slide-27
SLIDE 27

Part I: Piecewise linear approximation

Solve (PRO) ϑ := max

x∈X min u∈U EP[u(f (x, ξ(ω)))],

via solving (PRO-N) ϑN := max

x∈X min u∈UN EP[u(f (x, ξ(ω)))].

Comment: Since UN ⊂ U, ϑN ≥ ϑ, so ϑN provides an upper bound for ϑ.

16 / 59

slide-28
SLIDE 28

Part I: Piecewise linear approximation

Concave case (PRO-N) ϑN := max

x∈X min u∈UN K

  • k=1

pku(f (x, ξk)).

17 / 59

slide-29
SLIDE 29

Part I: Piecewise linear approximation

Concave case (PRO-N) ϑN := max

x∈X min u∈UN K

  • k=1

pku(f (x, ξk)). max

x∈X

min

ak,bk,α,β K

  • k=1

pk(akf (x, ξk) + bk) (1a) s.t. αi+1 − αi = βi(ti+1 − ti), ∀i ∈ I\{N}, (1b) βi ≥ βi+1, ∀i ∈ I\{N}, (1c) βi ≥ 0, ∀i ∈ I\{N}, (1d)

N−1

  • i=1

βi ti+1

ti

ψj(t)dt ≤ cj, ∀j = 1, · · · , m, (1e) akti + bk ≥ αi, ∀i ∈ I, k = 1, · · · , K, (1f) ak ≥ 0, k = 1, · · · , K. (1g)

17 / 59

slide-30
SLIDE 30

Part I: Piecewise linear approximation

t1 t2 t3 t4 t5 𝛽 𝛽 𝛽 𝛽 𝛽

( , ) 

k

f x

18 / 59

slide-31
SLIDE 31

Part I: Piecewise linear approximation

max

x∈X,θ,λ,µ,v

m

  • j=1

λjcj + θN−1 − vN−1 +

K

  • k=1

µNk (2a) s.t. pkf (x, ξk) −

N

  • i=1

µikti ≥ 0, k = 1, · · · , K, (2b) pk −

N

  • i=1

µik = 0, k = 1, · · · , K, (2c) θiti − θiti+1 + vi−1(ti − ti−1) +

m

  • j=1

λj ti+1

ti

ψj(t)dt ≥ 0, i = 2, · · · , N − 1, (2d) θ1t1 − θ1t2 + +

m

  • j=1

λj t2

t1

ψj(t)dt ≥ 0, (2e) vN−1(tN − tN−1) ≥ 0, (2f) θi−1 − θi +

K

  • k=1

µik − vi−1 + vi = 0, i = 2, · · · , N − 1, (2g) λj ≥ 0, j = 1, · · · , m, (2h) µik ≥ 0, i = 1, · · · , N, k = 1, · · · , K, (2i) vi ≥ 0, i = 1, · · · , N − 1, (2j) where θ ∈ I RN−1, λ ∈ I Rm, µ ∈ I RN×K and v ∈ I RN−1.

19 / 59

slide-32
SLIDE 32

Part I: Piecewise linear approximation

Without concavity

(PRO-N) max

x∈X min u∈UN K

  • k=1

pku(f (x, ξk)).

20 / 59

slide-33
SLIDE 33

Part I: Piecewise linear approximation

Without concavity

(PRO-N) max

x∈X min u∈UN K

  • k=1

pku(f (x, ξk)). Algorithm 2.1 Step 0. Choose some initial x0 ∈ X.

20 / 59

slide-34
SLIDE 34

Part I: Piecewise linear approximation

Without concavity

(PRO-N) max

x∈X min u∈UN K

  • k=1

pku(f (x, ξk)). Algorithm 2.1 Step 0. Choose some initial x0 ∈ X. Step 1. For s = 1, 2, · · · , do us = arg minu∈UN E[u(f (xs−1, ξ))],

20 / 59

slide-35
SLIDE 35

Part I: Piecewise linear approximation

Without concavity

(PRO-N) max

x∈X min u∈UN K

  • k=1

pku(f (x, ξk)). Algorithm 2.1 Step 0. Choose some initial x0 ∈ X. Step 1. For s = 1, 2, · · · , do us = arg minu∈UN E[u(f (xs−1, ξ))], xs+1 = arg maxx∈X E[us(f (x, ξ))].

20 / 59

slide-36
SLIDE 36

Part I: Piecewise linear approximation

Without concavity

(PRO-N) max

x∈X min u∈UN K

  • k=1

pku(f (x, ξk)). Algorithm 2.1 Step 0. Choose some initial x0 ∈ X. Step 1. For s = 1, 2, · · · , do us = arg minu∈UN E[u(f (xs−1, ξ))], xs+1 = arg maxx∈X E[us(f (x, ξ))]. Step 2. Stop when xs+1 = xs.

20 / 59

slide-37
SLIDE 37

Part I: Piecewise linear approximation

us = arg minu∈UN E[u(f (xs−1, ξ))]

u(t) = (a1t + b1)✶[t1,t2](t) +

N−1

  • j=2

(ajt + bj)✶(tj,tj+1](t).

✶ ✶

21 / 59

slide-38
SLIDE 38

Part I: Piecewise linear approximation

us = arg minu∈UN E[u(f (xs−1, ξ))]

u(t) = (a1t + b1)✶[t1,t2](t) +

N−1

  • j=2

(ajt + bj)✶(tj,tj+1](t).

(as, bs) ∈ arg min

(aj,bj),j=1,··· ,N−1 K

  • k=1

pk

  • (a1f (xs−1, ξk) + b1)✶[t1,t2](f (xs−1, ξk))

+

N−1

  • j=2

(ajf (xs−1, ξk) + bj)✶(tj,tj+1](f (xs−1, ξk))    s.t. aj−1tj + bj−1 = ajtj + bj, j = 2, · · · , N − 1, a1t1 + b1 = 0, aN−1tN + bN−1 = 1, 0 ≤ aj ≤ L, j = 1, · · · , N − 1,

N−1

  • j=1

aj tj+1

tj

ψi(t)dt ≤ ci, i = 1, · · · , m, which is a linear programming problem.

21 / 59

slide-39
SLIDE 39

Part I: Piecewise linear approximation

xs+1 = arg maxx∈X E[us(f (x, ξ))]

u(t) = (a1t + b1)✶[t1,t2](t) +

N−1

  • j=2

(ajt + bj)✶(tj,tj+1](t). ✶ ✶

22 / 59

slide-40
SLIDE 40

Part I: Piecewise linear approximation

xs+1 = arg maxx∈X E[us(f (x, ξ))]

u(t) = (a1t + b1)✶[t1,t2](t) +

N−1

  • j=2

(ajt + bj)✶(tj,tj+1](t). xs ∈ arg max

x∈X K

  • k=1

pk

  • (as

1f (x, ξk) + bs 1)✶[t1,t2](f (x, ξk))

+

N−1

  • j=2

(as

j f (x, ξk) + bs j )✶(tj,tj+1](f (x, ξk))

   .

22 / 59

slide-41
SLIDE 41

Part I: Piecewise linear approximation

xs+1 = arg maxx∈X E[us(f (x, ξ))]

u(t) = (a1t + b1)✶[t1,t2](t) +

N−1

  • j=2

(ajt + bj)✶(tj,tj+1](t). xs ∈ arg max

x∈X K

  • k=1

pk

  • (as

1f (x, ξk) + bs 1)✶[t1,t2](f (x, ξk))

+

N−1

  • j=2

(as

j f (x, ξk) + bs j )✶(tj,tj+1](f (x, ξk))

   . Comment: When f is linear in x, E[us(f (x, ξ))] is quasiconcave.

22 / 59

slide-42
SLIDE 42

Part I: Piecewise linear approximation

Proposition 2.1 (Convergence) Algorithm 2.1 either terminates in a finite number of steps with a solution to (PRO-N) model or generates a sequence {xs, us} whose cluster points are optimal solutions of (PRO-N) model.

23 / 59

slide-43
SLIDE 43

Part I: Piecewise linear approximation

Proposition 2.1 (Convergence) Algorithm 2.1 either terminates in a finite number of steps with a solution to (PRO-N) model or generates a sequence {xs, us} whose cluster points are optimal solutions of (PRO-N) model. Question What’s the difference between (PRO) and (PRO-N) in terms of the

  • ptimal value?

23 / 59

slide-44
SLIDE 44

Part I: Piecewise linear approximation

Quantify the difference between UN and U

U :=

  • u ∈ U :

b

a

ψj(t)du(t) ≤ cj, for j = 1, · · · , m

  • ,

UN :=

  • uN ∈ UN :

b

a

ψj(t)duN(t) ≤ cj, for j = 1, · · · , m

  • .

24 / 59

slide-45
SLIDE 45

Part I: Piecewise linear approximation

Quantify the difference between UN and U

U :=

  • u ∈ U :

b

a

ψj(t)du(t) ≤ cj, for j = 1, · · · , m

  • ,

UN :=

  • uN ∈ UN :

b

a

ψj(t)duN(t) ≤ cj, for j = 1, · · · , m

  • .

Let Ψ := (ψ1, · · · , ψm), C := (c1, · · · , cm). Then U and UN can be rewritten succinctly as U = {u ∈ U : Ψ, u ≤ C} and UN = {u ∈ UN : Ψ, uN ≤ C}.

24 / 59

slide-46
SLIDE 46

Part I: Piecewise linear approximation

How to measure the difference between U and UN?

Let G be a set of measurable functions defined over [a, b]. For u, v ∈ U, define the pseudo-metric between u and v by dlG (u, v) := sup

g∈G

|g, u − g, v|.

25 / 59

slide-47
SLIDE 47

Part I: Piecewise linear approximation

How to measure the difference between U and UN?

Let G be a set of measurable functions defined over [a, b]. For u, v ∈ U, define the pseudo-metric between u and v by dlG (u, v) := sup

g∈G

|g, u − g, v|. dlG (u, v) = 0 ⇔ g, u = g, v|, ∀ ∈ G , see R¨

  • misch (2003).

25 / 59

slide-48
SLIDE 48

Part I: Piecewise linear approximation

How to measure the difference between U and UN?

Let G be a set of measurable functions defined over [a, b]. For u, v ∈ U, define the pseudo-metric between u and v by dlG (u, v) := sup

g∈G

|g, u − g, v|. dlG (u, v) = 0 ⇔ g, u = g, v|, ∀ ∈ G , see R¨

  • misch (2003).

dlG (u, U) = inf

v∈U dlG (u, v)

25 / 59

slide-49
SLIDE 49

Part I: Piecewise linear approximation

Assumption 2.1 There exists a positive number θ such that sup

g∈G ,u∈U

b

a

|g(t)|du(t) < θ. (4) Assumption 2.2 Each function u ∈ U is differentiable and Lipschitz continuous over [a, b], with modulus being bounded by L, supt∈[a,b] |u′′(t)| ≤ ˜ L.

26 / 59

slide-50
SLIDE 50

Part I: Piecewise linear approximation

Measuring deviation of u from the ambiguity set U

Let ψj, u = b

a ψj(t)du(t).

Write the ambiguity set as U = {u ∈ U : Ψ, u ≤ C} The ambiguity set is the solution set of a linear inequality system! For u ∈ U , what is dlG (u, U)?

27 / 59

slide-51
SLIDE 51

Part I: Piecewise linear approximation

Lemma 2.1 (Hoffman’s lemma) Assume: (Slater’s condition) there exist a positive constant α and a function u0 ∈ U such that Ψ, u0 − C + αB ⊂ I Rm

−,

Then dlG (u, U) ≤ ∆(u, u0) α (Ψ, u − C)+, where (a)+ = max(0, a) and the maximum is taken componentwise, ∆(u, u0) := sup

g∈G

|g, u − g, u0|.

28 / 59

slide-52
SLIDE 52

Part I: Piecewise linear approximation

How to estimate ∆(u, u0)?

29 / 59

slide-53
SLIDE 53

Part I: Piecewise linear approximation

How to estimate ∆(u, u0)?

If we set G :=

  • g : [a, b] → I

R| g is measurable, sup

ξ∈[a,b]

|g(t)| ≤ 1

  • ,

then ∆(u, u0) corresponds to the total variation metric in which case ∆(u, u0) ≤ 2.

29 / 59

slide-54
SLIDE 54

Part I: Piecewise linear approximation

How to estimate ∆(u, u0)?

If we set G :=

  • g : [a, b] → I

R| g is measurable, sup

ξ∈[a,b]

|g(t)| ≤ 1

  • ,

then ∆(u, u0) corresponds to the total variation metric in which case ∆(u, u0) ≤ 2. If we set G := {g : [a, b] → I R| g is Liptchitz with modulus bounded by 1} , then ∆(u, u0) corresponds to the Kantorovich metric in which case ∆(u, u0) ≤ b − a.

29 / 59

slide-55
SLIDE 55

Part I: Piecewise linear approximation

How to estimate ∆(u, u0)?

If we set G :=

  • g : [a, b] → I

R| g is measurable, sup

ξ∈[a,b]

|g(t)| ≤ 1

  • ,

then ∆(u, u0) corresponds to the total variation metric in which case ∆(u, u0) ≤ 2. If we set G := {g : [a, b] → I R| g is Liptchitz with modulus bounded by 1} , then ∆(u, u0) corresponds to the Kantorovich metric in which case ∆(u, u0) ≤ b − a. If each g ∈ G is restricted to be both Lipschitz continuous with modulus bounded by 1 and the function value is bounded by 1, then ∆(u, u0) corresponds to the bounded Lipschitz metric in which case ∆(u, u0) ≤ max(2, b − a).

29 / 59

slide-56
SLIDE 56

Part I: Piecewise linear approximation

Proposition 2.2 Assume that u(·) is differentiable and Lipschitz continuous over interval [a, b] with modulus L and uN(t) := u(ti−1) + u(ti) − u(ti−1) ti − ti−1 (t − ti−1), for t ∈ [ti−1, ti], i = 2, · · · , N, where t1 = a, tN = b. Then sup

t∈[a,b]

|uN(t) − u(t)| ≤ LβN, where βN := max

i=2,··· ,N(ti − ti−1).

30 / 59

slide-57
SLIDE 57

Part I: Piecewise linear approximation

𝑢1 𝑢2 𝑢3 𝑢4 𝛾𝑂 𝑣𝑂 𝑢 − 𝑣(𝑢)

31 / 59

slide-58
SLIDE 58

Part I: Piecewise linear approximation

Theorem 2.1 (Error bound on the discrete approximation of the ambiguity set) Assume that supt∈[a,b] |u′′(t)| ≤ ˜

  • L. Then there exist ˆ

α < α and N0 such that HG (UN, U) ≤

  • ˜

Lθ/L + ∆ ˆ α

  • √m(b − a)

sup

j=1,··· ,m,t∈[a,b]

|ψj(t)|˜ L

  • βN

for all N ≥ N0, where ∆ := supu,˜

u∈U ∆(u, ˜

u) and βN := max

i=2,··· ,N(ti − ti−1)

.

32 / 59

slide-59
SLIDE 59

Part I: Piecewise linear approximation

Theorem 2.2 (Error bound on the optimal value) Assume the setting and conditions of Theorem 2.1. Then |ϑN − ϑ| ≤

  • ˜

Lθ/L + ∆ ˆ α

  • √m(b − a)

sup

j=1,··· ,m,t∈[a,b]

|ψj(t)|˜ L

  • βN

for all N ≥ N0, where ˜ L, θ, ˆ α and N0 are defined as in Theorem 2.1.

33 / 59

slide-60
SLIDE 60

Part I: Piecewise linear approximation

Stability against variation of data in the definition of ambiguity set U

Recall (PRO) ϑ := max

x∈X min u∈U EP[u(f (x, ξ(ω)))],

34 / 59

slide-61
SLIDE 61

Part I: Piecewise linear approximation

Stability against variation of data in the definition of ambiguity set U

Recall (PRO) ϑ := max

x∈X min u∈U EP[u(f (x, ξ(ω)))],

With true data U = {u ∈ U : Ψ, u ≤ C}. With perturbed data ˜ U = {u ∈ U : ˜ Ψ, u ≤ ˜ C}. How to quantify the difference between ˜ U and U under metric dlG ?

34 / 59

slide-62
SLIDE 62

Part I: Piecewise linear approximation

Proposition 2.3 (Stability of the ambiguity set) Let Assumptions 2.1 and 2.2 hold, and the Slater’s condition be fulfilled. Then HG ( ˜ U, U) ≤ ∆ α

  • sup

˜ u∈ ˜ U∪U

Ψ − ˜ Ψ, ˜ u + C − ˜ C

  • ,

where ∆ := supu,˜

u∈U ∆(u, ˜

u) and ∆(u, ˜ u) is defined as in Lemma 2.1.

35 / 59

slide-63
SLIDE 63

Part I: Piecewise linear approximation

Proposition 2.3 (Stability of the ambiguity set) Let Assumptions 2.1 and 2.2 hold, and the Slater’s condition be fulfilled. Then HG ( ˜ U, U) ≤ ∆ α

  • sup

˜ u∈ ˜ U∪U

Ψ − ˜ Ψ, ˜ u + C − ˜ C

  • ,

where ∆ := supu,˜

u∈U ∆(u, ˜

u) and ∆(u, ˜ u) is defined as in Lemma 2.1. sup

˜ u∈ ˜ U∪U

Ψ − ˜ Ψ, ˜ u ≤ L m

  • i=1

b

a

|ψi(t) − ˜ ψi(t)|dt 2 1

2

.

35 / 59

slide-64
SLIDE 64

Part I: Piecewise linear approximation

Theorem 2.3 (Stability of the optimal value) Under the setting and conditions of Proposition 2.3 ˜ ϑ − ϑ| ≤ ∆ α

  • sup

˜ u∈ ˜ U

Ψ − ˜ Ψ, ˜ u + C − ˜ C

  • ,

where ∆ := supu,˜

u∈U ∆(u, ˜

u).

36 / 59

slide-65
SLIDE 65

Part I: Piecewise linear approximation

Portfolio investment problem (Hu and Mehrotra (2015))

Concave utility function case max

x∈X min u∈U K

  • k=1

pku(1 + xTξk). The ambiguity set of utility functions

U :=              u ∈ U : 2

0 (0.7 × ✶t≥0.2 + 0.3 × ✶t≥1.2 − 0.5 × ✶t≥1 − 0.5 × ✶t≥0)du(t) ≤ 0,

2

0 (0.3 × ✶t≥0.8 + 0.7 × ✶t≥1.8 − 0.5 × ✶t≥1 − 0.5 × ✶t≥2)du(t) ≤ 0,

2

0 −✶t≥0.16du(t) ≤ −0.75,

2

0 ✶t≥0.24du(t) ≤ 0.75,

2

0 −✶t≥0.46du(t) ≤ −0.5,

2

0 ✶t≥0.54du(t) ≤ 0.5,

2

0 −✶t≥0.96du(t) ≤ −0.25,

2

0 −✶t≥1.04du(t) ≤ 0.25

             ,

where U is a set of utility functions mapping from [0, 2] to [0, 1].

37 / 59

slide-66
SLIDE 66

Part I: Piecewise linear approximation

Increasing concave utility case

Table: Optimal portfolio decision and optimal value: increasing concave case

N DJI EFA GOX GSPC IXIC TNX TYX W5000 Optimal value 10 0.2493 0.7477 0.0030 0.7389 50 0.1120 0.6960 0.1920 0.7377 100 0.1623 0.5837 0.2539 0.7372 200 0.2077 0.3688 0.4235 0.7369 500 0.2358 0.3662 0.3980 0.7366 800 0.2255 0.3761 0.3984 0.7366

38 / 59

slide-67
SLIDE 67

Part I: Piecewise linear approximation

Perturbation of the data

U :=              u ∈ U : 2

0 (0.7 × ✶t≥0.2 + 0.3 × ✶t≥1.2 − 0.5 × ✶t≥1 − 0.5 × ✶t≥0)du(t) ≤ δ,

2

0 (0.3 × ✶t≥0.8 + 0.7 × ✶t≥1.8 − 0.5 × ✶t≥1 − 0.5 × ✶t≥2)du(t) ≤ δ,

2

0 −✶t≥0.16du(t) ≤ −0.75,

2

0 ✶t≥0.24du(t) ≤ 0.75 + δ,

2

0 −✶t≥0.46du(t) ≤ −0.5,

2

0 ✶t≥0.54du(t) ≤ 0.5 + δ,

2

0 −✶t≥0.96du(t) ≤ −0.25,

2

0 −✶t≥1.04du(t) ≤ 0.25 + δ

             ,

39 / 59

slide-68
SLIDE 68

Part I: Piecewise linear approximation

Increasing concave utility case: stability analysis

0.05 0.1 0.15 0.2 0.25

/

0.5 0.55 0.6 0.65 0.7 0.75

Optimal value Figure: Optimal value w.r.t variation of δ: increasing concave case

40 / 59

slide-69
SLIDE 69

Part I: Piecewise linear approximation

Increasing utility case

Table: Optimal portfolio decision and optimal value: increasing case

N DJI EFA GOX GSPC IXIC TNX TYX W5000 Optimal value 10 0.2493 0.7477 0.0030 0.7134 12 0.1703 0.7510 0.0786 0.7058 14 0.1965 0.6693 0.1314 0.7005 16 0.0160 0.4973 0.4867 0.6982 18 0.0886 0.7110 0.2004 0.6964 20 0.1431 0.5698 0.2872 0.6941

41 / 59

slide-70
SLIDE 70

Part I: Piecewise linear approximation

Increasing utility case

10 12 14 16 18 20

N

0.69 0.695 0.7 0.705 0.71 0.715

Optimal value Figure: Optimal value w.r.t variation of N: increasing case

42 / 59

slide-71
SLIDE 71

Part I: Piecewise linear approximation

Increasing utility case

0.5 1 1.5 2

t

0.2 0.4 0.6 0.8 1

Optimal piecewise linear utility functions

N=10 N=12 N=14 N=16 N=18 N=20

43 / 59

slide-72
SLIDE 72

Part II: Statistical Robustness in PRO

Part II: Statistical Robustness in Utility Preference Robust Optimization Models

44 / 59

slide-73
SLIDE 73

Part II: Statistical Robustness in PRO

Motivation – data driven problem

Consider (PRO) ϑ(P) := max

x∈X min u∈U EP[u(f (x, ξ))].

Let ξ1, · · · , ξN be the iid sample and PN := 1 N

N

  • j=1

✶ξj(·). Comment If the empirical data are generated by the true probability distribution, then ϑ(PN) is an estimator of ϑ(P). Is ϑ(PN) a good estimator if the empirical data contain some noise?

45 / 59

slide-74
SLIDE 74

Part II: Statistical Robustness in PRO

Empirical distributions

Let P ∈ P(Ξ) be the true probability distribution and PN(·) := 1 N

N

  • i=1

✶ξi(·) its empirical distribution. Let Q be its perturbation and QN(·) := 1 N

N

  • i=1

✶˜

ξi(·)

the empirical distribution of Q. Q: When Q is close to P, is ϑ(QN) close to ϑ(PN)?

46 / 59

slide-75
SLIDE 75

Part II: Statistical Robustness in PRO

ϑ(PN) → ϑ(P) ↑ continuity of ϑ ϑ(QN) → ϑ(Q)

47 / 59

slide-76
SLIDE 76

Part II: Statistical Robustness in PRO

ϑ(PN) → ϑ(P) ↑ continuity of ϑ ϑ(QN) → ϑ(Q) Sufficient and necessary conditions Continuity: ϑ(Q) is close to ϑ(P) for all Q close to P. Uniform Glivenko-Cantelli property: QN is close to Q uniformly for all Q close to P. Statistical robustness, Hampel (1971), Huber and Ronchetti (2009), Cont et al (2010), Kr¨ atschmer, Schied and Z¨ ahle (2012) etc.

47 / 59

slide-77
SLIDE 77

Part II: Statistical Robustness in PRO

ψ-weak topology

Let ψ : I Rk → [0, ∞) be a continuous function and Mψ

k :=

  • P′ ∈ P(I

Rk) :

  • I

Rk ψ(t)P′(dt) < ∞

  • .

In the particular case when ψ = · p, where · denotes the Euclidean norm in I Rk and p is a positive number, we write Mp

k for M·p k

. Mψ

k defines a set of probability distributions such that ψ has finite

moment.

48 / 59

slide-78
SLIDE 78

Part II: Statistical Robustness in PRO

Prohorov metric

Let dlψ : Mψ

k × Mψ k → I

R defined by

dlψ(P′, P′′) := dlProh(P′, P′′) +

  • I

Rk ψ(t)P′(dt) −

  • I

Rk ψ(t)P′′(dt)

  • (5)

where dlProh : P(I Rk) × P(I Rk) → I R+ is the Prohorov metric defined as follows: dlProh(P′, P′′) := inf{ǫ > 0 : P′(A) ≤ P′′(Aǫ) + ǫ, ∀A ∈ B(I Rk)}, where Aǫ := A + Bǫ(0) denotes the Minkowski sum of A and the open ball centred at 0 (w.r.t. Euclidean norm).

49 / 59

slide-79
SLIDE 79

Part II: Statistical Robustness in PRO

Growth condition

Recall (PRO) ϑ(P) := max

x∈X min u∈U EP[u(f (x, ξ))].

50 / 59

slide-80
SLIDE 80

Part II: Statistical Robustness in PRO

Growth condition

Recall (PRO) ϑ(P) := max

x∈X min u∈U EP[u(f (x, ξ))].

There exists a function φ such that |u(f (x, t))| ≤ φ(t), ∀(x, t) ∈ I Rn × I Rk, u(·) ∈ U.

50 / 59

slide-81
SLIDE 81

Part II: Statistical Robustness in PRO

Continuity

Recall (PRO) ϑ(P) := max

x∈X min u∈U EP[u(f (x, ξ))].

51 / 59

slide-82
SLIDE 82

Part II: Statistical Robustness in PRO

Continuity

Recall (PRO) ϑ(P) := max

x∈X min u∈U EP[u(f (x, ξ))].

Theorem 3.1 Let P, Q ∈ Mφ

  • k. Then

lim

Q

dlφ

− − →P ϑ(Q) = ϑ(P).

51 / 59

slide-83
SLIDE 83

Part II: Statistical Robustness in PRO

Example (continuity fails)

Let U be a singleton with u(t) = t for t ∈ I R and f (xξ) = xξ, x ∈ [0, 1] and ξ : I R → I R is a random variable. ✶

52 / 59

slide-84
SLIDE 84

Part II: Statistical Robustness in PRO

Example (continuity fails)

Let U be a singleton with u(t) = t for t ∈ I R and f (xξ) = xξ, x ∈ [0, 1] and ξ : I R → I R is a random variable. Assume that EP[ξ] > 0. Then ϑ(P) = max

x∈[0,1] EP[u(f (x, ξ))] = max x∈[0,1] xEP[ξ] = EP[ξ].

52 / 59

slide-85
SLIDE 85

Part II: Statistical Robustness in PRO

Example (continuity fails)

Let U be a singleton with u(t) = t for t ∈ I R and f (xξ) = xξ, x ∈ [0, 1] and ξ : I R → I R is a random variable. Assume that EP[ξ] > 0. Then ϑ(P) = max

x∈[0,1] EP[u(f (x, ξ))] = max x∈[0,1] xEP[ξ] = EP[ξ].

Let ξ1, · · · , ξN−1 be an iid sample generated by a random variable U having uniform distribution over [0, 1] and ξN = N. ✶

52 / 59

slide-86
SLIDE 86

Part II: Statistical Robustness in PRO

Example (continuity fails)

Let U be a singleton with u(t) = t for t ∈ I R and f (xξ) = xξ, x ∈ [0, 1] and ξ : I R → I R is a random variable. Assume that EP[ξ] > 0. Then ϑ(P) = max

x∈[0,1] EP[u(f (x, ξ))] = max x∈[0,1] xEP[ξ] = EP[ξ].

Let ξ1, · · · , ξN−1 be an iid sample generated by a random variable U having uniform distribution over [0, 1] and ξN = N. Let PN := 1

N

N

i=1 ✶ξi(·). Then PN converges weakly to P∗, the cdf

  • f U[0, 1].

52 / 59

slide-87
SLIDE 87

Part II: Statistical Robustness in PRO

Example (continuity fails)

Let U be a singleton with u(t) = t for t ∈ I R and f (xξ) = xξ, x ∈ [0, 1] and ξ : I R → I R is a random variable. Assume that EP[ξ] > 0. Then ϑ(P) = max

x∈[0,1] EP[u(f (x, ξ))] = max x∈[0,1] xEP[ξ] = EP[ξ].

Let ξ1, · · · , ξN−1 be an iid sample generated by a random variable U having uniform distribution over [0, 1] and ξN = N. Let PN := 1

N

N

i=1 ✶ξi(·). Then PN converges weakly to P∗, the cdf

  • f U[0, 1].

However ϑ(PN) = EPN(ξ) = N−1

j=1 ξj N + 1 N × N → 3 2 > 1 2 = ϑ(P∗).

52 / 59

slide-88
SLIDE 88

Part II: Statistical Robustness in PRO

What is wrong?

PN does not converge to P∗ under | · |-topology of weak convergence, i.e., ∞ |t|PN(dt) → ∞ |t|P∗(dt), the outliner has taken too much probability!

53 / 59

slide-89
SLIDE 89

Part II: Statistical Robustness in PRO

What is wrong?

PN does not converge to P∗ under | · |-topology of weak convergence, i.e., ∞ |t|PN(dt) → ∞ |t|P∗(dt), the outliner has taken too much probability! Revise PN to ˜ PN where ˜ PN(ξj) =

  • 1

N2

for j = N, (1 −

1 N2 ) 1 N−1

for j = 1, · · · , N − 1, then ∞ |t| ˜ PN(dt) → ∞ |t|P∗(dt).

53 / 59

slide-90
SLIDE 90

Part II: Statistical Robustness in PRO

Uniform Glivenko-Cantelli property

Definition 3.1 (Uniform Glivenko-Cantelli property) Let M be a subset of Mφ

  • k. We say the metric space (M, dlφ) has Uniform

Glivenko-Cantelli (UGC) property if for every ǫ > 0 and δ > 0, there exists N0 ∈ N such that for all Q ∈ M and N ≥ N0 Q⊗N (ξ1, · · · , ξN) : dlφ(QN, Q) ≥ δ

  • ≤ ǫ.

(6)

54 / 59

slide-91
SLIDE 91

Part II: Statistical Robustness in PRO

Qualitative robustness of the optimal value

Theorem 3.2 Let p > 1 and Mφp

k,κ :=

  • P ∈ M(I

Rk) :

  • I

Rk φ(t)pP(dt) ≤ κ

  • . Let

M ⊂ Mφp

k,κ. Then

(i) For any small number ǫ > 0, there exist positive numbers δ > 0 and N0 ∈ N such that ∀Q ∈ M, dlφ(Q, P) ≤ δ = ⇒ dlProh

  • Q⊗N ◦ ϑ(QN)−1, P⊗N ◦ ϑ(PN)−1

≤ ǫ for N ≥ N0.

55 / 59

slide-92
SLIDE 92

Part II: Statistical Robustness in PRO

Qualitative robustness of the optimal solution

(ii) If, in addition, for each Q ∈ M, the sets of optimal solutions S(QN) and S(Q) are singleton, then for any small number ǫ > 0, there exist positive numbers δ > 0 and N0 ∈ N such that for all N ≥ N0 ∀Q ∈ M, dlφ(Q, P) ≤ δ = ⇒ dlProh

  • Q⊗N ◦ S(QN)−1, P⊗N ◦ S(PN)−1

≤ ǫ for N ≥ N0.

56 / 59

slide-93
SLIDE 93

Part II: Statistical Robustness in PRO

Specific cases covered

Stochastic programming when U is a singleton with u(t) = t for t ∈ I R (SP) max

x∈X EP[f (x, ξ)].

57 / 59

slide-94
SLIDE 94

Part II: Statistical Robustness in PRO

Specific cases covered

Stochastic programming when U is a singleton with u(t) = t for t ∈ I R (SP) max

x∈X EP[f (x, ξ)].

Utility optimization problem when X is a singleton with X = {x0} for t ∈ I R min

u∈U EP[u(f (x0, ξ))].

57 / 59

slide-95
SLIDE 95

Part II: Statistical Robustness in PRO

Conclusion

Piecewise linear approximation provides an avenue for solving PRO Quantify the error for the ambiguity set and optimal value Provide stability analysis against perturbation of data Derive conditions under which PRO models are statistically robustness:

the more conservative, the more statistically robust the more risk taking, the less statistically robust

58 / 59

slide-96
SLIDE 96

Part II: Statistical Robustness in PRO

Further research

Using different approaches to construct the ambiguity set Numerical experiments on statistical robustness Extension to risk management, i.e., robust spectral risk measure

59 / 59