Data-Driven Optimization under Distributional Uncertainty Shuo Han - - PowerPoint PPT Presentation

data driven optimization under distributional uncertainty
SMART_READER_LITE
LIVE PREVIEW

Data-Driven Optimization under Distributional Uncertainty Shuo Han - - PowerPoint PPT Presentation

Data-Driven Optimization under Distributional Uncertainty Shuo Han Postdoctoral Researcher Electrical and Systems Engineering University of Pennsylvania Internet of Things (IoT) A network of physical objects - Devices - Vehicles -


slide-1
SLIDE 1

Data-Driven Optimization under Distributional Uncertainty

Shuo Han

Postdoctoral Researcher Electrical and Systems Engineering University of Pennsylvania

slide-2
SLIDE 2

Shuo Han UTC-IASE, Apr 2017

Internet of Things (IoT)

  • A network of physical objects
  • Devices
  • Vehicles
  • Buildings
  • Allows objects to be
  • Sensed and controlled
  • Remotely across the network
  • Growing rapidly, by 2020:
  • 50 billion devices
  • 6.58 devices per person

2

slide-3
SLIDE 3

Shuo Han UTC-IASE, Apr 2017

What IoT Brings: More Sensing (and More Data)

  • Total size of dataset: 267 GB
  • 1.1 billion taxi and Uber trips (2009 - 2015)
  • Pick-up and drop-off dates/times, locations, distances...

3

[Source: nyc.gov]

TLC: Taxi and Limousine Commission

slide-4
SLIDE 4

Shuo Han UTC-IASE, Apr 2017

What IoT Brings: More Control

4

Connected and Autonomous Vehicles Smart Home Appliances Wireless Traffic Light Control Smart Buildings

slide-5
SLIDE 5

Shuo Han UTC-IASE, Apr 2017

Smart Cities: IoT + Decision Support

5

City Infrastructure Control & Optimization Algorithm Sensor Data Actions

slide-6
SLIDE 6

Shuo Han UTC-IASE, Apr 2017

Investment in Smart Cities

6

“... an infrastructure to continuously improve the collection, aggregation, and use

  • f data to improve the life of their residents – by harnessing the growing data

revolution, low-cost sensors, and research collaborations, and doing so securely to protect safety and privacy.” The Smart Cities Initiative from the White House (Sep 2015)

TerraSwarm

slide-7
SLIDE 7

Shuo Han UTC-IASE, Apr 2017

TerraSwarm: Swarm at The Edge of The Cloud

7

“How should we make use of data?” “How should we send data?” “How should we collect data?”

TerraSwarm

slide-8
SLIDE 8

Shuo Han UTC-IASE, Apr 2017

Research Interests

8

Convex Optimization Control Theory Statistics Theory Applications Energy Transportation Research Topics Multi-Agent Systems Stochastic Systems Network Dynamics

slide-9
SLIDE 9

Shuo Han UTC-IASE, Apr 2017

Research Overview

9

Data-Driven Optimization

6am 12pm 6pm 50 100 150 Power generation

[ACC13], [SIOPT15] [CDC15], [TASE16], [ICCPS17]

Pricing for Ridesharing

[Allerton14], [TAC16] [ACC17]

Privacy Solutions for Cyber-Physical Systems

slide-10
SLIDE 10

Shuo Han UTC-IASE, Apr 2017

Research Overview

10

Data-Driven Optimization

6am 12pm 6pm 50 100 150 Power generation

[ACC13], [SIOPT15] [CDC15], [TASE16], [ICCPS17]

Privacy Solutions for Cyber-Physical Systems

[Allerton14], [TAC16] [ACC17]

Pricing for Ridesharing

slide-11
SLIDE 11

Shuo Han UTC-IASE, Apr 2017

[Source]: AESO

6am 12pm 6pm 50 100 150 Power generation

Motivation: Wind Energy Integration

11

conventional power plant wind power storage devices

+

Control Action: Allocation of energy storage How can we make use of the wind power generation data to maximally utilize wind power?

Wind Energy Data

slide-12
SLIDE 12

Shuo Han UTC-IASE, Apr 2017

Motivation: On-Demand Ridesharing in Cities

12

How can we make use of the trip data to reduce the average wait time for passengers?

  • Pick-up and drop-off times
  • Pick-up/drop-off locations
  • Travel distances

Cause: Mismatch between supply and demand

Control Action: Redistribution of empty vehicles

slide-13
SLIDE 13

Shuo Han UTC-IASE, Apr 2017

Probability distribution that models the stochastic phenomenon

Background: Stochastic Programming

  • : Objective function
  • : Decision variable
  • Ridesharing: Redirection of empty vehicles
  • Wind power integration: Allocation of storage
  • : Stochastic phenomenon
  • Ridesharing: Future passenger demand
  • Wind power integration: Wind power generation
  • : Probability distribution of

13

minimize

x

Eθ∼d [f(x, θ)]

f x θ d θ

slide-14
SLIDE 14

Shuo Han UTC-IASE, Apr 2017

Distribution is Not Always Available

14

6am 12pm 6pm 50 100 150 Power generation

We often do not have: Instead, we have: Question: How should these samples be used in a computationally tractable way with performance guarantees?

slide-15
SLIDE 15

Shuo Han UTC-IASE, Apr 2017

Using Sampled Data: Previous Methods

15

Sample average approximation

minimize

x

1 n

n

X

i=1

f(x, θi)

  • Weak guarantee on performance

Robust optimization

minimize

x

max

θ∈Θ f(x, θ)

θi Θ

  • Can be extremely conservative

Distributional Information + Uncertainty ?

slide-16
SLIDE 16

Shuo Han UTC-IASE, Apr 2017

Using Sampled Data: Distributional Uncertainty

  • Distributional uncertainty
  • An ambiguity set in the space of probability distributions
  • No assumption on the type (continuous vs discrete,

Gaussian, uniform, ...) of distributions

  • Contains the true distribution with high probability
  • Informally: “Uncertainty of uncertainty”

16

D d

(true distribution)

minimize

x

Eθ∼d [f(x, θ)]

(vs. )

minimize

x

max

d∈D Eθ∼d [f(x, θ)]

  • Decision making problem: Distributionally robust optimization
  • Strong worst-case guarantees
  • Subsumes conventional robust optimization
slide-17
SLIDE 17

Shuo Han UTC-IASE, Apr 2017

Distributional Uncertainty

  • Method 1: Based on certain (pseudo)metric
  • KL divergence
  • Wasserstein metric (earth mover’s distance)
  • Metric ball centered at the empirical distribution
  • The ball contains with high probability
  • Advantage: “Nonparametric” characterization
  • Disadvantage: Complexity of decision making

against grows quickly with the number of samples

17

D d b d

empirical distribution

d D

(true distribution)

M D(✏) = n d: M(d, b d) ≤ ✏

slide-18
SLIDE 18

Shuo Han UTC-IASE, Apr 2017

Distributional Uncertainty (cont’d)

  • Method 2: Based on generalized moments (this talk)
  • Assume: is easily bounded
  • Examples
  • Moments: ,
  • Tail probability:
  • Classical concentration inequalities can be used to

compute the probability that contains

18

E[θ] = ˆ θ cov[θ] = b Σ P(✓ ≥ ¯ ✓) ≤ ✏

g

D = {d: Eθ∼d[g(θ)] 0} D d

(true distribution)

D d P 1 n

n

X

i=1

θi ≥ Eθ + t ! ≤ exp(−2nt2) Example (Hoeffding’s inequality):

slide-19
SLIDE 19

Shuo Han UTC-IASE, Apr 2017

Challenges and My Contribution

  • Challenge: Finding the worst-case distribution
  • Infinite-dimensional optimization problem
  • Not numerically tractable

19

minimize

x

max

d∈D Eθ∼d [f(x, θ)]

  • My contribution
  • Formulate equivalent convex optimization problem (under certain conditions)
  • Tractable numerical solutions
  • Conditions apply to many resource allocation and scheduling problems
  • Previous work on special instances
  • [Scarf, 1958]: Analytical solution for a special case
  • [Bertsimas, Popescu, 2005]: Optimal probability inequalities
  • [Vandenberghe, Boyd, Comanor, 2007]: Optimal Chebyshev bounds
  • [Delage, Ye, 2010]: Piecewise affine functions
slide-20
SLIDE 20

Shuo Han UTC-IASE, Apr 2017

g(θ) f(θ)

Main Result: Equivalent Convex Optimization Problem

20

Theorem: There exists an equivalent convex

  • ptimization problem for computing the

worst-case distribution if

f

concave

  • The objective is piecewise concave

f(θ) = max

k

f (k)(θ) f (k) g

convex

  • The constraint is piecewise convex

g(θ) = min

l

g(l)(θ) g(l)

Shuo Han, Molei Tao, Ufuk Topcu, Houman Owhadi, Richard M. Murray, “Convex optimal uncertainty quantification,” SIAM Journal on Optimization, 25(3), 1368–1387, 2015.

slide-21
SLIDE 21

Shuo Han UTC-IASE, Apr 2017

concave piecewise affine 0-1 indicator resource allocatio llocation/scheduling failure rate

Piecewise Concave Functions

21

1

I(θ ≥ a) max

k∈K{aT k θ + bk}

slide-22
SLIDE 22

Shuo Han UTC-IASE, Apr 2017

Piecewise Convex Functions

22

linear convex 0-1 indicator mean

covariance & higher moments

tail probability 1

I(θ ≥ a)

slide-23
SLIDE 23

Shuo Han UTC-IASE, Apr 2017

The Convex Optimization Problem

23

maximize

{pkl,γkl}k,l

X

k,l

pklf (k)(γkl/pkl) subject to X

k,l

pkl = 1 pkl ≥ 0, ∀k, l X

k,l

pklg(l)(γkl/pkl) ≤ 0 K · L f(θ) = max

k∈{1,2,··· ,K} f (k)(θ)

g(θ) = min

l∈{1,2,··· ,L} g(l)(θ)

For:

  • The worst case is always attained by a discrete distribution
  • Total number of Dirac masses in the distribution:
slide-24
SLIDE 24

Shuo Han UTC-IASE, Apr 2017

Storage Allocation for Power Grid

24

xi

power flow

fij

storage

θi

wind power (stochastic)

min.

x

max

d

Eθ∼d  min

f

Wind Energy Wasted(x, θ, f)

  • Storage Allocation Problem
  • ptimal power flow

θ piecewise concave in

| {z }

slide-25
SLIDE 25

Shuo Han UTC-IASE, Apr 2017

Numerical Example: IEEE 14-Bus Test Case

  • Network with 5 generators
  • Time: one day, 3-hour interval
  • Mean and covariance obtained from

real wind generation data

25

[Source]: AESO

6am 12pm 6pm 50 100 150 Power generation

slide-26
SLIDE 26

Shuo Han UTC-IASE, Apr 2017

5 10 15 10 20 30 40 50 Total storage Expected cost

The Influence of Information Constraints

26

Exact distribution Support Information Only Support + Mean + Covariance

Shuo Han, Ufuk Topcu, Molei Tao, Houman Owhadi, Richard M. Murray, “Convex optimal uncertainty quantification: Algorithms and a case study in energy storage placement for power grids study,” American Control Conference, 2013.

slide-27
SLIDE 27

Shuo Han UTC-IASE, Apr 2017

On-Demand Ridesharing

27

Dispatch Center Vehicle Customer Predicted Demand Dispatch Command

min.

X1:T T

X

t=1

[JD(Xt) + JE(Xt, rt)]

Distribution of Customer Demand Vehicle Flows Wait Time Cost of Rebalancing Demand

slide-28
SLIDE 28

Shuo Han UTC-IASE, Apr 2017

Robust vs. Non-Robust

  • Robust optimization against demand uncertainty

28 Fei Miao, Shuo Han, Shan Lin, George J. Pappas, “Taxi dispatch under model uncertainties,” IEEE Conference on Decision and Control, 2015.

12 16 20 24 28 32 36 40 44 48 52 10 20 30 40

Cost range Number of experiments Costs distribution of dispatch solutions non−robust solutions robust solutions

Robust solution: 35.5% reduction

min.

X1:T

max

r1:T ∈∆ T

X

t=1

[JD(Xt) + JE(Xt, rt)]

100 200 300 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Total requests number Boxplot of total requests number during one hour

Region ID

NYC Dataset: 4 years, 100 GB

slide-29
SLIDE 29

Shuo Han UTC-IASE, Apr 2017

Conventional vs. Distributionally Robust

  • Distributionally robust formulation

29 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 7 8 9 10x 10

4 Average cost comparison of DRO and RO

ϵ Average cost SOC Box NR DRO

min.

X1:T max d∈D Er∼d

( T X

t=1

[JD(Xt) + JE(Xt, rt)] )

Fei Miao, Shuo Han, Abdeltawab Hendawi, et al., “Data-driven distributionally robust vehicle balancing with dynamic region partition,” ACM/IEEE Intl. Conf. on Cyber-Physical Systems, 2017. (non-robust) (distr. robust) (robust #1) (robust #2) confidence level Note: Confidence level not

  • ptimized for distr. robust opt.

Confidence level: Probability the true parameter/distribution lies outside ambiguity set

slide-30
SLIDE 30

Shuo Han UTC-IASE, Apr 2017

Future Directions

30

  • Distributed computation
  • Approximate algorithms
  • Markov properties
  • Prior knowledge
  • When to discard old data
  • When to re-learn

Large Datasets Structured Models Online Optimization

slide-31
SLIDE 31

Shuo Han UTC-IASE, Apr 2017

Summary

  • Distributional uncertainty: A new approach to data-driven optimization
  • A rigorous way to make use of sampled data
  • Probabilistic guarantees
  • Worst-case analysis/design: Often required for engineering applications
  • Computationally efficient
  • Convex formulation available for a large class of problems
  • Examples: Resource allocation and scheduling

31

6am 12pm 6pm 50 100 150 Power generation

D d

(true distribution)