Data-Driven Optimization under Distributional Uncertainty Shuo Han - - PowerPoint PPT Presentation
Data-Driven Optimization under Distributional Uncertainty Shuo Han - - PowerPoint PPT Presentation
Data-Driven Optimization under Distributional Uncertainty Shuo Han Postdoctoral Researcher Electrical and Systems Engineering University of Pennsylvania Internet of Things (IoT) A network of physical objects - Devices - Vehicles -
Shuo Han UTC-IASE, Apr 2017
Internet of Things (IoT)
- A network of physical objects
- Devices
- Vehicles
- Buildings
- Allows objects to be
- Sensed and controlled
- Remotely across the network
- Growing rapidly, by 2020:
- 50 billion devices
- 6.58 devices per person
2
Shuo Han UTC-IASE, Apr 2017
What IoT Brings: More Sensing (and More Data)
- Total size of dataset: 267 GB
- 1.1 billion taxi and Uber trips (2009 - 2015)
- Pick-up and drop-off dates/times, locations, distances...
3
[Source: nyc.gov]
TLC: Taxi and Limousine Commission
Shuo Han UTC-IASE, Apr 2017
What IoT Brings: More Control
4
Connected and Autonomous Vehicles Smart Home Appliances Wireless Traffic Light Control Smart Buildings
Shuo Han UTC-IASE, Apr 2017
Smart Cities: IoT + Decision Support
5
City Infrastructure Control & Optimization Algorithm Sensor Data Actions
Shuo Han UTC-IASE, Apr 2017
Investment in Smart Cities
6
“... an infrastructure to continuously improve the collection, aggregation, and use
- f data to improve the life of their residents – by harnessing the growing data
revolution, low-cost sensors, and research collaborations, and doing so securely to protect safety and privacy.” The Smart Cities Initiative from the White House (Sep 2015)
TerraSwarm
Shuo Han UTC-IASE, Apr 2017
TerraSwarm: Swarm at The Edge of The Cloud
7
“How should we make use of data?” “How should we send data?” “How should we collect data?”
TerraSwarm
Shuo Han UTC-IASE, Apr 2017
Research Interests
8
Convex Optimization Control Theory Statistics Theory Applications Energy Transportation Research Topics Multi-Agent Systems Stochastic Systems Network Dynamics
Shuo Han UTC-IASE, Apr 2017
Research Overview
9
Data-Driven Optimization
6am 12pm 6pm 50 100 150 Power generation
[ACC13], [SIOPT15] [CDC15], [TASE16], [ICCPS17]
Pricing for Ridesharing
[Allerton14], [TAC16] [ACC17]
Privacy Solutions for Cyber-Physical Systems
Shuo Han UTC-IASE, Apr 2017
Research Overview
10
Data-Driven Optimization
6am 12pm 6pm 50 100 150 Power generation
[ACC13], [SIOPT15] [CDC15], [TASE16], [ICCPS17]
Privacy Solutions for Cyber-Physical Systems
[Allerton14], [TAC16] [ACC17]
Pricing for Ridesharing
Shuo Han UTC-IASE, Apr 2017
[Source]: AESO
6am 12pm 6pm 50 100 150 Power generation
Motivation: Wind Energy Integration
11
conventional power plant wind power storage devices
+
Control Action: Allocation of energy storage How can we make use of the wind power generation data to maximally utilize wind power?
Wind Energy Data
Shuo Han UTC-IASE, Apr 2017
Motivation: On-Demand Ridesharing in Cities
12
How can we make use of the trip data to reduce the average wait time for passengers?
- Pick-up and drop-off times
- Pick-up/drop-off locations
- Travel distances
Cause: Mismatch between supply and demand
Control Action: Redistribution of empty vehicles
Shuo Han UTC-IASE, Apr 2017
Probability distribution that models the stochastic phenomenon
Background: Stochastic Programming
- : Objective function
- : Decision variable
- Ridesharing: Redirection of empty vehicles
- Wind power integration: Allocation of storage
- : Stochastic phenomenon
- Ridesharing: Future passenger demand
- Wind power integration: Wind power generation
- : Probability distribution of
13
minimize
x
Eθ∼d [f(x, θ)]
f x θ d θ
Shuo Han UTC-IASE, Apr 2017
Distribution is Not Always Available
14
6am 12pm 6pm 50 100 150 Power generation
We often do not have: Instead, we have: Question: How should these samples be used in a computationally tractable way with performance guarantees?
Shuo Han UTC-IASE, Apr 2017
Using Sampled Data: Previous Methods
15
Sample average approximation
minimize
x
1 n
n
X
i=1
f(x, θi)
- Weak guarantee on performance
Robust optimization
minimize
x
max
θ∈Θ f(x, θ)
θi Θ
- Can be extremely conservative
Distributional Information + Uncertainty ?
Shuo Han UTC-IASE, Apr 2017
Using Sampled Data: Distributional Uncertainty
- Distributional uncertainty
- An ambiguity set in the space of probability distributions
- No assumption on the type (continuous vs discrete,
Gaussian, uniform, ...) of distributions
- Contains the true distribution with high probability
- Informally: “Uncertainty of uncertainty”
16
D d
(true distribution)
minimize
x
Eθ∼d [f(x, θ)]
(vs. )
minimize
x
max
d∈D Eθ∼d [f(x, θ)]
- Decision making problem: Distributionally robust optimization
- Strong worst-case guarantees
- Subsumes conventional robust optimization
Shuo Han UTC-IASE, Apr 2017
Distributional Uncertainty
- Method 1: Based on certain (pseudo)metric
- KL divergence
- Wasserstein metric (earth mover’s distance)
- Metric ball centered at the empirical distribution
- The ball contains with high probability
- Advantage: “Nonparametric” characterization
- Disadvantage: Complexity of decision making
against grows quickly with the number of samples
17
D d b d
empirical distribution
d D
(true distribution)
M D(✏) = n d: M(d, b d) ≤ ✏
Shuo Han UTC-IASE, Apr 2017
Distributional Uncertainty (cont’d)
- Method 2: Based on generalized moments (this talk)
- Assume: is easily bounded
- Examples
- Moments: ,
- Tail probability:
- Classical concentration inequalities can be used to
compute the probability that contains
18
E[θ] = ˆ θ cov[θ] = b Σ P(✓ ≥ ¯ ✓) ≤ ✏
g
D = {d: Eθ∼d[g(θ)] 0} D d
(true distribution)
D d P 1 n
n
X
i=1
θi ≥ Eθ + t ! ≤ exp(−2nt2) Example (Hoeffding’s inequality):
Shuo Han UTC-IASE, Apr 2017
Challenges and My Contribution
- Challenge: Finding the worst-case distribution
- Infinite-dimensional optimization problem
- Not numerically tractable
19
minimize
x
max
d∈D Eθ∼d [f(x, θ)]
- My contribution
- Formulate equivalent convex optimization problem (under certain conditions)
- Tractable numerical solutions
- Conditions apply to many resource allocation and scheduling problems
- Previous work on special instances
- [Scarf, 1958]: Analytical solution for a special case
- [Bertsimas, Popescu, 2005]: Optimal probability inequalities
- [Vandenberghe, Boyd, Comanor, 2007]: Optimal Chebyshev bounds
- [Delage, Ye, 2010]: Piecewise affine functions
Shuo Han UTC-IASE, Apr 2017
g(θ) f(θ)
Main Result: Equivalent Convex Optimization Problem
20
Theorem: There exists an equivalent convex
- ptimization problem for computing the
worst-case distribution if
f
concave
- The objective is piecewise concave
f(θ) = max
k
f (k)(θ) f (k) g
convex
- The constraint is piecewise convex
g(θ) = min
l
g(l)(θ) g(l)
Shuo Han, Molei Tao, Ufuk Topcu, Houman Owhadi, Richard M. Murray, “Convex optimal uncertainty quantification,” SIAM Journal on Optimization, 25(3), 1368–1387, 2015.
Shuo Han UTC-IASE, Apr 2017
concave piecewise affine 0-1 indicator resource allocatio llocation/scheduling failure rate
Piecewise Concave Functions
21
1
I(θ ≥ a) max
k∈K{aT k θ + bk}
Shuo Han UTC-IASE, Apr 2017
Piecewise Convex Functions
22
linear convex 0-1 indicator mean
covariance & higher moments
tail probability 1
I(θ ≥ a)
Shuo Han UTC-IASE, Apr 2017
The Convex Optimization Problem
23
maximize
{pkl,γkl}k,l
X
k,l
pklf (k)(γkl/pkl) subject to X
k,l
pkl = 1 pkl ≥ 0, ∀k, l X
k,l
pklg(l)(γkl/pkl) ≤ 0 K · L f(θ) = max
k∈{1,2,··· ,K} f (k)(θ)
g(θ) = min
l∈{1,2,··· ,L} g(l)(θ)
For:
- The worst case is always attained by a discrete distribution
- Total number of Dirac masses in the distribution:
Shuo Han UTC-IASE, Apr 2017
Storage Allocation for Power Grid
24
xi
power flow
fij
storage
θi
wind power (stochastic)
min.
x
max
d
Eθ∼d min
f
Wind Energy Wasted(x, θ, f)
- Storage Allocation Problem
- ptimal power flow
θ piecewise concave in
| {z }
Shuo Han UTC-IASE, Apr 2017
Numerical Example: IEEE 14-Bus Test Case
- Network with 5 generators
- Time: one day, 3-hour interval
- Mean and covariance obtained from
real wind generation data
25
[Source]: AESO
6am 12pm 6pm 50 100 150 Power generation
Shuo Han UTC-IASE, Apr 2017
5 10 15 10 20 30 40 50 Total storage Expected cost
The Influence of Information Constraints
26
Exact distribution Support Information Only Support + Mean + Covariance
Shuo Han, Ufuk Topcu, Molei Tao, Houman Owhadi, Richard M. Murray, “Convex optimal uncertainty quantification: Algorithms and a case study in energy storage placement for power grids study,” American Control Conference, 2013.
Shuo Han UTC-IASE, Apr 2017
On-Demand Ridesharing
27
Dispatch Center Vehicle Customer Predicted Demand Dispatch Command
min.
X1:T T
X
t=1
[JD(Xt) + JE(Xt, rt)]
Distribution of Customer Demand Vehicle Flows Wait Time Cost of Rebalancing Demand
Shuo Han UTC-IASE, Apr 2017
Robust vs. Non-Robust
- Robust optimization against demand uncertainty
28 Fei Miao, Shuo Han, Shan Lin, George J. Pappas, “Taxi dispatch under model uncertainties,” IEEE Conference on Decision and Control, 2015.
12 16 20 24 28 32 36 40 44 48 52 10 20 30 40
Cost range Number of experiments Costs distribution of dispatch solutions non−robust solutions robust solutions
Robust solution: 35.5% reduction
min.
X1:T
max
r1:T ∈∆ T
X
t=1
[JD(Xt) + JE(Xt, rt)]
100 200 300 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Total requests number Boxplot of total requests number during one hour
Region ID
NYC Dataset: 4 years, 100 GB
Shuo Han UTC-IASE, Apr 2017
Conventional vs. Distributionally Robust
- Distributionally robust formulation
29 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 7 8 9 10x 10
4 Average cost comparison of DRO and RO
ϵ Average cost SOC Box NR DRO
min.
X1:T max d∈D Er∼d
( T X
t=1
[JD(Xt) + JE(Xt, rt)] )
Fei Miao, Shuo Han, Abdeltawab Hendawi, et al., “Data-driven distributionally robust vehicle balancing with dynamic region partition,” ACM/IEEE Intl. Conf. on Cyber-Physical Systems, 2017. (non-robust) (distr. robust) (robust #1) (robust #2) confidence level Note: Confidence level not
- ptimized for distr. robust opt.
Confidence level: Probability the true parameter/distribution lies outside ambiguity set
Shuo Han UTC-IASE, Apr 2017
Future Directions
30
- Distributed computation
- Approximate algorithms
- Markov properties
- Prior knowledge
- When to discard old data
- When to re-learn
Large Datasets Structured Models Online Optimization
Shuo Han UTC-IASE, Apr 2017
Summary
- Distributional uncertainty: A new approach to data-driven optimization
- A rigorous way to make use of sampled data
- Probabilistic guarantees
- Worst-case analysis/design: Often required for engineering applications
- Computationally efficient
- Convex formulation available for a large class of problems
- Examples: Resource allocation and scheduling
31
6am 12pm 6pm 50 100 150 Power generation