Online Convex Optimization Using Predictions Niangjun Chen - PowerPoint PPT Presentation

Online ¡Convex ¡Optimization ¡ Using ¡Predictions Niangjun ¡Chen ¡ Joint ¡work ¡with ¡Anish ¡Agarwal, ¡Lachlan ¡Andrew, ¡Siddharth Barman, ¡and ¡Adam ¡Wierman 1

𝑑 " 𝑑 " (𝑦 " ) 𝑦 " 𝐺 2

𝑑 ) 𝑑 ) 𝑦 ) 𝑦 " 𝑦 ) 𝐺 𝛾‖𝑦 ) − 𝑦 " ‖ 3

� 𝑑 + 𝑑 + (𝑦 + ) 𝑦 " , 𝑑 " , 𝑦 ) , 𝑑 ) , 𝑦 + , 𝑑 + … online min / 0 ∈2 ¡4 ¡𝑑 5 𝑦 5 + 𝛾‖𝑦 5 − 𝑦 58" ‖ ¡ 5 convex switching ¡cost Goal: ¡ ¡Algorithms ¡to ¡minimize ¡cost 𝑦 " 𝑦 + 𝑦 ) 𝐺 𝛾‖𝑦 + − 𝑦 ) ‖ 4

Lots ¡of ¡applications ¡… Dynamic ¡capacity ¡management ¡in ¡data ¡centers ¡[Tu et ¡al. ¡2013] Power ¡system ¡generation/load ¡scheduling[Lu ¡et ¡al. ¡ ¡2013] ¡ Portfolio ¡management ¡[Cover ¡1991][Boyd ¡et ¡al. ¡2012] Video ¡streaming ¡[Sen ¡et ¡al. ¡2000][Liu ¡et ¡al. ¡2008] Network ¡routing ¡[Bansal ¡et ¡al. ¡2003][Kodialam et ¡al. ¡2003] Geographical ¡load ¡balancing ¡[Hindman et ¡al. ¡2011] ¡[Lin ¡et ¡al. ¡2012] … ¡ 5

In ¡most ¡applications, ¡predictions ¡are ¡crucial But ¡we ¡do ¡not ¡have ¡a ¡good ¡understanding ¡about ¡how ¡(imperfect) ¡predictions impact ¡online ¡algorithm ¡design 6

This ¡talk: ¡Online ¡Convex ¡Optimization ¡Using ¡Predictions 7

𝑑 "|< 𝑑 +|< 𝑑 )|< 𝑑 " 𝑑 " (𝑦 " ) 𝑦 " 𝐺 8

𝑑 ) 𝑑 )|" 𝑑 =|" 𝑑 +|" 𝑑 ) 𝑦 ) 𝑦 " 𝑦 ) 𝐺 𝛾‖𝑦 ) − 𝑦 " ‖ 9

𝑑 >|) 𝑑 =|) 𝑑 + 𝑑 +|) 𝑑 + (𝑦 + ) 𝑦 " 𝑦 + 𝑦 ) 𝐺 𝛾‖𝑦 + − 𝑦 ) ‖ 10

� Online ¡convex ¡optimization ¡using ¡predictions 𝑦 " , 𝑧 " , 𝑦 ) , 𝑧 ) , 𝑦 + , 𝑧 + … online min / 0 ∈2 4 𝑑 𝑦 5 , 𝑧 5 + 𝛾‖𝑦 5 − 𝑦 58" ‖ ¡ 5 switching ¡cost convex e.g. ¡online ¡tracking ¡cost 𝑑(𝑦 5 , 𝑧 5 ) = 1 ) 𝑧 5 − 𝐿𝑦 5 2 Time Information ¡Available Decision Given ¡ ¡ prediction ¡of ¡ 𝑧 5 ¡ at ¡time ¡ 𝜐, 𝑧 5|E 1 𝑧 "|< 𝑧 )|< 𝑧 +|< … ¡ 𝑦 " 2 𝑧 " 𝑧 )|" 𝑧 +|" … 𝑦 ) 3 𝑧 " 𝑧 ) 𝑧 +|) … 𝑦 + 4 𝑧 " 𝑧 ) 𝑧 + … 𝑦 = ¡ 11

How ¡do ¡algorithms ¡model ¡prediction ¡noise? Ø Learning ¡and ¡Algorithms: Perfect ¡lookahead model ¡ Worst ¡case ¡analysis (Near) ¡perfect ¡lookahead for ¡ 𝑥 time ¡steps ¡and ¡then ¡adversarial Both ¡too ¡optimistic ¡and ¡pessimistic Ø Control ¡and ¡Signal ¡Processing: Stochastic ¡model Assume ¡a ¡stochastic ¡process ¡and ¡derive ¡optimal ¡predictor Too ¡sensitive ¡to ¡assumptions Average ¡case ¡analysis Ø Systems ¡Design: ¡Numeric ¡evaluation ¡ Test ¡predictor ¡given ¡historic ¡traces No ¡guarantee ¡for ¡performance 12

¡contribution : ¡ ¡ a ¡ Ou Our ¡ a ¡gen ener eral ¡ al ¡an and ¡ ¡tr trac actab able ¡ le ¡ mo model ¡ ¡for ¡ r ¡prediction ¡ ¡ Key ¡message: ¡prediction ¡allows 1. Overcoming ¡“impossibility” ¡results ¡for ¡OCO ¡with ¡minimal ¡structural ¡ assumption 2. Mixture ¡of ¡average ¡case ¡and ¡worst ¡case ¡analysis 13

Outline 1. Background ¡: ¡regret ¡and ¡competitive ¡ratio OCO ¡without ¡prediction OCO ¡with ¡worst ¡case ¡prediction 2. Our ¡prediction ¡noise ¡model 3. Algorithm ¡design 4. OCO ¡with ¡stochastic ¡prediction ¡noise ¡ 14

Two ¡communities, ¡two ¡metrics ¡ Online ¡Learning Regret(Alg) = ¡ sup y [ Cost(Alg) ¡– ¡Cost(STA) ] Goal: ¡sublinear ¡regret Real ¡applications ¡want ¡both Online ¡Algorithm Cost Alg Competitive ¡ratio(Alg) = ¡sup y Cost OPT Goal: ¡constant ¡competitive ¡ratio 15

Guarantees ¡without ¡prediction Ø Sublinear regret? Yes, ¡[Kivinen & ¡Vempala 2002] ¡[Bansal ¡et ¡al. ¡2003] [Zinkevich 2003] ¡[Hazan et ¡al. ¡2007] ¡[Lin ¡et ¡al. ¡2012] ¡… Ø Constant ¡CR? Yes, ¡but ¡only ¡for ¡scalar ¡case ¡ [Blum ¡et ¡al. ¡1992] ¡[Borodin ¡et ¡al. ¡1992][Blum ¡& ¡Burch ¡2000] [Lin ¡et ¡al. ¡2011][Lin ¡et ¡al. ¡2012] ¡… Ø Sublinear regret ¡ and ¡ constant ¡CR? Not ¡even ¡in ¡scalar ¡case! ¡[Andrew ¡et ¡al. ¡2013] 16

Guarantees ¡with ¡prediction 1 st cut, ¡perfect ¡lookahead: ¡ 𝑧 5|E = 𝑧 5 for ¡any ¡time ¡ 𝑢 ≤ 𝜐 + 𝑥 ¡ Ø Sublinear ¡regret? Yes, ¡[Kivinen & ¡Vempala 2002] ¡[Bansal ¡et ¡al. ¡2003] ¡ [Zinkevich 2003] ¡[Hazan et ¡al. ¡2007] ¡[Lin ¡et ¡al. ¡2012] ¡… Ø Constant ¡CR? Yes ¡in ¡general [Lin ¡et ¡al. ¡2013] Ø Sublinear regret ¡ and ¡ constant ¡CR? Not ¡without ¡a ¡lot ¡of ¡prediction ¡[Chen ¡et ¡al. ¡2015] 17

Theorem: ¡ An ¡online ¡algorithm ¡with ¡perfect ¡lookahead requires ¡unbounded ¡lookahead window ¡ 𝑥 to ¡simultaneously ¡achieve ¡sublinear ¡regret ¡and a ¡constant ¡competitive ¡ratio. 𝑥 = 𝜕 1 as ¡ 𝑈 grows We ¡may ¡be ¡using ¡the ¡wrong ¡prediction ¡model 18

What ¡do ¡we ¡want ¡in ¡a ¡prediction ¡noise ¡model? Ø Predictions ¡are ¡“refined” ¡as ¡time ¡goes ¡forward Ø Predictions ¡are ¡more ¡noisy ¡as ¡you ¡look ¡ further ¡ahead Ø Prediction ¡errors ¡can ¡be ¡correlated Ø Should ¡be ¡general ¡enough ¡to ¡incorporate detailed ¡models 20

A ¡more ¡realistic ¡prediction ¡noise ¡model 5 𝑧 5 = 𝑧 5|E + 4 𝑔 𝑢 − 𝑡 𝑓(𝑡) Z[E\" Realization ¡that ¡algorithm ¡is ¡trying ¡to ¡track prediction ¡error Prediction ¡for ¡time ¡ 𝑢 given ¡to ¡ algorithm ¡at ¡time ¡ 𝜐 21

A ¡more ¡realistic ¡prediction ¡noise ¡model Per-‑step ¡noise 5 𝑧 5 = 𝑧 5|E + 4 𝑔 𝑢 − 𝑡 𝑓(𝑡) Z[E\" 22

A ¡more ¡realistic ¡prediction ¡noise ¡model Weighting ¡factor 5 𝑧 5 = 𝑧 5|E + 4 𝑔 𝑢 − 𝑡 𝑓(𝑡) Z[E\" How ¡important ¡is ¡the ¡noise ¡at ¡time ¡ ¡ 𝑢 − 𝑡 for ¡the ¡prediction ¡of ¡ 𝑢 ? 23

A ¡more ¡realistic ¡prediction ¡noise ¡model 5 𝑧 5 = 𝑧 5|E + 4 𝑔 𝑢 − 𝑡 𝑓(𝑡) Z[E\" prediction ¡error • Predictions ¡are ¡“refined” ¡as ¡time ¡goes ¡forward • Predictions ¡are ¡more ¡noisy ¡as ¡you ¡look ¡further ¡ahead 58E8" ) ) = 𝜏 ) 4 𝔽 𝑧 5 − 𝑧 5|E 𝑔 𝑡 Z[< • Prediction ¡errors ¡can ¡be ¡correlated • Form ¡of ¡errors ¡matches ¡many ¡classic ¡models 24

A ¡more ¡realistic ¡prediction ¡noise ¡model 5 𝑧 5 = 𝑧 5|E + 4 𝑔 𝑢 − 𝑡 𝑓(𝑡) Z[E\" prediction ¡error This ¡form ¡of ¡prediction ¡error ¡matches ¡what ¡occurs ¡in Prediction ¡of ¡a ¡wide-‑sense ¡stationary ¡ ¡process ¡using ¡a ¡Weiner ¡filter • Prediction ¡of ¡a ¡linear ¡dynamical ¡system ¡using ¡a ¡Kalman filter • 25

A ¡more ¡realistic ¡prediction ¡noise ¡model 5 𝑧 5 = 𝑧 5|E + 4 𝑔 𝑢 − 𝑡 𝑓(𝑡) Z[E\" Key ¡observation: ¡No ¡assumption ¡about ¡ 𝑧 5 ¡ or ¡ how ¡predictions ¡are ¡made Allows ¡adversarial ¡analysis using ¡stochastic ¡prediction ¡noise 𝐒𝐟𝐡𝐬𝐟𝐮 𝐁𝐦𝐡 = sup 𝔽 i ¡cost(Alg) − cost(STA) h cost(Alg) 𝐃𝐩𝐧𝐪𝐟𝐮𝐣𝐮𝐣𝐰𝐟 ¡𝐒𝐛𝐮𝐣𝐩 𝐁𝐦𝐡 = sup 𝔽 i cost(Opt) r 26

A ¡natural ¡suggestion: ¡ Model ¡Predictive ¡Control ¡(MPC) 𝑧 5\"|5 , 𝑧 5\)|5 , … , 𝑧 5\s|5 , 𝑧 5\s\"|5 , 𝑧 5\s\)|5 , … 5\s 4 1 ) 𝑦 5\" , 𝑦 5\) , … , 𝑦 5\s = argmin 𝑧 Z|5 − 𝐿𝑦 5 + 𝛾 𝑦 5 − 𝑦 58" 2 " Z[5\" 28

Online Convex Optimization Using Predictions Niangjun Chen - PowerPoint PPT Presentation

Online Convex Optimization Using Predictions Niangjun Chen Joint work with Anish Agarwal, Lachlan Andrew, Siddharth Barman, and Adam Wierman 1 " " ( "

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems Instructor:

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Optimization Problems

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

Convex Optimization 4. Convex Optimization Problems Prof. Ying Cui Department of Electrical

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions Instructor: Shaddin

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2014 Convex Functions Instructor: Shaddin

Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback Alekh Agarwal

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

Some Recent Advances in Non-convex Optimization Purushottam Kar IIT KANPUR Outline of the Talk

1 Predictions for 2020 Predictions for 2020 We will live in flying houses. 1966

A Primer in Convex Optimization Moritz Diehl partly based on material by Colin Jones, Stephen

16. Review of convex optimization Convex sets and functions Convex programming models

A new window on primordial non-Gaussianity based on 1201.5375 with M. Zaldarriaga Enrico Pajer

Model Learning for Long-term Safe Control in Changing Environments Christopher D. McKinnon and

Non-linear MPC Robert Platt Northeastern University NonLinear Model Predictive Control Given:

Prt tr

What is Mosek up to January 15, 2019 Erling D. Andersen www.mosek.com Mosek A software

Clump formation through colliding stellar winds in the Galactic Center Caldern et al. (2016)

Eclipse Marketplace Client (MPC) Release and Graduation Review Submitter Ian Skerrett, Eclipse

Literature on which the following results are based: Pfrommer, 2008, MNRAS, in print,

Sambuz

Useful Links

Newsletter

Mail Us