Regret Minimization for Online Buffering Problems Using the Weighted - PowerPoint PPT Presentation

Regret Minimization for Online Buffering Problems Using the Weighted Majority Algorithm Sascha Geulen, Berthold V¨ ocking, Melanie Winkler Department of Computer Science, RWTH Aachen University June 27, 2010 Melanie Winkler (RWTH Aachen University) Online Buffering June 27, 2010 1 / 11

Online Buffering Online Buffering Toy example: buffer of bounded size B in every time step t = 1 , . . . , T : ◮ demand d t ≤ B ◮ p t ∈ [0 , 1] , price per unit of the resource OR ◮ f t ( x ) , price function to buy x units How much should be purchased in time step t ? Melanie Winkler (RWTH Aachen University) Online Buffering June 27, 2010 2 / 11

Online Buffering Applications Main Application Battery Management of Hybrid cars ◮ two energy resources (combustion / electrical) ◮ given requested torque of the car, battery level ◮ determine torque of combustion engine Melanie Winkler (RWTH Aachen University) Online Buffering June 27, 2010 3 / 11

Online Learning Motivation Online Learning Motivation: online buffering problems have been studied in Worst-Case Analysis algorithm is “threat-based“, i.e. buys enough to ensure the competitive factor in the next step for all possible extensions of the price sequence Melanie Winkler (RWTH Aachen University) Online Buffering June 27, 2010 4 / 11

Online Learning Problems Online Learning Applied to Online Buffering Algorithm 1 ( Randomized Weighted Majority (RWM) ) 1: w 1 i = 1 , q 1 i = 1 N , for all i ∈ { 1 , . . . , N } 2: for t = 1 , . . . , T do choose expert e t at random according to Q t = ( q t 1 , . . . , q t N ) 3: i (1 − η ) c t w t +1 = w t i , for all i 4: i w t +1 q t +1 = , for all i 5: i i j =1 w t +1 � N j 6: end for Problem: � �� 0 � � 1 � � 0 � � 1 �� T ′ � p t � � 0 = . d t 0 1 / 4 1 / 4 1 / 4 1 / 4 The first expert purchases 1 / 2 unit in the initial step and afterwards one unit in the third step of every round. The second expert purchases one unit in the first step of every round. Melanie Winkler (RWTH Aachen University) Online Buffering June 27, 2010 5 / 11

Our Approach Shrinking Dartboard Online Learning for Online Buffering Algorithm 2 ( Shrinking Dartboard (SD) ) 1: w 1 i = 1 , q 1 i = 1 N , for all i 2: choose expert e 1 at random according to Q 1 = ( q 1 1 , . . . , q 1 N ) 3: for t = 2 , . . . , T do (1 − η ) c t − 1 w t i = w t − 1 , for all i 4: i i w t q t i = j , for all i i 5: � N j =1 w t w t do not change expert, i.e., set e t = e t − 1 et with probability 6: w t − 1 et else choose e t at random according to Q t = ( q t 1 , . . . , q t N ) 7: 8: end for Melanie Winkler (RWTH Aachen University) Online Buffering June 27, 2010 6 / 11

Our Approach Shrinking Dartboard Shrinking Dartboard Algorithm Idea: dartboard of size N , area of size 1 for expert i set active area of expert i to 1 1 throw dart into active area to choose an expert 2 if weight of expert i decreases 3 ◮ decrease active area of that expert dart outside of active area ⇒ throw new dart 4 ⇒ distribution to choose an expert is the same as for RWM in every step, but depends on e t − 1 Theorem � For η = min { ln N/ ( BT ) , 1 / 2 } , the expected cost of SD satisfies � C T SD ≤ C T best + O ( BT log N ) . Melanie Winkler (RWTH Aachen University) Online Buffering June 27, 2010 7 / 11

Our Approach Shrinking Dartboard Regret of Shrinking Dartboard Proof idea: Observation: E [ c SD ] ≤ � t c chosen expert + B · E [ number of expert changes ] best + ln N expected cost of chosen expert ⇔ cost of RWM: (1 + η ) C T 1 η additional cost for every expert change are at most B 2 ◮ due to difference in number of units in the storage estimate number of expert changes 3 ◮ W t , remaining size of dartboard in step t , ( W t = � N i =1 w t i ) ◮ size of dartboard larger than weight of best expert, ( W T +1 ≥ (1 + η ) C T best ) ◮ W T +1 equals product of fraction of dartboard which remains from t to t + 1 t =1 (1 − W t − W t +1 multiplied by N , ( N � T ) ) W t best + O ( √ BT log N ) . combining those equations leads to C T SD ≤ C T 4 Melanie Winkler (RWTH Aachen University) Online Buffering June 27, 2010 8 / 11

Our Approach Weighted Fractional Weighted Fractional Algorithm Algorithm 3 ( Weighted Fractional (WF) ) 1: w 1 i = 1 , q 1 i = 1 N , for all i 2: for t = 2 , . . . , T do purchase x t = � N i =1 q i x i units, x i amount purchased by i 3: (1 − η ) c t − 1 i = w t − 1 w t , for all i 4: i i w t q t i = j , for all i 5: i � N j =1 w t 6: end for Idea: purchased amount is a weighted sum of the recommendations of the experts Theorem Suppose the price functions f t ( x ) are convex, for 1 ≤ t ≤ T . Then for η = � min { ln N/ ( BT ) , 1 / 2 } the cost of WF satisfies � C T WF ≤ C T best + O ( BT log N ) . Melanie Winkler (RWTH Aachen University) Online Buffering June 27, 2010 9 / 11

Our Approach Lower Bound Lower Bound Theorem For every T , there exists a sequence of length T together with N experts s.t. every learning algorithm with a buffer of size B suffers a regret of Ω( √ BT log N ) . Proof idea: a) The expert purchases B units in the � B � T ′ �� B � � B � � p t � first phase. 2 { 0 , 4 } 4 = d t 0 0 1 b) The expert purchases B units in the second phase. every expert chooses one of the strategies uniformly at random in every round cost of experts: N independent random walks of length T ′ with step length B expected minimum of those random walks 2 / 3 T − Ω( √ BT log N ) , expected cost 2 / 3 T Melanie Winkler (RWTH Aachen University) Online Buffering June 27, 2010 10 / 11

Summary Summary Shrinking Dartboard, which achieves low regret for online buffering ◮ Similar regret bound also possible for Follow the Perturbed Leader [Kalai, Vempala, 2005] Weighted Fractional achieves low regret also against adaptive adversary The regret bounds of the algorithms are tight Thank you for your attention! Any questions? Melanie Winkler (RWTH Aachen University) Online Buffering June 27, 2010 11 / 11

Regret Minimization for Online Buffering Problems Using the Weighted - PowerPoint PPT Presentation

Regret Minimization for Online Buffering Problems Using the Weighted Majority Algorithm Sascha Geulen, Berthold V ocking, Melanie Winkler Department of Computer Science, RWTH Aachen University June 27, 2010 Melanie Winkler (RWTH Aachen

Counterfactual Regret Minimization and Domination in Extensive-Form Games Richard Gibson

Cautious R Regret M Minimization: Online O Optimization w with L Long-Term B Budg udget Co

No-Regret Learning in Convex Games Geoff Gordon, Amy Greenwald, Casey Marks, and Martin Zinkevich

Regret bounds for online variational inference Pierre Alquier ACML Nagoya, Nov. 18, 2019

Minimization Satoru Iwata (University of Tokyo) Submodular Function Minimization ( )

Optimistic Regret Minimization for Extensive-Form Games via Dilated Distance-Generating Functions

Counterfactual Regret Minimization Gabriele Farina 1 Christian Kroer 2 Noam Brown 1 Tuomas Sandholm

Efficient Online Portfolio with Logarithmic Regret Haipeng Luo (USC) Chen-Yu Wei (USC) Kai Zheng

Buffers and Event Handling Outline Color buffer Animation and double buffering

1 Prefetching Prefetching or Streaming or Streaming Prediction Prediction Compiler-driven

Memory Management vanilladb.org Outline Overview Buffering User Data Caching Logs

Minimization Using Descent Information we will consider the minimization of unconstrained

Moment methods in energy minimization David de Laat CWI Amsterdam Andrejewski-Tage Moment

Acceleration through Optimistic No-Regret Dynamics Jun-Kun Wang and Jacob Abernethy Georgia Tech

Composability of Regret Minimizers Gabriele Farina 1 Christian Kroer 2 Tuomas Sandholm 1,3,4,5 1

A Closer Look at Adaptive Regret Dmitry Adamskiy Joint work with Wouter Koolen, Volodya Vovk and

Parkview School District 2015-2016 Budget Hearing and Annual Meeting Monday, October 19,

2015 Monmouth County Budget Presentation Public Budget Presentations April 1 at 7 p.m.

Jason Zirnis Assistant Superintendent of Business of Operations Revenue Summary: Reviewed State

Economy continues to chug along; same outlook as last year. National and state

DEVELOPMENT STANDARDS Agenda Overview of ReZone Syracuse Project Overview of Module 2:

July 21, 2020 2 Outline Study Overview What Weve Heard Recommendations Next

Real-Time Response Time Measurement by Integration of Trace Buffering and Aggregation Tools

Excipients general approach EMA workshop, London, 8 November 2011. Presented by: Caroline Le