Learning with Global Cost in Stochastic Environments Eyal Even-dar, - PowerPoint PPT Presentation

Learning with Global Cost in Stochastic Environments Eyal Even-dar, Shie Mannor and Yishay Mansour Technion COLT, June 2010 (You haven’t heard it last year.) Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 1 / 26

Learning with Global Cost in Stochastic Environments Eyal Even-dar, Shie Mannor and Yishay Mansour Technion COLT, June 2010 (You haven’t heard it last year.) Really. Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 1 / 26

Table of contents Introduction 1 The Framework 2 Natural algorithms that don’t work 3 Algorithms that sort of work 4 Analysis 5 Conclusions and open problems 6 Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 2 / 26

Regret Minimization Let L be a sequence of losses of length T , then R ( T , L ) = E [max( Cost ( alg , L ) − Cost (opt in hindsight , L ) , 0)] R ( T ) = max L R ( T , L ) Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 3 / 26

Regret Minimization Let L be a sequence of losses of length T , then R ( T , L ) = E [max( Cost ( alg , L ) − Cost (opt in hindsight , L ) , 0)] R ( T ) = max L R ( T , L ) An algorithm is no-regret if R ( T ) is sublinear in T . Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 3 / 26

Regret Minimization Let L be a sequence of losses of length T , then R ( T , L ) = E [max( Cost ( alg , L ) − Cost (opt in hindsight , L ) , 0)] R ( T ) = max L R ( T , L ) An algorithm is no-regret if R ( T ) is sublinear in T . Cost is in general not additive Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 3 / 26

Regret Minimization (biased view) So we have come a long way N experts (full and partial information) Shortest path (full and partial information) Strongly convex functions (better bounds) Many more... (40% of papers this year). Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 4 / 26

Regret Minimization (biased view) So we have come a long way N experts (full and partial information) Shortest path (full and partial information) Strongly convex functions (better bounds) Many more... (40% of papers this year). But some room to grow Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 4 / 26

Regret Minimization (biased view) So we have come a long way N experts (full and partial information) Shortest path (full and partial information) Strongly convex functions (better bounds) Many more... (40% of papers this year). But some room to grow There is no memory/state (in most works). Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 4 / 26

Regret Minimization (biased view) So we have come a long way N experts (full and partial information) Shortest path (full and partial information) Strongly convex functions (better bounds) Many more... (40% of papers this year). But some room to grow There is no memory/state (in most works). Losses are assumed to be additive across time (in almost all works). Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 4 / 26

Regret Minimization (biased view) So we have come a long way N experts (full and partial information) Shortest path (full and partial information) Strongly convex functions (better bounds) Many more... (40% of papers this year). But some room to grow There is no memory/state (in most works). Losses are assumed to be additive across time (in almost all works). Most algorithms are essentially greedy (bad for job talks). Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 4 / 26

Regret Minimization with State Routing [AK .... ] MDPs [EKM, YMS] Paging [BBK] Data structures [BCK] Load balancing – this talk Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 5 / 26

Are we optimizing the true loss function? Predicting click through rates (calibration) Handwriting recognition (calibration) Relevant documents, viral marketing (sub modular function) Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 6 / 26

Are we optimizing the true loss function? Predicting click through rates (calibration) Handwriting recognition (calibration) Relevant documents, viral marketing (sub modular function) Load balancing Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 6 / 26

Model N alternatives Algorithm chooses a distribution ¯ p t over the alternatives and then observes loss vector ¯ ℓ t . Algorithm accumulated loss: ¯ t = � t τ =1 ¯ L A ℓ τ · ¯ p τ Overall loss: ¯ τ =1 ¯ L t = � t ℓ τ Algorithm cost: C ( ¯ L A t ), where C is a global cost function . OPT cost: C ∗ ( ¯ L t ) = min α ∈ ∆( N ) C ( α · ¯ L t ). Regret: max { C ( ¯ t ) − C ∗ ( ¯ L A L t ) , 0 } . ⇒ C is convex and C ∗ concave). Assume C is L d norm ( d ≥ 1 = Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 7 / 26

Model - load balancing with makespan Assume makespan: C = � · � ∞ . Time loss Dist. Alg Accu. C ( Alg ) Over loss C ∗ 1 (1,1) (.5,.5) (.5,.5) .5 (1,1) .5 Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 8 / 26

Model - load balancing with makespan Assume makespan: C = � · � ∞ . Time loss Dist. Alg Accu. C ( Alg ) Over loss C ∗ 1 (1,1) (.5,.5) (.5,.5) .5 (1,1) .5 2 (1,0) (.5,.5) (1,.5) 1 (2,1) .66 Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 8 / 26

Model - load balancing with makespan Assume makespan: C = � · � ∞ . Time loss Dist. Alg Accu. C ( Alg ) Over loss C ∗ 1 (1,1) (.5,.5) (.5,.5) .5 (1,1) .5 2 (1,0) (.5,.5) (1,.5) 1 (2,1) .66 3 (1,0) (.33,.66) (1.33,.5) 1.33 (3,1) .75 Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 8 / 26

Model - load balancing with makespan Assume makespan: C = � · � ∞ . Time loss Dist. Alg Accu. C ( Alg ) Over loss C ∗ 1 (1,1) (.5,.5) (.5,.5) .5 (1,1) .5 2 (1,0) (.5,.5) (1,.5) 1 (2,1) .66 3 (1,0) (.33,.66) (1.33,.5) 1.33 (3,1) .75 4 (0,1) (.25,.75) (1.33,1.25) 1.33 (3,2) 1.2 Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 8 / 26

Model - load balancing with makespan Assume makespan: C = � · � ∞ . Time loss Dist. Alg Accu. C ( Alg ) Over loss C ∗ 1 (1,1) (.5,.5) (.5,.5) .5 (1,1) .5 2 (1,0) (.5,.5) (1,.5) 1 (2,1) .66 3 (1,0) (.33,.66) (1.33,.5) 1.33 (3,1) .75 4 (0,1) (.25,.75) (1.33,1.25) 1.33 (3,2) 1.2 Minimizing the sum of losses does not minimize C ∗ and vice versa Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 8 / 26

Model - load balancing with makespan Let’s focus on the makespan ( L ∞ ) for now. Optimal policy in hindsight the load vector ¯ L is 1 / L i p i = � N j =1 1 / L j Cost of the optimal policy is � N j =1 L j 1 C ∗ (¯ L ) = = � N � N j =1 1 / L j � i � = j L i j =1 Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 9 / 26

The Loss Model The loss sequence is generated by a stochastic source. In the talk we consider a very simple case, however the results hold in general. Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 10 / 26

The Loss Model The loss sequence is generated by a stochastic source. In the talk we consider a very simple case, however the results hold in general. The loss vector allows correlation between the arms: some measure D provided IID loss vectors. (Note: arms are possibly correlated.) Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 10 / 26

The Loss Model The loss sequence is generated by a stochastic source. In the talk we consider a very simple case, however the results hold in general. The loss vector allows correlation between the arms: some measure D provided IID loss vectors. (Note: arms are possibly correlated.) Known D and unknown D are both interesting. (We thought known D would be easy - how hard can the stochastic case be if you solved the adversarial case and you know the source?) Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 10 / 26

Known source - a simple example Consider two machines: Each time w.p 1 / 2 load vector is (1 , 0) and w.p 1 / 2 load vector is (0 , 1) W.h.p the cost of the best fixed policy in hindsight is T / 4 − O (1) What is the optimal policy? Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 11 / 26

“Naive model based” Standard technique in control/machine learning: 1 Learn the model 2 Compute optimal policy for the learned model AKA “certainty equivalence” Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT, June 2010 12 / 26

Learning with Global Cost in Stochastic Environments Eyal Even-dar, - PowerPoint PPT Presentation

Learning with Global Cost in Stochastic Environments Eyal Even-dar, Shie Mannor and Yishay Mansour Technion COLT, June 2010 (You havent heard it last year.) Shie Mannor (Technion) Learning with Global Cost in Stochastic Environments COLT,

TUTORIAL - TUTORIAL -ABC ABC TOTAL COST for a COST OBJECT TOTAL COST for a COST OBJECT

misc: environments, usethis, package structure Environments Environments and bindings via

Environments Announcements Environments for Higher-Order Functions Environments Enable

Cost Report Capital Cost Operating Cost (Up front cost) (Annual cost over time) Utilities

Cost Allocation Plans and Indirect Cost Rates Cost Allocation Plans and Indirect Cost Rates

CS440/ECE448 Lecture 12: Stochastic Games, Stochastic Search, and Learned Evaluation Functions

Stochastic Processes Will Perkins March 7, 2013 Stochastic Processes Q: What is a Stochastic

What If We Only Have Stochastic . . . What if the Stochastic . . . Approximate Stochastic

Chapter 4 Chapter 4 Marginal Costing and Cost-Volume-Profit Analysis Cost behaviour Cost

LEARNING ENVIRONMENTS Richard Shuttleworth Designing Learning Environments Learning Performance

GLOBAL RISKS GLOBAL RISKS GLOBAL RISKS - GLOBAL RISKS - - - GLOBAL RISKS GLOBAL RISKS

COST European Cooperation in Science and Technology Introduction to the COST Framework Programme

Pricing according to cost Cost-based pricing Cost of a service = value of economic means used in

COST Action CA18108 Quantum gravity phenomenology in the multi-messenger approach What is a COST

Chapter 2: Cost Behavior, Activity Analysis, and Cost Estimation Agenda History of Cost

Stochastic optimization in Hilbert spaces Aymeric Dieuleveut Aymeric Dieuleveut Stochastic

TALENTS PRESENTATION May 2020 TAI2TAI COMING TOGETHER TO CREATE YOUR NEXT STORY Whether you are

Energy Infrastructures in France: Climate Change Vulnerabilities and Adaptation Possibilities

Teachers for Teaching Algebra in the Age of the Common Core State Standards in Mathematics AACTE

LAMB WESTON HOLDINGS, INC. (Exact name of registrant as specified in its charter) Delaware

NJSLA Results: Spring 2019 Measuring Administrations College and Career Readiness Hasbrouck

CLASS OF 20 CLASS OF 2021 21 WELCOME TO FREEDOM HIGH SCHOOL!!! OUR STAFF Doug Fult

Gering High School 2018 - 2019 School Year Registration GPS Purpose and Direction The

Past, Present, and Future of Developmental Placement at CUNY EDWARD RUBIO AND SARAH TRUELSCH