Online Learning and Online Convex Optimization Nicol` o - PowerPoint PPT Presentation

Online Learning and Online Convex Optimization Nicol` o Cesa-Bianchi Universit` a degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Learning 1 / 49

Summary My beautiful regret 1 A supposedly fun game I’ll play again 2 The joy of convex 3 N. Cesa-Bianchi (UNIMI) Online Learning 2 / 49

Machine learning Classification / regression tasks Predictive models h mapping data instances X to labels Y (e.g., binary classifier) � � Training data S T = ( X 1 , Y 1 ) , . . . , ( X T , Y T ) (e.g., email messages with spam vs. nonspam annotations) Learning algorithm A (e.g., Support Vector Machine) maps training data S T to model h = A ( S T ) Evaluate the risk of the trained model h with respect to a given loss function N. Cesa-Bianchi (UNIMI) Online Learning 4 / 49

Two notions of risk View data as a statistical sample: statistical risk � �� A ( S T ) , ( X , Y ) E ℓ � �� test trained example model � � Training set S T = ( X 1 , Y 1 ) , . . . , ( X T , Y T ) and test example ( X , Y ) drawn i.i.d. from the same unknown and fixed distribution N. Cesa-Bianchi (UNIMI) Online Learning 5 / 49

Two notions of risk View data as a statistical sample: statistical risk � �� A ( S T ) , ( X , Y ) E ℓ � �� test trained example model � � Training set S T = ( X 1 , Y 1 ) , . . . , ( X T , Y T ) and test example ( X , Y ) drawn i.i.d. from the same unknown and fixed distribution View data as an arbitrary sequence: sequential risk T � � � ℓ A ( S t − 1 ) , ( X t , Y t ) � �� t = 1 test trained example model Sequence of models trained on growing prefixes � � S t = ( X 1 , Y 1 ) , . . . , ( X t , Y t ) of the data sequence N. Cesa-Bianchi (UNIMI) Online Learning 5 / 49

Regrets, I had a few Learning algorithm A maps datasets to models in a given class H Variance error in statistical learning � �� E ℓ A ( S T ) , ( X , Y ) − inf h ∈ H E ℓ h , ( X , Y ) compare to expected loss of best model in the class N. Cesa-Bianchi (UNIMI) Online Learning 6 / 49

Regrets, I had a few Learning algorithm A maps datasets to models in a given class H Variance error in statistical learning � �� E ℓ A ( S T ) , ( X , Y ) − inf h ∈ H E ℓ h , ( X , Y ) compare to expected loss of best model in the class Regret in online learning T T � � � � � � ℓ A ( S t − 1 ) , ( X t , Y t ) − inf ℓ h , ( X t , Y t ) h ∈ H t = 1 t = 1 compare to cumulative loss of best model in the class N. Cesa-Bianchi (UNIMI) Online Learning 6 / 49

Incremental model update A natural blueprint for online learning algorithms For t = 1, 2, . . . Apply current model h t − 1 to next data element ( X t , Y t ) 1 Update current model: h t − 1 → h t ∈ H (local optimization) 2 N. Cesa-Bianchi (UNIMI) Online Learning 7 / 49

Incremental model update A natural blueprint for online learning algorithms For t = 1, 2, . . . Apply current model h t − 1 to next data element ( X t , Y t ) 1 Update current model: h t − 1 → h t ∈ H (local optimization) 2 Goal: control regret T T � � � � � � ℓ h t − 1 , ( X t , Y t ) − inf ℓ h , ( X t , Y t ) h ∈ H t = 1 t = 1 N. Cesa-Bianchi (UNIMI) Online Learning 7 / 49

Incremental model update A natural blueprint for online learning algorithms For t = 1, 2, . . . Apply current model h t − 1 to next data element ( X t , Y t ) 1 Update current model: h t − 1 → h t ∈ H (local optimization) 2 Goal: control regret T T � � � � � � ℓ h t − 1 , ( X t , Y t ) − inf ℓ h , ( X t , Y t ) h ∈ H t = 1 t = 1 View this as a repeated game between a player generating predictors h t ∈ H and an opponent generating data ( X t , Y t ) N. Cesa-Bianchi (UNIMI) Online Learning 7 / 49

Theory of repeated games James Hannan David Blackwell (1922–2010) (1919–2010) Learning to play a game (1956) Play a game repeatedly against a possibly suboptimal opponent N. Cesa-Bianchi (UNIMI) Online Learning 9 / 49

Zero-sum 2-person games played more than once 1 2 . . . M N × M known loss matrix ℓ ( 1, 1 ) ℓ ( 1, 2 ) 1 . . . Row player (player) 2 ℓ ( 2, 1 ) ℓ ( 2, 2 ) . . . has N actions . . . ... . . . . . . Column player (opponent) N has M actions For each game round t = 1, 2, . . . Player chooses action i t and opponent chooses action y t The player su ff ers loss ℓ ( i t , y t ) ( = gain of opponent) Player can learn from opponent’s history of past choices y 1 , . . . , y t − 1 N. Cesa-Bianchi (UNIMI) Online Learning 10 / 49

Prediction with expert advice t = 1 t = 2 . . . 1 ℓ 1 ( 1 ) ℓ 2 ( 1 ) . . . ℓ 1 ( 2 ) ℓ 2 ( 2 ) 2 . . . . . . ... . . . . . . ℓ 1 ( N ) ℓ 2 ( N ) N Volodya Vovk Manfred Warmuth Opponent’s moves y 1 , y 2 , . . . define a sequential prediction problem with a time-varying loss function ℓ ( i t , y t ) = ℓ t ( i t ) N. Cesa-Bianchi (UNIMI) Online Learning 11 / 49

Playing the experts game A sequential decision problem N actions Unknown deterministic assignment of losses to actions � � ∈ [ 0, 1 ] N for t = 1, 2, . . . ℓ t = ℓ t ( 1 ) , . . . , ℓ t ( N ) ? ? ? ? ? ? ? ? ? For t = 1, 2, . . . N. Cesa-Bianchi (UNIMI) Online Learning 12 / 49

Playing the experts game A sequential decision problem N actions Unknown deterministic assignment of losses to actions � � ∈ [ 0, 1 ] N for t = 1, 2, . . . ℓ t = ℓ t ( 1 ) , . . . , ℓ t ( N ) ? ? ? ? ? ? ? ? ? For t = 1, 2, . . . Player picks an action I t (possibly using randomization) and 1 incurs loss ℓ t ( I t ) N. Cesa-Bianchi (UNIMI) Online Learning 12 / 49

Playing the experts game A sequential decision problem N actions Unknown deterministic assignment of losses to actions � � ∈ [ 0, 1 ] N for t = 1, 2, . . . ℓ t = ℓ t ( 1 ) , . . . , ℓ t ( N ) .7 .3 .2 .4 .1 .6 .7 .4 .9 For t = 1, 2, . . . Player picks an action I t (possibly using randomization) and 1 incurs loss ℓ t ( I t ) Player gets feedback information: ℓ t ( 1 ) , . . . , ℓ t ( N ) 2 N. Cesa-Bianchi (UNIMI) Online Learning 12 / 49

Regret analysis Regret � T � T � � def ℓ t ( i ) want R T = E ℓ t ( I t ) − min = o ( T ) i = 1,..., N t = 1 t = 1 N. Cesa-Bianchi (UNIMI) Online Learning 13 / 49

Regret analysis Regret � T � T � � def ℓ t ( i ) want R T = E ℓ t ( I t ) − min = o ( T ) i = 1,..., N t = 1 t = 1 Lower bound using random losses [Experts’ paper, 1997] ℓ t ( i ) → L t ( i ) ∈ { 0, 1 } independent random coin flip � T � � = T For any player strategy E L t ( I t ) 2 t = 1 Then the expected regret is � �� 1 T � � T ln N E max 2 − L t ( i ) = 1 − o ( 1 ) 2 i = 1,..., N t = 1 for N , T → ∞ N. Cesa-Bianchi (UNIMI) Online Learning 13 / 49

Exponentially weighted forecaster (Hedge) At time t pick action I t = i with probability proportional to � � t − 1 � exp − η ℓ s ( i ) s = 1 the sum at the exponent is the total loss of action i up to now Regret bound [Experts’ paper, 1997] � � T ln N If η = ( ln N ) / ( 8 T ) then R T � 2 Matching lower bound including constants � Dynamic choice η t = ( ln N ) / ( 8 t ) only loses small constants N. Cesa-Bianchi (UNIMI) Online Learning 14 / 49

The nonstochastic bandit problem ? ? ? ? ? ? ? ? ? N. Cesa-Bianchi (UNIMI) Online Learning 15 / 49

The nonstochastic bandit problem ? ? ? ? ? ? ? ? ? For t = 1, 2, . . . Player picks an action I t (possibly using randomization) and 1 incurs loss ℓ t ( I t ) N. Cesa-Bianchi (UNIMI) Online Learning 15 / 49

The nonstochastic bandit problem ? .3 ? ? ? ? ? ? ? For t = 1, 2, . . . Player picks an action I t (possibly using randomization) and 1 incurs loss ℓ t ( I t ) Player gets partial information: Only ℓ t ( I t ) is revealed 2 N. Cesa-Bianchi (UNIMI) Online Learning 15 / 49

The nonstochastic bandit problem ? .3 ? ? ? ? ? ? ? For t = 1, 2, . . . Player picks an action I t (possibly using randomization) and 1 incurs loss ℓ t ( I t ) Player gets partial information: Only ℓ t ( I t ) is revealed 2 Player still competing agaist best o ffl ine action � T � T � � R T = E ℓ t ( I t ) − min ℓ t ( i ) i = 1,..., N t = 1 t = 1 N. Cesa-Bianchi (UNIMI) Online Learning 15 / 49

The Exp3 algorithm [Auer et al., 2002] Hedge with estimated losses � � t − 1 � � P t ( I t = i ) ∝ exp − η ℓ s ( i ) i = 1, . . . , N s = 1  ℓ t ( i )  � � if I t = i � ℓ t ( i ) = P t ℓ t ( i ) observed  0 otherwise Only one non-zero component in � ℓ t N. Cesa-Bianchi (UNIMI) Online Learning 16 / 49

Online Learning and Online Convex Optimization Nicol` o - PowerPoint PPT Presentation

Online Learning and Online Convex Optimization Nicol` o Cesa-Bianchi Universit` a degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Learning 1 / 49 Summary My beautiful regret 1 A supposedly fun game Ill play again 2 The joy of

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems Instructor:

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Optimization Problems

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

Convex Optimization 4. Convex Optimization Problems Prof. Ying Cui Department of Electrical

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions Instructor: Shaddin

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2014 Convex Functions Instructor: Shaddin

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

Some Recent Advances in Non-convex Optimization Purushottam Kar IIT KANPUR Outline of the Talk

16. Review of convex optimization Convex sets and functions Convex programming models

A Primer in Convex Optimization Moritz Diehl partly based on material by Colin Jones, Stephen

Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback Alekh Agarwal

3.1 Online Convex Programming Definition 3.1.1 (Convex Set) A set of vectors X R n is convex

FastForward: Fast and Constructive Full Duplex Relays Dinesh Bharadia and Sachin Katti Stanford

Computability and complexity in continuous dynamical systems ca 1 , 2 Daniel S. Gra 1 FCT,

Computability and computational complexity of the evolution of nonlinear dynamical systems Olivier

Nonequilibrium and statistical ensembles Statistical properties of an Equilibrium state are

G-22 of Ministers and Governors WORKING GROUP ON TRANSPARENCY AND ACCOUNTABILITY Andrew Sheng

COVID-19 19 Mar arch 13 13, 20 2020 FEM FEMA-4487-NC NC PUBLIC ASSISTANCE APPLICANT

PIP-II Systems Engineering Management Process, Design Review plans and procedures, and associated

AGENDA Meeting of October 21, 2014 Columbus, Ohio The National Board of Boiler & Pressure

Online Learning and Online Convex Optimization Nicol` o - PowerPoint PPT Presentation

Online Learning and Online Convex Optimization Nicol` o Cesa-Bianchi Universit` a degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Learning 1 / 49 Summary My beautiful regret 1 A supposedly fun game Ill play again 2 The joy of

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems Instructor:

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Optimization Problems

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

Convex Optimization 4. Convex Optimization Problems Prof. Ying Cui Department of Electrical

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions Instructor: Shaddin

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2014 Convex Functions Instructor: Shaddin

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

Some Recent Advances in Non-convex Optimization Purushottam Kar IIT KANPUR Outline of the Talk

16. Review of convex optimization Convex sets and functions Convex programming models

A Primer in Convex Optimization Moritz Diehl partly based on material by Colin Jones, Stephen

Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback Alekh Agarwal

3.1 Online Convex Programming Definition 3.1.1 (Convex Set) A set of vectors X R n is convex

FastForward: Fast and Constructive Full Duplex Relays Dinesh Bharadia and Sachin Katti Stanford

Computability and complexity in continuous dynamical systems ca 1 , 2 Daniel S. Gra 1 FCT,

Computability and computational complexity of the evolution of nonlinear dynamical systems Olivier

Nonequilibrium and statistical ensembles Statistical properties of an Equilibrium state are

G-22 of Ministers and Governors WORKING GROUP ON TRANSPARENCY AND ACCOUNTABILITY Andrew Sheng

COVID-19 19 Mar arch 13 13, 20 2020 FEM FEMA-4487-NC NC PUBLIC ASSISTANCE APPLICANT

PIP-II Systems Engineering Management Process, Design Review plans and procedures, and associated

AGENDA Meeting of October 21, 2014 Columbus, Ohio The National Board of Boiler &amp; Pressure

AGENDA Meeting of October 21, 2014 Columbus, Ohio The National Board of Boiler & Pressure