The Online Approach to Machine Learning Nicol` o Cesa-Bianchi - PowerPoint PPT Presentation

The Online Approach to Machine Learning Nicol` o Cesa-Bianchi Universit` a degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Approach to ML 1 / 53

Summary My beautiful regret 1 A supposedly fun game I’ll play again 2 A graphic novel 3 The joy of convex 4 The joy of convex (without the gradient) 5 N. Cesa-Bianchi (UNIMI) Online Approach to ML 2 / 53

Machine learning Classification / regression tasks Predictive models h mapping data instances X to labels Y (e.g., binary classifier) � � Training data S T = ( X 1 , Y 1 ) , . . . , ( X T , Y T ) (e.g., email messages with spam vs. nonspam annotations) Learning algorithm A (e.g., Support Vector Machine) maps training data S T to model h = A ( S T ) Evaluate the risk of the trained model h with respect to a given loss function N. Cesa-Bianchi (UNIMI) Online Approach to ML 4 / 53

Two notions of risk View data as a statistical sample: statistical risk � �� loss A ( S T ) , ( X , Y ) E � �� test trained example model � � Training set S T = ( X 1 , Y 1 ) , . . . , ( X T , Y T ) and test example ( X , Y ) drawn i.i.d. from the same unknown and fixed distribution View data as an arbitrary sequence: sequential risk T � � � loss A ( S t − 1 ) , ( X t , Y t ) � �� t = 1 test trained example model Sequence of models trained on growing prefixes � � S t = ( X 1 , Y 1 ) , . . . , ( X t , Y t ) of the data sequence N. Cesa-Bianchi (UNIMI) Online Approach to ML 5 / 53

Regrets, I had a few Learning algorithm A maps datasets to models in a given class H Variance error in statistical learning � �� E loss A ( S T ) , ( X , Y ) − inf h ∈ H E loss h , ( X , Y ) compare to expected loss of best model in the class Regret in online learning T T � � � � � � loss A ( S t − 1 ) , ( X t , Y t ) − inf loss h , ( X t , Y t ) h ∈ H t = 1 t = 1 compare to cumulative loss of best model in the class N. Cesa-Bianchi (UNIMI) Online Approach to ML 6 / 53

Incremental model update A natural blueprint for online learning algorithms For t = 1, 2, . . . Apply current model h t − 1 to next data element ( X t , Y t ) 1 Update current model: h t − 1 → h t ∈ H 2 Goal: control regret T T � � � � � � loss h t − 1 , ( X t , Y t ) − inf loss h , ( X t , Y t ) h ∈ H t = 1 t = 1 View this as a repeated game between a player generating predictors h t ∈ H and an opponent generating data ( X t , Y t ) N. Cesa-Bianchi (UNIMI) Online Approach to ML 7 / 53

Theory of repeated games James Hannan David Blackwell (1922–2010) (1919–2010) Learning to play a game (1956) Play a game repeatedly against a possibly suboptimal opponent N. Cesa-Bianchi (UNIMI) Online Approach to ML 9 / 53

Zero-sum 2-person games played more than once 1 2 . . . M N × M known loss matrix ℓ ( 1, 1 ) ℓ ( 1, 2 ) 1 . . . Row player (player) 2 ℓ ( 2, 1 ) ℓ ( 2, 2 ) . . . has N actions . . . ... . . . . . . Column player (opponent) N has M actions For each game round t = 1, 2, . . . Player chooses action i t and opponent chooses action y t The player su ff ers loss ℓ ( i t , y t ) ( = gain of opponent) Player can learn from opponent’s history of past choices y 1 , . . . , y t − 1 N. Cesa-Bianchi (UNIMI) Online Approach to ML 10 / 53

Prediction with expert advice t = 1 t = 2 . . . 1 ℓ 1 ( 1 ) ℓ 2 ( 1 ) . . . ℓ 1 ( 2 ) ℓ 2 ( 2 ) 2 . . . . . . ... . . . . . . ℓ 1 ( N ) ℓ 2 ( N ) N Volodya Vovk Manfred Warmuth Opponent’s moves y 1 , y 2 , . . . define a sequential prediction problem with a time-varying loss function ℓ ( i t , y t ) = ℓ t ( i t ) N. Cesa-Bianchi (UNIMI) Online Approach to ML 11 / 53

Playing the experts game N actions ? ? ? ? ? ? ? ? ? For t = 1, 2, . . . Loss ℓ t ( i ) ∈ [ 0, 1 ] is assigned to every action i = 1, . . . , N 1 (hidden from the player) N. Cesa-Bianchi (UNIMI) Online Approach to ML 12 / 53

Playing the experts game N actions ? ? ? ? ? ? ? ? ? For t = 1, 2, . . . Loss ℓ t ( i ) ∈ [ 0, 1 ] is assigned to every action i = 1, . . . , N 1 (hidden from the player) Player picks an action I t (possibly using randomization) and 2 incurs loss ℓ t ( I t ) N. Cesa-Bianchi (UNIMI) Online Approach to ML 12 / 53

Playing the experts game N actions 7 3 2 4 1 6 7 4 9 For t = 1, 2, . . . Loss ℓ t ( i ) ∈ [ 0, 1 ] is assigned to every action i = 1, . . . , N 1 (hidden from the player) Player picks an action I t (possibly using randomization) and 2 incurs loss ℓ t ( I t ) Player gets feedback information: ℓ t ( 1 ) , . . . , ℓ t ( N ) 3 N. Cesa-Bianchi (UNIMI) Online Approach to ML 12 / 53

Oblivious opponents Losses ℓ t ( 1 ) , . . . , ℓ t ( N ) for all t = 1, 2, . . . are fixed beforehand, and unknown to the (randomized) player Oblivious regret minimization � T � T � � def ℓ t ( i ) want R T = E ℓ t ( I t ) − min = o ( T ) i = 1,..., N t = 1 t = 1 N. Cesa-Bianchi (UNIMI) Online Approach to ML 13 / 53

Bounds on regret [Experts’ paper, 1997] Lower bound using random losses ℓ t ( i ) → L t ( i ) ∈ { 0, 1 } independent random coin flip � T � = T � For any player strategy E L t ( I t ) 2 t = 1 Then the expected regret is � �� 1 T � T ln N � E max 2 − L t ( i ) = 1 − o ( 1 ) 2 i = 1,..., N t = 1 N. Cesa-Bianchi (UNIMI) Online Approach to ML 14 / 53

Exponentially weighted forecaster At time t pick action I t = i with probability proportional to � � t − 1 � exp − η ℓ s ( i ) s = 1 the sum at the exponent is the total loss of action i up to now Regret bound [Experts’ paper, 1997] � � T ln N If η = ( ln N ) / ( 8 T ) then R T � 2 Matching lower bound including constants � Dynamic choice η t = ( ln N ) / ( 8 t ) only loses small constants N. Cesa-Bianchi (UNIMI) Online Approach to ML 15 / 53

The bandit problem: playing an unknown game N actions ? ? ? ? ? ? ? ? ? For t = 1, 2, . . . Loss ℓ t ( i ) ∈ [ 0, 1 ] is assigned to every action i = 1, . . . , N 1 (hidden from the player) N. Cesa-Bianchi (UNIMI) Online Approach to ML 16 / 53

The bandit problem: playing an unknown game N actions ? ? ? ? ? ? ? ? ? For t = 1, 2, . . . Loss ℓ t ( i ) ∈ [ 0, 1 ] is assigned to every action i = 1, . . . , N 1 (hidden from the player) Player picks an action I t (possibly using randomization) and 2 incurs loss ℓ t ( I t ) N. Cesa-Bianchi (UNIMI) Online Approach to ML 16 / 53

The bandit problem: playing an unknown game N actions ? 3 ? ? ? ? ? ? ? For t = 1, 2, . . . Loss ℓ t ( i ) ∈ [ 0, 1 ] is assigned to every action i = 1, . . . , N 1 (hidden from the player) Player picks an action I t (possibly using randomization) and 2 incurs loss ℓ t ( I t ) Player gets feedback information: Only ℓ t ( I t ) is revealed 3 N. Cesa-Bianchi (UNIMI) Online Approach to ML 16 / 53

The bandit problem: playing an unknown game N actions ? 3 ? ? ? ? ? ? ? For t = 1, 2, . . . Loss ℓ t ( i ) ∈ [ 0, 1 ] is assigned to every action i = 1, . . . , N 1 (hidden from the player) Player picks an action I t (possibly using randomization) and 2 incurs loss ℓ t ( I t ) Player gets feedback information: Only ℓ t ( I t ) is revealed 3 Many applications Ad placement, dynamic content adaptation, routing, online auctions N. Cesa-Bianchi (UNIMI) Online Approach to ML 16 / 53

Relationships between actions [Mannor and Shamir, 2011] Undirected Directed N. Cesa-Bianchi (UNIMI) Online Approach to ML 18 / 53

A graph of relationships over actions ? ? ? ? ? ? ? ? ? ? N. Cesa-Bianchi (UNIMI) Online Approach to ML 19 / 53

A graph of relationships over actions 7 3 6 7 2 ? ? ? ? ? N. Cesa-Bianchi (UNIMI) Online Approach to ML 19 / 53

Recovering expert and bandit settings Experts: clique Bandits: empty graph 7 ? 3 6 3 ? 7 2 1 2 ? ? ? ? 4 9 ? ? 4 ? N. Cesa-Bianchi (UNIMI) Online Approach to ML 20 / 53

The Online Approach to Machine Learning Nicol` o Cesa-Bianchi - PowerPoint PPT Presentation

The Online Approach to Machine Learning Nicol` o Cesa-Bianchi Universit` a degli Studi di Milano N. Cesa-Bianchi (UNIMI) Online Approach to ML 1 / 53 Summary My beautiful regret 1 A supposedly fun game Ill play again 2 A graphic

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Online Learning Lorenzo Rosasco MIT, 9.520 L. Rosasco Online Learning About this class Goal

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning

Online Learning and Online Investing Jia Mao February 20, 2006 Jia Mao () Online Learning and

Designing Experiments in Political Science Delegation in Bureaucracies Michael F. Stoffel

UVL ink to myUVU T he Ne w E mploye e and Stude nt Por tal Nathan Ge r be r Dir e c

Discovery of Inflectional Paradigms from Plain Text using Graphical Models over Strings Markus

How Do Voters Respond to Information? Evidence from a Randomized Campaign Chad Kendall UBC

Experiments and Causal Inference Erik Gahner Larsen Advanced applied statistics, 2015 1 / 67

Burcin Becerik-Gerber Assistant Professor Civil and Environmental Engineering University of

A Theory of Content Mark Steedman ( with Mike Lewis and Nathan Schneider) August 2016 Steedman,

Bregman and Wasserstein, with Applications to Generative Adversarial Networks (GANs) and beyond