on compiling online combinatorial learning problems
play

On Compiling (Online) Combinatorial Learning Problems Fr ed eric - PowerPoint PPT Presentation

On Compiling (Online) Combinatorial Learning Problems Fr ed eric Koriche CRIL - CNRS UMR 8188, Univ. Artois koriche@cril.fr Dagstuhl17, New Trends in Knowledge Compilation 1 Outline 1 Online Learning 2 The Convex Case 3 The


  1. On Compiling (Online) Combinatorial Learning Problems Fr´ ed´ eric Koriche CRIL - CNRS UMR 8188, Univ. Artois koriche@cril.fr Dagstuhl’17, New Trends in Knowledge Compilation 1

  2. Outline 1 Online Learning 2 The Convex Case 3 The Combinatorial Case 4 Compiling Hedge 2

  3. Online Learning Online learning is a zero-sum repeated game between a learning algorithm and its environment. The components of the game are: A class H of hypotheses ( the learner’s moves ) A space Z of instances ( the environment’s moves ) A loss function ℓ : H × Z → R ( the game “matrix” ) 3

  4. Online Learning Learner Environment During each round t of the game, 4

  5. Online Learning Learner h 1 Environment z 1 During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z 4

  6. Online Learning Learner h 1 ℓ ( h 1 , z 1 ) Environment z 1 During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z Then z t is revealed to the learner, which incurs the loss ℓ ( h t , z t ) 4

  7. Online Learning h 1 h 2 Learner ℓ ( h 1 , z 1 ) Environment z 1 z 2 During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z Then z t is revealed to the learner, which incurs the loss ℓ ( h t , z t ) The goal for the learner is to minimize its cumulative loss over the course of the game. 4

  8. Online Learning h 1 h 2 Learner + ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) Environment z 1 z 2 During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z Then z t is revealed to the learner, which incurs the loss ℓ ( h t , z t ) The goal for the learner is to minimize its cumulative loss over the course of the game. 4

  9. Online Learning h 1 h 2 h 3 Learner + ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) Environment z 1 z 2 z 3 During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z Then z t is revealed to the learner, which incurs the loss ℓ ( h t , z t ) The goal for the learner is to minimize its cumulative loss over the course of the game. 4

  10. Online Learning h 1 h 2 h 3 Learner + + ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) ℓ ( h 3 , z 3 ) Environment z 1 z 2 z 3 During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z Then z t is revealed to the learner, which incurs the loss ℓ ( h t , z t ) The goal for the learner is to minimize its cumulative loss over the course of the game. 4

  11. Online Learning h 1 h 2 h 3 Learner + + . . . ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) ℓ ( h 3 , z 3 ) Environment z 1 z 2 z 3 During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z Then z t is revealed to the learner, which incurs the loss ℓ ( h t , z t ) The goal for the learner is to minimize its cumulative loss over the course of the game. 4

  12. Online Learning h 1 h 2 h 3 h T Learner + + . . . ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) ℓ ( h 3 , z 3 ) Environment z 1 z 2 z 3 z T During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z Then z t is revealed to the learner, which incurs the loss ℓ ( h t , z t ) The goal for the learner is to minimize its cumulative loss over the course of the game. 4

  13. Online Learning h 1 h 2 h 3 h T Learner + + + + . . . ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) ℓ ( h 3 , z 3 ) ℓ ( h T , z T ) Environment z 1 z 2 z 3 z T During each round t of the game, The learner plays a hypothesis h t ∈ H Simultaneously, the environment plays an instance z t ∈ Z Then z t is revealed to the learner, which incurs the loss ℓ ( h t , z t ) The goal for the learner is to minimize its cumulative loss over the course of the game. 4

  14. Online Learning Example: Online Linear Classification On each round t , 5

  15. Online Learning h t z t Example: Online Linear Classification On each round t , The learner plays a separating hyperplane h t = sign � w t , ·� Simultaneously, the environment plays a labeled example z t = ( x t , y t ) 5

  16. Online Learning h t z t Example: Online Linear Classification On each round t , The learner plays a separating hyperplane h t = sign � w t , ·� Simultaneously, the environment plays a labeled example z t = ( x t , y t ) Then, the learner incurs the hinge loss ℓ ( h t , z t ) = max(0 , 1 − y t � x t , w t � ) 5

  17. Online Learning 1 0 z 1 z 2 z 3 z 4 z 5 z 6 z 7 z 8 Example: Online Density Estimation On each round t , 6

  18. Online Learning z t 1 h t 0 z 1 z 2 z 3 z 4 z 5 z 6 z 7 z 8 Example: Online Density Estimation On each round t , The learner plays a probability distribution h t over Z Simultaneously, the environment plays an instance z t 6

  19. Online Learning z t 1 h t 0 z 1 z 2 z 3 z 4 z 5 z 6 z 7 z 8 Example: Online Density Estimation On each round t , The learner plays a probability distribution h t over Z Simultaneously, the environment plays an instance z t Then, the learner incurs the log loss ℓ ( h t , z t ) = − ln h t ( z t ) 6

  20. Online Learning Learner h 1 h 2 h 3 h T + + + + . . . ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) ℓ ( h 3 , z 3 ) ℓ ( h T , z T ) Environment z 1 z 2 z 3 z T Online learning can be applied to a wide range of tasks, ranging 7

  21. Online Learning h 1 h 2 h 3 h T Learner + + + + . . . ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) ℓ ( h 3 , z 3 ) ℓ ( h T , z T ) Environment z 1 z 2 z 3 z T D Online learning can be applied to a wide range of tasks, ranging from statistical learning, where the environment is an oblivious player modelled by a fixed probability distribution D over Z , 7

  22. Online Learning Learner h 1 h 2 h 3 h T + + + + . . . ℓ ( h 1 , z 1 ) ℓ ( h 2 , z 2 ) ℓ ( h 3 , z 3 ) ℓ ( h T , z T ) Environment z 1 z 2 z 3 z T D 1 D 2 D 3 D T Online learning can be applied to a wide range of tasks, ranging from statistical learning, where the environment is an oblivious player modelled by a fixed probability distribution D over Z , to adversarial learning, where the environment is an active player who changes its distribution at each iteration in response to the learner’s moves. 7

  23. Online Learning In a nutshell, online learning is particularly suited to: Adaptive environments, where the data distribution can change over time Streaming applications, where all the data is not available in advance Large-scale datasets, by processing only one instance at a time 8

  24. Online Learning The performance of an online learning algorithm A is measured according to two metrics: 9

  25. Online Learning The performance of an online learning algorithm A is measured according to two metrics: Minimax Regret Defined by the maximum, over every sequence z 1: T = ( z 1 , · · · , z T ) ∈ Z T , of the cumulative relative loss between A and the best hypothesis in H : � T � T � � max ℓ ( h t , z t ) − min ℓ ( h , z t ) z 1: T ∈Z T h ∈H t =1 t =1 A is Hannan consistent if its minimax regret is sublinear in T . 9

  26. Online Learning The performance of an online learning algorithm A is measured according to two metrics: Minimax Regret Defined by the maximum, over every sequence z 1: T = ( z 1 , · · · , z T ) ∈ Z T , of the cumulative relative loss between A and the best hypothesis in H : � T � T � � max ℓ ( h t , z t ) − min ℓ ( h , z t ) z 1: T ∈Z T h ∈H t =1 t =1 A is Hannan consistent if its minimax regret is sublinear in T . Per-round Complexity Given by the amount of computational operations spent by A at each trial t , for choosing a hypothesis h t in H , and evaluating its loss ℓ ( h t , z t ). 9

  27. Outline 1 Online Learning 2 The Convex Case 3 The Combinatorial Case 4 Compiling Hedge 10

  28. The Convex Case loss 3 2 b 1 0 y t � w t , x t � − 1 0 1 2 H = { w ∈ R n : � w � ≤ b } ℓ ( h t , z t ) = max(0 , 1 − y t � w t , x t � ) (in blue) Online Convex Learning An online learning problem ( H , Z , ℓ ) is convex if: H is a closed convex subset of R d ℓ is convex in its first argument, i.e. ℓ ( · , z ) is convex for all z ∈ Z 11

  29. The Convex Case Online Gradient Descent Start with the vector w 1 = 0 . During each round t Play with w t Observe z t and incur loss ℓ ( w t , z t ) Update the hypothesis as follows: � 2 where w ′ � w − w ′ � � w t +1 = argmin t = w t − η t ∇ ℓ ( w t , z t ) t w ∈H 12

  30. The Convex Case Online Gradient Descent Start with the vector w 1 = 0 . During each round t Play with w t Observe z t and incur loss ℓ ( w t , z t ) Update the hypothesis as follows: � 2 where w ′ � w − w ′ � � w t +1 = argmin t = w t − η t ∇ ℓ ( w t , z t ) t w ∈H The regret of OGD with respect to any w ∗ ∈ H is bounded by T � w ∗ � 2 + 1 � η t �∇ ℓ ( w t , z t ) � 2 η T 2 t =1 Thus, if ℓ is L -Lipschitz and H is D -bounded, then using η t = D / L √ t , OGD is √ Hannan consistent with regret 2 DL T 12

  31. Outline 1 Online Learning 2 The Convex Case 3 The Combinatorial Case 4 Compiling Hedge 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend