CS 170 Section 13 Multiplicative Updates Owen Jow April 25, 2018 - PowerPoint PPT Presentation

CS 170 Section 13 Multiplicative Updates Owen Jow April 25, 2018 University of California, Berkeley

Table of Contents 1. Multiplicative Updates Intro 2. Follow the Regularized Leader 1

Multiplicative Updates Intro

The Experts Problem • Every day, you enter a transaction in which you lose between 0 and 1 dollars • Life is hard • There are n experts, each of whom gives different advice • Instead of making your own decisions, you choose an expert every day and follow his advice • The next day you find out how all the experts performed, and you can choose another expert if you wish • Goal: minimize regret 2

Terminology • There are n experts • There are T days ( T is very large) • The i th expert on day t costs you c t i ∈ [0 , 1] • You choose expert i ( t ) on day t • R is your regret 3

Regret Figure 1: we would like to minimize our regret R . � T T � R = 1 � � c t c t i ( t ) − min i T i t =1 t =1 i.e. on average ((how you did) − (how the best expert did)) 4

Goal Reframed • More specifically, you would like an algorithm for choosing experts with the result that R ≈ 0 no matter what c t i s the environment throws at you (i.e. even in the worst case) • For this you can use multiplicative weight updates 5

Notes • You want your algorithm to do as well as the one that picks the best expert from the start and sticks with him • Regret is defined at the end (how did you do in comparison to how you’d have done if you chose the best expert at the start and followed him every day?) • It is impossible to match the best expert on a day-to-day basis, but it is possible to match the single best expert throughout • The adversary is the environment, which provides the cost values 6

Multiplicative Weight Updates MWU is a randomized algorithm. It chooses expert i on day t with weight w t i > 0. Algorithm 1 Multiplicative Weight Updates 1: Initialize all weights to w 0 i = 1. 2: for i = 1 to T do w i Choose expert i with probability 3: � j w j Update weights for all experts: w t +1 i · (1 − ǫ ) c t = w t 4: i i 5: end for 7

Multiplicative Weight Updates • (1 − ǫ ) c t i will be less than or equal to 1. It’ll be much less than 1 if the expert ruined you; the bigger c t i is, the more you punish expert i . • In the words of a certain theoretical computer scientist, “ c T is the amount of money this bastard made you pay.” i • Weights “absorb” all past performances of experts • Experts who perform the best end up with the highest weights 8

Multiplicative Weight Updates • This algorithm can be proven to give almost zero regret. • The proof is left as an exercise. • Just kidding. For the proof, see the notes . R = 1 T (MWU − OPT) ≤ ln n ǫ T + ǫ OPT T ≤ ln n ǫ T + ǫ � ln n ≤ 2 T 9

Notes • With this algorithm, higher T means smaller regret. • MWU punishes bad experts exponentially severely. By the crushing weight of exponentiation, if an expert is the best you’ll be choosing him all the time. 10

Life Advice If you want zero regret in life, notice what works in a very conservative fashion – by giving it a little more weight every time. In the long run, this means perfection. A theoretical computer scientist 11

Follow the Regularized Leader

Exercise 1a • You are playing T rounds of a game • At round t you pick strategy i ∈ { 1 , ..., n } and receive payoff A ( t , i ) ∈ [0 , 1] • What happens if you choose at each round the strategy which has given the highest average payoff so far? ( Even though you throw in your lot with one strategy, you get to observe how all of them do. ) 12

Exercise 1b • The problem: if you choose strategies deterministically, an adversarial environment can design payoffs to ruin you • So let’s try a randomized strategy • To the adversary: good luck outplaying randomness • Pick each strategy at random from a distribution D t 13

Exercise 1b • D t assigns a probability p t ( i ) to each strategy i • At round t , “follow the leader” will approximately maximize   n � �  p t ( i ) · A ( τ, i )  i =1 τ ∈{ 1 ,..., t − 1 } • Why is this no better than before? 14

Exercise 1c • Let’s add an entropy regularizer, now maximizing at time step t     n � �  − η p t ( i ) ln p t ( i )  p t ( i ) · A ( τ, i )   i =1 τ ∈{ 1 ,..., t − 1 } • Suddenly, “follow the regularized leader” is the same as MWU. • Show that for any distribution p t , our objective is at most � n � A ( τ, i ) � � η ln e τ ∈{ 1 ,..., t − 1 } η i =1 15

Exercise 1d When computing p t using multiplicative weight updates, we can say for some choice of ǫ (dependent on η ) that the objective     n � �  − η p t ( i ) ln p t ( i )  p t ( i ) · A ( τ, i )   i =1 τ ∈{ 1 ,..., t − 1 } is equal to � n � A ( τ, i ) � � η ln e τ ∈{ 1 ,..., t − 1 } η i =1 Show this. Also, how does ǫ depend on η ? 16

CS 170 Section 13 Multiplicative Updates Owen Jow April 25, 2018 - PowerPoint PPT Presentation

CS 170 Section 13 Multiplicative Updates Owen Jow April 25, 2018 University of California, Berkeley Table of Contents 1. Multiplicative Updates Intro 2. Follow the Regularized Leader 1 Multiplicative Updates Intro The Experts Problem

170 BLEECKER STREET, 10/24/2017 Page 1 170 Bleecker Street: Existing Conditions 170 BLEECKER

170 BLEECKER STREET, 11/3/2017 Page 1 170 Bleecker Street: Existing Conditions 170 BLEECKER

Module V: Vector Spaces Module V Math 237 Module V Section V.0 Section V.1 Section V.2

1240 x 260 1180 x 63 949 x 217 949 x 217 48 px 30 30 120 x 120 x x 30 30 24 px 200 x

Half Year Results Presentation 2019 6 months ended 30 June 2019 Section 1 Section 2 Section 3

2018 Full year results presentation 12 months ended 31 December 2018 1 Section 1 Section 2

May 2013 Agenda Section 1 Jaypee Group Overview Section 2 Company Overview Section 3 Yamuna

Fermilab NORTH 0 20 20 40 1"=20'-0" 2/8/2019 6:57:50 PM 4850 LEVEL SCALE SC LE

Module A: Algebraic properties of linear maps Module A Math 237 Module A Section A.1 Section

Probability Chapter 4 Section 2: Fundamentals Section 3: Addition Rule Section 4:

Probability Chapter 4 Section 2: Fundamentals Section 3: Addition Rule Section 4:

2020 Budget Page 103 of 170 Updated Revenues and Expenditures; and Work Group Recommendation

CS 170: Algorithms Prof David Wagner. Slides edited from a version created by Prof. Satish Rao.

CS 170: Algorithms Prof David Wagner. Slides edited from a version created by Prof. Satish Rao.

Investor Update CONTENTS SECTION 01 SECTION 02 Asset Overview management strategy SECTION

Agenda Section 1: Introduction Section 2: Emergency & Welfare Arrangements Section

Slide 1 / 135 Momentum Problems Slide 2 / 135 Momentum of a Single Object Slide 3 / 135 1

Annual Public Meeting 26 August 2020 1 Karakia Whakataka te hau ki te uru Whakataka te hau ki

The Success of Maori Focus Units and the Faith Based Unit in the New Zealand Corrections System

Measuring Maori Wellbeing Mason Durie Massey University Wellbeing Pathways Outcomes Capacities

Information security has relied upon the following pillars: Confidentiality only allow

Sampling, Virtual Trackball, Hidden Surfaces Week 5, Tue Jun 7

Multi-agent learning Rep eated games Gerard Vreeswijk , Intelligent Systems Group, Computer

Option Values, Arrays, Sequences, and Lazy Evaluation Bjrn Lisper School of Innovation,

CS 170 Section 13 Multiplicative Updates Owen Jow April 25, 2018 - PowerPoint PPT Presentation

CS 170 Section 13 Multiplicative Updates Owen Jow April 25, 2018 University of California, Berkeley Table of Contents 1. Multiplicative Updates Intro 2. Follow the Regularized Leader 1 Multiplicative Updates Intro The Experts Problem

170 BLEECKER STREET, 10/24/2017 Page 1 170 Bleecker Street: Existing Conditions 170 BLEECKER

170 BLEECKER STREET, 11/3/2017 Page 1 170 Bleecker Street: Existing Conditions 170 BLEECKER

Module V: Vector Spaces Module V Math 237 Module V Section V.0 Section V.1 Section V.2

1240 x 260 1180 x 63 949 x 217 949 x 217 48 px 30 30 120 x 120 x x 30 30 24 px 200 x

Half Year Results Presentation 2019 6 months ended 30 June 2019 Section 1 Section 2 Section 3

2018 Full year results presentation 12 months ended 31 December 2018 1 Section 1 Section 2

May 2013 Agenda Section 1 Jaypee Group Overview Section 2 Company Overview Section 3 Yamuna

Fermilab NORTH 0 20 20 40 1&quot;=20'-0&quot; 2/8/2019 6:57:50 PM 4850 LEVEL SCALE SC LE

Module A: Algebraic properties of linear maps Module A Math 237 Module A Section A.1 Section

Probability Chapter 4 Section 2: Fundamentals Section 3: Addition Rule Section 4:

Probability Chapter 4 Section 2: Fundamentals Section 3: Addition Rule Section 4:

2020 Budget Page 103 of 170 Updated Revenues and Expenditures; and Work Group Recommendation

CS 170: Algorithms Prof David Wagner. Slides edited from a version created by Prof. Satish Rao.

CS 170: Algorithms Prof David Wagner. Slides edited from a version created by Prof. Satish Rao.

Investor Update CONTENTS SECTION 01 SECTION 02 Asset Overview management strategy SECTION

Agenda Section 1: Introduction Section 2: Emergency &amp; Welfare Arrangements Section

Slide 1 / 135 Momentum Problems Slide 2 / 135 Momentum of a Single Object Slide 3 / 135 1

Annual Public Meeting 26 August 2020 1 Karakia Whakataka te hau ki te uru Whakataka te hau ki

The Success of Maori Focus Units and the Faith Based Unit in the New Zealand Corrections System

Measuring Maori Wellbeing Mason Durie Massey University Wellbeing Pathways Outcomes Capacities

Information security has relied upon the following pillars: Confidentiality only allow

Sampling, Virtual Trackball, Hidden Surfaces Week 5, Tue Jun 7

Multi-agent learning Rep eated games Gerard Vreeswijk , Intelligent Systems Group, Computer

Option Values, Arrays, Sequences, and Lazy Evaluation Bjrn Lisper School of Innovation,

Fermilab NORTH 0 20 20 40 1"=20'-0" 2/8/2019 6:57:50 PM 4850 LEVEL SCALE SC LE

Agenda Section 1: Introduction Section 2: Emergency & Welfare Arrangements Section