Multi-agent learning
Rep eated gamesGerard Vreeswijk, Intelligent Systems Group, Computer Science Department, Faculty of Sciences, Utrecht University, The Netherlands.
Friday 3rd May, 2019
Multi-agent learning Rep eated games Gerard Vreeswijk , - - PowerPoint PPT Presentation
Multi-agent learning Rep eated games Gerard Vreeswijk , Intelligent Systems Group, Computer Science Department, Faculty of Sciences, Utrecht University, The Netherlands. Friday 3 rd May, 2019 interation lea rning stage game nite
Gerard Vreeswijk, Intelligent Systems Group, Computer Science Department, Faculty of Sciences, Utrecht University, The Netherlands.
Friday 3rd May, 2019
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 2
intera tion lea rning stage game nite indenite inniteAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 2
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 2
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 2
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 2
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 2
■ A
nite number of times. indenite inniteAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 2
■ A
nite number of times.■ An
indenite (same: indeterminate) number of times. inniteAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 2
■ A
nite number of times.■ An
indenite (same: indeterminate) number of times.■ An
innite number of times.Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 2
■ A
nite number of times.■ An
indenite (same: indeterminate) number of times.■ An
innite number of times.Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 3
ba kw a rd indu tion Dis ount fa to r FAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 3
■ NE in normal form games that are repeated a finite number of times.
ba kw a rd indu tion Dis ount fa to r FAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 3
■ NE in normal form games that are repeated a finite number of times.
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 3
■ NE in normal form games that are repeated a finite number of times.
■ NE in normal form games that are repeated an indefinite number of
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 3
■ NE in normal form games that are repeated a finite number of times.
■ NE in normal form games that are repeated an indefinite number of
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 3
■ NE in normal form games that are repeated a finite number of times.
■ NE in normal form games that are repeated an indefinite number of
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 3
■ NE in normal form games that are repeated a finite number of times.
■ NE in normal form games that are repeated an indefinite number of
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 3
■ NE in normal form games that are repeated a finite number of times.
■ NE in normal form games that are repeated an indefinite number of
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 3
■ NE in normal form games that are repeated a finite number of times.
■ NE in normal form games that are repeated an indefinite number of
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 3
■ NE in normal form games that are repeated a finite number of times.
■ NE in normal form games that are repeated an indefinite number of
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 3
■ NE in normal form games that are repeated a finite number of times.
■ NE in normal form games that are repeated an indefinite number of
* H. Peters (2008): Game Theory: A Multi-Leveled Approach. Springer, ISBN: 978-3-540- 69290-4. Ch. 8: Repeated games.
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 4
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 4
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 4
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 4
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 5
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 5
■ Even if mixed strategies are allowed, the PD possesses
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 5
■ Even if mixed strategies are allowed, the PD possesses
■ This equilibrium is
P a reto sub-optimal.Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 5
■ Even if mixed strategies are allowed, the PD possesses
■ This equilibrium is
P a reto sub-optimal.■ Does the situation change if two parties get to play the Prisoners’
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 5
■ Even if mixed strategies are allowed, the PD possesses
■ This equilibrium is
P a reto sub-optimal.■ Does the situation change if two parties get to play the Prisoners’
■ The following diagram (hopefully) shows that playing the PD two
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 6
(2, 2)
CC
(4, 4)
CC
(6, 6)
CD
(3, 8)
DC
(8, 3)
DD
(4, 4)
CD
(1, 6)
CC
(3, 8)
CD
(0, 10)
DC
(5, 5)
DD
(1, 6)
DC
(6, 1)
CC
(8, 3)
CD
(5, 5)
DC
(10, 0)
DD
(6, 1)
DD
(2, 2)
CC
(4, 4)
CD
(1, 6)
DC
(6, 1)
DD
(2, 2)
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 6
(2, 2)
CC
(4, 4)
CC
(6, 6)
CD
(3, 8)
DC
(8, 3)
DD
(4, 4)
CD
(1, 6)
CC
(3, 8)
CD
(0, 10)
DC
(5, 5)
DD
(1, 6)
DC
(6, 1)
CC
(8, 3)
CD
(5, 5)
DC
(10, 0)
DD
(6, 1)
DD
(2, 2)
CC
(4, 4)
CD
(1, 6)
DC
(6, 1)
DD
(2, 2)
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 7
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 7
■ The action profile (DD, DD) is the only Nash equilibrium.
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 7
■ The action profile (DD, DD) is the only Nash equilibrium. ■ With 3 successive games, we obtain a 23 × 23 matrix, where the action
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 7
■ The action profile (DD, DD) is the only Nash equilibrium. ■ With 3 successive games, we obtain a 23 × 23 matrix, where the action
■ Generalise to N repetitions: (DDN−1, DDN−1) still is the only Nash
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 8
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 8
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 8
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 8
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 9
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 9
■ . . . ten times.
a xed numb erAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 9
■ . . . ten times. That’s hopefully clear.
a xed numb erAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 9
■ . . . ten times. That’s hopefully clear. ■ . . . a finite number of times.
a xed numb erAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 9
■ . . . ten times. That’s hopefully clear. ■ . . . a finite number of times. May mean:
a xed numb erAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 9
■ . . . ten times. That’s hopefully clear. ■ . . . a finite number of times. May mean:
a xed numb erAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 9
■ . . . ten times. That’s hopefully clear. ■ . . . a finite number of times. May mean:
a xed numb erAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 9
■ . . . ten times. That’s hopefully clear. ■ . . . a finite number of times. May mean:
a xed numb er■ . . . an indefinite number of times.
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 9
■ . . . ten times. That’s hopefully clear. ■ . . . a finite number of times. May mean:
a xed numb er■ . . . an indefinite number of times. Means:
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 9
■ . . . ten times. That’s hopefully clear. ■ . . . a finite number of times. May mean:
a xed numb er■ . . . an indefinite number of times. Means:
■ . . . an infinite number of times.
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 9
■ . . . ten times. That’s hopefully clear. ■ . . . a finite number of times. May mean:
a xed numb er■ . . . an indefinite number of times. Means:
■ . . . an infinite number of times. When
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 10
indenite dis ount fa to r innitely many FAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 10
■ A Pareto-suboptimal outcome can be avoided in case the following
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 10
■ A Pareto-suboptimal outcome can be avoided in case the following
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 10
■ A Pareto-suboptimal outcome can be avoided in case the following
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 10
■ A Pareto-suboptimal outcome can be avoided in case the following
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 10
■ A Pareto-suboptimal outcome can be avoided in case the following
■ Under these conditions suddenly
innitely many Nash equilibria exist.Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 10
■ A Pareto-suboptimal outcome can be avoided in case the following
■ Under these conditions suddenly
innitely many Nash equilibria exist.■ Various
FAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 10
■ A Pareto-suboptimal outcome can be avoided in case the following
■ Under these conditions suddenly
innitely many Nash equilibria exist.■ Various
F■ Here we discuss one version of “the” Folk Theorem.
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 11
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 11
■ Horizon. The game may be repeated indefinitely (present case)
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 11
■ Horizon. The game may be repeated indefinitely (present case) or
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 11
■ Horizon. The game may be repeated indefinitely (present case) or
■ Information. Players may act on the basis of CKR (present case)
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 11
■ Horizon. The game may be repeated indefinitely (present case) or
■ Information. Players may act on the basis of CKR (present case), or
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 11
■ Horizon. The game may be repeated indefinitely (present case) or
■ Information. Players may act on the basis of CKR (present case), or
■ Reward. Players may collect their payoff through a discount factor
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 11
■ Horizon. The game may be repeated indefinitely (present case) or
■ Information. Players may act on the basis of CKR (present case), or
■ Reward. Players may collect their payoff through a discount factor
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 11
■ Horizon. The game may be repeated indefinitely (present case) or
■ Information. Players may act on the basis of CKR (present case), or
■ Reward. Players may collect their payoff through a discount factor
■ Subgame perfectness. Subgame perfect equilibria (present case)
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 11
■ Horizon. The game may be repeated indefinitely (present case) or
■ Information. Players may act on the basis of CKR (present case), or
■ Reward. Players may collect their payoff through a discount factor
■ Subgame perfectness. Subgame perfect equilibria (present case) or
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 11
■ Horizon. The game may be repeated indefinitely (present case) or
■ Information. Players may act on the basis of CKR (present case), or
■ Reward. Players may collect their payoff through a discount factor
■ Subgame perfectness. Subgame perfect equilibria (present case) or
■ Equilibrium. We may be interested in Nash equilibria (present case)
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 11
■ Horizon. The game may be repeated indefinitely (present case) or
■ Information. Players may act on the basis of CKR (present case), or
■ Reward. Players may collect their payoff through a discount factor
■ Subgame perfectness. Subgame perfect equilibria (present case) or
■ Equilibrium. We may be interested in Nash equilibria (present case),
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 12
rep eated game stage game histo ryAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 12
■ Let G be a game in normal form.
rep eated game stage game histo ryAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 12
■ Let G be a game in normal form. ■ The
rep eated game G∗(δ), is G, played an indefinite number of times stage game histo ryAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 12
■ Let G be a game in normal form. ■ The
rep eated game G∗(δ), is G, played an indefinite number of times,Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 12
■ Let G be a game in normal form. ■ The
rep eated game G∗(δ), is G, played an indefinite number of times,Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 12
■ Let G be a game in normal form. ■ The
rep eated game G∗(δ), is G, played an indefinite number of times,Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 12
■ Let G be a game in normal form. ■ The
rep eated game G∗(δ), is G, played an indefinite number of times,■ G is called the
stage game. histo ryAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 12
■ Let G be a game in normal form. ■ The
rep eated game G∗(δ), is G, played an indefinite number of times,■ G is called the
stage game.■ A
histo ry h of length t of a repeated game is a sequence of actionAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 12
■ Let G be a game in normal form. ■ The
rep eated game G∗(δ), is G, played an indefinite number of times,■ G is called the
stage game.■ A
histo ry h of length t of a repeated game is a sequence of actionAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 13
strategy strategy p role exp e ted pa yAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 13
■ The set of all possible histories (of any length) is denoted by H.
strategy strategy p role exp e ted pa yAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 13
■ The set of all possible histories (of any length) is denoted by H. ■ A
strategy for Player i is a function si : H → ∆{C, D} such thatAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 13
■ The set of all possible histories (of any length) is denoted by H. ■ A
strategy for Player i is a function si : H → ∆{C, D} such that■ A
strategy p role s is a combination of strategies, one for each player. exp e ted pa yAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 13
■ The set of all possible histories (of any length) is denoted by H. ■ A
strategy for Player i is a function si : H → ∆{C, D} such that■ A
strategy p role s is a combination of strategies, one for each player.■ The
exp e ted pa y∞
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 14
∞
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 14
∞
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 14
∞
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 14
∞
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 14
∞
t=0
∞
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 15
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 15
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 15
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 15
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 15
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 15
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 15
1A notation like D∗ or (worse) D∞ is suggestive. Mathematically it makes no sense,
but intuitively it does.
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 16
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
N−1
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
N−1
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
N−1
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
N−1
t=0
∞
t=N+1
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
N−1
t=0
∞
t=N+1
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
N−1
t=0
∞
t=N+1
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
N−1
t=0
∞
t=N+1
t=0 δt· 3
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
N−1
t=0
∞
t=N+1
t=0 δt· 3 which means he
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
N−1
t=0
∞
t=N+1
t=0 δt· 3 which means he
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
N−1
t=0
∞
t=N+1
t=0 δt· 3 which means he
t=0
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
N−1
t=0
∞
t=N+1
t=0 δt· 3 which means he
t=0
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
N−1
t=0
∞
t=N+1
t=0 δt· 3 which means he
t=0
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
N−1
t=0
∞
t=N+1
t=0 δt· 3 which means he
t=0
t=0
∞
t=N+1
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
N−1
t=0
∞
t=N+1
t=0 δt· 3 which means he
t=0
t=0
∞
t=N+1
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 17
N−1
t=0
∞
t=N+1
t=0 δt· 3 which means he
t=0
t=0
∞
t=N+1
∞
t=N+1
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 18
∞
t=N+1
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 18
∞
t=N+1
∞
t=N+1
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 18
∞
t=N+1
∞
t=N+1
∞
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 18
∞
t=N+1
∞
t=N+1
∞
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 18
∞
t=N+1
∞
t=N+1
∞
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 18
∞
t=N+1
∞
t=N+1
∞
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 18
∞
t=N+1
∞
t=N+1
∞
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 18
∞
t=N+1
∞
t=N+1
∞
t=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 19
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 19
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 19
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 19
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 19
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 19
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 19
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 19
*H. Peters (2008): Game Theory: A Multi-Leveled Approach. Springer, ISBN: 978-3-540-69290-4.
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 20
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 20
■ Every convex combination2 of payoffs
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 20
■ Every convex combination2 of payoffs
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 20
■ Every convex combination2 of payoffs
■ Ensure that (C, C) occurs (in the long run) in α1, (C, D) in α2, (D, C)
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 20
■ Every convex combination2 of payoffs
■ Ensure that (C, C) occurs (in the long run) in α1, (C, D) in α2, (D, C)
■ As long as these limiting average payoffs exceed payoff({D, D}) for
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 20
■ Every convex combination2 of payoffs
■ Ensure that (C, C) occurs (in the long run) in α1, (C, D) in α2, (D, C)
■ As long as these limiting average payoffs exceed payoff({D, D}) for
■ For δ high enough, these strategies again form a SGP NE.
2Meaning αi ≥ 0 and α1 + α2 + α3 + α4 = 1.
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 21
1 2 3 4 5 1 2 3 4 5
(3, 3)
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 21
1 2 3 4 5 1 2 3 4 5
(3, 3)
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 21
1 2 3 4 5 1 2 3 4 5
(3, 3)
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 21
1 2 3 4 5 1 2 3 4 5
(3, 3)
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 21
1 2 3 4 5 1 2 3 4 5
(3, 3)
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 21
1 2 3 4 5 1 2 3 4 5
(3, 3)
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 21
1 2 3 4 5 1 2 3 4 5
(3, 3)
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 22
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 23
existen eAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 23
■ We have seen that many
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 23
■ We have seen that many
■ What about the
existen eAuthor: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 23
■ We have seen that many
■ What about the
existen e■ Without the requirement of
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 23
■ We have seen that many
■ What about the
existen e■ Without the requirement of
■ However, non-SGPs implies
threats that a re not redible.Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 24
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 25
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 25
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 25
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 25
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 25
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 25
■ The punishment strategy of row is mixed (0.8, 0.2)∗.
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 25
■ The punishment strategy of row is mixed (0.8, 0.2)∗. ■ The punishment strategy of col is pure R∗.
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 25
■ The punishment strategy of row is mixed (0.8, 0.2)∗. ■ The punishment strategy of col is pure R∗.
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 25
■ The punishment strategy of row is mixed (0.8, 0.2)∗. ■ The punishment strategy of col is pure R∗.
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 26
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 26
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 26
■ T1 ⇒ T2. If row plays (the non-degenerated part of) T1, then col
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 26
■ T1 ⇒ T2. If row plays (the non-degenerated part of) T1, then col
■ T2 ⇒ T1. If at all, the best moment for row to deviate is at D, for that
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 26
■ T1 ⇒ T2. If row plays (the non-degenerated part of) T1, then col
■ T2 ⇒ T1. If at all, the best moment for row to deviate is at D, for that
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 26
■ T1 ⇒ T2. If row plays (the non-degenerated part of) T1, then col
■ T2 ⇒ T1. If at all, the best moment for row to deviate is at D, for that
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 27
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 27
■ T2 ⇒ T1 (continued). Total payoff for row player: 0 (for cheating) +
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 27
■ T2 ⇒ T1 (continued). Total payoff for row player: 0 (for cheating) +
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 27
■ T2 ⇒ T1 (continued). Total payoff for row player: 0 (for cheating) +
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 27
■ T2 ⇒ T1 (continued). Total payoff for row player: 0 (for cheating) +
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 27
■ T2 ⇒ T1 (continued). Total payoff for row player: 0 (for cheating) +
∞
k=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 27
■ T2 ⇒ T1 (continued). Total payoff for row player: 0 (for cheating) +
∞
k=0
∞
k=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 27
■ T2 ⇒ T1 (continued). Total payoff for row player: 0 (for cheating) +
∞
k=0
∞
k=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 27
■ T2 ⇒ T1 (continued). Total payoff for row player: 0 (for cheating) +
∞
k=0
∞
k=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 27
■ T2 ⇒ T1 (continued). Total payoff for row player: 0 (for cheating) +
∞
k=0
∞
k=0
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 28
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 28
■ Col can punish row maximally
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 28
■ Col can punish row maximally
■ If row plays D∗ then col will
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 28
■ Col can punish row maximally
■ If row plays D∗ then col will
■ Row can punish col even more
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 28
■ Col can punish row maximally
■ If row plays D∗ then col will
■ Row can punish col even more
l,r ul· 1+ dr· 4
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 28
■ Col can punish row maximally
■ If row plays D∗ then col will
■ Row can punish col even more
l,r ul· 1+ dr· 4
l
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 28
■ Col can punish row maximally
■ If row plays D∗ then col will
■ Row can punish col even more
l,r ul· 1+ dr· 4
l
l
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 28
■ Col can punish row maximally
■ If row plays D∗ then col will
■ Row can punish col even more
l,r ul· 1+ dr· 4
l
l
■ If 5u − 4 = 0, it does not matter
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 29
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 29
■ If 5u − 4 = 0, it does not mat-
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 29
■ If 5u − 4 = 0, it does not mat-
■ If 5u − 4 > 0 col will play l = 1,
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 29
■ If 5u − 4 = 0, it does not mat-
■ If 5u − 4 > 0 col will play l = 1,
■ If 5u − 4 < 0 col will play l = 0,
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 29
■ If 5u − 4 = 0, it does not mat-
■ If 5u − 4 > 0 col will play l = 1,
■ If 5u − 4 < 0 col will play l = 0,
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 30
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 30
■ Reinforcement Learning. Agents simply execute the action(s) with
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 30
■ Reinforcement Learning. Agents simply execute the action(s) with
■ No-regret learning. Agents execute the action(s) with maximal
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 30
■ Reinforcement Learning. Agents simply execute the action(s) with
■ No-regret learning. Agents execute the action(s) with maximal
■ Fictitious Play. Sample the actions of opponent(s) and play a best
Author: Gerard Vreeswijk. Slides last modified on May 3rd, 2019 at 12:39 Multi-agent learning: Repeated games, slide 30
■ Reinforcement Learning. Agents simply execute the action(s) with
■ No-regret learning. Agents execute the action(s) with maximal
■ Fictitious Play. Sample the actions of opponent(s) and play a best
■ Gradient Dynamics. This is to approximate NE of single-shot games