1
On the Agenda(s) of Research
- n Multi-Agent Learning
by Yoav Shoham and Rob Powers and Trond Grenager
Learning against opponents with bounded memory
by Rob Powers and Yoav Shaham
Presented by: Ece Kamar and Philip Hendrix April 3, 2006 CS 286r
On the Agenda(s) of Research on Multi-Agent Learning by Yoav Shoham - - PowerPoint PPT Presentation
On the Agenda(s) of Research on Multi-Agent Learning by Yoav Shoham and Rob Powers and Trond Grenager Learning against opponents with bounded memory by Rob Powers and Yoav Shaham Presented by: Ece Kamar and Philip Hendrix April 3, 2006 CS
1
by Yoav Shoham and Rob Powers and Trond Grenager
by Rob Powers and Yoav Shaham
Presented by: Ece Kamar and Philip Hendrix April 3, 2006 CS 286r
2
3
4
5
6
– Friend class: Q values define a globally optimal action profile – Foe class: Q values define a game with a saddle point – Friend Q updates V similar to regular Q learning – Foe Q updates V similar to maximin
7
any zero-sum game with infinite exploration
equilibrium in common payoff games under the condition of self play and decreasing exploration
Foe games.
8
9
10
11
12
13
14
15
16
17
18
Prisoner’s Dilemma
19
20
21
22
23
24
25
26
27
memory
a minimum probability of playing any given action
Potential discounted sum implementation
28
29
30
31
32