Multi-agent learning
Compa ring algo rithms empiri allyGerard Vreeswijk, Intelligent Software Systems, Computer Science Department, Faculty of Sciences, Utrecht University, The Netherlands.
Sunday 21st June, 2020
Multi-agent learning Compa ring algo rithms empirially Gerard - - PowerPoint PPT Presentation
Multi-agent learning Compa ring algo rithms empirially Gerard Vreeswijk , Intelligent Software Systems, Computer Science Department, Faculty of Sciences, Utrecht University, The Netherlands. Sunday 21 st June, 2020 pitting games against
Gerard Vreeswijk, Intelligent Software Systems, Computer Science Department, Faculty of Sciences, Utrecht University, The Netherlands.
Sunday 21st June, 2020
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 2
pitting games against ea hAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 2
■ Until 2005, say, MAL research
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 2
■ Until 2005, say, MAL research
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 2
■ Until 2005, say, MAL research
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 2
■ Until 2005, say, MAL research
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 2
■ Until 2005, say, MAL research
■ Later, MAL algorithms were
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 2
■ Until 2005, say, MAL research
■ Later, MAL algorithms were
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 2
■ Until 2005, say, MAL research
■ Later, MAL algorithms were
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 2
■ Until 2005, say, MAL research
■ Later, MAL algorithms were
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 2
■ Until 2005, say, MAL research
■ Later, MAL algorithms were
■ We will look into the
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 2
■ Until 2005, say, MAL research
■ Later, MAL algorithms were
■ We will look into the
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 3
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 3
■ Axelrod: organising tournaments to let algorithms play the IPD.
Axelrod, Robert. The evolution of cooperation. (1984) New York: Basic Books.
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 3
■ Axelrod: organising tournaments to let algorithms play the IPD.
Axelrod, Robert. The evolution of cooperation. (1984) New York: Basic Books.
■ Zawadzki et al.: straight but thorough.
Zawadzki, Erik, Asher Lipson, and Kevin Leyton-Brown. “Empirically evaluating multiagent learning algorithms.” arXiv preprint arXiv:1401.8074 (2014).
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 3
■ Axelrod: organising tournaments to let algorithms play the IPD.
Axelrod, Robert. The evolution of cooperation. (1984) New York: Basic Books.
■ Zawadzki et al.: straight but thorough.
Zawadzki, Erik, Asher Lipson, and Kevin Leyton-Brown. “Empirically evaluating multiagent learning algorithms.” arXiv preprint arXiv:1401.8074 (2014).
■ Bouzy et al.: elimination (“knock-out”).
Bouzy, Bruno, and Marc Métivier. “Multi-agent learning experiments on repeated matrix games” in: Proc. of the 27th Int. Conf. on Machine Learning (2010).
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 3
■ Axelrod: organising tournaments to let algorithms play the IPD.
Axelrod, Robert. The evolution of cooperation. (1984) New York: Basic Books.
■ Zawadzki et al.: straight but thorough.
Zawadzki, Erik, Asher Lipson, and Kevin Leyton-Brown. “Empirically evaluating multiagent learning algorithms.” arXiv preprint arXiv:1401.8074 (2014).
■ Bouzy et al.: elimination (“knock-out”).
Bouzy, Bruno, and Marc Métivier. “Multi-agent learning experiments on repeated matrix games” in: Proc. of the 27th Int. Conf. on Machine Learning (2010).
■ Airiau et al.: evolutionary dynamics.
Airiau, Stéphane, Sabyasachi Saha, and Sandip Sen. “Evolutionary tournament-based comparison of learning and non-learning algorithms for iterated games” in: Journal of Artificial Societies and Social Simulation 10.3 (2007).
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 4
grand table head-to-head s o res p erfo rman e measures to evenAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 4
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 4
■ Entries are
p erfo rman e measures for the protagonist (row), whichAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 4
■ Entries are
p erfo rman e measures for the protagonist (row), whichAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 4
■ Entries are
p erfo rman e measures for the protagonist (row), which■ Often each entry is computed multiple times
to evenAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 4
■ Entries are
p erfo rman e measures for the protagonist (row), which■ Often each entry is computed multiple times
to even■ Sometimes there is a
settling-in phase (a.k.a. burn-in phase) in whichAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 5
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 6
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 7
Zero-Determinant strategiesAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 7
■ One game to test: the
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 7
■ One game to test: the
■ Contestants: 14 constructed
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 7
■ One game to test: the
■ Contestants: 14 constructed
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 7
■ One game to test: the
■ Contestants: 14 constructed
■ Grand table: all pairs play 200
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 7
■ One game to test: the
■ Contestants: 14 constructed
■ Grand table: all pairs play 200
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 7
■ One game to test: the
■ Contestants: 14 constructed
■ Grand table: all pairs play 200
■ Winner: Tit-for-tat.
Axelrod, Robert. "Effective choice in the prisoner’s dilemma." Journal of conflict resolution 24.1 (1980): 3-25.
Zero-Determinant strategiesAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 7
■ One game to test: the
■ Contestants: 14 constructed
■ Grand table: all pairs play 200
■ Winner: Tit-for-tat.
Axelrod, Robert. "Effective choice in the prisoner’s dilemma." Journal of conflict resolution 24.1 (1980): 3-25.
■ Second tournament: 64
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 7
■ One game to test: the
■ Contestants: 14 constructed
■ Grand table: all pairs play 200
■ Winner: Tit-for-tat.
Axelrod, Robert. "Effective choice in the prisoner’s dilemma." Journal of conflict resolution 24.1 (1980): 3-25.
■ Second tournament: 64
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 7
■ One game to test: the
■ Contestants: 14 constructed
■ Grand table: all pairs play 200
■ Winner: Tit-for-tat.
Axelrod, Robert. "Effective choice in the prisoner’s dilemma." Journal of conflict resolution 24.1 (1980): 3-25.
■ Second tournament: 64
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 7
■ One game to test: the
■ Contestants: 14 constructed
■ Grand table: all pairs play 200
■ Winner: Tit-for-tat.
Axelrod, Robert. "Effective choice in the prisoner’s dilemma." Journal of conflict resolution 24.1 (1980): 3-25.
■ Second tournament: 64
■ In 2012, Alexander Stewart and
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 8
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 9
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 9
■ Contestants: FP, Determinate,
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 9
■ Contestants: FP, Determinate,
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 9
■ Contestants: FP, Determinate,
■ Games: a suite of 13 interesting
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 9
■ Contestants: FP, Determinate,
■ Games: a suite of 13 interesting
■ Game pool: 600 games: 100
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 9
■ Contestants: FP, Determinate,
■ Games: a suite of 13 interesting
■ Game pool: 600 games: 100
■ Grand table: each algorithm
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 9
■ Contestants: FP, Determinate,
■ Games: a suite of 13 interesting
■ Game pool: 600 games: 100
■ Grand table: each algorithm
■ Evaluation: through
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 9
■ Contestants: FP, Determinate,
■ Games: a suite of 13 interesting
■ Game pool: 600 games: 100
■ Grand table: each algorithm
■ Evaluation: through
■ Conclusion: Q-learning is the
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 10
Mean reward over all opponents and games.
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 10
Mean reward over all opponents and games. Mean regret over all opponents and games.
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 11
Mean reward against different game suites.
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 11
Mean reward against different game suites. Mean reward against different opponents.
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 12
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 12
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 12
■ Compute the average difference ¯
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 12
■ Compute the average difference ¯
■ If the two series are generated by the same random process, the
test statisti t = ¯Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 12
■ Compute the average difference ¯
■ If the two series are generated by the same random process, the
test statisti t = ¯■ If t is too eccentric, then we’ll have to reject that possibility, since
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 13
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 14
test statisti empiri al umulative distribution fun tionsAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 14
■ Test whether two distributions are generated by the same random
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 14
■ Test whether two distributions are generated by the same random
■ The
test statisti is the maximum distance between the empiri al umulative distribution fun tions of the two samples.Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 14
■ Test whether two distributions are generated by the same random
■ The
test statisti is the maximum distance between the empiri al umulative distribution fun tions of the two samples.■ The p
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 15
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 15
■ The relation between game sizes and rewards.
p robabilisti ally dominate enfo r eable pa yAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 15
■ The relation between game sizes and rewards. Outcome: no relation.
p robabilisti ally dominate enfo r eable pa yAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 15
■ The relation between game sizes and rewards. Outcome: no relation. ■ The correlation between regret and average reward.
p robabilisti ally dominate enfo r eable pa yAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 15
■ The relation between game sizes and rewards. Outcome: no relation. ■ The correlation between regret and average reward. ■ The correlation between distance to nearest Nash and average reward.
p robabilisti ally dominate enfo r eable pa yAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 15
■ The relation between game sizes and rewards. Outcome: no relation. ■ The correlation between regret and average reward. ■ The correlation between distance to nearest Nash and average reward. ■ Which algorithms
p robabilisti ally dominate which other algorithms.Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 15
■ The relation between game sizes and rewards. Outcome: no relation. ■ The correlation between regret and average reward. ■ The correlation between distance to nearest Nash and average reward. ■ Which algorithms
p robabilisti ally dominate which other algorithms.Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 15
■ The relation between game sizes and rewards. Outcome: no relation. ■ The correlation between regret and average reward. ■ The correlation between distance to nearest Nash and average reward. ■ Which algorithms
p robabilisti ally dominate which other algorithms.■ the difference between average reward and maxmin value ( enfo
r eable pa yAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 15
■ The relation between game sizes and rewards. Outcome: no relation. ■ The correlation between regret and average reward. ■ The correlation between distance to nearest Nash and average reward. ■ Which algorithms
p robabilisti ally dominate which other algorithms.■ the difference between average reward and maxmin value ( enfo
r eable pa yAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 16
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 17
rankAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 17
■ Contestants: Minimax, FP, QL,
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 17
■ Contestants: Minimax, FP, QL,
■ Games: random 2-player,
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 17
■ Contestants: Minimax, FP, QL,
■ Games: random 2-player,
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 17
■ Contestants: Minimax, FP, QL,
■ Games: random 2-player,
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 17
■ Contestants: Minimax, FP, QL,
■ Games: random 2-player,
■ Grand table: each pair plays
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 17
■ Contestants: Minimax, FP, QL,
■ Games: random 2-player,
■ Grand table: each pair plays
■ Final ranking: UCB, M3, Sat,
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 17
■ Contestants: Minimax, FP, QL,
■ Games: random 2-player,
■ Grand table: each pair plays
■ Final ranking: UCB, M3, Sat,
■ Evaluation: Plot with x-axis =
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 17
■ Contestants: Minimax, FP, QL,
■ Games: random 2-player,
■ Grand table: each pair plays
■ Final ranking: UCB, M3, Sat,
■ Evaluation: Plot with x-axis =
■ Bouzy et al. are familiar with
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 17
■ Contestants: Minimax, FP, QL,
■ Games: random 2-player,
■ Grand table: each pair plays
■ Final ranking: UCB, M3, Sat,
■ Evaluation: Plot with x-axis =
■ Bouzy et al. are familiar with
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 17
■ Contestants: Minimax, FP, QL,
■ Games: random 2-player,
■ Grand table: each pair plays
■ Final ranking: UCB, M3, Sat,
■ Evaluation: Plot with x-axis =
■ Bouzy et al. are familiar with
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 18
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 18
■ Algorithm: Repeat:
eliminate b y lagAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 18
■ Algorithm: Repeat:
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 18
■ Algorithm: Repeat:
■ Final ranking: M3, Sat, UCB,
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 18
■ Algorithm: Repeat:
■ Final ranking: M3, Sat, UCB,
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 18
■ Algorithm: Repeat:
■ Final ranking: M3, Sat, UCB,
■ Algorithm: Repeat:
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 18
■ Algorithm: Repeat:
■ Final ranking: M3, Sat, UCB,
■ Algorithm: Repeat:
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 18
■ Algorithm: Repeat:
■ Final ranking: M3, Sat, UCB,
■ Algorithm: Repeat:
p(np − 1)2
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 18
■ Algorithm: Repeat:
■ Final ranking: M3, Sat, UCB,
■ Algorithm: Repeat:
p(np − 1)2
■ Final ranking: M3, Sat, UCB,
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 19
Ranking evolution according to the number of steps played in games (logscale). The key is ordered according to the final ranking.
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 20
Ranking based on eliminations (logscale). The key is ordered according to the final ranking.
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 21
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 21
■ Only cooperative games (shared payoffs): Exp3, M3, Bully, JR, . . .
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 21
■ Only cooperative games (shared payoffs): Exp3, M3, Bully, JR, . . . ■ Only competitive games (zero-sum payoffs): Exp3, M3, Minimax, JR,
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 21
■ Only cooperative games (shared payoffs): Exp3, M3, Bully, JR, . . . ■ Only competitive games (zero-sum payoffs): Exp3, M3, Minimax, JR,
■ Specific matrix games: penalty game, climbing game, coordination
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 21
■ Only cooperative games (shared payoffs): Exp3, M3, Bully, JR, . . . ■ Only competitive games (zero-sum payoffs): Exp3, M3, Minimax, JR,
■ Specific matrix games: penalty game, climbing game, coordination
■ Different number of actions (n × n games).
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 21
■ Only cooperative games (shared payoffs): Exp3, M3, Bully, JR, . . . ■ Only competitive games (zero-sum payoffs): Exp3, M3, Minimax, JR,
■ Specific matrix games: penalty game, climbing game, coordination
■ Different number of actions (n × n games).
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 21
■ Only cooperative games (shared payoffs): Exp3, M3, Bully, JR, . . . ■ Only competitive games (zero-sum payoffs): Exp3, M3, Minimax, JR,
■ Specific matrix games: penalty game, climbing game, coordination
■ Different number of actions (n × n games).
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 22
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 23
tness-p ropAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 23
■ Contestants: Maxmin, Nash,
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 23
■ Contestants: Maxmin, Nash,
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 23
■ Contestants: Maxmin, Nash,
■ Games: 57 distinct 2 × 2 normal
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 23
■ Contestants: Maxmin, Nash,
■ Games: 57 distinct 2 × 2 normal
■ Grand table: Each algorithm
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 23
■ Contestants: Maxmin, Nash,
■ Games: 57 distinct 2 × 2 normal
■ Grand table: Each algorithm
■ Methodology = evolutionary
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 23
■ Contestants: Maxmin, Nash,
■ Games: 57 distinct 2 × 2 normal
■ Grand table: Each algorithm
■ Methodology = evolutionary
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 23
■ Contestants: Maxmin, Nash,
■ Games: 57 distinct 2 × 2 normal
■ Grand table: Each algorithm
■ Methodology = evolutionary
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 23
■ Contestants: Maxmin, Nash,
■ Games: 57 distinct 2 × 2 normal
■ Grand table: Each algorithm
■ Methodology = evolutionary
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 23
■ Contestants: Maxmin, Nash,
■ Games: 57 distinct 2 × 2 normal
■ Grand table: Each algorithm
■ Methodology = evolutionary
■ Final ranking: BRFP, FP, Saby,
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 24
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 25
With tournament selection.
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 25
With tournament selection. With modified tournament selection.a
aModified tournament selection is a hybrid of fitness-proportionate
selection and 2-sample tournament selection. Cf. Sec. 2.7. of Airiau et al.’ 2007 paper.
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 26
SN ESS NSS GSS ASS LSS LP NE FP * i * * SN = strict Nash, ESS - evol’y stable strategy, GSS = glob’y stable state, ASS = asymp’y stable state, NSS = neutrally stable strategy, LP = limit point, LSS = Lyapunov stable state, NE = Nash eq., FP = fixed point, * = only if fully mixed, i = isolated Nash eq. Dotted lines are indirect implications.
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 27
grand tableAuthor: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 27
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 27
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 27
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 27
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 27
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 28
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 29
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 29
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 29
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 29
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 29
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 29
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 29
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 29
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 29
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 29
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 29
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 30
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 30
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 30
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 30
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 30
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 30
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 30
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 30
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 30
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 30
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 30
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 30
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 30
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 30
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 30
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 30
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 31
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 31
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 31
Author: Gerard Vreeswijk. Slides last modified on June 21st, 2020 at 21:18 Multi-agent learning: Comparing algorithms empirically, slide 32