outline overview of game theory
play

Outline Overview of Game Theory A. Introduction Models of - PowerPoint PPT Presentation


  1. ✙ ✕ � ✥ ✦ ✧ ★ ✩ ✪ ✪ ✔ � ✆ ✪ ✎ ★ ✩ � ✪ ✡ ✆ � ✚ ✑ ✗ ✜ ✖ ✛ ✗ ✙ ✚ ✘ ✛ ✖ ✗ ✘ ✖ ✘ ✖ ✗ ✖ ✖ ✜ ✙ ✏ ✑ ✘ ✩ ✦ ✧ ✁ ★ ✪ ✩ ✪ � ✪ ✟ ★ ✪ � ✪ ✩ ✪ ★ ✫ ✬ ✟ ✡ ✩ ✬ ✝ ✆ ✪ ✆ ✪ ✎ ★ ✫ � ✂ ✄ � ✥ ☞ ✡ ✟ ✟ ✟ ✝ ☛ ✙ Outline Overview of Game Theory A. Introduction Models of Interaction B. Single Agent Learning – Normal-Form Games C. Game Theory – Repeated Games D. Multiagent Learning – Stochastic Games E. Future Issues and Open Problems Solution Concepts SA3 – C1 SA3 – C2 Normal-Form Games Example — Rock-Paper-Scissors Two players. Each simultaneously picks an action: A normal-form game is a tuple ✆✞✝✠✟ , ✂☎✄ Rock , Paper , or Scissors . is the number of players, The rewards: is the set of actions available to player ✆✍✌ Rock beats Scissors is the joint action space , – ✏✒✑ Scissors beats Paper is player ’s payoff function . ☛✓✌ Paper beats Rock The matrices: R P S R P S . . . . R R . . . . . . . . . . . . P P ✢✤✣ ✢✤✭ . . . . S S . . SA3 – C3 SA3 – C4

  2. ✁ ✞ ✥ � ✪ ★ ★ ✪ ✁ ✆ ✝ ✟ ✠ ✞ ✟ ☞ ✡☛ ✥ � ✪ ★ ★ ✪ ✡☛ ✟ ✢ ★ ✄ ☎ ★ ✪ ✆ � � � � ✁ ★ ✞ ✪ ✪ � ✥ ✡☛ ☞ � ✆✝ ✞ ✟ ✁ ✭ ✥ ✞ � ★ ✪ ★ ✪ ✁ ✟ ✆✝ ✞ ✟ ✟ ✥ ✠ ✡☛ ✥ ✞ ✝ � ★ ★ ✪ ✪ ✞ ✡☛ ✆✝ ★ ✞ ✟ ✞ ✟ ✠ ✡☛ ✥ ✟ � ✪ ✪ ☞ ★ ✁ ✢ ✭ ✆ ✝ ✞ ✟ ✞ ✟ � ✁ ✁ ★ ✪ � ✢ ✣ ✥ ✥ ✁ ✪ � ✂ ★ ★ ★ ✪ ✁ ✁ ✂ � ✢ ✭ ✥ ✥ ✣ ✢ � � ✁ ✪ ✂ � ✪ ★ ✪ ✪ ✢ ✣ ☎ ✥ � ★ ✪ ✩ ✄ � ✪ ✩ ✪ ✁ ✩ ✢ ✭ ✥ ✥ � ✩ � ✪ ✪ ✁ ✂ ★ ✪ ★ ★ More Examples More Examples Matching Pennies Prisoner’s Dilemma H T H T C D C D H H C C ✢✤✣ ✢✤✭ T T D D Coordination Game Three-Player Matching Pennies A B A B A A B B Bach or Stravinsky B S B S B B ✢✤✭ S S SA3 – C5 SA3 – C6 Three-Player Matching Pennies Three-Player Matching Pennies Three players. Each simultaneously picks an action: The matrices: H T H T Heads or Tails . H H ✢✤✣ ✢✤✣ T T The rewards: H H Player One wins by matching Player Two, T T Player Two wins by matching Player Three, H H Player Three wins by not matching Player One. ✢ ✍✌ ✢ ✍✌ T T SA3 – C7 SA3 – C8

  3. ✓ ☛ ✝ � ✤ ✍ ✂ ✛ ☛ ✝ ✣ ✎ � ✛ � ✌ ✞ ✁ ✌ ✁ ✎ ✎ ✄ ✗ ✝ ✘ ✙ ✝ ✗ ✚ � ✢ � ✣ ✏ ✛ ✚ ✙ � ✛ ✢ ✂ ✌ ☛ ✡ ✌ ✄ ☞ ✁ ✁ ✝ ☎ ✄ ✂ ✁ ☞ ✁ ✁ � ✌ ✁ ✌ � � ✆ ✁ � ☞ ✌ ✁ ✌ ✎ ✠ Strategies Strategies What can players do? Notation. – Pure strategies ( ): select an action. is a joint strategy for all players. – – Mixed strategies ( ): select an action according ☛✓✌ ☛✓✌ to some probability distribution. – is a joint strategy for all players except . ✁✟✞ – is the joint strategy where uses strategy ✁✟✞ and everyone else . SA3 – C9 SA3 – C10 Types of Games Repeated Games Zero-Sum Games (a.k.a. constant-sum games) You can’t learn if you only play a game once. ☛ ✌☞ Repeatedly playing a game raises new questions. Examples: Rock-paper-scissors, matching pennies. – How many times? Is this common knowledge? Team Games Finite Horizon Infinite Horizon ☛ ✒✑ – Trading off present and future reward? Examples: Coordination game. ✗✜✛ ✔✖✕ General-Sum Games (a.k.a. all games) Examples: Bach or Stravinsky, three-player matching Average Reward Discounted Reward pennies, prisoner’s dilemma SA3 – C11 SA3 – C12

  4. ✄ ✑ ✙ ✗ ✓ ✓ ✗✘ ✓ ✕ ✎ ✁ ✒ ✂ ✄ ✡ ✄ ✆ ✝ ✟ ✟ ✟ ✏ ✎ ✄ ✄ ✌ ✄ ★ ☎ ✪ ✁ ✥ � ☎ ✘ ★ ✪ ✁ � ✠ ✙ ✠ ✚ ✎ ✡ ☛ � � ✝ ✎ ✑ ✑ ✏ ✆ ✡ � ☛ ✡ ✍ ✡ ✏ ✆ ✏ ✡ ✔ ☞ ✍ ✆ ✆ ✄ ✂ ☛ ✝ ✟ ✟ ✟ ✡ ☞ ✏ � ✎ ✏ � ✡ ✕ � ✆ ✌ ✔ ✎ ✆ ✥ � ✒ ✓ ☞ ✂ ✁ ✁ � ✡ ☞ ✞ � ✏ ✎ � ✝ ✟ ✟ ✟ ✡ ✡ ✄ ✆ ✝ ✡ ✢ ✎ � ✡ ✟ ✟ ✡ ✆ ✑ ✁ ✌ ✁ � ✝ ✄ ✑ ✑ ✝ ✁ ✆ ✂✄ ✟ ✌ ✆ ✄ ✁ ✑ ✑ ✑ ✄ ☞ ✔ � ✡ ✁ � ☞ ✌ ✁ ✞ ✡ ✌ � ✁ � ✝ ✄ ✑ ✑ ✑ ✄ ✡ ✁ ☞ ✂ ☎ ✙ ✂ ✁ ✁ � ✆ Repeated Games — Strategies Repeated Games — Examples What can players do? Iterated Prisoner’s Dilemma C D C D – Strategies can depend on the history of play. C C ✢✤✣ ✢✤✭ D D where – The single most examined repeated game! – Repeated play can justify behavior that is not – Markov strategies a.k.a. stationary strategies rational in the one-shot game. ✝✠✟ – Tit-for-Tat (TFT) – -Markov strategies Play opponent’s last action (C on round 1). A 1-Markov strategy. SA3 – C13 SA3 – C14 Stochastic Games Stochastic Games — Definition A stochastic game is a tuple , is the number of agents, is the set of states, is the set of actions available to agent , MDPs Repeated Games - Single Agent - Multiple Agent – is the joint action space , ✏✒✑ - Multiple State - Single State is the transition function , is the reward function for the th agent . ☛✓✌ . Stochastic Games ✒✔✓✖✕ ✘✜✛ . . . . . . . - Multiple Agent ✓✖✕ . . - Multiple State . SA3 – C15 SA3 – C16

  5. ✄ ✩ ✟ ✪ ★ ✑ ☞ ✞ ✩ ✂ ★ ✪ ✏ � ✎ ✄ ✪ ✟ ✪ ★ ✣ ✣ ✒ ✪ ★ ✪ ✂ ✝ ✝ ✞ ✟ ✄ ✝ ✡ ☛ � ☞ ✄ ✪ ☞ ✄ ✩ ✂ ✩ ✡ ✎ ✌ ✄ ✟ ✟ ★ ✝ ✆ ✌ ✄ ✄ ✂ ☞ ✂ ✄ ✠ ✚ ✝ ✄ ✘ ✚ ✌ ✁ ✄ ✄ � ☞ ☛ ✁ � ✄ ✣ ✂ ✣ ☞ ✌ ✩ ✓ ✎ ✩ ✂ ✔ � � ✂ ✁ ✌ � ✕ ✖ ✕ ✂ ✌ � ✗ ☛ ✞ ✠ ✆ ✌ ✎ ✁ ✄ ✁ ✂ ✝ ✁ ✝ ✡ � ✁ ☞ ✁ ✄ ✄ ☞ ✂ ✁ ✁ ✂ ✄ ✡ ✆ ☞ ✆ � ✌ � ✁ ✏ ✡ ✔ ✂✄ ✁ ✌ ✏ ☞ ✁ ✂ ✙ ☎ ✡ ✢ ✆ ✁ ✡ ✄ � � � � ✌ ✂ � � � � � ✑ ✑ � ✑ � ✂ Stochastic Games — Policies Example — Soccer (Littman, 1994) What can players do? A – Policies depend on history and the current state. B where Players: Two. States: Player positions and ball possession (780). – Markov polices a.k.a. stationary policies Actions: N, S, E, W, Hold (5). Transitions: ✎ ☎✄ – Simultaneous action selection, random execution. – Collision could change ball possession. – Focus on learning Markov policies, but the learning itself is a non-Markovian policy. Rewards: Ball enters a goal. SA3 – C17 SA3 – C18 Example — Goofspiel Stochastic Games — Facts Players hands and the deck have cards . If , it is an MDP . Card from the deck is bid on secretly. If , it is a repeated game. Highest card played gets points equal to the card from the deck. If the other players play a stationary policy, it is an Both players discard the cards bid. MDP to the remaining player. Repeat for all deck cards. � ✟✞ S IZEOF ( or ) V(det) V(random) ☎ ✙✘ 4 692 15150 59KB ✂ ✍✌ – The interesting case, then, is when the other 8 47MB ★ ✍✌ 13 2.5TB agents are not stationary, i.e., are learning. SA3 – C19 SA3 – C20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend