g ame t heory 1
play

G AME T HEORY 1 I NSTRUCTOR : G IANNI A. D I C ARO I CE - CREAM W ARS - PowerPoint PPT Presentation

15-382 C OLLECTIVE I NTELLIGENCE S18 L ECTURE 26: G AME T HEORY 1 I NSTRUCTOR : G IANNI A. D I C ARO I CE - CREAM W ARS http://youtu.be/jILgxeNBK_8 2 G AME T HEORY Game theory is the formal study of conflict and cooperation in (rational)


  1. 15-382 C OLLECTIVE I NTELLIGENCE – S18 L ECTURE 26: G AME T HEORY 1 I NSTRUCTOR : G IANNI A. D I C ARO

  2. I CE - CREAM W ARS http://youtu.be/jILgxeNBK_8 2

  3. G AME T HEORY  Game theory is the formal study of conflict and cooperation in (rational) multi-agent systems  Decision-making where several players must make choices that potentially affect the interests of other players: the effect of the actions of several agents are interdependent (and agents are aware of it)  Example: Auctioning! Psychology: Theory of social situations 15781 Fall 2016: Lecture 22 3

  4. E LEMENTS OF A G AME  The players: how many players are there? Does nature/chance play a role? Players are assumed to be rational  A complete description of what the players can do: the set of all possible actions. 15781 Fall 2016: Lecture 22 4

  5. E LEMENTS OF A G AME  A description of the payoff / consequences for each player for every possible combination of actions chosen by all players playing the game.  A description of all players’ preferences over payoffs Utility function for each player 15781 Fall 2016: Lecture 22 5

  6. A GENT VS . M ECHANISM D ESIGN  Agent strategy design: Game theory can be used to compute the expected utility for each decision, and use this to determine the best strategy (and its expected return) against a rational player Strategy ≡ Policy  System-level mechanism design: Define the rules of the game, such that the collective utility of the agents is maximized when each agent strategy is designed to maximize its own 15781 Fall 2016: Lecture 22 utility according to ASD 6

  7. M AKING DECISIONS : B ASIC DEFINITIONS  Decision-making can involve choosing:  one single action or  a sequence of actions  Action outcomes can be certain or subject to uncertainty  A set 𝐵 of alternative actions to choose from is given, it can be either discrete or continuous  Payoff (for a single agent): function 𝜌: 𝐵 → ℝ that associates a numerical values with every action in 𝐵 Optimal action 𝑏 ∗ (for a single agent scenario): 𝜌(𝑏 ∗ ) ≥ 𝜌 𝑏  ∀𝑏 ∈ 𝐵  Payoff (for a multi-agent scenario): The payoff of the action 𝑏 for agent 𝑗 depends on the actions of the other players! 𝜌: 𝐵 𝑜 → ℝ  Strategy: rule for choosing an action at every point a decision might have to be made (depending or not on the other agents)  The strategy defines the behavior of an agent  The observed behavior of an agent following a given strategy is the outcome of the strategy 7

  8. P URE VS . R ANDOMIZED STRATEGIES  Pure strategy: a strategy in which there is no randomization , one specific action is selected with certainty at each decision node  All possible pure strategies define the pure strategy set 𝑇  A decision tree can be used to represent a sequence of decisions 1 1 𝑏 1 𝑏 2 𝑏 1 𝑏 2 3 2 2 𝑐 1 𝑐 2 𝑐 1 𝑐 2 𝑑 1 𝑑 2 3 𝑑 1 𝑑 2 𝐵 1 = 𝑏 1 , 𝑏 2 , 𝐵 2 = 𝑐 1 , 𝑐 2 , 𝐵 3 = 𝑑 1 , 𝑑 2  Three action sets (actions may the be same), that result in the pure strategy set: 𝑇 = {𝑏 1 𝑐 1 𝑑 1 , 𝑏 1 𝑐 1 𝑑 2 , 𝑏 1 𝑐 2 𝑑 1 , 𝑏 1 𝑐 2 𝑑 2 , 𝑏 2 𝑐 1 𝑑 1 , 𝑏 2 𝑐 1 𝑑 2 , 𝑏 2 𝑐 2 𝑑 1 , 𝑏 2 𝑐 2 𝑑 2 } 8

  9. P URE VS . R ANDOMIZED STRATEGIES  In a game, we may observe only a subset of the possible outcomes of a strategy, depending on starting conditions and strategies from other agents 1  Strategies that give the same outcome lead to the same payoff 𝑏 1 𝑏 2  Reduced strategy set: the set formed by all pure 2 strategies that lead to indistinguishable outcomes 𝑐 1 𝑐 2  Let the pure strategy set be {𝑏 1 , 𝑏 2 }, the behavior specifies using 𝑏 1 with probability 𝑞 , and 𝑏 2 with 3 probability 𝑞 − 1 𝑑 1 𝑑 2  A mixed strategy 𝛾 specifies the probability 𝑞(𝑡) with which each of the pure strategies 𝑡 ∈ 𝑇 are used  Payoff for using 𝛾 (for a single agent): 𝜌 𝛾 = σ 𝑏∈𝐵 𝑞(𝑏)𝜌 𝑏  Payoff in an uncertain world : 𝜌 𝛾|𝑦 = σ 𝑏∈𝐵 𝑞(𝑏)𝜌 𝑏|𝑦 , 𝑦 is the state 9

  10. S TRATEGIES (P OLICIES )  Strategy: tells a player what to do for every possible situation throughout the game (complete algorithm for playing the game). It can be deterministic or stochastic  Strategy set: what strategies are available for the players to play. The set can be finite or infinite (e.g., beach war game)  Strategy profile: a set of strategies for all players which fully specifies all actions in a game. A strategy profile must include one and only one strategy for every player  Pure strategy: one specific element from the strategy set, a single strategy which is played 100% of the time ( deterministic )  Mixed strategy: assignment of a probability to each pure strategy. Pure strategy ≡ degenerate case of a mixed strategy ( stochastic ) 15781 Fall 2016: Lecture 22 10

  11. I NFORMATION  Complete information game: Utility functions, payoffs, strategies and “types” of players are common knowledge  Incomplete information game: Players may not possess full information about their opponents (e.g., in auctions, each player knows its utility but not that of the other players). “ Parameters ” of the game are not fully known  Perfect information game: Each player, when making any decision, is perfectly informed of all the events that have previously occurred (e.g., chess) [Full observability]  Imperfect information game: Not all information is accessible to the player (e.g., poker, prisoner’s dilemma) [Partial observability] 15781 Fall 2016: Lecture 22 11

  12. T URN - TAKING VS . S IMULTANEOUS MOVES  Static games  All players take actions “simultaneously” Morra  → Imperfect information games  Complete information  Single-move games  Dynamic games max  Turn-taking games min  Fully observable ↔ Perfect Information Games  Complete Information 10 10 9 100 15781 Fall 2016: Lecture 22  Repeated moves 12

  13. (S TRATEGIC -) N ORMAL -F ORM G AME  Let’s focus on static games Payoff matrix  There is a strategic interaction among players  A game in normal form consists of: o Set of players 𝑂 = {1, … , 𝑜} o Strategy set 𝑇 o For each 𝑗 ∈ 𝑂 , a utility function 𝑣 𝑗 defined over the set of all possible strategy profiles , 𝑣 𝑗 : 𝑇 𝑜 → ℝ o If each player 𝑘 ∈ 𝑂 plays the strategy 𝑡 𝑘 ∈ 𝑇 , the utility of player 𝑗 is 𝑣 𝑗 𝑡 1 , … , 𝑡 𝑜 that is the same as player 𝑗 ’ s payoff when strategy profile (𝑡 1 , … , 𝑡 𝑜 ) is chosen 15781 Fall 2016: Lecture 22 13

  14. T HE I CE C REAM W ARS  𝑂 = 1,2  𝑇 = [0,1]  𝑡 i is the fraction of beach  ….. 𝑡 𝑗 +𝑡 𝑘 , 𝑡 𝑗 < 𝑡 𝑘 2 𝑡 𝑗 +𝑡 𝑘 • 𝑣 𝑗 𝑡 𝑗 , 𝑡 𝑘 = 1 − , 𝑡 𝑗 > 𝑡 𝑘 2 1 2 , 𝑡 𝑗 = 𝑡 𝑘 15781 Fall 2016: Lecture 22 14

  15. T HE PRISONER ’ S DILEMMA (1962)  Two men are charged with a crime  They can’t communicate with each other  They are told that: o If one rats out and the other does not, the rat will be freed, 6 6 9 other jailed for 9 years o If both rat out, both will be jailed for 6 years 9  They also know that if neither rats out, both will be jailed for 1 year 15781 Fall 2016: Lecture 22 15

  16. T HE PRISONER ’ S DILEMMA (1962) 15781 Fall 2016: Lecture 22 16

  17. P RISONER ’ S DILEMMA : P AYOFF MATRIX Don’t confess = Don’t rat out Don’t B Cooperate with each other Confess Confess Confess = Defect Don’t cooperate to each other, act selfishly! Don’t -1,-1 -9,0 Confess A Confess 0,-9 -6,-6 What would you do? 15781 Fall 2016: Lecture 22 17

  18. P RISONER ’ S DILEMMA : P AYOFF MATRIX B Don’t confess: B • If A don’t confess, B gets -1 • If A confess, B gets -9 Don’t Confess Confess B Confess: Don’t • If A don’t confess, B gets 0 • If A confess, B gets -6 -1,-1 -9,0 Confess A Confess 0,-9 -6,-6 Rational agent B opts to confess 15781 Fall 2016: Lecture 22 18

  19. P RISONER ’ S DILEMMA  Confess (Defection, Acting selfishly) is a dominant strategy for B : no matters what A plays, the best reply strategy is always to confess  (Strictly) dominant strategy : yields a player strictly higher payoff, . no matter which decision(s) the other player(s) choose  Weakly: ties in some cases  Confess is a dominant strategy also for A  A will reason as follows: B ’s dominant strategy is to Confess, therefore, given that we are both rational agents, B will also Confess and we will both get 6 years. 15781 Fall 2016: Lecture 22 19

  20. P RISONER ’ S DILEMMA  But, is the dominant strategy (C,C) the best strategy? Don’t B Confess Confess Don’t -1,-1 -9,0 Confess A Confess 0,-9 -6,-6 15781 Fall 2016: Lecture 22 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend