Game Theory Preliminaries: Playing and Solving Games Zero-sum - PDF document

Game Theory Preliminaries: Playing and Solving Games Zero-sum games with perfect information R&N 6 • Definitions • Game evaluation • Optimal solutions – Minimax • Non-deterministic games (first take) 1

Types of Games (informal) Chance Deterministic Chess, Perfect Backgammon, Checkers Information Monopoly Go Bridge, Poker, Imperfect Scrabble, Battleship Information wargames Types of Games (informal) Chance Deterministic Chess, Perfect Backgammon, Checkers Information Monopoly Go Bridge, Poker, Imperfect Scrabble, Battleship Information wargames Note: This initial material uses the common definition of what a “game” is. More interesting is the generalization of the theory to scenarios that are far more useful to a wide range of decision making problems. Stay tuned…. 2

Definitions • Two-player game : Player A and B. Player A starts. • Deterministic : None of the moves/states are subject to chance (no random draws). • Perfect information : Both players see all the states and decisions. Each decision is made sequentially . • Zero-sum : Player’s A gain is exactly equal to player B’s loss. One of the player’s must win or there is a draw (both gains are equal). Example • Initially a stack of pennies stands between two players • Each player divides one of the current stacks into two unequal stacks. • The game ends when every stack contains one or two pennies • The first player who cannot play loses A B 3

7 A’s turn 6, 1 5, 2 B’s turn 4, 3 5, 1, 1 4, 2, 1 3, 2, 2 3, 3, 1 A’s turn 4, 1, 1, 1 3, 2, 1, 1 2, 2, 2, 1 B’s turn B Loses 3, 1, 1, 1, 1 2, 2, 1, 1, 1 A’s turn A Loses 2, 1, 1, 1, 1, 1 B’s turn B Loses Search Problem • States : Board configuration + next player to move • Successor : List of states that can be reached from the current state through of legal moves • Terminal state : States at which the games ends • Payoff/Utility : Numerical value assigned to each terminal state. Example: – U(s) = +1 for A win, -1 for B win, 0 for draw • Game value: The value of a terminal that will be reached assuming optimal strategies from both players ( minimax value) • Search : Find move that maximizes game value from current state 4

U = +1 2, 2, 2, 1 U = -1 2, 2, 1, 1, 1 U = +1 2, 1, 1, 1, 1, 1 Optimal (minimax) Strategies • Search the game tree such that: – A’s turn to move � find the move that yields maximum payoff from the corresponding subtree � This is the move most favorable to A – B’s turn to move � find the move that yields minimum payoff (best for B) from the corresponding subtree � This is the move most favorable to B 5

Minimax Minimax ( s ) If s is terminal Return U ( s ) If next move is A ( ) Return max Minimax s ' s ' ∈ Succs ( s ) Else ( ) min Minimax s ' Return s ' ∈ Succs ( s ) A 3 = max(3,2,2) B 3 = 2 2 min(3,12,8) 14 5 2 3 12 8 2 4 6 6

Minimax Properties • Complete: If finite game • Optimal: If opponent plays optimally • Essentially DFS • Efficiency: – αβ pruning – Use heuristic evaluation functions to cut off search early – Example: Weighted sum of number of pieces (material value of state) – Stop search based on cutoff test (e.g., maximum depth) Choice of Value? • Absolute game value is different in the two cases • Minimax solution is the same • Only the relative ordering of values matters, not the absolute values � ordinal utility values • True only for deterministic games • Evaluation functions can be any function that preserves the ordering of the utility values 7

Non-Deterministic Games Non-Deterministic Games A Chance B 8

Non-Deterministic Games Use expected value of Includes states where neither player makes successors at chance nodes: A � a choice. A random p ( s ' ) MiniMax ( s ' ) decision is made (e.g., rolling dice) s ' ∈ Succs ( s ) Chance B Non-Deterministic Minimax Minimax ( s ) If s is terminal Return U ( s ) ( ) max Minimax s ' If next move is A: Return s ' ∈ Succs ( s ) ( ) min Minimax s ' If next move is B Return s ' ∈ Succs ( s ) � ( ) ( ) p s ' Minimax s ' If chance node Return s ' ∈ Succs ( s ) 9

Choice of Utility Values • Different utility values may yield radically different result even though the order is the same � Absolute utility values do matter • Utility should be proportional to actual payoff, it is not sufficient to follow the same order • Think of choosing between 2 lotteries with same odds but radically different payoff distributions • Implication: Evaluation functions must be linear positive functions of utility • Kind of obvious but important consideration for later developments 10

• Definitions • Game evaluation • Optimal solutions – Minimax • Non-deterministic games Matrix Form of Games R&N Chapter 6 R&N Section 17.6 11

• Assumptions so far: – Two-player game : Player A and B. – Perfect information : Both players see all the states and decisions. Each decision is made sequentially . – Zero-sum : Player’s A gain is exactly equal to player B’s loss. • We are going to eliminate these constraints. We will eliminate first the assumption of “perfect information” leading to far more realistic models. – Some more game-theoretic definitions � Matrix games – Minimax results for perfect information games – Minimax results for hidden information games Player A 1 R L Player B 3 2 R R L L Player A 4 +2 +2 +5 L Extensive form of game: Represent the game by a tree -1 +4 12

A pure strategy for a player 1 defines the move that the R L player would make for every possible state that the player 3 2 would see. R R L L 4 +2 +2 +5 L -1 +4 Pure strategies for A: 1 Strategy I: (1 � L,4 � L) R Strategy II: (1 � L,4 � R) L Strategy III: (1 � R,4 � L) 3 2 Strategy IV: (1 � R,4 � R) Pure strategies for B: R R L L Strategy I: (2 � L,3 � L) Strategy II: (2 � L,3 � R) 4 +2 +2 +5 Strategy III: (2 � R,3 � L) R L Strategy IV: (2 � R,3 � R) -1 +4 In general: If N states and B moves, how many pure strategies exist? 13

Matrix form of games Pure strategies for A: Pure strategies for B: Strategy I: (1 � L,4 � L) Strategy I: (2 � L,3 � L) Strategy II: (1 � L,4 � R) Strategy II: (2 � L,3 � R) Strategy III: (1 � R,4 � L) Strategy III: (2 � R,3 � L) 1 Strategy IV: (1 � R,4 � R) Strategy IV: (2 � R,3 � R) R L 3 I II III IV 2 L R R L I -1 -1 +2 +2 4 II +4 +4 +2 +2 +1 +2 +5 R L III +5 +1 +5 +1 -1 +4 IV +5 +1 +5 +1 Pure strategies for Player B Player A’s payoff I II III IV Pure strategies if game is played for Player A I -1 -1 +2 +2 with strategy I by Player A and II +4 +4 +2 +2 strategy III by III +5 +1 +5 +1 Player B IV +5 +1 +5 +1 • Matrix normal form of games: The table contains the payoffs for all the possible combinations of pure strategies for Player A and Player B • The table characterizes the game completely, there is no need for any additional information about rules, etc. • Although, in many cases, the number of pure strategies may be too large for the table to be represented explicitly, the matrix representation is the basic representation that is used for deriving fundamental properties of games. 14

Minimax � Matrix version I II III IV I -1 -1 +2 +2 -1 Max value of all the rows +2 II +4 +4 +2 +2 III +5 +1 +5 +1 +1 +1 IV +5 +1 +5 +1 Min value across each row Max Min M ( i , j ) Rows i Columns j Minimax � Matrix version Max value = I II III IV game value = +2 • For each strategy (each row of the -1 I -1 -1 +2 +2 game matrix), Player A should assume that Player B will use the optimal strategy given Player A’s +2 II +4 +4 +2 +2 strategy (the strategy with the minimum value in the row of the matrix). Therefore the best value +1 III +5 +1 +5 +1 that Player can achieve is the maximum over all the rows of the minimum values across each of the +1 IV +5 +1 +5 +1 rows: Max Min i j M ( , ) Min value across each row Rows i Columns j • The corresponding pure strategy is the optimal solution for this game � It is the optimal strategy for A assuming that B plays optimally. 15

I II III IV Max value across I -1 -1 +2 +2 each column II +4 +4 +2 +2 III +5 +1 +5 +1 IV +5 +1 +5 +1 +5 +4 +5 +2 Min of all the columns Min Max M ( i , j ) Columns j Rows i Minimax or Maximin? Max value across each column • But we could have used the I II III IV opposite argument: • For each strategy (each column I -1 -1 +2 +2 of the game matrix), Player B should assume that Player A will use the optimal strategy II +4 +4 +2 +2 given Player B’s strategy (the strategy with the maximum value in the column of the III +5 +1 +5 +1 matrix): Min Max M ( i , j ) IV +5 +1 +5 +1 Columns j Rows i +5 +4 +5 +2 • Therefore the best value that Player B can achieve is the minimum over all the columns Min value = of the maximum values across game value = +2 each of the columns • Problem: Do we get to the same result?? • Is there always a solution? 16

Game Theory Preliminaries: Playing and Solving Games Zero-sum - PDF document

Game Theory Preliminaries: Playing and Solving Games Zero-sum games with perfect information R&N 6 Definitions Game evaluation Optimal solutions Minimax Non-deterministic games (first take) 1 Types of Games

Inductive general game playing Andrew Cropper, Richard Evans, and Mark Law General game playing

e-Bug Junior Game Junior Game Game Style Game Process Demo Game Mechanics and

Game Playing Why do AI researchers study game playing? 1. Its a good reasoning problem, formal

Set 4: Game-Playing ICS 271 Fall 2016 Kalev Kask Overview Computer programs that play

Inductive general game playing Andrew Cropper, Richard Evans, and Mark Law Inverse general game

KR-Techniques for General Game Playing Michael Thielscher Roadmap 1. General Game Playing a

e-Bug Senior Game Senior Game Game Style Game Process Demo Game Puzzles and

Inductive general game playing Andrew Cropper, Richard Evans, and Mark Law Learning game rules

Game Playing Philipp Koehn 27 February 2019 Philipp Koehn Artificial Intelligence: Game Playing

Game Playing Game playing AI Class 8 Ch. 5.1-5.3, 5.4.1, 5.5 State of the art and

Game interoperability with functors functor AgsFun (structure Game : GAME) :> sig structure

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

Game theory for wireless networks static games; dynamic games; repeated games; strict and weak

Game Playing Tail end of Constraint Satisfaction Ch. 5.1-5.3, 5.4.1, 5.5 Questions Game

4 Game Trees Game tree 4 Game Trees Game tree perfect information games perfect

Game playing Chapter 6 Chapter 6 1 Outline Games Perfect play minimax decisions

Lecture 10 Subgame-perfect Equilibrium 14.12 Game Theory Muhamet Yildiz 1 Road Map 1.

DICOM Second Generation Radiotherapy Enhanced RT Image Ulrich Busch Chairman DICOM WG-07

301AA - Advanced Programming Lecturer: Andrea Corradini andrea@di.unipi.it

Lambda Expressions CSC 203 Christopher Siu 1 / 12 Lambda Expressions A function is a mapping of

Game Theory (More examples, PoA, PoS) CSC304 - Nisarg Shah 1 Recap Normal form games

Game Theory and Strategic Analysis 11. Static Games and Nash Equilibrium 12.

Algorithmic Game Theory T HANKS TO P ROF . J ASON H ARTLINE

Decisions with Multiple Agents: Game Theory Alice Gao Lecture 23 Based on work by K.

Game Theory Preliminaries: Playing and Solving Games Zero-sum - PDF document

Game Theory Preliminaries: Playing and Solving Games Zero-sum games with perfect information R&N 6 Definitions Game evaluation Optimal solutions Minimax Non-deterministic games (first take) 1 Types of Games

Inductive general game playing Andrew Cropper, Richard Evans, and Mark Law General game playing

e-Bug Junior Game Junior Game Game Style Game Process Demo Game Mechanics and

Game Playing Why do AI researchers study game playing? 1. Its a good reasoning problem, formal

Set 4: Game-Playing ICS 271 Fall 2016 Kalev Kask Overview Computer programs that play

Inductive general game playing Andrew Cropper, Richard Evans, and Mark Law Inverse general game

KR-Techniques for General Game Playing Michael Thielscher Roadmap 1. General Game Playing a

e-Bug Senior Game Senior Game Game Style Game Process Demo Game Puzzles and

Inductive general game playing Andrew Cropper, Richard Evans, and Mark Law Learning game rules

Game Playing Philipp Koehn 27 February 2019 Philipp Koehn Artificial Intelligence: Game Playing

Game Playing Game playing AI Class 8 Ch. 5.1-5.3, 5.4.1, 5.5 State of the art and

Game interoperability with functors functor AgsFun (structure Game : GAME) :&gt; sig structure

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

Game theory for wireless networks static games; dynamic games; repeated games; strict and weak

Game Playing Tail end of Constraint Satisfaction Ch. 5.1-5.3, 5.4.1, 5.5 Questions Game

4 Game Trees Game tree 4 Game Trees Game tree perfect information games perfect

Game playing Chapter 6 Chapter 6 1 Outline Games Perfect play minimax decisions

Lecture 10 Subgame-perfect Equilibrium 14.12 Game Theory Muhamet Yildiz 1 Road Map 1.

DICOM Second Generation Radiotherapy Enhanced RT Image Ulrich Busch Chairman DICOM WG-07

301AA - Advanced Programming Lecturer: Andrea Corradini andrea@di.unipi.it

Lambda Expressions CSC 203 Christopher Siu 1 / 12 Lambda Expressions A function is a mapping of

Game Theory (More examples, PoA, PoS) CSC304 - Nisarg Shah 1 Recap Normal form games

Game Theory and Strategic Analysis 11. Static Games and Nash Equilibrium 12.

Algorithmic Game Theory T HANKS TO P ROF . J ASON H ARTLINE

Decisions with Multiple Agents: Game Theory Alice Gao Lecture 23 Based on work by K.

Game interoperability with functors functor AgsFun (structure Game : GAME) :> sig structure