Chapter6 Adversarial Search 20070419 Chap6 1 Game Theory - PDF document

Chapter6 Adversarial Search 20070419 Chap6 1 Game Theory • Studied by mathematicians, economists, finance • In AI we limit games to: - deterministic - turn-taking - two-player - zero-sum ( 零和遊戲或 Win-lose Game; 你死我活 ) - perfect information This means deterministic, fully observable environments in which there are two agents whose actions must alternate and in which the utility values at the end of the game are always equal and opposite. 20070419 Chap6 2 1

Types of Games Deterministic Chance Perfect Chess, Go, Othello Backgammon ( 西洋雙陸棋 ) information Checkers ( 西洋跳棋 ) Monopoly ( 地產大亨 , 大富翁 ) Imperfect Blind Tictactoe Bridge ( 橋牌 ) , Poker ( 梭哈 ) information • Game playing was one of the first tasks undertaken in AI. • Machines have surpassed humans on checkers and Othello, have defeated human champions in chess and backgammon. • In Go, computers perform at the amateur level. 20070419 Chap6 3 Games as Search Problems • Games offer pure, abstract competition. • A chess-playing computer would be an existence proof of a machine doing something generally thought to require intelligence. • Games are idealization of worlds in which - the world state is fully accessible ; - the (small number of) actions are well-defined; - uncertainty due to moves by the opponent due to the complexity of games 20070419 Chap6 4 2

Games as Search Problems (cont.-1) • Games are usually much too hard to solve. For example, in a typical chess game, - Average branching factor: 35 - Average moves by each player: 50 - Total number of nodes in search tree : 35 100 or 10 154 (although total number of different legal positions: 10 40 ) • Time limits for making good decisions 20070419 Chap6 5 Games as Search Problems (cont.-2) • Initial State - How does the game start? • Successor Function - A list of legal (move, state) pairs for each state • Terminal Test - Determines when game is over • Utility Function - Provides numeric value for all terminal states 20070419 Chap6 6 3

Partial Game Tree 20070419 Chap6 7 Optimal strategies • Find the contingent strategy for MAX assuming an infallible MIN opponent. • Assumption: Both players play optimally !! • Given a game tree, the optimal strategy can be determined by using the minimax value of each node: MinimaxValue( n ) = Utility ( n ) if n is a terminal state max s ∈ Successors(n) MinimaxValue( s ) if n is a MAX node min s ∈ Successors(n) MinimaxValue( s ) if n is a MIN node 20070419 Chap6 8 4

Minimax • Perfect play for deterministic, perfect information games • Idea: choose move to a position with highest minimax value = best achievable payoff against best play 20070419 Chap6 9 Two-Ply Game Tree 20070419 Chap6 10 5

Two-Ply Game Tree (cont.-1) 20070419 Chap6 11 Two-Ply Game Tree (cont.-2) 20070419 Chap6 12 6

Two-Ply Game Tree (cont.-3) The minimax decision Minimax maximizes the worst-case outcome for max. 20070419 Chap6 13 Minimax Algorithm 20070419 Chap6 14 7

The Minimax Algorithm (cont.) • Generate the whole game tree. • Apply the utility function to each terminal state. • Determine the utility of the nodes one level higher up from the terminal nodes. = K Utility( ) max / min ( . 1 , . 2 , , . ) n n n n b • Continue backing up the values. • At the root, MAX chooses the move leading to the highest utility value. 20070419 Chap6 15 Analysis of Minimax Complete ?? Yes , only if tree is finite Optimal ?? Yes , against an optimal opponent. Otherwise?? Time ?? O(b m ), is a complete depth-first search m: max depth, b : # of legal moves Space ?? O(bm), generate all successors at once or O(m), generate successors one at a time For chess, b ≈ 35, m ≈ 100 for “reasonable” games ⇒ Exact solution completely infeasible 20070419 Chap6 16 8

Optimal Decisions in Multiplayer Games • Extend the minimax idea to multiplayer games • Replace the single value for each node with a vector of values 20070419 Chap6 17 α - β Pruning • The problem of minimax search # of state to examine: exponential in number of moves α - β Pruning: • returns the same moves as minimax would, but prunes away branches that cannot possibly influence the final decision α : the value of the best (highest) choice so far in search of MAX • β : the value of the best (lowest) choice so far in search of MIN • • Order of considering successors matters (look at step f of Fig 6.5 pp.168) - If possible, consider best successors first 20070419 Chap6 18 9

α - β Pruning (cont.) If m is better than n for Player, we will never get to n in play and just prune it. 20070419 Chap6 19 α - β Pruning Example Do DF-search until first leaf Range of possible values [- ∞ ,+ ∞ ] [- ∞ , + ∞ ] 20070419 Chap6 20 10

α - β Pruning Example (cont.-1) [- ∞ ,+ ∞ ] [- ∞ ,3] 20070419 Chap6 21 α - β Pruning Example (cont.-2) [- ∞ ,+ ∞ ] [- ∞ ,3] 20070419 Chap6 22 11

α - β Pruning Example (cont.-3) [3,+ ∞ ] [3,3] 20070419 Chap6 23 α - β Pruning Example (cont.-4) [3,+ ∞ ] This node is worse for MAX [- ∞ ,2] [3,3] 20070419 Chap6 24 12

α - β Pruning Example (cont.-5) , [3,14] [- ∞ ,2] [- ∞ ,14] [3,3] 20070419 Chap6 25 α - β Pruning Example (cont.-6) , [3,5] [ − ∞ ,2] [- ∞ ,5] 20070419 Chap6 26 13

α - β Pruning Example (cont.-7) [3,3] [ − ∞ ,2] [3,3] [2,2] 20070419 Chap6 27 α - β Pruning Example (cont.-8) [3,3] [- ∞ ,2] [3,3] [2,2] 20070419 Chap6 28 14

The α - β Algorithm 20070419 Chap6 29 The α - β Algorithm (cont.) 20070419 Chap6 30 15

Analysis of α - β Algorithm • Pruning does not affect final result. • Entire subtrees can be pruned. • Good move ordering improves its effectiveness highly dependent on the order in which the successor are examined ⇒ It is worthwhile to try to examine first the successors that are likely to be best . e.g. Figure 6.5 (e, f) If successors of D is 2, 5, 14 (instead of 14, 5, 2) then 5, 14 can be pruned. 20070419 Chap6 31 Analysis of α - β Algorithm (cont.) • If best-move-first, - the total number of nodes examined is: O(b d/2 ) - the effective branching factor becomes: b 1/2 for chess, 6 instead 35 i.e. α - β can look ahead roughly twice as far as minimax in the same amount of time. • If random ordering, - the total number of nodes examined is: O(b 3d/4 ) for moderate b • Repeated states are again possible. - Store them in memory = transposition table 20070419 Chap6 32 16

Imperfect, Real-Time Decisions • Minimax and alpha-beta pruning require too much leaf-node evaluations. • May be impractical within a reasonable amount of time. • Shannon (1950): - Apply heuristic evaluation function EVAL (replacing utility function of alpha-beta) - Cut off search earlier (replacing terminal-test by Cutoff test) 20070419 Chap6 33 Heuristic Evaluation Functions • Produce an estimate of the expected utility of the game from a given position. • Performance depends on the quality of EVAL • Requirements - EVAL should order terminal-nodes in the same as UTILITY - Computation cannot take too long - For non-terminal states, the EVAL should be strongly correlated with the actual chance of winning. • Most evaluation functions work by calculating various features of the state. What are features of chess? e.g. # of pawns possessed, etc. addition assumes • Weighted linear function independence of Eval(s) = w 1 f 1 (s) + w 2 f 2 (s) + … + w n f n (s) each feature 20070419 Chap6 34 17

Heuristic Evaluation Functions (cont.-1) • Give material value for each piece (by chess book) pawn 1 knight or bishop 3 rook 5 queen 9 20070419 Chap6 35 Heuristic Evaluation Functions (cont.-2) Heuristic difficulties (Heuristic counts pieces won ) e.g. Two slightly different chess positions: (a) Black has an advantage of a knight and two pawns and will win the game. (b) Black will lose after white captures the queen. 20070419 Chap6 36 18

Cutting Off Search • When do you recuse or use evaluation function? if Cutoff-Test (state , depth ) then return Eval (state ) Controlling the amount of search is to set a fixed depth limit d - Cutoff-Test (state , depth ) returns 1 or 0 When 1 is returned for all depth greater than some fixed depth d , use evaluation function - Cutoff beyond a certain depth - Cutoff if state is stable (more predicable) - Cutoff moves you know are bad (forward pruning) • Can have disastrous effect if the evaluation functions is not sophisticated enough. • Should continue the search until a quiescent position is found (no wild swings in value in near future) 20070419 Chap6 37 Cutting Off Search (cont.) • Does it work in practice? b m = 10 6 , b = 35 ⇒ m = 4 4-ply lookahead is a hopeless chess player 4-ply ≈ human novice 8-ply ≈ typical PC, human master 12-ply ≈ Deep Blue, Kasparov 20070419 Chap6 38 19

Chapter6 Adversarial Search 20070419 Chap6 1 Game Theory - PDF document

Chapter6 Adversarial Search 20070419 Chap6 1 Game Theory Studied by mathematicians, economists, finance In AI we limit games to: - deterministic - turn-taking - two-player - zero-sum ( Win-lose Game;

N-GRAMS Speech and Language Processing, chapter6 Presented by Louis Tsai CSIE, NTNU

Chapter6 In AI we limit games to: - deterministic - turn-taking - two-player -

CS540 Midterm Review Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University

Robust Digital Filters Part 1: Minimax FIR Filters Wu-Sheng Lu Takao Hinamoto University of

Minimax Rates for Memory-Constrained Sparse Linear Regression Jacob Steinhardt John Duchi

Inf2D 04: Adversarial Search Valerio Restocchi School of Informatics, University of Edinburgh

Homework 7.1 C D Here is the payoff matrix for the most commonly used version of the

CSC304 Lecture 6 Game Theory : Minimax Theorem via Expert Learning CSC304 - Nisarg Shah 1

CMU 15-896 Noncooperative games 2: Learning and minimax Teacher: Ariel Procaccia Reminder: The

Imprecision in learning: introduction Sebastien Destercke Universit de Technologie de

Simpler Optimal Algorithm for Contextual Bandits under Realizability Yunzong Xu MIT Joint work

Decision Problems Decision Making under Uncertainty, Part III Christos Dimitrakakis Chalmers

a chaining algorithm for online non parametric regression Pierre Gaillard December 2, 2015

Outline 1. Standing on the Shoulders of Giants . . . 2. What is Information? 3. Shannon

Introduction to Machine Learning 25. Multiplicative Updates, Games and Boosting Alex Smola

Theory and Statistics Constantinos Daskalakis CSAIL and EECS, MIT Min-Max Optimization Solve:

Multi-agent learning Erik Berbee & Bas van Gijzel , Master Student AT, Utrecht University Erik

CSCI 446 ARTIFICIAL INTELLIGENCE EXAM 1 STUDY OUTLINE Introduction to Artificial Intelligence

Large-scale machine learning and convex optimization Francis Bach INRIA - Ecole Normale Sup

The Challenge Initiative: Business Unusual Approach to Scale up Kojo Lokko Bill & Melinda

On Scalable and Efficient Computation of Large Scale Optimal Transport Yujia Xie, Minshuo Chen,

Point Detectors KRYSTIAN MIKOLAJCZYK AND CORDELIA SCHMID [2004] Shreyas Saxena Gurkirit Singh

Efficient Interactive Training Selection for Large-scale Entity Resolution Qing Wang, Dinusha

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources

Chapter6 Adversarial Search 20070419 Chap6 1 Game Theory - PDF document

Chapter6 Adversarial Search 20070419 Chap6 1 Game Theory Studied by mathematicians, economists, finance In AI we limit games to: - deterministic - turn-taking - two-player - zero-sum ( Win-lose Game;

N-GRAMS Speech and Language Processing, chapter6 Presented by Louis Tsai CSIE, NTNU

Chapter6 In AI we limit games to: - deterministic - turn-taking - two-player -

CS540 Midterm Review Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University

Robust Digital Filters Part 1: Minimax FIR Filters Wu-Sheng Lu Takao Hinamoto University of

Minimax Rates for Memory-Constrained Sparse Linear Regression Jacob Steinhardt John Duchi

Inf2D 04: Adversarial Search Valerio Restocchi School of Informatics, University of Edinburgh

Homework 7.1 C D Here is the payoff matrix for the most commonly used version of the

CSC304 Lecture 6 Game Theory : Minimax Theorem via Expert Learning CSC304 - Nisarg Shah 1

CMU 15-896 Noncooperative games 2: Learning and minimax Teacher: Ariel Procaccia Reminder: The

Imprecision in learning: introduction Sebastien Destercke Universit de Technologie de

Simpler Optimal Algorithm for Contextual Bandits under Realizability Yunzong Xu MIT Joint work

Decision Problems Decision Making under Uncertainty, Part III Christos Dimitrakakis Chalmers

a chaining algorithm for online non parametric regression Pierre Gaillard December 2, 2015

Outline 1. Standing on the Shoulders of Giants . . . 2. What is Information? 3. Shannon

Introduction to Machine Learning 25. Multiplicative Updates, Games and Boosting Alex Smola

Theory and Statistics Constantinos Daskalakis CSAIL and EECS, MIT Min-Max Optimization Solve:

Multi-agent learning Erik Berbee &amp; Bas van Gijzel , Master Student AT, Utrecht University Erik

CSCI 446 ARTIFICIAL INTELLIGENCE EXAM 1 STUDY OUTLINE Introduction to Artificial Intelligence

Large-scale machine learning and convex optimization Francis Bach INRIA - Ecole Normale Sup

The Challenge Initiative: Business Unusual Approach to Scale up Kojo Lokko Bill &amp; Melinda

On Scalable and Efficient Computation of Large Scale Optimal Transport Yujia Xie, Minshuo Chen,

Point Detectors KRYSTIAN MIKOLAJCZYK AND CORDELIA SCHMID [2004] Shreyas Saxena Gurkirit Singh

Efficient Interactive Training Selection for Large-scale Entity Resolution Qing Wang, Dinusha

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources

Multi-agent learning Erik Berbee & Bas van Gijzel , Master Student AT, Utrecht University Erik

The Challenge Initiative: Business Unusual Approach to Scale up Kojo Lokko Bill & Melinda