games and adversarial search
play

Games and Adversarial Search Marco Chiarandini Department of - PowerPoint PPT Presentation

Lecture 17 Games and Adversarial Search Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Slides by Stuart Russell and Peter Norvig Introduction Minimax Algorithm Course Overview


  1. Lecture 17 Games and Adversarial Search Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Slides by Stuart Russell and Peter Norvig

  2. Introduction Minimax α – β Algorithm Course Overview Stochastic Games ✔ Introduction ✔ Learning ✔ Artificial Intelligence ✔ Supervised ✔ Intelligent Agents Decision Trees, Neural Networks ✔ Search Learning Bayesian Networks ✔ Uninformed Search ✔ Unsupervised ✔ Heuristic Search EM Algorithm ✔ Uncertain knowledge and ✔ Reinforcement Learning Reasoning ◮ Games and Adversarial Search ✔ Probability and Bayesian ◮ Minimax search and approach Alpha-beta pruning ✔ Bayesian Networks ◮ Multiagent search ✔ Hidden Markov Chains ◮ Knowledge representation and ✔ Kalman Filters Reasoning ◮ Propositional logic ◮ First order logic ◮ Inference ◮ Plannning 2

  3. Introduction Minimax α – β Algorithm Outline Stochastic Games ♦ Games ♦ Perfect play – minimax decisions – α – β pruning ♦ Resource limits and approximate evaluation ♦ Games of chance ♦ Games with imperfect information 3

  4. Introduction Minimax α – β Algorithm Outline Stochastic Games 1. Introduction 2. Minimax 3. α – β Algorithm 4. Stochastic Games 4

  5. Introduction Minimax α – β Algorithm Multiagent environments Stochastic Games Multiagent environments: ◮ cooperative ◮ competitive ➨ adversarial search in games AI game theory (combinatorial game theory) ◮ deterministic/stochastic ◮ turn taking ◮ two players ◮ zero sum games = utility values equal and opposite ◮ perfect/imperfect information ◮ agents are restricted to a small number of actions described by rules “Classical” (economic) game theory includes cooperation, chance, imperfect knowledge, simultaneous moves and tends to represent real-life decision making situations. 5

  6. Introduction Minimax α – β Algorithm Types of Games Stochastic Games deterministic chance chess, checkers, kalaha backgammon, perfect information go, othello monopoly battleships, imperfect information bridge, poker, scrabble blind tictactoe 6

  7. Introduction Minimax α – β Algorithm Games vs. search problems Stochastic Games “Unpredictable” opponent ⇒ solution is a strategy/policy specifying a move for every possible opponent reply ➨ contingency strategy Optimal strategy: the one that leads to outcomes at least as good as any other strategy when one is playing an infallibile opponent Search problem � game tree ◮ initial state: root of game tree ◮ successor function: game rules/moves ◮ terminal test (is the game over?) ◮ utility function, gives a value for terminal nodes (eg, +1, -1, 0) Terminology: ◮ Two players called MAX and MIN. ◮ MAX searches the game tree. ◮ Ply: one turn (every player moves once) from “reply”. [A. Samuel 1959] 7

  8. Introduction Minimax α – β Algorithm Game tree (2-player, deterministic, turns) Stochastic Games MAX (X) X X X MIN (O) X X X X X X X O X O X . . . MAX (X) O X O X X O X O . . . MIN (O) X X . . . . . . . . . . . . . . . X O X X O X X O X TERMINAL O X O O X X O X X O X O O Utility −1 0 +1 9

  9. Introduction Minimax α – β Algorithm Measures of Game Complexity Stochastic Games ◮ state-space complexity: number of legal game positions reachable from the initial position of the game. an upper bound can often be computed by including illegal positions Eg, TicTacToe: 3 9 = 19 . 683 5 . 478 after removal of illegal 765 essentially different positions after eliminating symmetries ◮ game tree size: total number of possible games that can be played: number of leaf nodes in the game tree rooted at the game’s initial position. Eg: TicTacToe: 9 ! = 362 . 880 possible games 255 . 168 possible games halting when one side wins 26 . 830 after removal of rotations and reflections 10

  10. Introduction Minimax α – β Algorithm Stochastic Games 11

  11. Introduction Minimax α – β Algorithm Stochastic Games First three levels of the tic-tac-toe state space reduced by symmetry: 12 × 7 ! 12

  12. Introduction Minimax α – β Algorithm Outline Stochastic Games 1. Introduction 2. Minimax 3. α – β Algorithm 4. Stochastic Games 13

  13. Introduction Minimax α – β Algorithm Minimax Stochastic Games Perfect play for deterministic, perfect-information games Idea: choose move to position with highest minimax value ( � utility for MAX) = best achievable payoff against best play E.g., 2-ply game: 3 MAX A 1 A 2 A 3 3 2 2 MIN A 11 A 12 A 13 A 21 A 22 A 23 A 31 A 32 A 33 3 12 8 2 4 6 14 5 2 14

  14. Introduction Minimax α – β Algorithm Minimax algorithm Stochastic Games Recursive Depth First Search: 15

  15. Introduction Minimax α – β Algorithm Properties of minimax Stochastic Games Complete?? Yes, if tree is finite (chess has specific rules for this) Time complexity?? O ( b m ) Space complexity?? O ( bm ) (depth-first exploration) But do we need to explore every path? 16

  16. Introduction Minimax α – β Algorithm Measures of Game Complexity Stochastic Games ◮ game-tree complexity: number of leaf nodes in the smallest full-width decision tree that establishes the value of the initial position. A full-width tree includes all nodes at each depth. estimates the number of positions to evaluate in a minimax search to determine the value of the initial position. approximation: game’s average branching factor to the power of the number of plies in an average game. Eg.: chess For chess, b ≈ 35, m ≈ 100 for “reasonable” games ⇒ exact solution completely infeasible ◮ computational complexity applies to generalized games (eg, n × n boards) Eg: TicTacToe: m × n board k in a row solved in DSPACE ( mn ) by searching the entire game tree 17

  17. Introduction Minimax α – β Algorithm Historical view Stochastic Games Time limits ⇒ unlikely to find goal, must approximate Plan of attack: ◮ Computer considers possible lines of play (Babbage, 1846) ◮ Algorithm for perfect play - MINIMAX - (Zermelo, 1912; Von Neumann, 1944) ◮ Finite horizon, approximate evaluation (Zuse, 1945; Wiener, 1948; Shannon, 1950) ◮ First chess program (Turing, 1951) ◮ Machine learning to improve evaluation accuracy (Samuel, 1952–57) ◮ Pruning to allow deeper search - α − β alg. - (McCarthy, 1956) 18

  18. Introduction Minimax α – β Algorithm Resource limits Stochastic Games Standard approaches: ◮ n-ply lookahead: depth-limited search ◮ heuristic descent ◮ heuristic cutoff 1. Use Cutoff-Test instead of Terminal-Test e.g., depth limit (perhaps add quiescence search) 2. Use Eval instead of Utility i.e., evaluation function that estimates desirability of position Suppose we have 100 seconds, explore 10 4 nodes/second ⇒ 10 6 nodes per move ≈ 35 8 / 2 19

  19. Introduction Minimax α – β Algorithm Heuristic Descent Stochastic Games Heuristic measuring conflict applied to states of tic-tac-toe 20

  20. Introduction Minimax α – β Algorithm Evaluation functions Stochastic Games Black to move White to move White slightly better Black winning For chess, typically linear weighted sum of features Eval ( s ) = w 1 f 1 ( s ) + w 2 f 2 ( s ) + . . . + w n f n ( s ) e.g., w 1 = 9 with f 1 ( s ) = (number of white queens) – (number of black queens), etc. 21

  21. Introduction Minimax α – β Algorithm Thrashing Stochastic Games 22

  22. Introduction Minimax α – β Algorithm Digression: Exact values don’t matter Stochastic Games MAX MIN 1 2 1 20 1 2 2 4 1 20 20 400 Behaviour is preserved under any monotonic transformation of Eval Only the order matters: payoff in deterministic games acts as an ordinal utility function 23

  23. Introduction Minimax α – β Algorithm Outline Stochastic Games 1. Introduction 2. Minimax 3. α – β Algorithm 4. Stochastic Games 24

  24. Introduction Minimax α – β Algorithm Example Stochastic Games 25

  25. Introduction Minimax α – β Algorithm α – β pruning example Stochastic Games 3 3 MAX 2 3 2 14 5 MIN X X 3 12 8 2 14 5 2 Minimax ( root ) = max { 3 , min { 2 , x , y } , min { ... }} 26

  26. Introduction Minimax α – β Algorithm Why is it called α – β ? Stochastic Games MAX MIN .. .. .. MAX MIN V α is the best value (to MAX) found so far along the current path If V is worse ( < ) than α , MAX will avoid it ⇒ prune that branch Define β similarly for MIN 27

  27. Introduction Minimax α – β Algorithm The α – β algorithm Stochastic Games α is the best value to MAX up to now for everything that comes above in the game tree. Similar for β and MIN. 28

  28. Introduction Minimax α – β Algorithm Properties of α – β Stochastic Games ◮ Pruning does not affect final result ◮ Good move ordering improves effectiveness of pruning ◮ With “perfect ordering,” time complexity = O ( b m / 2 ) ⇒ doubles solvable depth ◮ if b is relatively small, random orders leads to O ( b 3 m / 4 ) ◮ Unfortunately, 35 50 is still impossible! 29

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend