game playing
play

Game Playing Daniil Pakhomov (slides by Philipp Koehn) 26 February - PowerPoint PPT Presentation

Game Playing Daniil Pakhomov (slides by Philipp Koehn) 26 February 2019 Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019 Outline 1 Games Perfect play minimax decisions pruning Resource limits


  1. Game Playing Daniil Pakhomov (slides by Philipp Koehn) 26 February 2019 Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  2. Outline 1 ● Games ● Perfect play – minimax decisions – α – β pruning ● Resource limits and approximate evaluation ● Games of chance ● Games of imperfect information Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  3. 2 games Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  4. Games vs. Search Problems 3 ● “Unpredictable” opponent ⇒ solution is a strategy specifying a move for every possible opponent reply ● Time limits ⇒ unlikely to find goal, must approximate ● Plan of attack: – computer considers possible lines of play (Babbage, 1846) – algorithm for perfect play (Zermelo, 1912; Von Neumann, 1944) – finite horizon, approximate evaluation (Zuse, 1945; Wiener, 1948; Shannon, 1950) – first Chess program (Turing, 1951) – machine learning to improve evaluation accuracy (Samuel, 1952–57) – pruning to allow deeper search (McCarthy, 1956) Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  5. Types of Games 4 deterministic chance Chess perfect Backgammon Checkers information Monopoly Go Othello Bridge imperfect battleships Poker information Blind Tic Tac Toe Scrabble Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  6. Game Tree (2-player, Deterministic, Turns) 5 Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  7. Simple Game Tree 6 ● 2 player game ● Each player has one move ● You move first ● Goal: optimize your payoff (utility) Start Your move Opponent move Your payo ff Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  8. 7 minimax Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  9. Minimax 8 ● Perfect play for deterministic, perfect-information games ● Idea: choose move to position with highest minimax value = best achievable payoff against best play ● E.g., 2-player game, one move each: Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  10. Minimax Algorithm 9 function M INIMAX -D ECISION ( state ) returns an action inputs : state , current state in game return the a in A CTIONS ( state ) maximizing M IN -V ALUE ( R ESULT ( a , state )) function M AX -V ALUE ( state ) returns a utility value if T ERMINAL -T EST ( state ) then return U TILITY ( state ) v ←−∞ for a, s in S UCCESSORS ( state ) do v ← M AX ( v , M IN -V ALUE ( s )) return v function M IN -V ALUE ( state ) returns a utility value if T ERMINAL -T EST ( state ) then return U TILITY ( state ) v ←∞ for a, s in S UCCESSORS ( state ) do v ← M IN ( v , M AX -V ALUE ( s )) return v Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  11. Properties of Minimax 10 ● Complete? Yes, if tree is finite ● Optimal? Yes, against an optimal opponent. Otherwise?? ● Time complexity? O ( b m ) ● Space complexity? O ( bm ) (depth-first exploration) ● For Chess, b ≈ 35 , m ≈ 100 for “reasonable” games ⇒ exact solution completely infeasible ● But do we need to explore every path? Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  12. α – β Pruning Example 11 Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  13. α – β Pruning Example 12 Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  14. α – β Pruning Example 13 Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  15. α – β Pruning Example 14 Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  16. α – β Pruning Example 15 Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  17. α – β Pruning Example 16 Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  18. α – β Pruning Example 17 Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  19. Why is it Called α – β ? 18 ● α is the best value (to MAX ) found so far off the current path ● If V is worse than α , MAX will avoid it ⇒ prune that branch ● Define β similarly for MIN Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  20. The α – β Algorithm 19 function A LPHA -B ETA -D ECISION ( state ) returns an action return the a in A CTIONS ( state ) maximizing M IN -V ALUE ( R ESULT ( a , state )) function M AX -V ALUE ( state , α , β ) returns a utility value inputs : state , current state in game α , the value of the best alternative for MAX along the path to state β , the value of the best alternative for MIN along the path to state if T ERMINAL -T EST ( state ) then return U TILITY ( state ) v ←−∞ for a, s in S UCCESSORS ( state ) do v ← M AX ( v , M IN -V ALUE ( s , α , β )) if v ≥ β then return v α ← M AX ( α , v ) return v function M IN -V ALUE ( state , α , β ) returns a utility value same as M AX -V ALUE but with roles of α , β reversed Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  21. Properties of α – β 20 ● Safe: Pruning does not affect final result ● Good move ordering improves effectiveness of pruning ● With “perfect ordering,” time complexity = O ( b m / 2 ) ⇒ doubles solvable depth ● A simple example of the value of reasoning about which computations are relevant (a form of metareasoning) ● Unfortunately, 35 50 is still impossible! Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  22. Solved Games 21 ● A game is solved if optimal strategy can be computed ● Tic Tac Toe can be trivially solved ● Biggest solved game: Checkers – proof by Schaeffer in 2007 – both players can force at least a draw ● Most games (Chess, Go, etc.) too complex to be solved Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  23. 22 resource limits Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  24. Resource Limits 23 ● Standard approach: – Use C UTOFF -T EST instead of T ERMINAL -T EST e.g., depth limit (perhaps add quiescence search) – Use E VAL instead of U TILITY i.e., evaluation function that estimates desirability of position ● Suppose we have 100 seconds, explore 10 4 nodes/second ⇒ 10 6 nodes per move ≈ 35 8 / 2 ⇒ α – β reaches depth 8 ⇒ pretty good Chess program Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  25. Evaluation Functions 24 ● For Chess, typically linear weighted sum of features Eval ( s ) = w 1 f 1 ( s ) + w 2 f 2 ( s ) + ... + w n f n ( s ) e.g., f 1 ( s ) = (number of white queens) – (number of black queens) Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  26. Evaluation Function for Chess 25 ● Long experience of playing Chess ⇒ Evaluation of positions included in Chess strategy books – bishop is worth 3 pawns – knight is worth 3 pawns – rook is worth 5 pawns – good pawn position is worth 0.5 pawns – king safety is worth 0.5 pawns – etc. ● Pawn count → weight for features Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  27. Learning Evaluation Functions 26 ● Designing good evaluation functions requires a lot of expertise ● Machine learning approach – collect a large database of games play – note for each game who won – try to predict game outcome from features of position ⇒ learned weights ● May also learn evaluation functions from self-play Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  28. Some Concerns 27 ● Quiescence – position evaluation not reliable if board is unstable – e.g., Chess: queen will be lost in next move → deeper search of game-changing moves required ● Horizon Effect – adverse move can be delayed, but not avoided – search may prefer to delay, even if costly Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  29. Forward Pruning 28 ● Idea: avoid computation on clearly bad moves ● Cut off searches with bad positions before they reach max-depth ● Risky: initially inferior positions may lead to better positions ● Beam search: explore fixed number of promising moves deeper Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  30. Lookup instead of Search 29 ● Library of opening moves – even expert Chess players use standard opening moves – these can be memorized and followed until divergence ● End game – if only few pieces left, optimal final moves may be computed – Chess end game with 6 pieces left solved in 2006 – can be used instead of evaluation function Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

  31. Digression: Exact Values do not Matter 30 ● Behaviour is preserved under any monotonic transformation of E VAL ● Only the order matters: payoff in deterministic games acts as an ordinal utility function Philipp Koehn Artificial Intelligence: Game Playing 26 February 2019

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend