4 game trees game tree 4 game trees game tree
play

4 Game Trees Game tree 4 Game Trees Game tree perfect information - PDF document

4 Game Trees Game tree 4 Game Trees Game tree perfect information games perfect information games all possible plays of two all possible plays of two- -player, perfect player, perfect no hidden information no hidden


  1. §4 Game Trees Game tree §4 Game Trees Game tree � perfect information games perfect information games � all possible plays of two all possible plays of two- -player, perfect player, perfect � � � no hidden information no hidden information information games can be represented with a � information games can be represented with a � two two- -player, perfect information games player, perfect information games � game tree game tree � Noughts and Crosses Noughts and Crosses � � nodes: positions (or states) nodes: positions (or states) � Chess Chess � � � Go Go � � edges: moves edges: moves � � imperfect information games imperfect information games � � players: players: MAX MAX (has the first move) and (has the first move) and MIN � MIN � Poker Poker � � Backgammon Backgammon � ply = the length of the path between two nodes ply = the length of the path between two nodes � � � Monopoly Monopoly � MAX has even plies counting from the root node has even plies counting from the root node � MAX � zero zero- -sum property sum property � � MIN has odd plies counting from the root node has odd plies counting from the root node � one player’s gain equals another player’s loss one player’s gain equals another player’s loss � MIN � � Division Nim with seven matches Division Nim with seven matches Problem statement Problem statement Minimax Minimax Given a node v Given a node v in a game tree in a game tree � assumption: players are rational and try to win assumption: players are rational and try to win � � given a game tree, we know the outcome in the leaves given a game tree, we know the outcome in the leaves � find a winning strategy for find a winning strategy for MAX MAX (or (or MIN MIN ) from ) from v v � assign the leaves to win, draw, or loss (or a numeric value like assign the leaves to win, draw, or loss (or a numeric value like � +1, 0, – –1) according to 1) according to MAX ’s point of view +1, 0, MAX ’s point of view � at nodes one ply above the leaves, we choose the best at nodes one ply above the leaves, we choose the best � or (equivalently) or (equivalently) outcome among the children (which are leaves) outcome among the children (which are leaves) MAX : win if possible; otherwise, draw if possible; else loss : win if possible; otherwise, draw if possible; else loss � MAX � show that MAX MAX (or (or MIN ) can force a win from v v show that MIN ) can force a win from : loss if possible; otherwise, draw if possible; else win MIN : loss if possible; otherwise, draw if possible; else win � MIN � � recurse through the nodes until in the root recurse through the nodes until in the root � 1

  2. MAX MAX –1 Minimax rules Minimax rules MIN MIN –1 –1 –1 If the node is labelled to MAX If the node is labelled to MAX , assign it to the , assign it to the 1. 1. maximum value of its children. maximum value of its children. MAX MAX If the node is labelled to MIN , assign it to the If the node is labelled to MIN , assign it to the +1 –1 +1 –1 2. 2. minimum value of its children. minimum value of its children. MIN MIN +1 –1 +1 MIN minimizes, minimizes, MAX MAX maximizes → minimax maximizes → minimax MIN � � MAX MAX +1 –1 MIN MIN +1 Rough estimates on running Rough estimates on running Analysis Analysis times when d times when d = 5 = 5 � simplifying assumptions simplifying assumptions � suppose expanding a node takes 1 ms suppose expanding a node takes 1 ms � � � internal nodes have the same branching factor internal nodes have the same branching factor b b � � branching factor branching factor b b depends on the game depends on the game � � game tree is searched to a fixed depth game tree is searched to a fixed depth d d � � Draughts ( Draughts ( b b ≈ 3): ≈ 3): t t = 0.243 s = 0.243 s � time consumption is proportional to the number of time consumption is proportional to the number of � � expanded nodes expanded nodes � Chess ( Chess ( b b ≈ 30): ≈ 30): t t = 6 = 6¾ ¾ h h � � 1 1 — — root node (the initial ply) root node (the initial ply) � � Go ( Go ( b b ≈ 300): ≈ 300): t t = 77 a = 77 a � � b b — — nodes in the first ply nodes in the first ply � 2 — � b b 2 — nodes in the second ply nodes in the second ply � alpha alpha- -beta pruning reduces beta pruning reduces b b � � d — � b b d — nodes in the nodes in the d d th ply th ply � � overall running time overall running time O O ( ( b b d d ) ) � Controlling the search depth Controlling the search depth Evaluation function Evaluation function � usually the whole game tree is too large usually the whole game tree is too large � combination of numerical measurements combination of numerical measurements � � → limit the search depth → limit the search depth m m i i ( ( s s , , p p ) of the game state ) of the game state → a partial game tree → a partial game tree � single measurement: single measurement: m m i i ( ( s s , , p p ) ) � → partial minimax → partial minimax � difference measurement: difference measurement: m m i i ( ( s s , , p p ) − ) − m m j j ( ( s s , , q q ) ) � � n n - -move look move look- -ahead strategy ahead strategy � ratio of measurements: ratio of measurements: m m i i ( ( s s , , p p ) / ) / m m j j ( ( s s , , q q ) ) � � � stop searching after stop searching after n n moves moves � aggregate the measurements maintaining the aggregate the measurements maintaining the � � zero- -sum property sum property � make the internal nodes (i.e., frontier nodes) leaves make the internal nodes (i.e., frontier nodes) leaves zero � � use an evaluation function to ‘guess’ the outcome use an evaluation function to ‘guess’ the outcome � 2

  3. Example: Noughts and Crosses Examples of the evaluation Example: Noughts and Crosses Examples of the evaluation � heuristic evaluation function heuristic evaluation function e e : : � e e (•) = (•) = 6 6 – – 5 5 = 1 = 1 � count the winning lines open to count the winning lines open to MAX � MAX � subtract the number of winning lines open to subtract the number of winning lines open to MIN � MIN � forced wins forced wins � � state is evaluated + ∞ , if it is a forced win for state is evaluated + ∞ , if it is a forced win for MAX � MAX e e (•) = (•) = 4 4 – – 5 5 = = – –1 1 � state is evaluated state is evaluated – – ∞ , if it is forced win for ∞ , if it is forced win for MIN � MIN e (•) = + ∞ (•) = + ∞ e The deeper the better...? Drawbacks of partial minimax The deeper the better...? Drawbacks of partial minimax � assumptions: assumptions: � horizon effect horizon effect � � � n n - -move look move look- -ahead ahead � heuristically promising path can lead to an unfavourable heuristically promising path can lead to an unfavourable � � � branching factor branching factor b b , depth , depth d d , , situation situation � � leaves with uniform random distribution leaves with uniform random distribution � staged search: extend the search on promising nodes staged search: extend the search on promising nodes � � � minimax convergence theorem: minimax convergence theorem: � � iterative deepening: increase iterative deepening: increase n n until out of memory or time until out of memory or time � � n n increases → root value converges to increases → root value converges to f f ( ( b b , , d d ) ) � phase phase- -related search: opening, midgame, end game related search: opening, midgame, end game � � � last player theorem: last player theorem: � � however, horizon effect cannot be totally eliminated however, horizon effect cannot be totally eliminated � � root values from odd and even plies not comparable root values from odd and even plies not comparable � � bias bias � � minimax pathology theorem: minimax pathology theorem: � � we want to have an estimate of minimax but get a minimax of we want to have an estimate of minimax but get a minimax of � � n n increases → probability of selecting non increases → probability of selecting non- -optimal move optimal move estimates estimates � increases ( ← uniformity assumption!) increases ( ← uniformity assumption!) � distortion in the root: odd plies → win, even plies → loss distortion in the root: odd plies → win, even plies → loss � 3

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend