minimax strategies alpha beta pruning
play

Minimax strategies, alpha beta pruning Lirong Xia How to find good - PowerPoint PPT Presentation

Minimax strategies, alpha beta pruning Lirong Xia How to find good heuristics? No really mechanical way art more than science General guideline: relaxing constraints e.g. Pacman can pass through the walls Mimic what you would do 1


  1. Minimax strategies, alpha beta pruning Lirong Xia

  2. How to find good heuristics? Ø No really mechanical way § art more than science Ø General guideline: relaxing constraints § e.g. Pacman can pass through the walls Ø Mimic what you would do 1

  3. Arc Consistency of a CSP Ø A simple form of propagation makes sure all arcs are consistent: X X X Delete Ø If V loses a value, neighbors of V need to be rechecked! from tail! Ø Arc consistency detects failure earlier than forward checking Ø Can be run as a preprocessor or after each assignment Ø Might be time-consuming 2

  4. Limitations of Arc Consistency Ø After running arc consistency: § Can have one solution left § Can have multiple solutions left § Can have no solutions left (and not know it) 3

  5. “Sum to 2” game Ø Player 1 moves, then player 2, finally player 1 again Ø Move = 0 or 1 Ø Player 1 wins if and only if all moves together sum to 2 Player 1 0 1 Player 2 Player 2 0 1 1 0 Player 1 Player 1 Player 1 Player 1 0 1 0 1 1 0 0 1 -1 -1 -1 1 -1 1 1 -1 Player 1’s utility is in the leaves; player 2’s utility is the negative of this

  6. Today’s schedule Ø Adversarial game Ø Minimax search Ø Alpha-beta pruning algorithm 5

  7. Adversarial Games Ø Deterministic, zero-sum games: § Tic-tac-toe, chess, checkers § The MAX player maximizes result § The MIN player minimizes result Ø Minimax search: § A search tree § Players alternate turns § Each node has a minimax value: best achievable utility against a rational adversary 6

  8. Computing Minimax Values Ø This is DFS Ø Two recursive functions: § max-value maxes the values of successors § min-value mins the values of successors Ø Def value (state): If the state is a terminal state: return the state’s utility If the agent at the state is MAX: return max-value(state) If the agent at the state is MIN: return min-value(state) Ø Def max-value(state): Initialize max = -∞ For each successor of state: Compute value(successor) Update max accordingly return max Ø Def min-value(state): similar to max-value 7

  9. Minimax Example 3 3 2 2 8

  10. Tic-tac-toe Game Tree 9

  11. Renju • 15*15 • 5 horizontal, vertical, or diagonal in a row win • no double-3 or double-4 moves for black • otherwise black’s winning strategy was computed – L. Victor Allis 1994 (PhD thesis) 10

  12. Minimax Properties Ø Time complexity? ( ) m O b § Ø Space complexity? ( ) O bm § Ø For chess, § Exact solution is completely » » b 35, m 100 infeasible § But, do we need to explore the whole tree? 11

  13. Resource Limits Ø Cannot search to leaves Ø Depth-limited search § Instead, search a limited depth of tree § Replace terminal utilities with an evaluation function for non-terminal positions Ø Guarantee of optimal play is gone 12

  14. Evaluation Functions Ø Functions which scores non-terminals Ø Ideal function: returns the minimax utility of the position Ø In practice: typically weighted linear sum of features: ( ) = w 1 f 1 s ( ) + w 2 f 2 s ( ) +  + w n f n s ( ) Evals s ( ) ( ) f s = # white queens - # black queens Ø e.g. , etc. 13 1

  15. Minimax with limited depth Ø Suppose you are the MAX player Ø Given a depth d and current state Ø Compute value(state, d ) that reaches depth d § at depth d , use a evaluation function to estimate the value if it is non-terminal 14

  16. Improving minimax: pruning 15

  17. Pruning in Minimax Search Ø An ancestor is a MAX node § already has an option than my current solution § my future solution can only be smaller 16

  18. Alpha-beta pruning Ø Pruning = cutting off parts of the search tree (because you realize you don’t need to look at them) § When we considered A* we also pruned large parts of the search tree Ø Maintain § α = value of the best option for the MAX player along the path so far § β = value of the best option for the MIN player along the path so far § Initialized to be α = -∞ and β = +∞ Ø Maintain and update α and β for each node § α is updated at MAX player’s nodes § β is updated at MIN player’s nodes

  19. Alpha-Beta Pruning Ø General configuration § We’re computing the MIN-VALUE at n § We’re looping over n ’s children § n ’s value estimate is dropping § α is the best value that MAX can get at any choice point along the current path § If n becomes worse than α , MAX will avoid it, so can stop considering n ’s other children § Define β similarly for MIN § α is usually smaller than β • Once α >= β , return to the upper layer 18

  20. Alpha-Beta Pruning Example a is MAX’s best alternative here or above b is MIN’s best alternative here or above 19

  21. Alpha-Beta Pruning Example a = ¥ - a b starting / b = ¥ + a = a = a = ¥ a = ¥ 3 - - 3 raising a b = ¥ b = ¥ b = ¥ b = ¥ + + + + lowering b a = 3 a = 3 a = ¥ a = ¥ a = ¥ a = ¥ - - - - a = a = a = a = 3 3 3 3 b = ¥ b = + 2 b = b = b = b = ¥ 3 3 3 + b = b = b = b = ¥ 14 5 1 + a raising a a = ¥ a = 8 - is MAX’s best alternative here or above b = b = 3 b 3 is MIN’s best alternative here or above 20

  22. Alpha-Beta Pseudocode 21

  23. Alpha-Beta Pruning Properties Ø This pruning has no effect on final result at the root Ø Values of intermediate nodes might be wrong! § Important: children of the root may have the wrong value Ø Good children ordering improves effectiveness of pruning Ø With “perfect ordering”: § Time complexity drops to O ( b m /2 ) § Doubles solvable depth! § Your action looks smarter: more forward-looking with good evaluation function § Full search of, e.g. chess, is still hopeless… 22

  24. Project 2 Ø Q1: write an evaluation function for (state,action) pairs § the evaluation function is for this question only Ø Q2: minimax search with arbitrary depth and multiple MIN players (ghosts) § evaluation function on states has been implemented for you Ø Q3: alpha-beta pruning with arbitrary depth and multiple MIN players (ghosts) 23

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend