adversarial search
play

Adversarial Search Lecture 6 How can we use search to plan ahead - PowerPoint PPT Presentation

Wentworth Institute of Technology COMP3770 Artificial Intelligence | Spring 2017 | Derbinsky Adversarial Search Lecture 6 How can we use search to plan ahead when other agents are planning against us? Adversarial Search March 12, 2017 1


  1. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Adversarial Search Lecture 6 How can we use search to plan ahead when other agents are planning against us? Adversarial Search March 12, 2017 1

  2. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Agenda • Games: context, history • Searching via Minimax • Scaling – 𝛽−𝛾 pruning – Depth-limiting – Evaluation functions • Handling uncertainty with Expectiminimax Adversarial Search March 12, 2017 2

  3. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Characterizing Games • There are many kinds of games, and several ways to classify them – Deterministic vs. stochastic – [Im]perfect information – One, two, multi-player – Utility (how agents value outcomes) • Zero-sum • Algorithmic goal: calculate a strategy (or policy ) that decides a move in each state Adversarial Search March 12, 2017 3

  4. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Utility Zero/Constant-Sum General Games • Opposite utilities • Independent utilities • Adversarial, pure • Cooperation, indifference, competition competition, and more are all possible Adversarial Search March 12, 2017 4

  5. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Examples: Perception vs. Chance Deterministic Stochastic Perfect Chess, Checkers, Go, Othello Backgammon, Monopoly Imperfect Battleship Bridge, Poker, Scrabble Adversarial Search March 12, 2017 5

  6. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Checkers • 1950: First computer player • 1994: First computer champion (Chinook) ended 40-year-reign of human champion Marion Tinsley using complete 8-piece endgame • 1995: defended against Don Lafferty • 2007: solved ! Adversarial Search March 12, 2017 6

  7. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Chess • 1997: Deep Blue defeats human champion Gary Kasparov in a six-game match • Deep Blue examined 200M positions per second, used very sophisticated evaluation and undisclosed methods for extending some lines of search up to 40 ply • Current programs are even better , if less historic DeepBlue Adversarial Search March 12, 2017 7

  8. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Go • Until recently, were not competitive at champion level • 2016: beaten European champion – World champion game pending… • ANNs for policy (what to do) and evaluation (how good is a board state) AlphaGo Adversarial Search March 12, 2017 8

  9. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky More Progress • Othello: 1997, defeated world champion • Bridge: 1998, competitive with human champions • Scrabble: 2006, defeated world champion Adversarial Search March 12, 2017 9

  10. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Game Formalism • States: 𝑇 (start at 𝑇 % ) • Players: 𝑄 {1, … 𝑂} (typically take turns) • Actions: 𝐵𝑑𝑢𝑗𝑝𝑜(𝑡) , returns legal options • Transition function: 𝑇×𝐵 → 𝑇 • Terminal test: 𝑈𝑓𝑠𝑛𝑗𝑜𝑏𝑚(𝑡) , returns T/F • Utility: 𝑇×𝑄 → ℝ • Solution for a player is a policy : 𝑇 → 𝐵 Adversarial Search March 12, 2017 10

  11. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Game Plan :) • Start with deterministic, two- player adversarial games • Issues to come – Multiple players – Resource limits – Stochasticity Adversarial Search March 12, 2017 11

  12. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Single-Agent Game Tree 8 2 0 … 2 6 … 4 6 Adversarial Search March 12, 2017 12

  13. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Value of a State Non-Terminal States: Value of a state: The best achievable outcome (utility) from that state 8 Terminal States: 2 0 … 2 6 … 4 6 Adversarial Search March 12, 2017 13

  14. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Adversarial Game Trees -20 -8 … -18 -5 … -10 +4 -20 +8 Adversarial Search March 12, 2017 14

  15. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Minimax Values States Under Agent’s Control: States Under Opponent’s Control: -8 -5 -10 +8 Terminal States: Adversarial Search March 12, 2017 15

  16. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Tic-Tac-Toe Game Tree Adversarial Search March 12, 2017 16

  17. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Adversarial Search via Minimax • Deterministic, zero-sum Minimax values: – Tic-tac-toe, chess computed recursively – One player maximizes max 5 – The other minimizes • Minimax search min 5 2 – A search tree – Players alternate turns – Compute each node’s 8 2 5 6 minimax value : the best achievable utility Terminal values: against a rational part of the game (optimal) adversary Adversarial Search March 12, 2017 17

  18. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Minimax Implementation def value(state): if the state is a terminal state: return the state’s utility if the next agent is MAX: return max-value(state) if the next agent is MIN: return min-value(state) def max-value(state): def min-value(state): initialize v = - ∞ initialize v = + ∞ for each successor of state: for each successor of state: v = max(v, value(successor)) v = min(v, value(successor)) return v return v Adversarial Search March 12, 2017 18

  19. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Minimax Evaluation Time 𝒫(𝑐 𝑛 ) • – For chess: 𝑐 ≈ 35 , 𝑛 ≈ 100 Space 𝒫(𝑐𝑛) • Complete • Only if finite Minimax-Min Optimal • Yes, against optimal opponent Minimax-Avg Adversarial Search March 12, 2017 19

  20. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Multiple Players Add a ply per player • Independent utility: use a vector of values, each player MAX own utility • Zero-sum: each team sequentially MIN/MAX • In Pacman, have multiple MIN layers for each ghost per 1 Pacman move Adversarial Search March 12, 2017 20

  21. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Scaling to Larger Games Tree Pruning Depth-Limiting + Evaluation Adversarial Search March 12, 2017 21

  22. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Minimax Example 3 2 3 2 3 12 8 2 4 6 14 5 2 Adversarial Search March 12, 2017 22

  23. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Minimax Pruning [−∞, ∞] [3, ∞] [3,3] 3 [3,3] [2,2] [−∞, ∞] [−∞, 3] [−∞, 2] [−∞, 14] [−∞, 5] 2 3 2 3 12 8 2 14 5 2 Adversarial Search March 12, 2017 23

  24. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky General Case 𝛽 is the best value (to 𝑁𝐵𝑌 ) found so far off the current path • If V is worse than 𝛽 , 𝑁𝐵𝑌 will avoid it – prune that branch • Define 𝛾 similarly for 𝑁𝐽𝑂 • Adversarial Search March 12, 2017 24

  25. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Alpha-Beta Pruning def min-value(state, α, β): initialize v = + ∞ for each successor of state: v = min(v,value(successor,α,β)) if v ≤ α return v β = min(β, v) return v def max-value(state, α, β): initialize v = - ∞ for each successor of state: v = max(v,value(successor,α,β)) if v ≥ β return v α: MAX’s best option on path α = max(α, v) β: MIN’s best option on path return v Adversarial Search March 12, 2017 25

  26. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Alpha-Beta Properties • Has no effect on minimax value computed for the root! • Good child ordering improves effectiveness of pruning • With “perfect ordering”: – Time complexity drops to 𝒫(𝑐 N/P ) – Doubles solvable depth! – Full search of, e.g. chess, is still hopeless… • This is a simple example of metareasoning (computing about what to compute) Adversarial Search March 12, 2017 26

  27. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Checkup #1 10 8 4 50 Adversarial Search March 12, 2017 27

  28. Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Checkup #2 10 6 100 8 1 2 20 4 Adversarial Search March 12, 2017 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend