Adversarial Search Lecture 6 How can we use search to plan ahead - PowerPoint PPT Presentation

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Adversarial Search Lecture 6 How can we use search to plan ahead when other agents are planning against us? Adversarial Search March 12, 2017 1

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Agenda • Games: context, history • Searching via Minimax • Scaling – 𝛽−𝛾 pruning – Depth-limiting – Evaluation functions • Handling uncertainty with Expectiminimax Adversarial Search March 12, 2017 2

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Characterizing Games • There are many kinds of games, and several ways to classify them – Deterministic vs. stochastic – [Im]perfect information – One, two, multi-player – Utility (how agents value outcomes) • Zero-sum • Algorithmic goal: calculate a strategy (or policy ) that decides a move in each state Adversarial Search March 12, 2017 3

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Utility Zero/Constant-Sum General Games • Opposite utilities • Independent utilities • Adversarial, pure • Cooperation, indifference, competition competition, and more are all possible Adversarial Search March 12, 2017 4

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Examples: Perception vs. Chance Deterministic Stochastic Perfect Chess, Checkers, Go, Othello Backgammon, Monopoly Imperfect Battleship Bridge, Poker, Scrabble Adversarial Search March 12, 2017 5

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Checkers • 1950: First computer player • 1994: First computer champion (Chinook) ended 40-year-reign of human champion Marion Tinsley using complete 8-piece endgame • 1995: defended against Don Lafferty • 2007: solved ! Adversarial Search March 12, 2017 6

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Chess • 1997: Deep Blue defeats human champion Gary Kasparov in a six-game match • Deep Blue examined 200M positions per second, used very sophisticated evaluation and undisclosed methods for extending some lines of search up to 40 ply • Current programs are even better , if less historic DeepBlue Adversarial Search March 12, 2017 7

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Go • Until recently, were not competitive at champion level • 2016: beaten European champion – World champion game pending… • ANNs for policy (what to do) and evaluation (how good is a board state) AlphaGo Adversarial Search March 12, 2017 8

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky More Progress • Othello: 1997, defeated world champion • Bridge: 1998, competitive with human champions • Scrabble: 2006, defeated world champion Adversarial Search March 12, 2017 9

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Game Formalism • States: 𝑇 (start at 𝑇 % ) • Players: 𝑄 {1, … 𝑂} (typically take turns) • Actions: 𝐵𝑑𝑢𝑗𝑝𝑜(𝑡) , returns legal options • Transition function: 𝑇×𝐵 → 𝑇 • Terminal test: 𝑈𝑓𝑠𝑛𝑗𝑜𝑏𝑚(𝑡) , returns T/F • Utility: 𝑇×𝑄 → ℝ • Solution for a player is a policy : 𝑇 → 𝐵 Adversarial Search March 12, 2017 10

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Game Plan :) • Start with deterministic, two- player adversarial games • Issues to come – Multiple players – Resource limits – Stochasticity Adversarial Search March 12, 2017 11

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Single-Agent Game Tree 8 2 0 … 2 6 … 4 6 Adversarial Search March 12, 2017 12

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Value of a State Non-Terminal States: Value of a state: The best achievable outcome (utility) from that state 8 Terminal States: 2 0 … 2 6 … 4 6 Adversarial Search March 12, 2017 13

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Adversarial Game Trees -20 -8 … -18 -5 … -10 +4 -20 +8 Adversarial Search March 12, 2017 14

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Minimax Values States Under Agent’s Control: States Under Opponent’s Control: -8 -5 -10 +8 Terminal States: Adversarial Search March 12, 2017 15

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Tic-Tac-Toe Game Tree Adversarial Search March 12, 2017 16

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Adversarial Search via Minimax • Deterministic, zero-sum Minimax values: – Tic-tac-toe, chess computed recursively – One player maximizes max 5 – The other minimizes • Minimax search min 5 2 – A search tree – Players alternate turns – Compute each node’s 8 2 5 6 minimax value : the best achievable utility Terminal values: against a rational part of the game (optimal) adversary Adversarial Search March 12, 2017 17

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Minimax Implementation def value(state): if the state is a terminal state: return the state’s utility if the next agent is MAX: return max-value(state) if the next agent is MIN: return min-value(state) def max-value(state): def min-value(state): initialize v = - ∞ initialize v = + ∞ for each successor of state: for each successor of state: v = max(v, value(successor)) v = min(v, value(successor)) return v return v Adversarial Search March 12, 2017 18

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Minimax Evaluation Time 𝒫(𝑐 𝑛 ) • – For chess: 𝑐 ≈ 35 , 𝑛 ≈ 100 Space 𝒫(𝑐𝑛) • Complete • Only if finite Minimax-Min Optimal • Yes, against optimal opponent Minimax-Avg Adversarial Search March 12, 2017 19

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Multiple Players Add a ply per player • Independent utility: use a vector of values, each player MAX own utility • Zero-sum: each team sequentially MIN/MAX • In Pacman, have multiple MIN layers for each ghost per 1 Pacman move Adversarial Search March 12, 2017 20

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Scaling to Larger Games Tree Pruning Depth-Limiting + Evaluation Adversarial Search March 12, 2017 21

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Minimax Example 3 2 3 2 3 12 8 2 4 6 14 5 2 Adversarial Search March 12, 2017 22

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Minimax Pruning [−∞, ∞] [3, ∞] [3,3] 3 [3,3] [2,2] [−∞, ∞] [−∞, 3] [−∞, 2] [−∞, 14] [−∞, 5] 2 3 2 3 12 8 2 14 5 2 Adversarial Search March 12, 2017 23

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky General Case 𝛽 is the best value (to 𝑁𝐵𝑌 ) found so far off the current path • If V is worse than 𝛽 , 𝑁𝐵𝑌 will avoid it – prune that branch • Define 𝛾 similarly for 𝑁𝐽𝑂 • Adversarial Search March 12, 2017 24

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Alpha-Beta Pruning def min-value(state, α, β): initialize v = + ∞ for each successor of state: v = min(v,value(successor,α,β)) if v ≤ α return v β = min(β, v) return v def max-value(state, α, β): initialize v = - ∞ for each successor of state: v = max(v,value(successor,α,β)) if v ≥ β return v α: MAX’s best option on path α = max(α, v) β: MIN’s best option on path return v Adversarial Search March 12, 2017 25

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Alpha-Beta Properties • Has no effect on minimax value computed for the root! • Good child ordering improves effectiveness of pruning • With “perfect ordering”: – Time complexity drops to 𝒫(𝑐 N/P ) – Doubles solvable depth! – Full search of, e.g. chess, is still hopeless… • This is a simple example of metareasoning (computing about what to compute) Adversarial Search March 12, 2017 26

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Checkup #1 10 8 4 50 Adversarial Search March 12, 2017 27

Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Spring 2017 | Derbinsky Checkup #2 10 6 100 8 1 2 20 4 Adversarial Search March 12, 2017 28

Adversarial Search Lecture 6 How can we use search to plan ahead - PowerPoint PPT Presentation

Wentworth Institute of Technology COMP3770 Artificial Intelligence | Spring 2017 | Derbinsky Adversarial Search Lecture 6 How can we use search to plan ahead when other agents are planning against us? Adversarial Search March 12, 2017 1

Adversarial Search Robert Platt Northeastern University Some images and slides are used from:

Adversarial Search Rob Platt Northeastern University Some images and slides are used from: AIMA

CHAPTERS 45: NON-CLASSICAL AND CHAPTERS 45: NON-CLASSICAL AND ADVERSARIAL SEARCH

Game-Playing & Adversarial Search This lecture topic: Game-Playing & Adversarial Search

CSE 473: Artificial Intelligence Today Spring 2012 Adversarial Search Minimax search

Adversarial Search Lecture 7 How can we use search to plan ahead when other agents are planning

Adversarial Search Toolbox so far Uninformed search BFS, DFS, uniform cost search

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Bounding entropies of hard squares and friends How to pick a good vector Andrew Rechnitzer

Modeling and Verification of Real-time/Hybrid/Cyber-Physical Systems via Concurrent Co-inductive

Protocol Conformance for Logic-based Agents Ulle Endriss 1 , Nicolas Maudet 2 , Fariba Sadri 1 and

Pakota: A System for Enforcement in Abstract Argumentation Andreas Niskanen Johannes P. Wallner

Rscript in a Nutshell Rscript: examples Set: Basic types:

Approximately Opaque Multi-version Permissive Transactional Memory Basem Assiri Costas Busch

Cuckoo: a Language for Implementing Memory- and Thread- safe System Services Richard West and

State of the Art and Perspectives Legal Analysis Law & Economics Computational Law Abstract

Adversarial Search Lecture 6 How can we use search to plan ahead - PowerPoint PPT Presentation

Wentworth Institute of Technology COMP3770 Artificial Intelligence | Spring 2017 | Derbinsky Adversarial Search Lecture 6 How can we use search to plan ahead when other agents are planning against us? Adversarial Search March 12, 2017 1

Adversarial Search Robert Platt Northeastern University Some images and slides are used from:

Adversarial Search Rob Platt Northeastern University Some images and slides are used from: AIMA

CHAPTERS 45: NON-CLASSICAL AND CHAPTERS 45: NON-CLASSICAL AND ADVERSARIAL SEARCH

Game-Playing &amp; Adversarial Search This lecture topic: Game-Playing &amp; Adversarial Search

CSE 473: Artificial Intelligence Today Spring 2012 Adversarial Search Minimax search

Adversarial Search Lecture 7 How can we use search to plan ahead when other agents are planning

Adversarial Search Toolbox so far Uninformed search BFS, DFS, uniform cost search

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Bounding entropies of hard squares and friends How to pick a good vector Andrew Rechnitzer

Modeling and Verification of Real-time/Hybrid/Cyber-Physical Systems via Concurrent Co-inductive

Protocol Conformance for Logic-based Agents Ulle Endriss 1 , Nicolas Maudet 2 , Fariba Sadri 1 and

Pakota: A System for Enforcement in Abstract Argumentation Andreas Niskanen Johannes P. Wallner

Rscript in a Nutshell Rscript: examples Set: Basic types:

Approximately Opaque Multi-version Permissive Transactional Memory Basem Assiri Costas Busch

Cuckoo: a Language for Implementing Memory- and Thread- safe System Services Richard West and

State of the Art and Perspectives Legal Analysis Law &amp; Economics Computational Law Abstract

Game-Playing & Adversarial Search This lecture topic: Game-Playing & Adversarial Search

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

State of the Art and Perspectives Legal Analysis Law & Economics Computational Law Abstract