Adversarial Search Rob Platt Northeastern University Some images - PowerPoint PPT Presentation

Adversarial Search Rob Platt Northeastern University Some images and slides are used from: AIMA CS188 UC Berkeley

What is adversarial search? Adversarial search: planning used to play a game such as chess or checkers – algorithms are similar to graph search except that we plan under the assumption that our opponent will maximize his own advantage...

Some types of games Chess Solved/unsolved? Checkers Solved/unsolved? Tic-tac-toe Solved/unsolved? Go Solved/unsolved? Outcome of game can be predicted from any initial state assuming both players play perfectly

Examples of adversarial search Chess Unsolved Checkers Solved Tic-tac-toe Solved Go Unsolved Outcome of game can be predicted from any initial state assuming both players play perfectly

Examples of adversarial search Chess Unsolved ~10^40 states Checkers Solved ~10^20 states Tic-tac-toe Solved Less than 9!=362k states Go Unsolved ? Outcome of game can be predicted from any initial state assuming both players play perfectly

Different types of games Deterministic / stochastic Two player / multi player? Zero-sum / non zero-sum Perfect information / imperfect information

Different types of games Deterministic / stochastic Zero Sum: Two player / multi player? – utilities of all players sum to zero – pure competition Zero-sum / non zero-sum Non-Zero Sum: – utility function of each play Perfect information / imperfect information could be arbitrary – optimal strategies could involve cooperation

Formalizing a Game Given: Action that player p Calculate a policy: should take from state s

Formalizing a Game Given: How? Action that player p Calculate a policy: should take from state s

How solve for a policy? Use adversarial search! – build a game tree

This is a game tree for tic-tac-toe

This is a game tree for tic-tac-toe You

This is a game tree for tic-tac-toe You Them

This is a game tree for tic-tac-toe You Them You

This is a game tree for tic-tac-toe You Them You Them

This is a game tree for tic-tac-toe You Them You Them Utility

What is Minimax? Consider a simple game: 1. you make a move 2. your opponent makes a move 3. game ends

What is Minimax? Consider a simple game: What does the minimax tree 1. you make a move look like in this case? 2. your opponent makes a move 3. game ends

What is Minimax? Consider a simple game: What does the minimax tree 1. you make a move look like in this case? 2. your opponent makes a move 3. game ends Max (you) Min (them) Max 3 12 8 2 4 6 14 5 2 (you)

What is Minimax? Max (you) Min (them) Max 3 12 8 2 4 6 14 5 2 (you) These are terminal utilities – assume we know what these values are

What is Minimax? Max (you) Min 3 2 2 (them) Max 3 12 8 2 4 6 14 5 2 (you)

What is Minimax? 3 Max Max (you) (you) Min Min 3 2 2 (them) (them) Max 3 12 8 2 4 6 14 5 2 (you)

What is Minimax? 3 Max (you) This is called “backing up” Min 3 2 2 the values (them) Max 3 12 8 2 4 6 14 5 2 (you)

Minimax Okay – so we know how to back up values ... … but, how do we construct the tree? 3 12 8 2 4 6 14 5 2 This tree is already built...

Minimax Notice that we only get utilities at the bottom of the tree … – therefore, DFS makes sense.

Minimax Notice that we only get utilities at the bottom of the tree … – therefore, DFS makes sense. 3

Minimax Notice that we only get utilities at the bottom of the tree … – therefore, DFS makes sense. 3 12

Minimax Notice that we only get utilities at the bottom of the tree … – therefore, DFS makes sense. 3 12 8

Minimax Notice that we only get utilities at the bottom of the tree … – therefore, DFS makes sense. 3 3 12 8

Minimax Notice that we only get utilities at the bottom of the tree … – therefore, DFS makes sense. 3 2 3 12 8 2 4 6

Minimax Notice that we only get utilities at the bottom of the tree … – therefore, DFS makes sense. 3 3 2 2 3 12 8 2 4 6 14 5 2

Minimax Notice that we only get utilities at the bottom of the tree … – therefore, DFS makes sense. – since most games have forward progress, the distinction between tree search and graph search is less important

Minimax

Minimax properties Is it always correct to assume your opponent plays optimally? Max (you) Min ? (them) Max (you) 10 10 9 100

Minimax properties Is minimax optimal? Is it complete?

Minimax properties Is minimax optimal? Is it complete? Time complexity = ? Space complexity = ?

Minimax properties Is minimax optimal? Is it complete? Time complexity = Space complexity =

Minimax properties Is minimax optimal? Is it complete? Time complexity = Space complexity = Is it practical? In chess, b=35, d=100

Minimax properties Is minimax optimal? Is it complete? Time complexity = Space complexity = Is it practical? In chess, b=35, d=100 is a big number...

Minimax properties Is minimax optimal? Is it complete? Time complexity = Space complexity = Is it practical? In chess, b=35, d=100 is a big number... So what can we do?

Evaluation functions Key idea: cut off search at a certain depth and give the corresponding nodes an estimated value. 1 -6 1 Cut off -5 -6 3 1 recursion here

Evaluation functions Key idea: cut off search at a certain depth and give the corresponding nodes an estimated value. the evaluation function 1 makes this estimate. -6 1 Cut off -5 -6 3 1 recursion here

Evaluation functions How does the evaluation function make the estimate? – depends upon domain For example, in chess, the value of a state might equal the sum of piece values. – a pawn counts for 1 – a rook counts for 5 – a knight counts for 3 ...

A weighted linear evaluation function number of pawns on the board number of knights on the board A pawn counts for 1 A knight counts for 3 Eval = 3-2.5=0.5 Eval = 3+2.5+1+1-2.5 = 5

A weighted linear evaluation function number of pawns on the board number of knights on the board A pawn counts for 1 A knight counts for 3 Maybe consider other factors as well? Eval = 3-2.5=0.5 Eval = 3+2.5+1+1-2.5 = 5

Evaluation functions Problem: In realistic games, cannot search to leaves! Solution: Depth-limited search Instead, search only to a limited depth in the tree Replace terminal utilities with an evaluation function for non-terminal positions Example: Suppose we have 100 seconds Can explore 10K nodes / sec So can check 1M nodes per move Guarantee of optimal play is gone More plies makes a BIG difference Use iterative deepening for an anytime algorithm

At what depth do you run the evaluation function? Option 1: cut off search at a fixed depth 1 Option 2: cut off search at particular states deeper than a certain threshold -6 1 -5 -6 3 1 The deeper your threshold, the less the quality of the evaluation function matters...

Alpha/Beta pruning

Alpha/Beta pruning 3 3 12 8

Alpha/Beta pruning 3 3 12 8 2

Alpha/Beta pruning 3 3 12 8 2 4

Alpha/Beta pruning 3 We don't need to expand this node! 3 12 8 2 4

Alpha/Beta pruning 3 We don't need to expand this node! Why? 3 12 8 2 4

Alpha/Beta pruning Max Min 3 We don't need to expand this node! Why? 3 12 8 2 4

Alpha/Beta pruning Max 3 Min 3 2 2 3 12 8 2 14 5 2

Alpha/Beta pruning So, we don't need to expand these nodes in order to back up correct values! Max 3 Min 3 2 2 3 12 8 2 14 5 2

Alpha/Beta pruning So, we don't need to expand these nodes That's alpha-beta in order to back up correct values! pruning. Max 3 Min 3 2 2 3 12 8 2 14 5 2

Alpha/Beta pruning: algorithm α: MAX’s best option on path to root β: MIN’s best option on path to root def max-value(state, α, β): def min-value(state , α, β): initialize v = -∞ initialize v = +∞ for each successor of state: for each successor of state: v = max(v, v = min(v, value(successor, value(successor, α, β)) α, β)) if v ≥ β return v if v ≤ α return v α = max(α, v) β = min(β, v) return v return v

Alpha/Beta pruning (-inf,+inf)

Alpha/Beta pruning (-inf,+inf) (-inf,+inf)

Alpha/Beta pruning (-inf,+inf) Best value for far for MIN along path to root (-inf,3) 3 3

Alpha/Beta pruning (-inf,+inf) Best value for far for MIN along path to root (-inf,3) 3 3 12

Alpha/Beta pruning (-inf,+inf) Best value for far for MIN along path to root (-inf,3) 3 3 12 8

Alpha/Beta pruning Best value for far for MAX along path to root (3,+inf) (-inf,3) 3 3 12 8

Alpha/Beta pruning (3,+inf) (-inf,3) (3,+inf) 3 3 12 8

Adversarial Search Rob Platt Northeastern University Some images - PowerPoint PPT Presentation

Adversarial Search Rob Platt Northeastern University Some images and slides are used from: AIMA CS188 UC Berkeley What is adversarial search? Adversarial search: planning used to play a game such as chess or checkers algorithms are

Adversarial Search Robert Platt Northeastern University Some images and slides are used from:

CHAPTERS 45: NON-CLASSICAL AND CHAPTERS 45: NON-CLASSICAL AND ADVERSARIAL SEARCH

Game-Playing & Adversarial Search This lecture topic: Game-Playing & Adversarial Search

CSE 473: Artificial Intelligence Today Spring 2012 Adversarial Search Minimax search

Adversarial Search Lecture 7 How can we use search to plan ahead when other agents are planning

Adversarial Search Lecture 6 How can we use search to plan ahead when other agents are planning

Adversarial Search Toolbox so far Uninformed search BFS, DFS, uniform cost search

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Who wins and how? Sasha Rubin Cornell REU 2009 Traditional Game Theory von Neumann,

Henry Chu Professor, School of Computing and Informatics Executive Director, Informatics Research

Case 45 yow comes to see you complaining of fatigue, depressive symptoms and weight gain over

ARTIFICIAL INTELLIGENCE Russell & Norvig Games, evaluation functions Tic-Tac-Toe The

TIC TAC TOE TIC TAC TOE DEVELOPMENT CSSE 120Rose Hulman Institute of Technology Viewing

Adversarial Search Volker Sorge Intro to AI: Problem of Games Lecture 4 Volker Sorge MiniMax

Lecture: Segmentation Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab

Plan for this class Logistics Welcome to 4003-380 Syllabus & Ground Rules

Adversarial Search Rob Platt Northeastern University Some images - PowerPoint PPT Presentation

Adversarial Search Rob Platt Northeastern University Some images and slides are used from: AIMA CS188 UC Berkeley What is adversarial search? Adversarial search: planning used to play a game such as chess or checkers algorithms are

Adversarial Search Robert Platt Northeastern University Some images and slides are used from:

CHAPTERS 45: NON-CLASSICAL AND CHAPTERS 45: NON-CLASSICAL AND ADVERSARIAL SEARCH

Game-Playing &amp; Adversarial Search This lecture topic: Game-Playing &amp; Adversarial Search

CSE 473: Artificial Intelligence Today Spring 2012 Adversarial Search Minimax search

Adversarial Search Lecture 7 How can we use search to plan ahead when other agents are planning

Adversarial Search Lecture 6 How can we use search to plan ahead when other agents are planning

Adversarial Search Toolbox so far Uninformed search BFS, DFS, uniform cost search

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Who wins and how? Sasha Rubin Cornell REU 2009 Traditional Game Theory von Neumann,

Henry Chu Professor, School of Computing and Informatics Executive Director, Informatics Research

Case 45 yow comes to see you complaining of fatigue, depressive symptoms and weight gain over

ARTIFICIAL INTELLIGENCE Russell &amp; Norvig Games, evaluation functions Tic-Tac-Toe The

TIC TAC TOE TIC TAC TOE DEVELOPMENT CSSE 120Rose Hulman Institute of Technology Viewing

Adversarial Search Volker Sorge Intro to AI: Problem of Games Lecture 4 Volker Sorge MiniMax

Lecture: Segmentation Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab

Plan for this class Logistics Welcome to 4003-380 Syllabus &amp; Ground Rules

Game-Playing & Adversarial Search This lecture topic: Game-Playing & Adversarial Search

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

ARTIFICIAL INTELLIGENCE Russell & Norvig Games, evaluation functions Tic-Tac-Toe The

Plan for this class Logistics Welcome to 4003-380 Syllabus & Ground Rules