Adversarial Search Berlin Chen 2004 References: 1. S. Russell and - PowerPoint PPT Presentation

Adversarial Search Berlin Chen 2004 References: 1. S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach . Chapter 6 2. N. J. Nilsson. Artificial Intelligence: A New Synthesis . Chapter 12 3. S. Russell’s teaching materials

Introduction • Game theory – First developed by von Neumann and Morgensten – Widely studied by economists, mathematicians, financiers, etc. – The action of one player (agent) can significantly affect the utilities of the others • Cooperative or competitive • Deal with the environments with multiple agents • Most games studied in AI are (state, action(state)) → next state – Deterministic (but strategic) – Turn-taking This means in deterministic, fully observable – Two-player environments in which there are two agents whose actions must alternate – Zero-sum and in which the utility values at the end of – Perfect information game are always equal or opposite But not physical games AI 2004 – Berlin Chen 2

Types of Games Deterministic chance Chess, Checkers, Backgammon Perfect information Go, Othello Bridge, Poker Imperfect information • Games are one of the first tasks undertaken in AI – The abstract nature of (nonphysical) games makes them an appealing subject in AI • Computers have surpassed humans on checkers and Othello , and have defeated human champions in chess and backgammon • However, in Go , computers still perform at the amateur level AI 2004 – Berlin Chen 3

Games as Search Problems • Games are usually too hard to solve – E.g., a chess game • Average branching factor: 35 • Average moves by each player: 50 • Total number of nodes in the search tree: 35 100 or 10 154 • Total number of distinct states:10 40 • The solution is a strategy that specifies a move for every possible opponent reply – Time limit: how to make the best possible use of time? • Calculate the optimal decision may be infeasible • Pruning is needed – Uncertainty: due to the opponent’s actions and game complexity • Imperfect information • Chance AI 2004 – Berlin Chen 4

Scenario • Games with two players – MAX, moves first Then, taking turns – MIN, moves second – At the end of the game • Winner awarded and loser penalized • Or, draw – Can be formally defined as a kind of search problem Sense → Plan → Act AI 2004 – Berlin Chen 5

Games as Search Problems • Main components should be specified – Initial State • Board position, which player to move Define the game tree – Successor Function • A list of legal ( move , state ) pairs for each state indicating a legal move and the resulting state – Terminal Test • Determine when the game is over • Terminal states: states where the game has ended – Utility Function (objective/payoff function) • Give numeric values for all terminal states, e.g.: From the viewpoint – Win, loss or draw : +1, -1, 0 of MAX – Or values with a wider variety AI 2004 – Berlin Chen 6

Example Game Tree for Tic-Tac-Toe • Tic-Tac-Toe also called Noughts and Crosses – 2-player, deterministic, alternating game tree – The numbers on leaves indicate the utility values of terminal states from the point of view of the MAX AI 2004 – Berlin Chen 7

Minimax Search • A strategy/solution for optimal decisions • Examine the minimax value of each node in the game tree ( ) − = Minmax Value n ( ) ⎧ Utility n if n is a terminal state ⎪ ( ) − max Minmax Value s if n is a MAX node ⎨ ( ) ∈ s Successor n ⎪ ( ) − min Minmax Value s if n is a MIN node ⎩ ( ) ∈ s Successor n – The is just the utility from the point of view of MAX – Assume two players (MAX and MIN) play optimally (infallibly) from the current node to the end of the game AI 2004 – Berlin Chen 8

Minimax Search (cont.) • Example: a trivial 2-ply (one-move-deep) game – Perfect play for the deterministic, perfect-information game • MAX and MIN play optimally – Idea: choose the move to a position with highest minimax value = best achievable payoff against best play A ply: a pair of alternative moves for MAX and MIN AI 2004 – Berlin Chen 9

Tree for Tic-Tac-Toe MAX MIN AI 2004 – Berlin Chen 10

Tree for Tic-Tac-Toe (cont.) MAX MIN AI 2004 – Berlin Chen 11

Tree for Tic-Tac-Toe (cont.) MAX MIN AI 2004 – Berlin Chen 12

Minimax Search: Algorithm For MAX Node For MIN Node AI 2004 – Berlin Chen 13

Minimax Search: Example A v A =- ∞ v A =- ∞ A v A =- ∞ A v B = 3 B B v B = ∞ B Terminal-Test 3 v A =- ∞ A v A =3 A v A =- ∞ A Backed up v B =3 v B =3 B B v B =3 B to root 12 8 3 12 12 8 3 3 AI 2004 – Berlin Chen 14

Minimax Search: Example (cont.) v A =3 A A v A =3 v A =3 A C v B =3 C v C =2 B v B =3 v C = 2 B C v C = ∞ v B =3 B 4 12 8 2 3 2 12 8 3 Backed up v A =3 A A v A =3 to root C v B =3 v C =2 C B v B =3 v C =2 B 4 6 12 8 2 3 4 6 12 8 2 3 AI 2004 – Berlin Chen 15

Minimax Search: Example (cont.) v A =3 A v A =3 A C C D v C =2 v D = ∞ B v C =2 D v B =3 B v D = 14 v B =3 4 6 12 8 2 3 4 6 12 8 2 14 3 v A =3 A v A =3 A C C v C =2 v C =2 D B D B v D = 5 v D = 2 v B =3 v B =3 4 6 4 6 12 8 2 5 2 12 8 2 5 3 3 14 14 AI 2004 – Berlin Chen 16

Minimax Search: Example (cont.) Backed up v A =3 A to root C D v D =2 v B =3 v C =2 B 5 2 14 4 6 12 8 2 3 AI 2004 – Berlin Chen 17

Minimax Search (cont.) • Explanations of the Minmax Algorithm – A complete depth-first, recursive exploration of the game tree – The utility function is applied to each terminal state – The utility (min or max values) of internal tree nodes are calculated and then backed up through the tree as the recursion unwind – At the root, MAX chooses the move leading to the highest utility AI 2004 – Berlin Chen 18

Properties of Minimax Search • Is complete if tree is finite • Is optimal if the opponent acts optimally • Time complexity: O ( b m ) – m : the maximum depth of the tree • Space complexity: O ( bm ) or O ( m ) (when successors generated one at a time ) For chess, b ≈ 35, m ≈ 100 for “reasonable” games I.e., exact solution is completely infeasible AI 2004 – Berlin Chen 19

Optimal Decisions in Multiplayer Games • Extend the minimax idea to multiplayer games • Replace the single value for each node with a vector of values (utility vector) If A and B are in an alliance • Alliances among players would be involved sometimes – E.g., A and B form an alliance to attack C AI 2004 – Berlin Chen 20

α - β Pruning • The problem with minimax search – The number of nodes to examine is exponential in the number of moves • α - β pruning – Applied to the minimax tree – Return the same moves as minimax would, but prune away branches that can’t possibly influence the final decision • α : the value of best (highest-value) choice so far in search of MAX • β : the value of best (lowest-value) choice so far in search of MIN AI 2004 – Berlin Chen 21

α - β Pruning (cont.) • Example A The subtree to be explored next should have a utility B equal to or higher than 3 AI 2004 – Berlin Chen 22

α - β Pruning (cont.) • Example A C B The utility of this subtree will be no more than 2 (lower than current α ), so the remaining children can be pruned AI 2004 – Berlin Chen 23

α - β Pruning (cont.) • Example A C D B AI 2004 – Berlin Chen 24

α - β Pruning (cont.) • Example A C B D AI 2004 – Berlin Chen 25

α - β Pruning (cont.) • Example A C D B Can’t prune any successors of D at all because the worst successors of D have been generated first AI 2004 – Berlin Chen 26

α - β Pruning (cont.) AI 2004 – Berlin Chen 27

α - β Pruning (cont.) ( ) ( ( ) ( ) ( ) ) − = Minmax Value root max min 3 , 12 , 8 , min 2 , x , y , min 14 , 5 , 2 ( ( ) ) = max 3 , min 2 , x , y , 2 ( ) = ≤ max 3 , z , 2 where z 2 = 3 • The value of the root are independent of the value of the pruned leaves x and y AI 2004 – Berlin Chen 28

Tree for Tic-Tac-Toe (cont.) Alpha value= -1 Beta value= -1 AI 2004 – Berlin Chen 29

α - β Pruning (cont.) • Algorithm For MAX Node Pruning: If one of its children has value larger than that of its best MIN predecessor node , return immediately. (?) For MIN Node Pruning: If one of its children has value lower than that of its best MAX predecessor node , return immediately. (?) AI 2004 – Berlin Chen 30

α - β Pruning (cont.) (MAX) (MIN) Should examine some of n ’s descendant to reach the conclusion If m is better than n for Player (MAX), n will not be visited in play and can therefore be pruned AI 2004 – Berlin Chen 31

Properties of α - β Pruning • Pruning does not affect final result • The effectiveness of alpha-beta pruning is highly dependent on the order in which the successors are examined – Worthwhile to try to examine first the successors that are likely to be best – E.g., If the third successor “2” of node D has been generated first, the other two “14” and “5” can be pruned A C B D AI 2004 – Berlin Chen 32

Adversarial Search Berlin Chen 2004 References: 1. S. Russell and - PowerPoint PPT Presentation

Adversarial Search Berlin Chen 2004 References: 1. S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach . Chapter 6 2. N. J. Nilsson. Artificial Intelligence: A New Synthesis . Chapter 12 3. S. Russells teaching materials

Adversarial Search Robert Platt Northeastern University Some images and slides are used from:

Adversarial Search Rob Platt Northeastern University Some images and slides are used from: AIMA

CHAPTERS 45: NON-CLASSICAL AND CHAPTERS 45: NON-CLASSICAL AND ADVERSARIAL SEARCH

Game-Playing & Adversarial Search This lecture topic: Game-Playing & Adversarial Search

CSE 473: Artificial Intelligence Today Spring 2012 Adversarial Search Minimax search

Adversarial Search Lecture 7 How can we use search to plan ahead when other agents are planning

Adversarial Search Lecture 6 How can we use search to plan ahead when other agents are planning

Adversarial Search Toolbox so far Uninformed search BFS, DFS, uniform cost search

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

Modelling & Datatypes Koen Lindstrm Claessen Software Software = Programs + Data

Today Nondeterministic games: backgammon 0 1 2 3 4 5 6 7 8 9 10 11 12 See Russell and

Residen'al Schools 1840s-1996 Children taken from home

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Paula

University Florence of y heat work beet W Kg heat work beet W Kg heat work beet W

Using Sampling to Understand Parallel Program Performance Nathan Tallent John Mellor-Crummey M.

Message from by the election of a very talented, energetic opportunity which we must seize with

On Kaleidoscope Designs Francesca Merola Roma Tre University Joint work with Marco Buratti

Adversarial Search Berlin Chen 2004 References: 1. S. Russell and - PowerPoint PPT Presentation

Adversarial Search Berlin Chen 2004 References: 1. S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach . Chapter 6 2. N. J. Nilsson. Artificial Intelligence: A New Synthesis . Chapter 12 3. S. Russells teaching materials

Adversarial Search Robert Platt Northeastern University Some images and slides are used from:

Adversarial Search Rob Platt Northeastern University Some images and slides are used from: AIMA

CHAPTERS 45: NON-CLASSICAL AND CHAPTERS 45: NON-CLASSICAL AND ADVERSARIAL SEARCH

Game-Playing &amp; Adversarial Search This lecture topic: Game-Playing &amp; Adversarial Search

CSE 473: Artificial Intelligence Today Spring 2012 Adversarial Search Minimax search

Adversarial Search Lecture 7 How can we use search to plan ahead when other agents are planning

Adversarial Search Lecture 6 How can we use search to plan ahead when other agents are planning

Adversarial Search Toolbox so far Uninformed search BFS, DFS, uniform cost search

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

Modelling &amp; Datatypes Koen Lindstrm Claessen Software Software = Programs + Data

Today Nondeterministic games: backgammon 0 1 2 3 4 5 6 7 8 9 10 11 12 See Russell and

Residen'al Schools 1840s-1996 Children taken from home

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Paula

University Florence of y heat work beet W Kg heat work beet W Kg heat work beet W

Using Sampling to Understand Parallel Program Performance Nathan Tallent John Mellor-Crummey M.

Message from by the election of a very talented, energetic opportunity which we must seize with

On Kaleidoscope Designs Francesca Merola Roma Tre University Joint work with Marco Buratti

Game-Playing & Adversarial Search This lecture topic: Game-Playing & Adversarial Search

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Modelling & Datatypes Koen Lindstrm Claessen Software Software = Programs + Data