Adversarial Search R&N 5.15.5 Jacques Fleuriot University of - PowerPoint PPT Presentation

N I V E U R S E I H T T Y O H F G R E U D I B N Adversarial Search R&N 5.1–5.5 Jacques Fleuriot University of Edinburgh, School of Informatics jdf@ed.ac.uk Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 1/25

Overview N I V E U R S E I H T T Y O H F G R E U D I B N Perfect play α – β pruning Resource limits Games of chance Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 2/25

Games vs. search problems N I V E U R S E I H T T Y O H F G R E U D I B N A game can be formally defined as a kind of search problem: S 0 : The initial state, which specifies how the game is set up at the start. PLAYER( s ): Defines which player has the move in a state. ACTIONS( s ): Returns the set of legal moves in a state. RESULT( s , a ): The transition model, which defines the result of a move. TERMINAL-TEST( s ): which is true when the game is over and false otherwise. States where the game has ended are called terminal states. UTILITY( s , p ): A utility function (objective or payoff), defines the final numeric value for a game that ends in terminal state s for a player p . In chess, the outcome is a win (1), loss (0), or draw (1/2). “Unpredictable” opponent ⇒ solution is a strategy Time limits ⇒ unlikely to find goal, must approximate Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 3/25

Types of games N I V E U R S E I H T T Y O H F G R E U D I B N Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 4/25

Game tree (2-player, deterministic, turns) N I V E U R S E I H T T Y O H F G R E U D I B N MAX ( X ) X X X MIN ( O ) X X X X X X X O X O X . . . MAX ( X ) O X O X X O X O . . . MIN ( O ) X X . . . . . . . . . . . . . . . X O X X O X X O X TERMINAL O X O O X X O X X O X O O Utility –1 0 +1 Utility for each terminal state is from MAX’s point of view. Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 5/25

Optimal Decisions N I V E U R S E I H T T Y O H F G R E U D I B Normal search: optimal decision is a sequence of actions leading to N a goal state (i.e. a winning terminal state) Adversarial search: MIN has a say in game MAX needs to find a contingent strategy that specifies: MAX’s move in initial state then... MAX’s moves in states resulting from every response by MIN to the move then... MAX’s moves in states resulting from every response by MIN to all those moves, etc... minimax value of a node = utility for MAX of being in corresponding state: MINIMAX( s ) =  UTILITY( s ) if TERMINAL-TEST( s )   max a ∈ Actions ( s ) MINIMAX(RESULT( s , a )) if PLAYER( s ) = MAX  min a ∈ Actions ( s ) MINIMAX(RESULT( s , a )) if PLAYER( s ) = MIN  Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 6/25

Minimax N I V E U R S E I H T T Y O H F G R E U Perfect play for deterministic, perfect-information games D I B N Idea: choose move to position with highest minimax value = best achievable payoff against best play Example: 2-ply game: 3 MAX A 1 A 2 A 3 3 2 2 MIN A 31 A 33 A 11 A 12 A 13 A 21 A 22 A 23 A 32 3 12 8 2 4 6 14 5 2 Idea: Proceed all the way down to the leaves of the tree then minimax values are backed up through tree Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 7/25

Minimax algorithm N I V E U R S E I H T T Y O H F G R E U D I B N function MINIMAX-DECISION( state ) returns an action return argmax a ∈ ACTIONS ( s ) MIN-VALUE(RESULT( state , a )) function MAX-VALUE( state ) returns a utility value if TERMINAL-TEST( state ) then return UTILITY( state ) v ← −∞ for each a in ACTIONS( state ) do v ← MAX( v , MIN-VALUE(RESULT( state , a ))) return v function MIN-VALUE( state ) returns a utility value if TERMINAL-TEST( state ) then return UTILITY( state ) v ← ∞ for each a in ACTIONS( state ) do v ← MIN( v , MAX-VALUE(RESULT( state , a ))) return v Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 8/25

Properties of minimax N I V E U R S E I H T T Y O H F G R E U D I B N Complete? Optimal? Time complexity? Space complexity? Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 9/25

Properties of minimax N I V E U R S E I H T T Y O H F G R E U D I B N Complete? Yes, if tree is finite Optimal? Time complexity? Space complexity? Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 9/25

Properties of minimax N I V E U R S E I H T T Y O H F G R E U D I B N Complete? Yes, if tree is finite Optimal? Yes, against an optimal opponent. Otherwise? Time complexity? Space complexity? Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 9/25

Properties of minimax N I V E U R S E I H T T Y O H F G R E U D I B N Complete? Yes, if tree is finite Optimal? Yes, against an optimal opponent. Otherwise? Time complexity? O ( b m ) Space complexity? Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 9/25

Properties of minimax N I V E U R S E I H T T Y O H F G R E U D I B N Complete? Yes, if tree is finite Optimal? Yes, against an optimal opponent. Otherwise? Time complexity? O ( b m ) Space complexity? O ( bm ) (depth-first exploration) Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 9/25

Properties of minimax N I V E U R S E I H T T Y O H F G R E U D I B N Complete? Yes, if tree is finite Optimal? Yes, against an optimal opponent. Otherwise? Time complexity? O ( b m ) Space complexity? O ( bm ) (depth-first exploration) For chess, b ≈ 35, m ≈ 100 for “reasonable” games ⇒ exact solution completely infeasible! ⇒ would like to eliminate (large) parts of game tree Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 9/25

A Prolog implementation of Minimax 1 N I V E U R S E I H T T Y O H F G R E U D I B N minimax(Pos, BestNextPos, Val) :- bagof(NextPos, move(Pos, NextPos), NextPosList), bestmove(NextPosList, BestNextPos, Val), !. minimax(Pos, _, Val) :- utility(Pos, Val). bestmove([Pos], Pos, Val) :- minimax(Pos, _, Val), !. bestmove([Pos1 | PosList], BestPos, BestVal) :- minimax(Pos1, _, Val1), bestmove(PosList, Pos2, Val2), betterOf(Pos1, Val1, Pos2, Val2, BestPos, BestVal). betterOf(Pos0, Val0, _, Val1, Pos0, Val0) :- min_to_move(Pos0), Val0 > Val1, ! ; max_to_move(Pos0), Val0 < Val1, !. betterOf(_, _, Pos1, Val1, Pos1, Val1). 1Algorithm adapted from Prolog Programming for Artificial Intelligence by Bratko Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 10/25

α – β pruning N I V E U R S E I H T T Y O H F G R E U D I B N It is possible to compute the correct minimax decision without looking at every node in the game tree. When applied to a standard minimax tree, α – β pruning returns the same move as minimax would, but prunes away branches that cannot possibly influence the final decision Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 11/25

α – β pruning example N I V E U R S E I H T T Y O H F G R E U D I B N 3 MAX 3 MIN 3 12 8 Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 12/25

α – β pruning example N I V E U R S E I H T T Y O H F G R E U D I B N 3 MAX 3 2 MIN X X 3 12 8 2 Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 12/25

α – β pruning example N I V E U R S E I H T T Y O H F G R E U D I B N 3 MAX 3 2 14 MIN X X 3 12 8 2 14 Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 12/25

α – β pruning example N I V E U R S E I H T T Y O H F G R E U D I B N 3 MAX 3 2 14 5 MIN X X 3 12 8 2 14 5 Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 12/25

α – β pruning example N I V E U R S E I H T T Y O H F G R E U D I B N 3 3 MAX 3 2 14 5 2 MIN X X 3 12 8 2 14 5 2 Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 12/25

Properties of α – β N I V E U R S E I H T T Y O H F G R E U D I B N Pruning does not affect final result (as we saw for the example) Good move ordering improves effectiveness of pruning (how could the tree in the example be better?) With “perfect ordering,” time complexity = O ( b m / 2 ) √ branching factor goes from b to b (alternative view) doubles depth of search compared to minimax A simple example of the value of reasoning about which computations are relevant (a form of metareasoning ) Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 13/25

Why is it called α – β ? N I V E U R S E I H T T Y O H F G R E U D I B N MAX MIN .. .. .. MAX MIN V α is the value of the best (i.e., highest-value) choice found so far at any choice point along the path for MAX If v is worse than α , MAX will avoid it ⇒ prune that branch Define β similarly for MIN Jacques Fleuriot Adversarial Search, R&N 5.1–5.5 14/25

Adversarial Search R&N 5.15.5 Jacques Fleuriot University of - PowerPoint PPT Presentation

N I V E U R S E I H T T Y O H F G R E U D I B N Adversarial Search R&N 5.15.5 Jacques Fleuriot University of Edinburgh, School of Informatics jdf@ed.ac.uk Jacques Fleuriot Adversarial Search, R&N 5.15.5

Adversarial Search Robert Platt Northeastern University Some images and slides are used from:

Adversarial Search Rob Platt Northeastern University Some images and slides are used from: AIMA

CHAPTERS 45: NON-CLASSICAL AND CHAPTERS 45: NON-CLASSICAL AND ADVERSARIAL SEARCH

Game-Playing & Adversarial Search This lecture topic: Game-Playing & Adversarial Search

CSE 473: Artificial Intelligence Today Spring 2012 Adversarial Search Minimax search

Adversarial Search Lecture 7 How can we use search to plan ahead when other agents are planning

Adversarial Search Lecture 6 How can we use search to plan ahead when other agents are planning

Adversarial Search Toolbox so far Uninformed search BFS, DFS, uniform cost search

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

while Loops Introducing: while Loops General form of a while loop statement: while [boolean

U ( s ) = U ( s ) + [ r + U ( s 0 ) U ( s )] inputs : mdp , an MDP, and , a policy to

FEI Week 3 Valuation Venture Capital Chris Ansell MBA CFA BPP BUSINESS SCHOOL BPP BUSINESS

301AA - Advanced Programming Lecturer: Andrea Corradini andrea@di.unipi.it

'()+,)&-.),& 4E&+;.+F,&;.2&6(&)E6,&G+,().=:(&

The term structure of interest rates Financial markets and recession expectations 2018-2019

USING DATA TO FIND THE OPTIMAL MIX OF RETAIL LOCATIONS AND RESOURCES INTRODUCTION Education

| 1 New gTLD Subsequent Procedures PDP Working Group ICANN65 Monday, 24 June 2019 13:30 to

Adversarial Search R&N 5.15.5 Jacques Fleuriot University of - PowerPoint PPT Presentation

N I V E U R S E I H T T Y O H F G R E U D I B N Adversarial Search R&N 5.15.5 Jacques Fleuriot University of Edinburgh, School of Informatics jdf@ed.ac.uk Jacques Fleuriot Adversarial Search, R&N 5.15.5

Adversarial Search Robert Platt Northeastern University Some images and slides are used from:

Adversarial Search Rob Platt Northeastern University Some images and slides are used from: AIMA

CHAPTERS 45: NON-CLASSICAL AND CHAPTERS 45: NON-CLASSICAL AND ADVERSARIAL SEARCH

Game-Playing &amp; Adversarial Search This lecture topic: Game-Playing &amp; Adversarial Search

CSE 473: Artificial Intelligence Today Spring 2012 Adversarial Search Minimax search

Adversarial Search Lecture 7 How can we use search to plan ahead when other agents are planning

Adversarial Search Lecture 6 How can we use search to plan ahead when other agents are planning

Adversarial Search Toolbox so far Uninformed search BFS, DFS, uniform cost search

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

while Loops Introducing: while Loops General form of a while loop statement: while [boolean

U ( s ) = U ( s ) + [ r + U ( s 0 ) U ( s )] inputs : mdp , an MDP, and , a policy to

FEI Week 3 Valuation Venture Capital Chris Ansell MBA CFA BPP BUSINESS SCHOOL BPP BUSINESS

301AA - Advanced Programming Lecturer: Andrea Corradini andrea@di.unipi.it

'()*+*,)&amp;-.)*,&amp; 4E*&amp;+*;.+F,&amp;;.2*&amp;6(&amp;)E6,&amp;G+*,*().=:(&amp;

The term structure of interest rates Financial markets and recession expectations 2018-2019

USING DATA TO FIND THE OPTIMAL MIX OF RETAIL LOCATIONS AND RESOURCES INTRODUCTION Education

| 1 New gTLD Subsequent Procedures PDP Working Group ICANN65 Monday, 24 June 2019 13:30 to

Game-Playing & Adversarial Search This lecture topic: Game-Playing & Adversarial Search

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

'()+,)&-.),& 4E&+;.+F,&;.2&6(&)E6,&G+,().=:(&