Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
Adversarial Search
Lecture 7
How can we use search to plan ahead when
- ther agents are planning against us?
June 10, 2017 Adversarial Search 1
Adversarial Search Lecture 7 How can we use search to plan ahead - - PowerPoint PPT Presentation
Wentworth Institute of Technology COMP3770 Artificial Intelligence | Summer 2017 | Derbinsky Adversarial Search Lecture 7 How can we use search to plan ahead when other agents are planning against us ? Adversarial Search June 10, 2017 1
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 1
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 2
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 3
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 4
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
Deterministic Stochastic Perfect Chess, Checkers, Go, Othello Backgammon, Monopoly Imperfect Battleship Bridge, Poker, Scrabble
June 10, 2017 Adversarial Search 5
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 6
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 7
DeepBlue
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
– 2015: beat Fan Hui, European champion (2-dan; 5-0) – 2016: beat Lee Sedol,
in the world (9-dan; 4-1) – 2017: beat Ke Jie, #1 in the world (9-dan; 2-0)
June 10, 2017 Adversarial Search 8
AlphaGo
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
– 120,000 hands played
June 10, 2017 Adversarial Search 9
Libratus
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 10
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 11
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 12
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 13
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 14
8 2 2 6 4 6 … …
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 15
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 16
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 17
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
– Tic-tac-toe, chess – One player maximizes – The other minimizes
– A search tree – Players alternate turns – Compute each node’s minimax value: the best achievable utility against a rational (optimal) adversary
June 10, 2017 Adversarial Search 18
8 2 5 6 max min 2 5 5 Terminal values: part of the game Minimax values: computed recursively
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 19
def min-value(state): initialize v = +∞ for each successor of state: v = min(v, value(successor)) return v def max-value(state): initialize v = -∞ for each successor of state: v = max(v, value(successor)) return v def value(state): if the state is a terminal state: return the state’s utility if the next agent is MAX: return max-value(state) if the next agent is MIN: return min-value(state)
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
Time
– For chess: 𝑐 ≈ 35, 𝑛 ≈ 100
Space
Complete
Optimal
June 10, 2017 Adversarial Search 20
Minimax-Min Minimax-Avg
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
multiple MIN layers for each ghost per 1 Pacman move
June 10, 2017 Adversarial Search 21
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 22
Depth-Limiting + Evaluation
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 23
12 8 5 2 3 2 14 4 6 3 2 2 3
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 24
12 8 5 2 3 2 14 [−∞, ∞] [−∞, ∞] [−∞, 3] [3,3] 3 [3, ∞] [−∞, 2] [−∞, 14] 2 [−∞, 5] [2,2] 2 3 [3,3]
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 25
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 26
def min-value(state, α, β): initialize v = +∞ for each successor of state: v = min(v,value(successor,α,β)) if v ≤ α return v β = min(β, v) return v def max-value(state, α, β): initialize v = -∞ for each successor of state: v = max(v,value(successor,α,β)) if v ≥ β return v α = max(α, v) return v α: MAX’s best option on path β: MIN’s best option on path
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
– Time complexity drops to 𝒫(𝑐N/P) – Doubles solvable depth! – Full search of, e.g. chess, is still hopeless…
June 10, 2017 Adversarial Search 27
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 28
10 8 50 4
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 29
10 6 100 8 1 2 20 4
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
Adversarial Search 30
5 6 4 3 6 7 6 7 5 6 9 5 9 8
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
1. Search only to a limited depth in the tree 2. Replace terminal utilities with an evaluation function for non-terminal positions
June 10, 2017 Adversarial Search 31
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 32
Depth2 Depth10
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
e.g. 𝑔
R(𝑡) = (num white queens – num black queens)
June 10, 2017 Adversarial Search 33
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
– He knows his score will go up by eating a dot now – He knows his score will go up just as much by eating a dot later – There are no point-scoring opportunities after eating a dot (within the horizon, two here) – Therefore, waiting seems just as good as eating: he may go east, then back west in the next round of replanning!
June 10, 2017 Adversarial Search 34
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 35
Thrashing Thrashing-Fixed SmartGhosts-1 SmartGhosts-2
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 36
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 37
10 10 9 100 max min
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
result of an action will be?
– Explicit randomness: rolling dice – Unpredictable opponents: the ghosts respond randomly – Actions can fail: when moving a robot, wheels might slip
average-case (expectimax)
(minimax) outcomes
the average score under optimal play
– Max nodes as in minimax search – Chance nodes are like min nodes but the outcome is uncertain – Calculate their expected utilities
June 10, 2017 Adversarial Search 38
10 4 5 7 max chance 10 10 9 100
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
an event whose outcome is unknown
an assignment of weights to
– Random variable:
– Outcomes:
– Distribution:
June 10, 2017 Adversarial Search 39
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 40
x x x
35 min
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 41
def exp-value(state): initialize v = 0 for each successor of state: p = probability(successor) v += p * value(successor) return v def max-value(state): initialize v = -∞ for each successor of state: v = max(v, value(successor)) return v def value(state): if the state is a terminal state: return the state’s utility if the next agent is MAX: return max-value(state) if the next agent is EXP: return exp-value(state)
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 42
def exp-value(state): initialize v = 0 for each successor of state: p = probability(successor) v += p * value(successor) return v
1/2 1/3 1/6
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
have a probabilistic model of how the opponent (or environment) will behave in any state
– Model could be a simple uniform distribution (roll a die) – Model could be sophisticated and require a great deal of computation – We have a chance node for any
– The model might say that adversarial actions are likely!
node magically comes along with probabilities that specify the distribution over its outcomes
June 10, 2017 Adversarial Search 43
Wentworth Institute of Technology COMP3770 – Artificial Intelligence | Summer 2017 | Derbinsky
June 10, 2017 Adversarial Search 44