4 Game Trees Game tree 4 Game Trees Game tree perfect information - - PDF document

4 game trees game tree 4 game trees game tree
SMART_READER_LITE
LIVE PREVIEW

4 Game Trees Game tree 4 Game Trees Game tree perfect information - - PDF document

4 Game Trees Game tree 4 Game Trees Game tree perfect information games perfect information games all possible plays of two all possible plays of two- -player, perfect player, perfect no hidden information no hidden


slide-1
SLIDE 1

1

§4 Game Trees §4 Game Trees

  • perfect information games

perfect information games

  • no hidden information

no hidden information

  • two

two-

  • player, perfect information games

player, perfect information games

  • Noughts and Crosses

Noughts and Crosses

  • Chess

Chess

  • Go

Go

  • imperfect information games

imperfect information games

  • Poker

Poker

  • Backgammon

Backgammon

  • Monopoly

Monopoly

  • zero

zero-

  • sum property

sum property

  • ne player’s gain equals another player’s loss
  • ne player’s gain equals another player’s loss

Game tree Game tree

  • all possible plays of two

all possible plays of two-

  • player, perfect

player, perfect information games can be represented with a information games can be represented with a game tree game tree

  • nodes: positions (or states)

nodes: positions (or states)

  • edges: moves

edges: moves

  • players:

players: MAX

MAX (has the first move) and

(has the first move) and MIN

MIN

  • ply = the length of the path between two nodes

ply = the length of the path between two nodes

  • MAX

MAX has even plies counting from the root node

has even plies counting from the root node

  • MIN

MIN has odd plies counting from the root node

has odd plies counting from the root node

Division Nim with seven matches Division Nim with seven matches Problem statement Problem statement

Given a node Given a node v v in a game tree in a game tree find a winning strategy for find a winning strategy for MAX

MAX (or

(or MIN

MIN) from

) from v v

  • r (equivalently)
  • r (equivalently)

show that show that MAX

MAX (or

(or MIN

MIN) can force a win from

) can force a win from v v

Minimax Minimax

  • assumption: players are rational and try to win

assumption: players are rational and try to win

  • given a game tree, we know the outcome in the leaves

given a game tree, we know the outcome in the leaves

  • assign the leaves to win, draw, or loss (or a numeric value like

assign the leaves to win, draw, or loss (or a numeric value like +1, 0, +1, 0, – –1) according to 1) according to MAX

MAX’s point of view

’s point of view

  • at nodes one ply above the leaves, we choose the best

at nodes one ply above the leaves, we choose the best

  • utcome among the children (which are leaves)
  • utcome among the children (which are leaves)
  • MAX

MAX: win if possible; otherwise, draw if possible; else loss

: win if possible; otherwise, draw if possible; else loss

  • MIN

MIN: loss if possible; otherwise, draw if possible; else win

: loss if possible; otherwise, draw if possible; else win

  • recurse through the nodes until in the root

recurse through the nodes until in the root

slide-2
SLIDE 2

2

Minimax rules Minimax rules

1. 1.

If the node is labelled to If the node is labelled to MAX

MAX, assign it to the

, assign it to the maximum value of its children. maximum value of its children.

2. 2.

If the node is labelled to If the node is labelled to MIN

MIN, assign it to the

, assign it to the minimum value of its children. minimum value of its children.

  • MIN

MIN minimizes,

minimizes, MAX

MAX maximizes → minimax

maximizes → minimax

MAX MAX MAX MAX MAX MAX MIN MIN MIN MIN MIN MIN

+1 –1 +1 +1 +1 –1 –1 +1 +1 –1 –1 –1 –1 –1

Analysis Analysis

  • simplifying assumptions

simplifying assumptions

  • internal nodes have the same branching factor

internal nodes have the same branching factor b b

  • game tree is searched to a fixed depth

game tree is searched to a fixed depth d d

  • time consumption is proportional to the number of

time consumption is proportional to the number of expanded nodes expanded nodes

  • 1

1 — — root node (the initial ply) root node (the initial ply)

  • b

b — — nodes in the first ply nodes in the first ply

  • b

b2

2 —

— nodes in the second ply nodes in the second ply

  • b

bd

d —

— nodes in the nodes in the d dth ply th ply

  • verall running time
  • verall running time O

O( (b bd

d)

)

Rough estimates on running Rough estimates on running times when times when d d = 5 = 5

  • suppose expanding a node takes 1 ms

suppose expanding a node takes 1 ms

  • branching factor

branching factor b b depends on the game depends on the game

  • Draughts (

Draughts (b b ≈ 3): ≈ 3): t t = 0.243 s = 0.243 s

  • Chess (

Chess (b b ≈ 30): ≈ 30): t t = 6 = 6¾ ¾ h h

  • Go (

Go (b b ≈ 300): ≈ 300): t t = 77 a = 77 a

  • alpha

alpha-

  • beta pruning reduces

beta pruning reduces b b

Controlling the search depth Controlling the search depth

  • usually the whole game tree is too large

usually the whole game tree is too large → limit the search depth → limit the search depth → a partial game tree → a partial game tree → partial minimax → partial minimax

  • n

n-

  • move look

move look-

  • ahead strategy

ahead strategy

  • stop searching after

stop searching after n n moves moves

  • make the internal nodes (i.e., frontier nodes) leaves

make the internal nodes (i.e., frontier nodes) leaves

  • use an evaluation function to ‘guess’ the outcome

use an evaluation function to ‘guess’ the outcome

Evaluation function Evaluation function

  • combination of numerical measurements

combination of numerical measurements m mi

i(

(s s, , p p) of the game state ) of the game state

  • single measurement:

single measurement: m mi

i(

(s s, , p p) )

  • difference measurement:

difference measurement: m mi

i(

(s s, , p p) − ) − m mj

j(

(s s, , q q) )

  • ratio of measurements:

ratio of measurements: m mi

i(

(s s, , p p) / ) / m mj

j(

(s s, , q q) )

  • aggregate the measurements maintaining the

aggregate the measurements maintaining the zero zero-

  • sum property

sum property

slide-3
SLIDE 3

3

Example: Noughts and Crosses Example: Noughts and Crosses

  • heuristic evaluation function

heuristic evaluation function e e: :

  • count the winning lines open to

count the winning lines open to MAX

MAX

  • subtract the number of winning lines open to

subtract the number of winning lines open to MIN

MIN

  • forced wins

forced wins

  • state is evaluated +∞, if it is a forced win for

state is evaluated +∞, if it is a forced win for MAX

MAX

  • state is evaluated

state is evaluated – –∞, if it is forced win for ∞, if it is forced win for MIN

MIN

Examples of the evaluation Examples of the evaluation

e e(•) = (•) = 6 6 – – 5 5 = 1 = 1 e e(•) = (•) = 4 4 – – 5 5 = = – –1 1 e e(•) = +∞ (•) = +∞

Drawbacks of partial minimax Drawbacks of partial minimax

  • horizon effect

horizon effect

  • heuristically promising path can lead to an unfavourable

heuristically promising path can lead to an unfavourable situation situation

  • staged search: extend the search on promising nodes

staged search: extend the search on promising nodes

  • iterative deepening: increase

iterative deepening: increase n n until out of memory or time until out of memory or time

  • phase

phase-

  • related search: opening, midgame, end game

related search: opening, midgame, end game

  • however, horizon effect cannot be totally eliminated

however, horizon effect cannot be totally eliminated

  • bias

bias

  • we want to have an estimate of minimax but get a minimax of

we want to have an estimate of minimax but get a minimax of estimates estimates

  • distortion in the root: odd plies → win, even plies → loss

distortion in the root: odd plies → win, even plies → loss

The deeper the better...? The deeper the better...?

  • assumptions:

assumptions:

  • n

n-

  • move look

move look-

  • ahead

ahead

  • branching factor

branching factor b b, depth , depth d d, ,

  • leaves with uniform random distribution

leaves with uniform random distribution

  • minimax convergence theorem:

minimax convergence theorem:

  • n

n increases → root value converges to increases → root value converges to f f( (b b, , d d) )

  • last player theorem:

last player theorem:

  • root values from odd and even plies not comparable

root values from odd and even plies not comparable

  • minimax pathology theorem:

minimax pathology theorem:

  • n

n increases → probability of selecting non increases → probability of selecting non-

  • optimal move
  • ptimal move

increases (← uniformity assumption!) increases (← uniformity assumption!)