AAAI'08 Tutorial
General Game Playing Michael Thielscher, Dresden
Some of the material presented in this tutorial originates in work by Michael Genesereth and the Stanford Logic
- Group. We greatly appreciate their contribution.
General Game Playing Michael Thielscher, Dresden Some of the - - PowerPoint PPT Presentation
AAAI'08 Tutorial Chess Players General Game Playing Michael Thielscher, Dresden Some of the material presented in this tutorial originates in work by Michael Genesereth and the Stanford Logic Group. We greatly appreciate their contribution.
AAAI'08 Tutorial
Some of the material presented in this tutorial originates in work by Michael Genesereth and the Stanford Logic
In the early days, game playing machines were considered a key to Artificial Intelligence (AI). But chess computers are highly specialized systems. Deep-Blue's intelligence was limited. It couldn't even play a decent game of Tic-Tac-Toe or Rock-Paper-Scissors. With General Game Playing many of the original expectations with game playing machines get revived. A General Game Player is a system that understands formal descriptions of arbitrary strategy games learns to play these games well without human intervention A General Game Player needs to exhibit much broader intelligence: abstract thinking strategic planning learning Traditional research on game playing focuses on constructing specific evaluation functions building libraries for specific games The intelligence lies with the programmer, not with the program!
Rather than being concerned with a specialized solution to a narrow problem, General Game Playing encompasses a variety of AI areas: Game Playing Knowledge Representation Planning and Search Learning General Game Playing is considered a grand AI Challenge
Games Agents Deterministic, complete information Competitive environments Nondeterministic, partially observable Uncertain environments Rules partially unknown Unknown environment model Robotic player Real-world environments
Commercially available chess computers can't be used for a game of Bughouse Chess. An adaptable game computer would allow the user to modify the rules for arbitrary variants of a game.
A General Game Playing system could be used for negotiations, marketing strategies, pricing, etc. It can be easily adapted to changes in the business processes and rules, new competitors, etc. The rules of an
agents can automatically learn how to participate.
(deterministic games w/ complete information only) Game description language Variety of games/actual matches Basic player available for download Annual world cup @AAAI (since 2005) Price money: US$ 10,000 games.stanford.edu
The Game Description Language GDL: Knowledge Representation How to make legal moves: Automated Reasoning How to solve simple games: Planning & Search How to play well: Learning
Every finite game can be modeled as a state transition system But direct encoding impossible in practice ~ 1043 legal positions 19,683 states
cell(X,Y,C) X ∈ {a,...,h} Y ∈ {1,...,8} C ∈ {whiteKing,...,blank} control(P) P ∈ {white,black}
a b c d e f g h
8 7 6 5 4 3 2 1
canCastle(P,S) P ∈ {white,black} S ∈ {kingsSide,queensSide} enPassant(C) C ∈ {a,...,h}
a b c d e f g h
8 7 6 5 4 3 2 1
move(U,V,X,Y) U,X ∈ {a,...,h} V,Y ∈ {1,...,8} promote(X,Y,P) X,Y ∈ {a,...,h} P ∈ {whiteQueen,...}
a b c d e f g h
8 7 6 5 4 3 2 1 Players Initial position Legal Moves init(cell(a,1,whiteRook))
✁... roles([white,black]) legal(white,promote(X,Y,P)) <= true(cell(X,7,whitePawn))
✁...
Position updates End of game Result terminal <= checkmate
✂stalemate next(cell(X,Y,C)) <= does(P,move(U,V,X,Y))
✁true(cell(U,V,C)) goal(white,100) <= true(control(black))
✁checkmate goal(white, 50) <= stalemate
Variables: X, Y, Z Constants: a, b, c Functions: f, g, h Predicates: p, q, r, = Logical Operators: ¬,
✁ ✂, , <= Terms: X, Y, Z, a, b, c, f(a), g(a,X), h(a,b,f(Y)) Atoms: p(a,b) Literals: p(a,b), ¬q(X,f(a)) Clauses: Head <= Body Head: relational sentence Body: logical sentence built from
✁,
✂, literal
Relations roles(list-of(player)) init(fluent) true(fluent) does(player,move) next(fluent) legal(player,move) goal(player,value) terminal cell(X,Y,M) X,Y ∈ {1,2,3} M ∈ {x,o,b} control(P) P ∈ {xplayer,oplayer}
3 2 1 1 2 3
3 2 1 1 2 3 mark(X,Y) X,Y ∈ {1,2,3} noop
Constants xplayer, oplayer Players x, o, b Marks Functions cell(number,number,mark) Fluent control(player) Fluent mark(number,number) Action Predicates row(number,mark) column(number,mark) diagonal(mark) line(mark)
roles([xplayer,oplayer]) init(cell(1,1,b)) init(cell(1,2,b)) init(cell(1,3,b)) init(cell(2,1,b)) init(cell(2,2,b)) init(cell(2,3,b)) init(cell(3,1,b)) init(cell(3,2,b)) init(cell(3,3,b)) init(control(xplayer))
legal(P,mark(X,Y)) <= true(cell(X,Y,b))
✁true(control(P)) legal(xplayer,noop) <= true
✁(cell(X,Y,b)) true(control(oplayer)) legal(oplayer,noop) <= true
✁(cell(X,Y,b)) true(control(xplayer))
next(cell(M,N,x))<= does(xplayer,mark(M,N)) next(cell(M,N,o))<= does(oplayer,mark(M,N)) next(cell(M,N,W))<= true(cell(M,N,W))
✁¬W=b next(cell(M,N,b))<= true(cell(M,N,b))
✁does(P,mark(J,K))
✁(¬M=J
✂¬N=K) next(control(xplayer)) <= true(control(oplayer)) next(control(oplayer)) <= true(control(xplayer))
terminal <= line(x)
✂line(o) terminal <= ¬open line(W) <= row(M,W) line(W) <= column(N,W) line(W) <= diagonal(W)
row(M,W) <= true
✁(cell(M,1,W)) true
✁(cell(M,2,W)) true(cell(M,3,W)) column(N,W) <= true
✁(cell(1,N,W)) true
✁(cell(2,N,W)) true(cell(3,N,W)) diagonal(W) <= true
✁(cell(1,1,W)) true
✁(cell(2,2,W)) true(cell(3,3,W)) diagonal(W) <= true
✁(cell(1,3,W)) true
✁(cell(2,2,W)) true(cell(3,1,W))
goal(xplayer,100) <= line(x) goal(xplayer,50) <= ¬
✁line(x) ¬
✁line(o) ¬open goal(xplayer,0) <= line(o) goal(oplayer,100) <= line(o) goal(oplayer,50) <= ¬
✁line(x) ¬
✁line(o) ¬open goal(oplayer,0) <= line(x)
Finite Environment Game “world” with finitely many states One initial state and one or more terminal states Fixed finite number of players Each with finitely many “percepts” and “actions” Each with one or more goal states Causal Model Environment changes only in response to moves Synchronous actions
a b c d e f g h i j k
a b c d e f g h i j k
a b c d e f g h i j k
a/b a/b a/a a/a b/a a/b a/b a/b a/a a/a a/a a/a a/a a/b b/b b/a b/b b/b b/b b/a
An n-player game is a structure with components: S – set of states A1, ..., An – n sets of actions, one for each player l1, ..., ln – where li
✄Ai × S, the legality relations u: S × A1 × ... × An
☎S – update function s1
✆S – initial game state t
✄S – the terminal states g1, ... gn – where gi
✄S ×
✝, the goal relations
role(bidder_1)
✞...
✞role(bidder_n) init(highestBid(0)) init(round(0)) legal(P,bid(X)) <= true(highestBid(Y))
✞greaterthan(X,Y) legal(P,noop) terminal <= true(round(10)) next(winner(P)) <= does(P,bid(X))
✞bestbid(X)
✞next(highestBid(X)) <= does(P,bid(X)) bestbid(X)
✞next(winner(P)) <= true(winner(P)) not bid
✞next(highestBid(X)) <= true(highestBid(X) not bid next(round(N)) <= true(round(M)), successor(M,N) bid <= does(P,bid(X))
✞bestbid(X) <= does(P,bid(X)) not overbid(X)
✞role(you) init(step(1)) init(cell(1,onecoin)) init(cell(Y,onecoin)) <= succ(X,Y) succ(1,2)
✞ ✞ ✞succ(2,3) ... succ(7,8) next(step(Y)) <= true(step(X))
✞succ(X,Y) next(cell(X,zerocoins)) <= does(you,jump(X,Y)) next(cell(Y,twocoins)) <= does(you(jump(X,Y)) next(cell(X,C)) <= does(you,jump(Y,Z))
✞true(cell(X,C))
✞distinct(X,Y)
✞distinct(X,Z) terminal <= ~continuable continuable <= legal(you,M) goal(you,100) <= true(step(5)) goal(you,0) <= true(cell(X,onecoin)) legal(you,jump(X,Y)) <= true(cell(X,onecoin))
✞true(cell(Y,onecoin))
✞( twobetween(X,Y) | twobetween(Y,X) ) zerobetween(X,Y) <= succ(X,Y) zerobetween(X,Y) <= succ(X,Z)
✞true(cell(Z,zerocoins))
✞zerobetween(Z,Y)
true(cell(Z,zerocoins))
✞true(cell(Z,onecoin))
✞zerobetween(Z,Y) twobetween(X,Y) <= succ(X,Z)
✞true(cell(Z,zerocoins))
✞twobetween(Z,Y) twobetween(X,Y) <= succ(X,Z)
✞true(cell(Z,onecoin))
✞twobetween(X,Y) <= succ(X,Z)
✞true(cell(Z,twocoins))
✞zerobetween(Z,Y)
Game descriptions are a good example of knowledge representation with formal logic. Automated reasoning about actions necessary to determine legal moves update positions recognize end of game
McCarthy's Situation Calculus (1963) s0
...
do(Aj,do(Ai,s0)) ... do(A1,s0) do(An,s0)
Effect Axioms: (
✟S)(
✟M,N) cell(M,N,x,do(xplayer,mark(M,N),S)) The Frame Problem (McCarthy & Hayes, 1969) arises because mere effect axioms do not suffice to infer non-effects! How does cell(2,2,o,s) imply cell(2,2,o,do(xplayer,mark(3,3),s))?
A frame axiom for Tic-Tac-Toe: (
✟S)(
✟...) cell(M,N,W,do(P,mark(J,K),S)) <= cell(M,N,W,S)
✁(M
✠ ✂J N
✠K) Compare this to the GDL axiom next(cell(M,N,W))<= true(cell(M,N,W))
✁¬W=b next(cell(M,N,b))<= true(cell(M,N,b))
✁does(P,mark(J,K))
✁(¬M=J
✂¬N=K) In a domain with m actions and n fluents, in the order of n·m frame axioms are needed.
“If AI can be said to have a classic problem, then the Frame Problem is it. Like all good open problems it is subtle, challenging, and it has led to significant new technical and conceptual developments in the field.” (Reiter, 1991)
✡( P,A,S)
☛(do(P,A,S)) <=>
☞+
✌[
☛ ✍ ✎(S)
☞A successor state axiom (Reiter, 1991) for every fluent
✏avoids extra frame axioms:
✑+:reasons for
✏to become true
✑to become false
+
✑+
✑( P,A,S)( ...) cell(M,N,W,do(P,A,S)) <=>
✁ ✁ ✂W=x P=xplayer A=mark(M,N)
✁ ✁ ✂W=o P=oplayer A=mark(M,N)
✁ ✒cell(M,N,W,S) A=mark(M,N)
✟ ✟( P,A,S)( R) control(R,do(P,A,S) <=>
✁ ✂R=xplayer control(oplayer,S)
✁R=oplayer control(xplayer,S)
F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 S0 S1 S2 S3
A state update axiom (T., 1999) for every action
✓avoids separate update axioms for every fluent:
✟( S)
✔1
✁ ✓(S) state(do(P, ,S)) = state(S) -
✕1
1 +
✂ ✂...
✔k
✁ ✓(S) state(do(P, ,S)) = state(S) -
✕k
k +
✕+: fluents that become true
✕(where subtraction z-
✕+
+ axiomatically defined)
( S)( ...) control(oplayer,S)
✁state(do(xplayer,mark(M,N),S)) = state(S) – control(oplayer) + control(xplayer) + cell(M,N,o)
✂control(xplayer,S)
✁state(do(oplayer,mark(M,N),S)) = state(S) – control(xplayer) + control(oplayer) + cell(M,N,x)
✟ ✟( S)( P) state(do(P,noop)) = state(S)
✕1
2
+
✕2
1
+
Morgan & Claypool Publishers
Michael Thielscher Synthesis Lectures on Artificial Intelligence and Machine Learning 2008
Game Description Compiled Theory Reasoner Move List Termination & Goal State Update
a b c d e f g h i j Advantage: Finds shortest solution Disadvantage: Consumes large amount of space a j i h g f e d c b
a b e f c g h d i j Advantage: Small intermediate storage Disadvantage: Susceptible to garden paths Disadvantage: Susceptible to infinite loops a j i h g f e d c b
Worst case for search depth d, solution at depth k Time Binary Branching b
______________________________________________________________________________________________________________________________________________________Depth-First 2d – 2d-k Breadth-First 2k - 1
b
d
✖b
d
✗k
b
✖1 b
k
✖1 b
✖1
Space Binary Branching b
_____________________________________________________________________________________________________________________________________________________Depth-First d (b - 1)
✘(d - 1) + 1 Breadth-First 2k-1 bk-1
Run depth-limited search repeatedly starting with a small initial depth d incrementing on each iteration d := d + 1 until success or run out of alternatives
d = 1: a d = 2: a b c d d = 3: a b e f c g h d i j Advantage: Small intermediate storage Advantage: Finds shortest solution Advantage: Not susceptible to garden paths Advantage: Not susceptible to infinite loops a j i h g f e d c b
Worst case for branching factor 2 Depth Iterative Deepening Depth-First 1 1 1 2 4 3 3 11 7 4 26 15 5 57 31 n 2n+1 – n – 2 2n – 1 Theorem: The cost of iterative deepening search is b/(b-1) times the cost of depth-first search (where b is the branching factor).
legal(P,mark(X,Y)) <= true(cell(X,Y,b))
✁true(control(P)) next(cell(M,N,x)) <= does(xplayer,mark(M,N)) next(cell(M,N,W)) <= true
✁(cell(M,N,W)) ¬W=b terminal <= line(x)
✂line(o) goal(xplayer,100) <= line(x)
function legals (role, node) findall(X, legal(role,X), node.position
✙gamerules) function simulate (node,moves) findall(true(P), next(P), node.position
✙moves
✙gamerules) function terminal (node) prove(terminal, node.position
✙gamerules) function goal (role, node) findone(X, goal(role,X), node.position
✙gamerules) Game Description Compiled Theory Reasoner Move List Termination & Goal State Update Search
function expand(node) begin al := []; for a in legals(role,node) do data := simulate(node,{does(role,a)}); new := create_node(data); al := {(a,new)}
✙al end-for; return al end
function bestmove(alist) begin max := 0; best := head(node.actionlist); for a in node.actionlist do score := maxscore(a.new.alist); if score = 100 then return a; if score > max then max := score; best := a end-if end-for; return best end function maxscore(alist) % returns best score among the alist actions
s1 s2 s3 s4 e f g h i j k
a/b a/b a/a a/a b/a a/b a/b a/b a/a a/a a/a a/a a/a a/b b/b b/a b/b b/b b/b b/a
s1 s4 s3 s2
s1 s4 s3 s2
ab bb ba aa
s1 s4 s3 s2
bb ba aa a ab b
Simple move list [(a,s2),(b,s3)] Multiple player move list [([a,a],s2),([a,b],s1), ([b,a],s3),([b,b],s4)] Bipartite move list [(a,[([a,a],s2),([a,b],s1)]), (b,[([b,a],s3),([b,b],s4)])]
function expand (node) begin al := []; jl := []; for a in legals(role,node) do for j in joints(role,a,node) do data := simulate(node,jointactions(j)); new := create_node(data); jl := {(j,new)}
✚jl end-for; al := {(a,jl)}
✚al end-for; return al end function joints (role,action) % returns combinatorial list of all legal joint actions % where role does action function jointactions(j) % returns set of does atoms for joint action j
function bestmove (node) begin max := 0; (best,jl) := head(node.alist); for (a,jl) in node.alist do score := minscore(jl); if score = 100 then return a; if score > max then max := score; best := a end-if end-for; return best end
Note: This makes the paranoid assumption that the other players make the most harmful (for us) joint move.
40
max 40 40 10 min 75 40 50 80 40 60 35 20 10
40
max 40
✛40
✛35 min 75 40 50 80 40 60 35 20 10
max 60
✛60
✛60 min 60
✣60 50 40 max 60 45 75 90 10 50 35 30 35 40 20 15
✤= 0
✥= 100
✤= 60
✥= 100
✤= 0
✥= 100
✤= 0
✥= 100
✤= 0
✥= 60
✤= 60
✥= 100
The game tree for Tic-Tac-Toe has approximately 700,000
Searching the tree requires 140 times more work than searching the graph. Recognizing a repeat state takes time that varies with the size of the graph thus far seen. Solution: Transposition tables
Symmetries can be logically derived from the rules of a game. A symmetry relation over the elements of a domain is an equiva-lence relation such that two symmetric states are either both terminal or non-terminal if they are terminal, they have the same goal value if they are non-terminal, the legal moves in each of them are symmetric and yield symmetric states
Connect-3
Capture Go
Branching factor as given to players: a · b Fringe of tree at depth n as given: (a · b)n Fringe of tree at depth n factored: an + bn Hodgepodge = Chess + Othello
Branching factor: b Branching factor: a
Branching factor: 81, 64, 49, 36, 25, 16, 9, 4, 1 Branching factor (factored): 9, 8, 7, 6, 5, 4, 3, 2, 1 (times 2)
A set
✦there are no connections between the fluents and moves in
✦and those outside of
✦.
Behavioral factoring Goal factoring
Append plans Interleave plans Parallelize plans with simultaneous actions
The “paranoid” assumption says that opponents choose the joint move that is most harmful for us. This is usually too pessimistic for other than zero-sum games and games with n > 2 players. A rational opponent chooses the move that's best for him rather than the one that's worst for us. Moreover, from a game theoretic point of view, it is incorrect to model simultaneous moves as a sequence of our move followed by the joint moves of our opponents. Example: Rock-Paper-Scissors
Game model: S – set of states A1, ..., An – n sets of actions, one for each player l1, ..., ln – where li
✄Ai × S, the legality relations g1, ..., gn – where gi
✄S ×
✝, the goal relations A strategy xi for player i maps every state to a legal move for i xi : S
☎Ai ( such that (xi
✆(S),S) li ) (Remark: The set of strategies is always finite in a finite game. However, there are more strategies in Chess than atoms in the universe ...)
An n-player game in normal form is an n+1-tuple
✑= (X1, ..., Xn,u) where Xi is the set of strategies for player i and u = (u1, ..., un):
✧Xi
☎ ✝i
are the utilities of the players for each n-tuple of strategies. (Remark: Each n-tuple of strategies determines directly the
moves.)
n i=1
Г Let = (X1, ..., Xn,u) be an n-player game. (x1*, ..., xn*) equilibrium if for all i = 1, ..., n and all xi
✆Xi ui(x1*, ..., xi-1*, xi, xi+1*, ..., xn*)
★ui(x1*, ..., xn*) An equilibrium is a tuple of optimal strategies: No player has a reason to deviate from his or her strategy, given the opponent's strategies.
A strategy x
✆Xi dominates a strategy y
✆Xi if ui(x1, ..., xi-1, x, xi+1, ..., xn)
✩ui(x1, ..., xi-1, y, xi+1, ..., xn) for all (x1, ..., xi-1, xi+1, ..., xn
✆) X1 × ... × Xi-1 × Xi+1 × ... × Xn. A strategy x
✆Xi strongly dominates a strategy y
✆Xi if x dominates y and y does not dominate x. Assume that opponents are rational: They don't choose a strongly dominated strategy.
Consider a game where both players have strategies {a, b, c, d, e}. Let the goal values be given by
a b c d e a 7 8 4 10 7 6 9 8 b 2 8 2 5 6 10 4 6 9 5 c 3 6 1 4 5 9 7 9 8 8 d 9 4 6 9 2 6 4 3 7
Player 2 Player 1
a b c d e a 7 8 4 10 7 6 9 8 b 2 8 2 5 6 10 4 6 9 5 c 3 6 1 4 5 9 7 9 8 8 d 9 4 6 9 2 6 4 3 7
Player 2 Player 1
a b c d e a 7 8 4 10 7 6 9 8 c 3 6 1 4 5 9 7 9 8 8
Player 2 Player 1
b c a 7 8 7 6 c 6 1 7 9
Player 2 Player 1
(60,50)
Player 1 (40,40) (60,50) (20,60) Player 2
(75,25) (40,40) (50,30) (80,40) (40,40) (60,50) (35,60) (20,60) (10,50)
(40,40)
✛40?
✛35?
(75,25) (40,40) (50,30) (80,40) (40,40) (60,50) (35,60) (20,60) (10,50)
Let (X1, ..., Xn, u) be an n-player game, then its mixed extension is
Г = (P1, ..., Pn, (e1, ..., en))where for each i=1, ..., n Pi = {pi: pi probability measure over Xi} and for each (p1, ..., pn)
✆P1 × ... × Pn ei(p1, ..., pn) =
✪ ✪... ui(x1, ..., xn) · p1(x1) · ... · pn(xn) Nash's Theorem: Every mixed extension of an n-player game has at least one equilibrium.
x1
✫X1 xn
✫Xn
Then p1 = dominates p1' = (0,1,0). Hence, for all (pa', pb', pc')
✆P1 with pb' > 0 there exists a dominating strategy (pa, 0, pc)
✆P1.
a b c a 10 8 b 6 4 4 c 3 8 7 Let a zero-sum game be given by
✬1 2 ,0, 1 2
✭a b c a 10 8 b 6 4 4 c 3 8 7 Now p2 = dominates p2' = (0,0,1).
✬1 2 , 1 2 ,0
✭a b c a 10 8 c 3 8 7 The unique equilibrium is
✮ ✬1 3 ,0 , 2 3
✭,
✬1 2 , 1 2 ,0
✭ ✯.
Heuristics Detecting Structures Generating Evaluation Functions The Viking Method
Simple games like Tic-Tac-Toe and Rock-Paper-Scissors can be searched completely. "Real" games like Peg Jumping, Chinese Checkers, Chess cannot.
e e e e e e e e e estimated val's Requires to automatically generate evaluation functions
Besides efficient inference and search algorithms, the ability to automatically generate a good evaluation function distinguishes good from bad General Game Playing programs. Existing approaches: Mobility and Novelty Heuristics Structure Detection Fuzzy Goal Evaluation The Viking Method: Monte-Carlo Tree Search
More moves means better state Advantage: In many games, being cornered or forced into making a move is quite bad
pieces of lower value, or less control of the board
things compared to not being in check
the board Disadvantage: Mobility is bad for some games
Having fewer things to do is better This works in some games, like Nothello and Suicide Chess, where you might in fact want to lose pieces How to decide between mobility and inverse mobility heuristics?
Changing the game state is better Advantage:
some directed randomness Disadvantage:
pieces ...
anybody
Typically designed by programmers/humans A great deal of thought and empirical testing goes into choosing
E.g.
But this requires knowledge of the game's structure, semantics, play order, etc.
Domains of fluents identified by dependency graph
step/1 succ/1 succ/2 1 2 3
succ(0,1)
✰succ(1,2)
✰succ(2,3) init(step(0)) next(step(X)) <= true(step(Y))
✰succ(Y,X)
A successor relation is a binary relation that is antisymmetric, functional, and injective. Example: An order relation is a binary relation that is antisymmetric and transitive. Example:
✰ ✰ ✰succ(1,2) succ(2,3) succ(3,4) ...
✰ ✰ ✰next(a,b) next(b,c) next(c,d) ... lessthan(A,B) <= succ(A,B)
✰lessthan(A,C) <= succ(A,B) lessthan(B,C)
An (m-dimensional) board is an n-ary fluent (n
✱m+1) with m arguments whose domains are successor relations 1 output argument Example: A marker is an element of the domain of a board's output argument. A piece is a marker which is in at most one board cell at a time. Example: Pebbles in Othello, White King in Chess
✰ ✰cell(a,1,whiterook) cell(b,1,whiteknight) ... goal(xplayer,100) <= line(x) line(P) <= row(P)
✲col(P)
✲diag(P)
Value of intermediate state = Degree to which it satisfies the goal
goal(xplayer,100) <= line(x) line(P) <= row(P)
✳col(P)
✳diag(P) row(P) <= true(cell(1,Y,P))
✴true(cell(2,Y,P))
✴true(cell(3,Y,P)) col(P) <= true(cell(X,1,P))
✴true(cell(X,2,P))
✴true(cell(X,3,P)) diag(P) <= true(cell(1,1,P))
✴true(cell(2,2,P))
✴true(cell(3,3,P)) diag(P) <= true(cell(3,1,P))
✴true(cell(2,2,P))
✴true(cell(1,3,P))
goal(x,100) <= true(cell(1,Y,x))
✴true(cell(2,Y,x))
✴true(cell(3,Y,x))
✳true(cell(X,1,x))
✴true(cell(X,2,x))
✴true(cell(X,3,x))
✳true(cell(1,1,x))
✴true(cell(2,2,x))
✴true(cell(3,3,x))
✳true(cell(3,1,x))
✴true(cell(2,2,x))
✴true(cell(1,3,x))
3 literals are true after does(x,mark(1,1)) 2 literals are true after does(x,mark(1,2)) 4 literals are true after does(x,mark(2,2)) Our t-norms: Instances of the Yager family (with parameter q)
T(a,b) = 1 – S(1-a,1-b) S(a,b) = (a^q + b^q) ^ (1/q) Evaluation function for formulas eval(f
✰g) = T'(eval(f),eval(g))
eval(f
✳g) = S'(eval(f),eval(g))
eval(
✵f) = 1 - eval(f) (1-p) - (1-p) *
✶(b,a) / |dom(f(x))| Degree to which f(x,a) is true given that f(x,b) holds: With p = 0.9, eval(cell(green,e,5)) is 0.082 if true(cell(green,f,10)) 0.085 if true(cell(green,j,5)) (f,10) (j,5) (e,5)
init(cell(green,j,13))
✰...
goal(green,100) <= true(cell(green,e,5)
✰... (j,13) (e,5) Truth degree of goal literal = (Distance to current value)-1
Order relations Binary, antisymmetric, functional, injective succ(1,2). succ(2,3). succ(3,4). file(a,b). file(b,c). file(c,d). Order relations define a metric on functional features
✶(cell(green,j,13),cell(green,e,5)) = 13
(1-p) - (1-p) *
✶(b,a) / |dom(f(x))| With p = 0.9, eval(cell(green,e,5)) is 0.082 if true(cell(green,f,10)) 0.085 if true(cell(green,j,5)) (f,10) (j,5) (e,5)
Game Description Compiled Theory Reasoner Move List Termination & Goal State Update Evaluation Function Search
Fuzzy goal evaluation works particularly well for games with independent sub-goals 15-Puzzle converge to the goal Chinese Checkers quantitative goal Othello partial goals Peg Jumping, Chinese Checkers with >2 players
aka Monte Carlo Tree Search used by Cadiaplayer (Reykjavik University) horizon
100 0 50
Game Tree Seach MC Tree Search
Value of move = Average score returned by simulation
n = 60 v = 40 n = 22 v = 20 n = 18 v = 20 n = 20 v = 80
n = 60 v = 70
Play one random game for each move For next simulation choose move confidence bound
argmax i
✷v i
✸C
✹ ✺log n ni
✻n1 = 4 v1 = 20 n2 = 24 v2 = 65 n3 = 32 v3 = 80
Monte Carlo Tree Search works particularly well for games which converge to the goal Checkers reward greedy behavior have a large branching factor do not admit a good heuristics
Game Master
Player1 Player2 Playern
Game description Time to think: 1,800 sec Time per move: 45 sec Your role
Game Master
Player1 Player2 Playern
Game description Time to think: 1,800 sec Time per move: 45 sec Your role
Game Master
Player1 Player2 Playern
Start
Game Master
Player1 Player2 Playern
Your move, please
Game Master
Player1 Player2 Playern
Individual moves
Game Master
Player1 Player2 Playern
Joint move
Game Master
Player1 Player2 Playern
End of game
UT Austin
Player Points 2690.75
2573.75
2370.50
1948.25
Player Points 2724
2356
2253
2122
1798
✼Much like RoboCup, General Game Playing combines a variety of AI areas fosters developmental research has great public appeal has the potential to significantly advance AI In contrast to RoboCup, GGP has the advantage to focus on high-level intelligence have low entry cost make a great hands-on course for AI students
Natural Language Understanding Rules of a game given in natural language Robotics Robot playing the actual, physical game Computer Vision Vision system sees board, pieces, cards, rule book, ... Uncertainty Nondeterministic games with incomplete information
Stanford GGP initiative games.stanford.edu
GGP in Germany general-game-playing.de
Palamedes palamedes-ide.sourceforge.net
Heuristic evaluation functions for general game playing AAAI 2007
Simulation-based approach to general game playing AAAI 2008
General game playing AI magazine 26(2), 2006
Automatic heuristic construction in a complete general game player AAAI 2006
Fluxplayer: a successful general game player AAAI 2007