Recent Advances in Computer Poker and Future Research for Artificial Intelligence in Video Games Richard Gibson
SIAT Faculty Search Presentation February 28, 2013
Richard Gibson SIAT Faculty Search Presentation February 28, 2013 - - PowerPoint PPT Presentation
Recent Advances in Computer Poker and Future Research for Artificial Intelligence in Video Games Richard Gibson SIAT Faculty Search Presentation February 28, 2013 One Slide Summary 2009 2013: Computer Poker Research One Slide Summary
SIAT Faculty Search Presentation February 28, 2013
Image source: co-optimus.com Image source: arcadelearningenvironment.org
– Motivation – Background
– Research + Hyperborean3p
– StarCraft AI, ALE, automated content generation
– Game design, AI in video games
– Motivation – Background
– Research + Hyperborean3p
– StarCraft AI, ALE, automated content generation
– Game design, AI in video games
– Deterministic – Binary outcomes (+ draw) – Perfect Information
Image sources: Wikipedia Image source: spectrum.ieee.org
– Stochastic elements
Image sources: Wikipedia
Flop? Flop? Flop? . . . . . .
– Stochastic elements – Varying outcomes
Pot 2 Pot 1 Pot 3
Image source: ebaumsworld.com
– Stochastic elements – Varying outcomes – Imperfect information
– Airport security [Pita et al., AI Magazine 2009] – Adaptive treatment strategies [Chen and Bowling, NIPS 2012] – Sequential auctions [?]
– Motivation – Background
– Research + Hyperborean3p
– StarCraft AI, ALE, automated content generation
– Game design, AI in video games
c 1 1 2 2 2 2 1
1 QJ QK c b b c c b f c c b f c f c f c 0.5 0.5 +1
+2 +1 +2
+1
between.
c 1 1 2 2 2 2 1
1 QJ QK c b b c c b f c c b f c f c f c 0.5 0.5 +1
+2 +1 +2
+1
Bet! Fold? Call?
Call.
Win! Lose. +2
c 1 1 2 2 2 2 1
1 QJ QK c b b c c b f c c b f c f c f c 0.5 0.5 +1
+2 +1 +2
+1
Extensive-Form Game
Strategy Profile
distribution over actions.
c 1 1 2 2 2 2 1
1 QJ QK c b b c c b f c c b f c f c f c 0.5 0.5 +1
+2 +1 +2
+1
0.6 0.4 0.6 0.4 0.8 0.2 1 1 1 0.7 0.3 0.7 0.3
– Nash equilibrium
1 2 2 2
r p s +1 +1
+1
r p s r p s r p s
– “No one can change their strategy and do better.” 1 2 2 2
r p s +1 r p s 1/3 1/3 1/3 1/3 1/3 1/3 +1
r p s 1/3 1/3 1/3 +1
r p s 1/3 1/3 1/3
– “I can't lose no matter what my opponent does.” 1 2 2 2
r p s +1 r p s 1/3 1/3 1/3 ? ? ? +1
r p s ? ? ? +1
r p s ? ? ?
Extensive-Form Game
Nash Equilibrium Strategy Profile
?
Source: clker.com
c 1 1 2 2 2 2 1
1 QJ QK c b b c c b f c c b f c f c f c 0.5 0.5 +1
+2 +1 +2
+1
0.6 0.4 0.6 0.4 0.8 0.2 1 1 1 0.7 0.3 0.7 0.3
[Zinkevich et al., NIPS 2007].
“Play” Poker Strategy Profile 1
Deal Cards
[Zinkevich et al., NIPS 2007].
“Play” Poker Strategy Profile 1 “Play” Poker Strategy Profile 2 “Play” Poker Strategy Profile T
T (billions)
Deal Cards Deal Cards Deal Cards
[Zinkevich et al., NIPS 2007]. Nash Equilibrium Strategy Profile Strategy 1 + Strategy 2 + ... + Strategy T T Average Strategy Profile
T
∞
Extensive-Form Game
Nash Equilibrium Strategy Profile
CFR
Texas Hold'em >1014
Nash Equilibrium Strategy Profile
> 5 million GB
CFR CFR
Extensive-Form Game
Nash Equilibrium Strategy Profile
Extensive-Form Game Abstract Game
Abstract Game
Extensive-Form Game
Abstract Game
Extensive-Form Game
Extensive-Form Game Abstract Game
Extensive-Form Game Abstract Game
Abstract Game Equilibrium Strategy
“Play” “Poker”
Deal Buckets
Abstract Strategy Profile billions of times
CFR
Extensive-Form Game Abstract Game
Abstract Game Equilibrium Strategy
Approximate Full Game Equilibrium Strategy
– Motivation – Background
– Research + Hyperborean3p
– StarCraft, ALE, automated content generation
– Game design, AI in video games
2 f c +1 +2
3-or-more Player Abstract Game
(Not equilibrium)
CFR
Agent Total Bankroll (mbb/g) Hyperborean3p 319 ± 2 dpp 171 ± 2 akuma 151 ± 2 CMURingLimit
dcu3pl
Bluechip
Annual Computer Poker Competition 3-Player Limit Texas Hold'em - 2009
3-or-more Player Abstract Game
(Not equilibrium)
CFR
c 1 1 2 2 2 2 1
1 QJ QK c b b c c b f c c b f c f c f c 0.5 0.5 +1
+2 +1 +2
+1
c 1 1 2 2 2 2 1
1 QJ QK c b b c c b f c c b f c f c f c 0.5 0.5 +1
+2 +1 +2
+1
c 1 1 2 2 2 2 1
1 QJ QK c b b c c b f c c b f c f c f c 0.5 0.5 +1
+2 +1 +2
+1
Dominated Strategies
c 1 1 2 2 2 2 1
1 QJ QK c b b c c b f c b c f c f c 0.5 0.5 +1
+2 +1
c 1 1 2 2 2 2 1
1 QJ QK c b b c c b f c b c f c f c 0.5 0.5 +1
+2 +1
Iteratively Dominated Strategy
Average Strategy Profile T
∞
No Iteratively Dominated Strategies 3-or-more Player Abstract Game
CFR New! [G., submitted to EC 2013]
Average Strategy Profile T
∞
No Iteratively Dominated Strategies 3-or-more Player Abstract Game
CFR
“Current” Strategy Profile T Finite T
No Iteratively Dominated Strategies 3-or-more Player Abstract Game
CFR New! New! [G., submitted to EC 2013]
3-Player Limit Texas Hold'em - 2012
New! [G., submitted to EC 2013]
2-player Limit Texas Hold'em Abstract Game
≈ 59,000,000 “Turn” Deals
540,000 “Turn” Buckets
2-player Limit Texas Hold'em Abstract Game
3-player Limit Texas Hold'em Abstract Game
≈ 59,000,000 “Turn” Deals
540,000 “Turn” Buckets
≈ 59,000,000 “Turn” Deals
540 “Turn” Buckets
3-Player Limit Texas Hold'em Abstract Game Abstract Game Strategy
3-player Limit Texas Hold'em Stitched Strategy
540 “Turn” Buckets
≈ 59,000,000 “Turn” Deals
3-Player Limit Texas Hold'em Abstract Game Abstract Game Strategy
3-player Limit Texas Hold'em Stitched Strategy
2-player Experts 2-player Sub-games
540 “Turn” Buckets
540,000 “Turn” Buckets
[Gibson and Szafron, NIPS 2011] ≈ 59,000,000 “Turn” Deals
Extensive-Form Game
Abstraction 1
Abstraction 2 Abstraction K
...
“Frankenstein” Abstract Game New!
Frankenstein-Game Strategy
Full Game Strategy
[Gibson and Szafron, NIPS 2011]
3-player Limit Texas Hold'em
200 “Turn” Buckets
765,000 “Turn” Buckets
“Frankenstein” Abstract Game
Frankenstein-Game Strategy
3-player Texas Hold'em Strategy
New!
Hyperborean3p Tournament
CFR 2-player Experts
(34/35 top-3 finishes) 5 1 21 8 5
– Motivation – Background
– Research + Hyperborean3p
– StarCraft AI, ALE, automated content generation
– Game design, AI in video games
– Imperfect information – Large state space – Actions taken in real-time
– Improved single player experience – Game balance
– Winner plays human professional – AI currently no match for humans – Poor high-level strategies
Image source: Flickr
StarCraft
Abstract Game
Abstract Game Equilibrium Strategy
High-level StarCraft Strategy
CFR?
2600 games
– Simple, yet still challenging domains
– At the heart of artificial intelligence
– Auto-detect glitches – Evaluate difficulty
– Death detection in reinforcement learning
Image source: Wikipedia
– Levels for a platforming game – Music for different game contexts – League schedules in a sports game, etc.
– More content for “free” – Content tailor-made for individual players
– Constraint satisfaction + optimization – I want to learn more!
Image source: Infinite Mario Bros. screenshot Image source: nhl.com
– Motivation – Background
– Research + Hyperborean3p
– StarCraft AI, ALE, automated content generation
– Game design, AI in video games
– IAT 167, 265, 312, 410 – Experience as lab instructor for
introductory programming course
– Hands-on experience implementing real AI – NPC behaviour, simple sports AI, etc.
– Exposure to research in the field:
Image source: amazon.ca
– Primary author of Hyperborean3p – Success in poker competitions
content generation
– Email: rggibson@cs.ualberta.ca – Website: http://cs.ualberta.ca/~rggibson/ – Twitter: @RichardGGibson
Clip art images used in this presentation can be found at clker.com