Computer Poker Research at The University of Alberta
Richard Gibson Computing Science Honours Seminar February 25, 2013
Computer Poker Research at The University of Alberta Richard Gibson - - PowerPoint PPT Presentation
Computer Poker Research at The University of Alberta Richard Gibson Computing Science Honours Seminar February 25, 2013 Games have been used to showcase advances in artificial intelligence... Checkers Source: spectrum.ieee.org Chess VS
Richard Gibson Computing Science Honours Seminar February 25, 2013
Games have been used to showcase advances in artificial intelligence...
Source: spectrum.ieee.org
Source: robertamsterdam.com Source: Wikipedia
Goal: Build a computer poker program capable of defeating the world's best human players!
–
Why is poker research interesting?
–
Computer Poker Research Group
–
Nash equilibrium
–
Abstraction
–
Annual Computer Poker Competition (Programs vs. Programs)
–
Man vs. Machine Competitions
–
Why is poker research interesting?
–
Computer Poker Research Group
–
Nash equilibrium
–
Abstraction
–
Annual Computer Poker Competition (Programs vs. Programs)
–
Man vs. Machine Competitions
Source: ebaumsworld.com
Dealer
Source: Wikipedia
Source: ebaumsworld.com
Dealer
Raise!
Source: ebaumsworld.com
Dealer
Call.
Source: ebaumsworld.com
Dealer
Flop Pot
Source: ebaumsworld.com
Dealer
Check.
Source: ebaumsworld.com
Dealer
Check.
Source: ebaumsworld.com
Dealer
Turn
Source: ebaumsworld.com
Dealer
Bet!
Source: ebaumsworld.com
Dealer
Call.
Source: ebaumsworld.com
Dealer
River
Source: ebaumsworld.com
Dealer
Check.
Source: ebaumsworld.com
Dealer
Bet!
Source: ebaumsworld.com
Dealer
Raise!
Source: ebaumsworld.com
Dealer
Call.
Source: ebaumsworld.com
Dealer
Source: ebaumsworld.com
Dealer
Winner! Loser.
fun!
Source: maps.google.com
Flop? Flop? Flop? . . . . . .
Pot 2 Pot 1 Pot 3
Source: Wikipedia
Example: Driving a car.
Source: clker.com
Example: Online Advertisement Auctions.
Source: blog.revizzit.com
Example: Sequential Auctions.
Source: wikipedia.com
Example: “Adaptive Treatment Strategies”
– For instance: Insulin for diabetes patients
Source: clker.com
[Chen and Bowling, NIPS 2012]
– Loki (1997) – Poki (1999) – PsOpti / Sparbot (2002) – Vexbot (2003)
Limit Texas Hold'em Heads-up (2-player) Limit Texas Hold'em
– Polaris (vs. Humans) – Hyperborean (vs. Programs)
– Heads-up Limit Texas Hold'em – Heads-up No-limit Texas Hold'em – Three-player Limit Texas Hold'em
– Polaris (vs. Humans) – Hyperborean (vs. Programs)
– Heads-up Limit Texas Hold'em – Heads-up No-limit Texas Hold'em – Three-player Limit Texas Hold'em
–
Why is poker research interesting?
–
Computer Poker Research Group
–
Nash equilibrium
–
Abstraction
–
Annual Computer Poker Competition (Programs vs. Programs)
–
Man vs. Machine Competitions
+2 +2 f c r f c r k r f c r k r f c r
. . .
Extensive-Form Game
Strategy Profile
+2 +2 0.2 0.8 0.2 0.8 0.9 0.1 1 0.3 0.7 0 0.4 0.6
. . .
– Nash equilibrium
Source: clker.com
r p s +1 +1
+1
r p s r p s r p s
– “No one can change their strategy and do better.”
1/3 1/3 1/3 +1 +1
+1
1/31/3 1/3 1/31/3 1/3 1/31/3 1/3
– “I can't lose no matter what my opponent does.”
1/3 1/3 1/3 +1 +1
+1
? ? ? ? ? ? ? ? ?
possible!
Pot 2 Pot 1 Pot 3
possible!
– [Bard and Bowling, AAAI 2007] – [Johanson, Zinkevich, and Bowling, NIPS 2007] – [Johanson and Bowling, AISTATS 2009]
but still lots of work to be done!
Extensive-Form Game
Nash Equilibrium Strategy Profile
Source: clker.com
+2 +2 f c r f c r k r f c r k r f c r
. . .
Source: clker.com
[Zinkevich et al., NIPS 2007].
“Play” Poker
Deal Cards
Update Strategy Profile
[Zinkevich et al., NIPS 2007].
“Play” Poker
Deal Cards
Update Strategy Profile Nash Equilibrium Strategy Profile
Limit
Extensive-Form Game 1018
Strategy Profile
5 million GB
Extensive-Form Game 1018
Strategy Profile
5 million GB
Extensive-Form Game
Nash Equilibrium Strategy Profile
Extensive-Form Game Abstract Game
Abstract Game
Extensive-Form Game
Abstract Game
Extensive-Form Game
– Rank hands from best to worst.
. . . . . . . . . . .
Best Worst
– Rank hands from best to worst. – For 10 buckets, put top 10% into bucket 1,
next 10% into bucket 2, etc.
. . . . . . . . . . .
Best Worst Bucket 1 Bucket 5 Bucket 10
– Old bucketing technique
– New bucketing technique
Extensive-Form Game Abstract Game
Extensive-Form Game Abstract Game
Abstract Game Equilibrium Strategy
“Play” “Poker”
Deal Buckets
Update Abstract Strategy Profile billions of times
CFR
Extensive-Form Game Abstract Game
Abstract Game Equilibrium Strategy
Approximate Full Game Equilibrium Strategy
– We use Compute Canada's largest supercomputers. – Parallel implementations of abstraction, CFR.
Source: rqchp.ca
Old abstraction CFR New abstraction Supercomputers Fancy new CFR variant
–
Why is poker research interesting?
–
Computer Poker Research Group
–
Nash equilibrium
–
Abstraction
–
Annual Computer Poker Competition (Programs vs. Programs)
–
Man vs. Machine Competitions
– Programs vs. Programs.
– Programs vs. Programs. – Three different Texas Hold'em games:
Total Bankroll
Pot 2 Pot 1 Pot 3
Bankroll Instant Run-off
Nash Equilibrium Strategy Profile
– Finished 4th in 2012 Heads-up limit total bankroll.
21 8 5
Source: clker.com
– Heads-up limit only – Opponents: Phil “The Unabomber” Laak and Ali Eslami.
– Youtube Video
– 500 duplicate hands per session, $10/$20 blinds
Ali Eslami Phil Laak
Combined Human Score
Results Session 1 +$395
Statistical Draw Session 2
+$1570
Polaris Wins Session 3
+$1455 +$820 Humans Win Session 4 +$460 +$110 +$570 Humans Win Overall
+$2670 +$395 Humans Win
– Again, just heads-up limit – Opponents: Matt Hawrilenko, Ijay Palansky, Nick
Grudzien, Kyle Hendon, Rich McRoberts, Victor Acosta, Mark Newhouse
– 500 duplicate hands per session, $1000/$2000 blinds Human 1 Human 2
Combined Human Score
Results Session 1 +$199,500
+$25,500 Humans Win Session 2
Polaris Wins Session 3
+$37,000
Statistical Draw Session 4 +$89,500
+$50,000 Humans Win Session 5 +$251,500
Polaris Wins Session 6
Polaris Wins Overall
Polaris Wins
Lose Win
–
Why is poker research interesting?
–
Computer Poker Research Group
–
Nash equilibrium
–
Abstraction
–
Annual Computer Poker Competition (Programs vs. Programs)
–
Man vs. Machine Competitions
– We are still far from equilibrium in no-limit.
– Website: http://cs.ualberta.ca/~poker – Twitter: @PolarisPoker
– Email: rggibson@cs.ualberta.ca – Website: http://cs.ualberta.ca/~rggibson – Twitter: @RichardGGibson