Richard Gibson SIAT Faculty Search Presentation February 28, 2013 - - PowerPoint PPT Presentation

richard gibson siat faculty search presentation february
SMART_READER_LITE
LIVE PREVIEW

Richard Gibson SIAT Faculty Search Presentation February 28, 2013 - - PowerPoint PPT Presentation

Recent Advances in Computer Poker and Future Research for Artificial Intelligence in Video Games Richard Gibson SIAT Faculty Search Presentation February 28, 2013 One Slide Summary 2009 2013: Computer Poker Research One Slide Summary


slide-1
SLIDE 1

Recent Advances in Computer Poker and Future Research for Artificial Intelligence in Video Games Richard Gibson

SIAT Faculty Search Presentation February 28, 2013

slide-2
SLIDE 2

One Slide Summary

  • 2009 – 2013: Computer Poker Research
slide-3
SLIDE 3

One Slide Summary

  • 2009 – 2013: Computer Poker Research
  • Future: AI in Video Games

Image source: co-optimus.com Image source: arcadelearningenvironment.org

slide-4
SLIDE 4

Outline of Presentation

  • Computer Poker Primer

– Motivation – Background

  • New Contributions to Computer Poker

– Research + Hyperborean3p

  • Future Research – AI in Video Games

– StarCraft AI, ALE, automated content generation

  • Teaching Interests

– Game design, AI in video games

slide-5
SLIDE 5

Outline of Presentation

  • Computer Poker Primer

– Motivation – Background

  • New Contributions to Computer Poker

– Research + Hyperborean3p

  • Future Research – AI in Video Games

– StarCraft AI, ALE, automated content generation

  • Teaching Interests

– Game design, AI in video games

slide-6
SLIDE 6

Why Poker Research?

  • Classic games, such as chess and checkers, are:

– Deterministic – Binary outcomes (+ draw) – Perfect Information

Image sources: Wikipedia Image source: spectrum.ieee.org

slide-7
SLIDE 7

Why Poker Research?

  • However, poker is a game with:

– Stochastic elements

Image sources: Wikipedia

Flop? Flop? Flop? . . . . . .

slide-8
SLIDE 8

Why Poker Research?

  • However, poker is a game with:

– Stochastic elements – Varying outcomes

Pot 2 Pot 1 Pot 3

Image source: ebaumsworld.com

slide-9
SLIDE 9

Why Poker Research?

  • However, poker is a game with:

– Stochastic elements – Varying outcomes – Imperfect information

? ?

slide-10
SLIDE 10

Why Poker Research?

  • Poker research is applicable in other areas:

– Airport security [Pita et al., AI Magazine 2009] – Adaptive treatment strategies [Chen and Bowling, NIPS 2012] – Sequential auctions [?]

slide-11
SLIDE 11

Outline of Presentation

  • Computer Poker Primer

– Motivation – Background

  • New Contributions to Computer Poker

– Research + Hyperborean3p

  • Future Research – AI in Video Games

– StarCraft AI, ALE, automated content generation

  • Teaching Interests

– Game design, AI in video games

slide-12
SLIDE 12

Poker Research Background

  • Model poker as an extensive-form game:

c 1 1 2 2 2 2 1

  • 1

1 QJ QK c b b c c b f c c b f c f c f c 0.5 0.5 +1

  • 1

+2 +1 +2

  • 1
  • 1
  • 2

+1

  • 2
slide-13
SLIDE 13

Poker Research Background

  • Information sets: Sets of states a player cannot distinguish

between.

c 1 1 2 2 2 2 1

  • 1

1 QJ QK c b b c c b f c c b f c f c f c 0.5 0.5 +1

  • 1

+2 +1 +2

  • 1
  • 1
  • 2

+1

  • 2
slide-14
SLIDE 14

Poker Research Background

  • Example: Kuhn Poker
slide-15
SLIDE 15

Poker Research Background

  • Example: Kuhn Poker
slide-16
SLIDE 16

Poker Research Background

  • Example: Kuhn Poker

?

slide-17
SLIDE 17

Poker Research Background

  • Example: Kuhn Poker

?

Bet! Fold? Call?

slide-18
SLIDE 18

Poker Research Background

  • Example: Kuhn Poker

?

Call.

slide-19
SLIDE 19

Poker Research Background

  • Example: Kuhn Poker
slide-20
SLIDE 20

Poker Research Background

  • Example: Kuhn Poker

Win! Lose. +2

  • 2
slide-21
SLIDE 21

Poker Research Background

  • Example: Kuhn Poker

c 1 1 2 2 2 2 1

  • 1

1 QJ QK c b b c c b f c c b f c f c f c 0.5 0.5 +1

  • 1

+2 +1 +2

  • 1
  • 1
  • 2

+1

  • 2
slide-22
SLIDE 22

Poker Research Background

Extensive-Form Game

Strategy Profile

slide-23
SLIDE 23

Poker Research Background

  • A strategy profile maps each information set to probability a

distribution over actions.

c 1 1 2 2 2 2 1

  • 1

1 QJ QK c b b c c b f c c b f c f c f c 0.5 0.5 +1

  • 1

+2 +1 +2

  • 1
  • 1
  • 2

+1

  • 2

0.6 0.4 0.6 0.4 0.8 0.2 1 1 1 0.7 0.3 0.7 0.3

slide-24
SLIDE 24

Poker Research Background

  • What type of strategy profile do we want?

– Nash equilibrium

  • Example: Rock-Paper-Scissors
slide-25
SLIDE 25

Poker Research Background

1 2 2 2

  • 1

r p s +1 +1

  • 1

+1

  • 1

r p s r p s r p s

slide-26
SLIDE 26

Poker Research Background

  • A Nash equilibrium strategy profile for Rock-Paper-Scissors.

– “No one can change their strategy and do better.” 1 2 2 2

  • 1

r p s +1 r p s 1/3 1/3 1/3 1/3 1/3 1/3 +1

  • 1

r p s 1/3 1/3 1/3 +1

  • 1

r p s 1/3 1/3 1/3

slide-27
SLIDE 27

Poker Research Background

  • A Nash equilibrium in a 2-player game is a defensive strategy:

– “I can't lose no matter what my opponent does.” 1 2 2 2

  • 1

r p s +1 r p s 1/3 1/3 1/3 ? ? ? +1

  • 1

r p s ? ? ? +1

  • 1

r p s ? ? ?

slide-28
SLIDE 28

Poker Research Background

Extensive-Form Game

Nash Equilibrium Strategy Profile

?

slide-29
SLIDE 29

Poker Research Background

  • Use minimax (alpha-beta) search to compute Nash?

Source: clker.com

slide-30
SLIDE 30

Poker Research Background

  • Use minimax (alpha-beta) search to compute Nash?

c 1 1 2 2 2 2 1

  • 1

1 QJ QK c b b c c b f c c b f c f c f c 0.5 0.5 +1

  • 1

+2 +1 +2

  • 1
  • 1
  • 2

+1

  • 2

0.6 0.4 0.6 0.4 0.8 0.2 1 1 1 0.7 0.3 0.7 0.3

slide-31
SLIDE 31

Poker Research Background

  • Instead, use Counterfactual Regret Minimization (CFR)

[Zinkevich et al., NIPS 2007].

“Play” Poker Strategy Profile 1

Deal Cards

slide-32
SLIDE 32

Poker Research Background

  • Instead, use Counterfactual Regret Minimization (CFR)

[Zinkevich et al., NIPS 2007].

“Play” Poker Strategy Profile 1 “Play” Poker Strategy Profile 2 “Play” Poker Strategy Profile T

... ...

T (billions)

Deal Cards Deal Cards Deal Cards

slide-33
SLIDE 33

Poker Research Background

  • Instead, use Counterfactual Regret Minimization (CFR)

[Zinkevich et al., NIPS 2007]. Nash Equilibrium Strategy Profile Strategy 1 + Strategy 2 + ... + Strategy T T Average Strategy Profile

=

T

slide-34
SLIDE 34

Poker Research Background

Extensive-Form Game

Nash Equilibrium Strategy Profile

CFR

slide-35
SLIDE 35

Poker Research Background

  • Huge problem (no pun intended):

Texas Hold'em >1014

Nash Equilibrium Strategy Profile

> 5 million GB

CFR CFR

slide-36
SLIDE 36

Poker Research Background

Extensive-Form Game

Nash Equilibrium Strategy Profile

?

slide-37
SLIDE 37

Poker Research Background

Extensive-Form Game Abstract Game

slide-38
SLIDE 38

Poker Research Background

Abstract Game

  • Merge card deals into buckets.

Extensive-Form Game

slide-39
SLIDE 39

Poker Research Background

Abstract Game

  • Merge card deals into buckets.

Extensive-Form Game

slide-40
SLIDE 40

Poker Research Background

Extensive-Form Game Abstract Game

>1014

≈109

slide-41
SLIDE 41

Poker Research Background

Extensive-Form Game Abstract Game

>1014

≈109

Abstract Game Equilibrium Strategy

“Play” “Poker”

Deal Buckets

Abstract Strategy Profile billions of times

CFR

slide-42
SLIDE 42

Poker Research Background

Extensive-Form Game Abstract Game

>1014

≈109

Abstract Game Equilibrium Strategy

Approximate Full Game Equilibrium Strategy

≈100 GB

slide-43
SLIDE 43

Outline of Presentation

  • Computer Poker Primer

– Motivation – Background

  • New Contributions to Computer Poker

– Research + Hyperborean3p

  • Future Research – AI in Video Games

– StarCraft, ALE, automated content generation

  • Teaching Interests

– Game design, AI in video games

slide-44
SLIDE 44

Contribution 1: Domination

2 f c +1 +2

slide-45
SLIDE 45

Domination

3-or-more Player Abstract Game

?

(Not equilibrium)

CFR

slide-46
SLIDE 46

Domination

Agent Total Bankroll (mbb/g) Hyperborean3p 319 ± 2 dpp 171 ± 2 akuma 151 ± 2 CMURingLimit

  • 37 ± 2

dcu3pl

  • 63 ± 2

Bluechip

  • 548 ± 2

Annual Computer Poker Competition 3-Player Limit Texas Hold'em - 2009

3-or-more Player Abstract Game

?

(Not equilibrium)

CFR

slide-47
SLIDE 47

Domination

c 1 1 2 2 2 2 1

  • 1

1 QJ QK c b b c c b f c c b f c f c f c 0.5 0.5 +1

  • 1

+2 +1 +2

  • 1
  • 1
  • 2

+1

  • 2
slide-48
SLIDE 48

Domination

c 1 1 2 2 2 2 1

  • 1

1 QJ QK c b b c c b f c c b f c f c f c 0.5 0.5 +1

  • 1

+2 +1 +2

  • 1
  • 1
  • 2

+1

  • 2
slide-49
SLIDE 49

Domination

c 1 1 2 2 2 2 1

  • 1

1 QJ QK c b b c c b f c c b f c f c f c 0.5 0.5 +1

  • 1

+2 +1 +2

  • 1
  • 1
  • 2

+1

  • 2

Dominated Strategies

slide-50
SLIDE 50

Domination

c 1 1 2 2 2 2 1

  • 1

1 QJ QK c b b c c b f c b c f c f c 0.5 0.5 +1

  • 1

+2 +1

  • 1
  • 1
  • 2
  • 2
slide-51
SLIDE 51

Domination

c 1 1 2 2 2 2 1

  • 1

1 QJ QK c b b c c b f c b c f c f c 0.5 0.5 +1

  • 1

+2 +1

  • 1
  • 1
  • 2
  • 2

Iteratively Dominated Strategy

slide-52
SLIDE 52

Domination

Average Strategy Profile T

No Iteratively Dominated Strategies 3-or-more Player Abstract Game

CFR New! [G., submitted to EC 2013]

slide-53
SLIDE 53

Domination

Average Strategy Profile T

No Iteratively Dominated Strategies 3-or-more Player Abstract Game

CFR

“Current” Strategy Profile T Finite T

No Iteratively Dominated Strategies 3-or-more Player Abstract Game

CFR New! New! [G., submitted to EC 2013]

slide-54
SLIDE 54

Domination

3-Player Limit Texas Hold'em - 2012

New! [G., submitted to EC 2013]

slide-55
SLIDE 55

Contribution 2: Strategy Stitching

slide-56
SLIDE 56

Strategy Stitching

2-player Limit Texas Hold'em Abstract Game

≈1014

≈109

≈ 59,000,000 “Turn” Deals

540,000 “Turn” Buckets

slide-57
SLIDE 57

Strategy Stitching

2-player Limit Texas Hold'em Abstract Game

≈1014

≈109

3-player Limit Texas Hold'em Abstract Game

≈1017

≈109

≈ 59,000,000 “Turn” Deals

540,000 “Turn” Buckets

≈ 59,000,000 “Turn” Deals

540 “Turn” Buckets

slide-58
SLIDE 58

Strategy Stitching

3-Player Limit Texas Hold'em Abstract Game Abstract Game Strategy

3-player Limit Texas Hold'em Stitched Strategy

540 “Turn” Buckets

≈ 59,000,000 “Turn” Deals

slide-59
SLIDE 59

Strategy Stitching

3-Player Limit Texas Hold'em Abstract Game Abstract Game Strategy

3-player Limit Texas Hold'em Stitched Strategy

2-player Experts 2-player Sub-games

540 “Turn” Buckets

540,000 “Turn” Buckets

  • Generalizes 3 previous approaches

[Gibson and Szafron, NIPS 2011] ≈ 59,000,000 “Turn” Deals

slide-60
SLIDE 60

Strategy Stitching

Extensive-Form Game

Abstraction 1

Abstraction 2 Abstraction K

...

“Frankenstein” Abstract Game New!

Frankenstein-Game Strategy

Full Game Strategy

[Gibson and Szafron, NIPS 2011]

slide-61
SLIDE 61

Strategy Stitching

3-player Limit Texas Hold'em

200 “Turn” Buckets

765,000 “Turn” Buckets

“Frankenstein” Abstract Game

Frankenstein-Game Strategy

3-player Texas Hold'em Strategy

New!

slide-62
SLIDE 62

Strategy Stitching

Hyperborean3p Tournament

CFR 2-player Experts

slide-63
SLIDE 63

Poker Competition Results

  • 3-player Hold'em 2010 – 2012:
  • Over all competitions:

(34/35 top-3 finishes) 5 1 21 8 5

slide-64
SLIDE 64

Outline of Presentation

  • Computer Poker Primer

– Motivation – Background

  • New Contributions to Computer Poker

– Research + Hyperborean3p

  • Future Research – AI in Video Games

– StarCraft AI, ALE, automated content generation

  • Teaching Interests

– Game design, AI in video games

slide-65
SLIDE 65

StarCraft AI

  • Real-time strategy game with

– Imperfect information – Large state space – Actions taken in real-time

  • Better AI can help game design

– Improved single player experience – Game balance

slide-66
SLIDE 66

StarCraft AI Competitions

  • Annual StarCraft AI Competition at AIIDE

– Winner plays human professional – AI currently no match for humans – Poor high-level strategies

Image source: Flickr

slide-67
SLIDE 67

StarCraft AI Research

StarCraft

Abstract Game

Abstract Game Equilibrium Strategy

High-level StarCraft Strategy

?

CFR?

?

slide-68
SLIDE 68

Arcade Learning Environment (ALE)

  • Framework for developing AI agents for Atari

2600 games

– Simple, yet still challenging domains

  • Goal: One agent that plays many games well

– At the heart of artificial intelligence

  • Can aid game design

– Auto-detect glitches – Evaluate difficulty

  • Future research ideas:

– Death detection in reinforcement learning

Image source: Wikipedia

slide-69
SLIDE 69

Automated Game Content Generation

  • Procedural methods for creating:

– Levels for a platforming game – Music for different game contexts – League schedules in a sports game, etc.

  • Benefits:

– More content for “free” – Content tailor-made for individual players

  • Techniques:

– Constraint satisfaction + optimization – I want to learn more!

Image source: Infinite Mario Bros. screenshot Image source: nhl.com

slide-70
SLIDE 70

Outline of Presentation

  • Computer Poker Primer

– Motivation – Background

  • New Contributions to Computer Poker

– Research + Hyperborean3p

  • Future Research – AI in Video Games

– StarCraft AI, ALE, automated content generation

  • Teaching Interests

– Game design, AI in video games

slide-71
SLIDE 71

Teaching Interests

  • Games design / programming courses

– IAT 167, 265, 312, 410 – Experience as lab instructor for

introductory programming course

  • Establish new courses in AI and games...
slide-72
SLIDE 72

AI in Games Courses

  • Introductory course

– Hands-on experience implementing real AI – NPC behaviour, simple sports AI, etc.

  • Advanced / graduate course

– Exposure to research in the field:

  • Pathfinding
  • StarCraft AI
  • Interactive story-telling...

Image source: amazon.ca

slide-73
SLIDE 73

Conclusion

  • Computer poker research

– Primary author of Hyperborean3p – Success in poker competitions

  • Video game research in StarCraft, ALE, automated game

content generation

  • Interested in teaching game design and AI in games
slide-74
SLIDE 74

Thanks for Listening!

  • I'm really excited to be here!
  • Contact info:

– Email: rggibson@cs.ualberta.ca – Website: http://cs.ualberta.ca/~rggibson/ – Twitter: @RichardGGibson

Clip art images used in this presentation can be found at clker.com