Richard Gibson SIAT Faculty Search Presentation February 28, 2013 - PowerPoint PPT Presentation

Recent Advances in Computer Poker and Future Research for Artificial Intelligence in Video Games Richard Gibson SIAT Faculty Search Presentation February 28, 2013

One Slide Summary ● 2009 – 2013: Computer Poker Research

One Slide Summary ● 2009 – 2013: Computer Poker Research ● Future: AI in Video Games Image source: co-optimus.com Image source: arcadelearningenvironment.org

Outline of Presentation ● Computer Poker Primer – Motivation – Background ● New Contributions to Computer Poker – Research + Hyperborean3p ● Future Research – AI in Video Games – StarCraft AI, ALE, automated content generation ● Teaching Interests – Game design, AI in video games

Why Poker Research? ● Classic games, such as chess and checkers, are: – Deterministic – Binary outcomes (+ draw) – Perfect Information Image source: spectrum.ieee.org Image sources: Wikipedia

Why Poker Research? ● However, poker is a game with: – Stochastic elements Image sources: Wikipedia Flop? Flop? . . . . . . Flop?

Why Poker Research? ● However, poker is a game with: – Stochastic elements – Varying outcomes Pot 1 Image source: ebaumsworld.com Pot 2 Pot 3

Why Poker Research? ● However, poker is a game with: – Stochastic elements – Varying outcomes – Imperfect information ? ?

Why Poker Research? ● Poker research is applicable in other areas: – Airport security [Pita et al. , AI Magazine 2009] – Adaptive treatment strategies [Chen and Bowling, NIPS 2012] – Sequential auctions [?]

Outline of Presentation ● Computer Poker Primer – Motivation – Background ● New Contributions to Computer Poker – Research + Hyperborean3p ● Future Research – AI in Video Games – StarCraft AI, ALE, automated content generation ● Teaching Interests – Game design, AI in video games

Poker Research Background ● Model poker as an extensive-form game : c QJ QK 0.5 0.5 1 1 c b c b 2 2 2 2 c b f c c b f c +1 1 +1 +2 -1 -1 1 +1 -2 f c f c -1 +2 -1 -2

Poker Research Background ● Information sets : Sets of states a player cannot distinguish between. c QJ QK 0.5 0.5 1 1 c b c b 2 2 2 2 c b f c c b f c +1 1 +1 +2 -1 -1 1 +1 -2 f c f c -1 +2 -1 -2

Poker Research Background ● Example: Kuhn Poker

Poker Research Background ● Example: Kuhn Poker ?

Poker Research Background ● Example: Kuhn Poker Fold? Bet! ? Call?

Poker Research Background ● Example: Kuhn Poker Call. ?

Poker Research Background ● Example: Kuhn Poker

Poker Research Background ● Example: Kuhn Poker -2 +2 Lose. Win!

Poker Research Background ● Example: Kuhn Poker c QJ QK 0.5 0.5 1 1 c b c b 2 2 2 2 c b f c c b f c +1 1 +1 +2 -1 -1 1 +1 -2 f c f c -1 +2 -1 -2

Poker Research Background Extensive-Form Game Strategy Profile

Poker Research Background A strategy profile maps each information set to probability a ● distribution over actions. c QJ QK 0.5 0.5 1 1 0.6 c b 0.4 0.6 c b 0.4 2 2 2 2 0.8 0.2 1 0 0 1 0 1 c b f c c b f c +1 1 +1 +2 -1 -1 1 +1 -2 0.7 f c f c 0.3 0.3 0.7 -1 +2 -1 -2

Poker Research Background ● What type of strategy profile do we want? – Nash equilibrium ● Example: Rock-Paper-Scissors

Poker Research Background 1 r p s 2 2 2 r p s r p s r p s 0 -1 +1 +1 0 -1 -1 +1 0

Poker Research Background ● A Nash equilibrium strategy profile for Rock-Paper-Scissors. – “No one can change their strategy and do better.” 1 1/3 r p s 1/3 1/3 2 2 2 1/3 r p s 1/3 1/3 r p s 1/3 1/3 r p s 1/3 1/3 1/3 1/3 0 -1 +1 +1 0 -1 -1 +1 0

Poker Research Background ● A Nash equilibrium in a 2-player game is a defensive strategy: – “I can't lose no matter what my opponent does.” 1 1/3 r p s 1/3 1/3 2 2 2 ? r p s ? ? r p s ? ? r p s ? ? ? ? 0 -1 +1 +1 0 -1 -1 +1 0

Poker Research Background Extensive-Form Game ? Nash Equilibrium Strategy Profile

Poker Research Background ● Use minimax (alpha-beta) search to compute Nash? Source: clker.com

Poker Research Background ● Use minimax (alpha-beta) search to compute Nash? c QJ QK 0.5 0.5 1 1 c b c b 0.6 0.4 0.6 0.4 2 2 2 2 0.8 c b 0.2 1 f c 0 0 c b 1 0 f c 1 +1 1 +1 +2 -1 -1 1 +1 -2 0.7 0.3 f c f c 0.7 0.3 -1 +2 -1 -2

Poker Research Background ● Instead, use Counterfactual Regret Minimization (CFR) [Zinkevich et al. , NIPS 2007]. Strategy Deal “Play” Cards Profile 1 Poker

Poker Research Background ● Instead, use Counterfactual Regret Minimization (CFR) [Zinkevich et al. , NIPS 2007]. Strategy Deal “Play” Cards Profile 1 Poker Deal Strategy Cards “Play” T Profile 2 Poker (billions) ... ... Deal Strategy Cards “Play” Profile T Poker

Poker Research Background ● Instead, use Counterfactual Regret Minimization (CFR) [Zinkevich et al. , NIPS 2007]. Strategy 1 + Strategy 2 + ... + Strategy T T ∞ Nash Equilibrium Strategy Profile T = Average Strategy Profile

Poker Research Background Extensive-Form Game CFR Nash Equilibrium Strategy Profile

Poker Research Background ● Huge problem (no pun intended): Texas Hold'em >10 14 CFR CFR Nash Equilibrium > 5 million GB Strategy Profile

Poker Research Background Extensive-Form Game ? Nash Equilibrium Strategy Profile

Poker Research Background Abstract Extensive-Form Game Game

Poker Research Background ● Merge card deals into buckets. Abstract Extensive-Form Game Game

Poker Research Background Abstract Extensive-Form Game Game ≈10 9 >10 14

Poker Research Background Abstract Extensive-Form Game Game ≈10 9 >10 14 CFR Abstract Game Abstract Equilibrium Strategy Deal “Play” Strategy Profile Buckets “Poker” billions of times

Poker Research Background Abstract Extensive-Form Game Game ≈10 9 >10 14 Abstract Game Approximate Full Game Equilibrium Strategy Equilibrium Strategy ≈100 GB

Outline of Presentation ● Computer Poker Primer – Motivation – Background ● New Contributions to Computer Poker – Research + Hyperborean3p ● Future Research – AI in Video Games – StarCraft, ALE, automated content generation ● Teaching Interests – Game design, AI in video games

2 f c +1 +2 Contribution 1: Domination

Domination 3-or-more Player Abstract Game CFR ? (Not equilibrium)

Domination Annual Computer Poker Competition 3-Player Limit Texas Hold'em - 2009 Agent Total Bankroll (mbb/g) Hyperborean3p 319 ± 2 3-or-more dpp 171 ± 2 Player Abstract akuma 151 ± 2 Game CMURingLimit -37 ± 2 dcu3pl -63 ± 2 Bluechip -548 ± 2 CFR ? (Not equilibrium)

Domination c QJ QK 0.5 0.5 1 1 c b c b 2 2 2 2 c b f c c b f c +1 1 +1 +2 -1 -1 1 +1 -2 f c f c -1 +2 -1 -2

Domination c QJ QK 0.5 0.5 1 1 Dominated Strategies c b c b 2 2 2 2 c b f c c b f c +1 1 +1 +2 -1 -1 1 +1 -2 f c f c -1 +2 -1 -2

Domination c QJ QK 0.5 0.5 1 1 c b c b 2 2 2 2 c b f c b c +1 1 +1 -1 -1 1 -2 f c f c -1 +2 -1 -2

Domination c QJ QK 0.5 0.5 1 1 c b c b 2 2 2 2 c b f c b c +1 1 +1 -1 -1 1 -2 f c f c -1 +2 -1 -2 Iteratively Dominated Strategy

Domination 3-or-more Player Abstract Game CFR Average Strategy Profile T ∞ No Iteratively New! Dominated Strategies [G., submitted to EC 2013]

Domination 3-or-more 3-or-more Player Abstract Player Abstract Game Game CFR CFR Average “Current” Strategy Profile Strategy Profile T T ∞ Finite T No Iteratively No Iteratively New! Dominated Strategies Dominated Strategies New! [G., submitted to EC 2013]

Domination 3-Player Limit Texas Hold'em - 2012 New! [G., submitted to EC 2013]

Contribution 2: Strategy Stitching

Strategy Stitching ≈10 9 ≈10 14 Abstract 2-player Limit Game Texas Hold'em ≈ 59,000,000 540,000 “Turn” “Turn” Deals Buckets

Strategy Stitching ≈10 9 ≈10 14 Abstract 2-player Limit Game Texas Hold'em ≈ 59,000,000 540,000 “Turn” “Turn” Deals Buckets ≈10 9 Abstract ≈10 17 3-player Limit Game Texas Hold'em ≈ 59,000,000 540 “Turn” Buckets “Turn” Deals

Richard Gibson SIAT Faculty Search Presentation February 28, 2013 - PowerPoint PPT Presentation

Recent Advances in Computer Poker and Future Research for Artificial Intelligence in Video Games Richard Gibson SIAT Faculty Search Presentation February 28, 2013 One Slide Summary 2009 2013: Computer Poker Research One Slide Summary

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Quaternary Golay Sequence Pairs Richard Gibson Department of Mathematics Simon Fraser University

Tabu Search Search Tabu Page 1 Part I Part I Tabu Search Principles Search Principles Tabu

Uninformed Search 2 Informed Search Rest of blind search An informed search strategyone

Informed search algorithms Outline Best-first search Greedy best-first search A *

Foundations of Artificial Intelligence 9. State-Space Search: Tree Search and Graph Search Malte

Humanities Team 23/03/17 Helen Beestin/Sarah Gibson PSHE Sarah Gibson RE Rachel Wright

Nic Background Simple text is a solved problem Gibson Complex texts and workflows are hard .

Genetics and Cancer Care Cynthia Forster-Gibson, MD, PhD and Loren Mackay- Loder, MSc Genetics

Elastic Search - Aditi Choksi (EW18455) Elastic Search Search engine Distributed

2 EBI Search 3 EBI Search 4 EBI

Balanced Search Trees Binary Search Trees Binary Search Tree Binary Search Tree A binary tree is

Search Algorithms 3 AI Slides (6e) c Lin Zuoquan@PKU 2003-2020 3 1 3 Search Algorithms

Query DB structures Manipulation queries DB search Hits Memory search 2 Standardization of

Search 3 AI Slides (5e) c Lin Zuoquan@PKU 2003-2019 3 1 3 Search 3.1 Problem-solving

Informed Search strategies AIMA sections 3.5, 3.6 Summary Informed Search strategies

Artificial intelligence Artificial Intelligence is the science of PHILOSOPHY OF ARTIFICIAL

Intro to Artificial Neural Networks Oscar Maas @oscmansan Outline 1. Perceptrons 2.

The Artificial Jack of All Trades: The Importance of Generality in Approaches to AI Tarek R.

The Life and Intelligence of Alan Turing Denbigh Starkey Emeritus Professor, Computer Science

Artificial Neural Networks for Multimodal Information Fusion Friedhelm Schwenker Institute of

Artificial Intelligence for Games IMGD 4000 Introduction to Artificial Intelligence (AI)

Dynamic Modelling of the Whole Heart Based on a Frequency Formulation and Implementation of

Figure 2.25 from page 92 of Exploring the Heart of Ma2er