U G A V michael.johanson@gmail.com ! " # ! K Q $ - PowerPoint PPT Presentation

Robust Strategies and   Counter-Strategies: From Superhuman to Optimal Play Mike Johanson January 14, 2016 Grad Seminar Q J $ # K 1 0 P C R " ! U G A V michael.johanson@gmail.com ! " # ! K Q $ @mikebjohanson A J ! 0 1 University of Alberta University of Alberta Computer Poker Research Group Computer Poker Research Group

Games as a testbed for Artificial Intelligence

Games as a testbed for Artificial Intelligence Chinook (Checkers): - Surpassed humans in 1994 - Solved (perfect play) in 2007

Games as a testbed for Artificial Intelligence Chinook (Checkers): - Surpassed humans in 1994 - Solved (perfect play) in 2007 Deep Blue (Chess): - Surpassed humans in 1997

Games as a testbed for Artificial Intelligence Chinook (Checkers): - Surpassed humans in 1994 - Solved (perfect play) in 2007 Deep Blue (Chess): - Surpassed humans in 1997 Watson (Jeopardy!): - Surpassed humans in 2011

Games as a testbed for Artificial Intelligence Chinook (Checkers): - Surpassed humans in 1994 - Solved (perfect play) in 2007 Deep Blue (Chess): - Surpassed humans in 1997 Watson (Jeopardy!): - Surpassed humans in 2011 Current challenges (not yet superhuman): go, Atari 2600 games, General Game Playing, Starcraft, RoboCup, poker, curling (?!) and so on…

Games as a testbed for Artificial Intelligence

Games as a testbed for Artificial Intelligence Babbage and Lovelace: Wanted “Games of Purely Intellectual Skill” to demonstrate their Analytical Engine. Chess , Tic-Tac-Toe. Horse racing?

Games as a testbed for Artificial Intelligence Babbage and Lovelace: Wanted “Games of Purely Intellectual Skill” to demonstrate their Analytical Engine. Chess , Tic-Tac-Toe. Horse racing? Alan Turing: Wrote a chess program before first computers, and ran it by hand. Chess as part of the Turing Test.

Games as a testbed for Artificial Intelligence Babbage and Lovelace: Wanted “Games of Purely Intellectual Skill” to demonstrate their Analytical Engine. Chess , Tic-Tac-Toe. Horse racing? Alan Turing: Wrote a chess program before first computers, and ran it by hand. Chess as part of the Turing Test. John von Neumann: Founded Game Theory to study rational decision making. Needed computational power to drive it, became pioneer in Computing Science.

Core idea in this line of research: We aspire to create agents that can achieve their goals in complex real-world domains. Games provide a series of well-defined and tractable domains that humans find challenging. New games introduce new challenges that current approaches can’t handle. This is a gradient we can follow.

Core idea in this line of research: We aspire to create agents that can achieve their goals in complex real-world domains. Games provide a series of well-defined and tractable domains that humans find challenging. New games introduce new challenges that current approaches can’t handle. This is a gradient we can follow. Can play against humans, to compare Artificial Intelligence to Human Intelligence.

John von Neumann pioneered Game Theory. When asked about real life and chess , he said…

John von Neumann pioneered Game Theory. When asked about real life and chess , he said… Real life is not like that. Real life consists of bluffing, of little tactics of deception, of asking yourself what is the other man going to think I mean to do. And that is what games are about in my theory.

Chess is a.. 2-player, deterministic, perfect information game, with win / lose / tie outcomes.

Poker: Chess is a.. 2-10 Players (at one table) 2-player, Thousands (tournaments) deterministic, perfect information game, with win / lose / tie outcomes.

Poker: Chess is a.. 2-10 Players (at one table) 2-player, Thousands (tournaments) Stochastic: Cards randomly dealt deterministic, to players and the table. perfect information game, with win / lose / tie outcomes.

Poker: Chess is a.. 2-10 Players (at one table) 2-player, Thousands (tournaments) Stochastic: Cards randomly dealt deterministic, to players and the table. Imperfect Information: Opponent’s cards perfect information game, are hidden. with win / lose / tie outcomes.

Poker: Chess is a.. 2-10 Players (at one table) 2-player, Thousands (tournaments) Stochastic: Cards randomly dealt deterministic, to players and the table. Imperfect Information: Opponent’s cards perfect information game, are hidden. Maximize winnings with win / lose / tie outcomes. by exploiting opponent errors.

My Research and This Grad Seminar Topic: Computing strong strategies in Imperfect Information Games 2008: 2015: PhD Start PhD End

My Research and This Grad Seminar Two key milestones in 2-Player limit hold’em poker: 2008: 2015: PhD Start PhD End

My Research and This Grad Seminar Two key milestones in 2-Player limit hold’em poker: 2008: 2015: PhD Start PhD End First computer victory over human poker pros. >

My Research and This Grad Seminar Two key milestones in 2-Player limit hold’em poker: 2008: 2015: PhD Start PhD End First computer Game solved. victory over Computer is human poker now optimal. pros. >= Everyone, forever.

My Research and This Grad Seminar Two key milestones in 2-Player limit hold’em poker: 2008: 2015: PhD Start PhD End Solving Solving Solving First computer Game solved. Attempt Attempt Attempt victory over Computer is human poker #1 #2 #3… now optimal. pros.

My Research and This Grad Seminar Two key milestones in 2-Player limit hold’em poker: 2008: 2015: PhD Start PhD End First computer Game solved. victory over Computer is human poker now optimal. pros. Note: I’ll be very high-level in this talk. This is a summary of 7 papers in my thesis, and 7 more not in my thesis. Ask questions!

Superhuman Play: The Abstraction-Solving-Translation Procedure. This is how we beat the pros in 2008. First used in poker by Shi and Littman in 2002. Still the dominant approach in large games.

Terminology: Strategy : A policy for playing a game. At every decision, a probability distribution over actions.

Terminology: Strategy : A policy for playing a game. At every decision, a probability distribution over actions. Best Response : A strategy that maximizes utility against a specific target strategy.

  Terminology: Strategy : A policy for playing a game. At every decision, a probability distribution over actions. Best Response : A strategy that maximizes utility against a specific target strategy. Nash Equilibrium : A strategy for every player that are all mutually best responses to the others.   In a 2-player zero-sum game, it’s guaranteed to do no worse than tie.

Game (10^14 Decisions) AI Solve the game by computing a Nash Equilibrium. Strategy (Opponent Modelling comes later)

Game (10^14 Decisions) AI Strategy Evaluation EV against humans,   other programs

Game (10^14 Decisions) Exploitability : Expected loss against a best response. AI Intractable to compute until 2011. Strategy Exploitability by   Best Response Evaluation EV against humans,   other programs

The AI Step: Counterfactual Regret Minimization (CFR) Start with Uniform Random strategy. 1 Repeatedly plays against itself. vs 2 Update: At each decision, use the 2a historically best actions more often. (minimizing regret) �� Average strategy converges �� 3 �� towards a Nash equilibrium. ��

The AI Step: Counterfactual Regret Minimization (CFR) �� Memory Cost: 2 doubles per Action-at-Decision-Point (16 bytes)

Real Game Problem: (10^14 Decisions) Game has 3.6 *10 13 actions. At 16 bytes each… 523 TB storage. AI ~10,000 CPU-years runtime. :( Real Strategy Exploitability by   Best Response Evaluation EV against humans,   other programs

Problem: Real Game Game has 3.6 *10 13 actions. (10^14 Decisions) At 16 bytes each… 523 TB storage. :( AI ~10,000 CPU-years runtime. Real Strategy :( Exploitability by   Best Response Evaluation EV against humans,   other programs

U G A V michael.johanson@gmail.com ! " # ! K Q $ - PowerPoint PPT Presentation

Robust Strategies and Counter-Strategies: From Superhuman to Optimal Play Mike Johanson January 14, 2016 Grad Seminar Q J $ # K 1 0 P C R " ! U G A V michael.johanson@gmail.com ! " # ! K Q $ @mikebjohanson

Problems for Breakfast Shaking Hands Seven people in a room start shaking hands. Six of them

Kochen-Specker theorem and games Laura Mancinska University of Waterloo, Department of C&O

Repeated Games George J Mailath A talk prepared for the Nemmers Conference Northwestern

2 The examples in our paper [2] Example 1: Voronoi Note, in the figure above, that the ( ,

Tool for Responsible Games London Workshop on Problem Gambling: Theory and (Best) Practice

Internal Corporate Governance Mechanisms and the Performance of Financial Firms in the Context of

Clinical Integration: An Economic and Clinical Integration: An Economic and Research

Jamie Gamble Wicked Systems Problems Are unique and have no precedent Do not have

CAS ANNUAL MEETING, NOVEMBER 2009 Dan Murphy, FCAS, MAAA Trinostics LLC Agenda Motivation

Perturbation Theory for Eigenvalue Problems Nico van der Aa October 19th 2005 Overview of

CAS Centennial November 2014 Credibility An Incredibly Good Idea ! Ira Robbin, PhD AIG CAS

tr r rt

Water wave analysis with nonlinear Fourier transforms Peter Prins p.j.prins@tudelft.nl 7

Cayley Graphs Isomorphisms Forming Groups Free Groups Examples Relators Ryan Jensen Graphs

RELEVANCE OF CONSERVATION LAWS FOR AN ENSEMBLE KALMAN FILTER Svetlana Dubinkina (CWI, Amsterdam)

Mutual Information in Conformal Field Theories in Higher Dimensions John Cardy University of

WE RE JUST LIKE YOU, BUT XYZ YEARS YOUNGER Engaging Millennials and Gen Z Resident X-ennial

THE VIEW FROM NORTH OF THE 49 TH PARALLEL Bill Macaulay Tax Partner Smythe Ratcliffe LLP 1

Informal Remarks on Fiscal Capacity and Tax Cooperation* by Vito Tanzi *To be delivered at the

FOR LIVE PROGRAM ONLY Repatriating Foreign-Source Income for U.S. Taxpayers: Minimizing the Tax

PUBLIC LECTURE TOPIC: INCOME TAX ACT, 2015 (ACT 896); WHAT HAS CHANGED 23/03/2016 Presented by

Presented by Michael I. Sanders Blank Rome LLP TAX OPINION ISSUES: True Debt Considerations

Standort Deutschland Turnaround Financing, Classic M&A and other Investment

DEVELOPMENT OF THE RUSSIAN TAX SYSTEM: RESULTS OF THE FIRST HALF OF 2015 AND PERSPECTIVES 24

U G A V michael.johanson@gmail.com ! " # ! K Q $ - PowerPoint PPT Presentation

Robust Strategies and Counter-Strategies: From Superhuman to Optimal Play Mike Johanson January 14, 2016 Grad Seminar Q J $ # K 1 0 P C R " ! U G A V michael.johanson@gmail.com ! " # ! K Q $ @mikebjohanson

Problems for Breakfast Shaking Hands Seven people in a room start shaking hands. Six of them

Kochen-Specker theorem and games Laura Mancinska University of Waterloo, Department of C&amp;O

Repeated Games George J Mailath A talk prepared for the Nemmers Conference Northwestern

2 The examples in our paper [2] Example 1: Voronoi Note, in the figure above, that the ( ,

Tool for Responsible Games London Workshop on Problem Gambling: Theory and (Best) Practice

Internal Corporate Governance Mechanisms and the Performance of Financial Firms in the Context of

Clinical Integration: An Economic and Clinical Integration: An Economic and Research

Jamie Gamble Wicked Systems Problems Are unique and have no precedent Do not have

CAS ANNUAL MEETING, NOVEMBER 2009 Dan Murphy, FCAS, MAAA Trinostics LLC Agenda Motivation

Perturbation Theory for Eigenvalue Problems Nico van der Aa October 19th 2005 Overview of

CAS Centennial November 2014 Credibility An Incredibly Good Idea ! Ira Robbin, PhD AIG CAS

tr r rt

Water wave analysis with nonlinear Fourier transforms Peter Prins p.j.prins@tudelft.nl 7

Cayley Graphs Isomorphisms Forming Groups Free Groups Examples Relators Ryan Jensen Graphs

RELEVANCE OF CONSERVATION LAWS FOR AN ENSEMBLE KALMAN FILTER Svetlana Dubinkina (CWI, Amsterdam)

Mutual Information in Conformal Field Theories in Higher Dimensions John Cardy University of

WE RE JUST LIKE YOU, BUT XYZ YEARS YOUNGER Engaging Millennials and Gen Z Resident X-ennial

THE VIEW FROM NORTH OF THE 49 TH PARALLEL Bill Macaulay Tax Partner Smythe Ratcliffe LLP 1

Informal Remarks on Fiscal Capacity and Tax Cooperation* by Vito Tanzi *To be delivered at the

FOR LIVE PROGRAM ONLY Repatriating Foreign-Source Income for U.S. Taxpayers: Minimizing the Tax

PUBLIC LECTURE TOPIC: INCOME TAX ACT, 2015 (ACT 896); WHAT HAS CHANGED 23/03/2016 Presented by

Presented by Michael I. Sanders Blank Rome LLP TAX OPINION ISSUES: True Debt Considerations

Standort Deutschland Turnaround Financing, Classic M&amp;A and other Investment

DEVELOPMENT OF THE RUSSIAN TAX SYSTEM: RESULTS OF THE FIRST HALF OF 2015 AND PERSPECTIVES 24

Kochen-Specker theorem and games Laura Mancinska University of Waterloo, Department of C&O

Standort Deutschland Turnaround Financing, Classic M&A and other Investment