R i f Reinforcement Learning in L i i Board Games Board Games - PowerPoint PPT Presentation

R i f Reinforcement Learning in L i i Board Games Board Games G E O R G E T U C K E R G E O R G E T U C K E R

Paper Background � Reinforcement learning in board games g g � Imran Ghory � 2004 � Surveys progress in last decade � Suggests improvements � Formalizes key game properties � Develops a TD-learning game system

Why board games? � Regarded as a sign of intelligence and learning g g g g � Chess � Games as simplified models � Battleship � Existing methods of comparison � Rating systems i

What is reinforcement learning? � After a sequence of actions get a reward q g � Positive or negative � Temporal credit assignment problem � Determine credit for the reward � Temporal Difference Methods � TD-lambda � TD-lambda

History � Basics developed by Arthur Samuel p y � Checkers � Richard Sutton introduced TD-lambda � Gerald Tesauro creates TD-Gammon � Chess and Go � Worse then conventional AI

History � Othello � Contradictory results � Substantial growth since then � TD-lambda has potential to learn game variants

Conventional Strategies � Most methods use an evaluation function � Use minimax/ alpha-beta search � Hand-designed feature detectors g � Evaluation function is a weighted sum � So why TD learning? � Does not need hand coded features � Generalization li i

Temporal Difference Learning

Disadvantage � Requires lots of training q g � Self-play � Short-term pathologies � Randomization

TD Algorithm Variants � TD-Leaf � Evaluation function search � TD-Directed � Minimax search � TD-Mu � Fixed opponent i d � Use evaluation function on opponent’s moves

Current State � Many improvements y p � Sparse and dubious validation � Hard to check � Tuning weights � Nonlinear combinations � Differentiate between effective and ineffective � Differentiate between effective and ineffective � Automated evolution method of feature generation � Turian � Turian

Important Game Properties � Board Smoothness � Capabilities tied to smoothness � Based on the board representation � Divergence rate � Divergence rate � Measure how a single move changes the board � Backgammon and Chess – low to medium � Othello – high � Forced exploration � State space complexity St t s l it � Longer training � Possibly the most important factor y p

Importance of State space complexity

Training Data � Random play � Limited use � Fixed opponent � Game environment and opponent are one � Game environment and opponent are one � Database play � Speed p � Self-play � No outside sources for data � Slow Sl � Learns what works � Hybrid methods Hybrid methods

Improvement: General � Reward size � Fixed value � Based on end board � Board encoding � Board encoding � When to learn? � Every move? y � Random moves? � Repetitive learning � Board inversion d � Batch learning

Improvement: Neural Network � Functions in Neural Network � Radial Basis Functions � Training algorithm � RPROP � Random weight initialization � Significance Si ifi

Improvement: Self-play � Asymmetry y y � Game-tree + function approximator � Player handling � Tesauro adds an extra unit � Negate score (zero-sum game) � Reverse colors � Reverse colors � Random moves � Algorithm � Algorithm � Informed final board evaluation

Evaluation � Tic-tac-toe and Connect 4 � Amenable to TD-learning � Human board encoding is near optimal � Networks across multiple games � A general game player � Plays perfectly near end game � Plays perfectly near end game � Randomly otherwise � Random-decay handicap � % of moves are random � Common system

Random Initializations � Significant impact on learning g p g

Inverted Board � Speeds up initial training p p g

Random Move Selection � More sophisticated techniques are required p q q

Reversed Color Evaluation

Batch Learning � Similar to control

Repetitive learning � No advantage g

Informed Final Board Evaluation � Extremely significant y g

Conclusion � Inverted boards and reverse color evaluation � Initialization is important � Biased randomization techniques q � Batch learning has promise � Informed final board evaluation is important p

R i f Reinforcement Learning in L i i Board Games Board Games - PowerPoint PPT Presentation

R i f Reinforcement Learning in L i i Board Games Board Games G E O R G E T U C K E R G E O R G E T U C K E R Paper Background Reinforcement learning in board games g g Imran Ghory 2004 Surveys progress in last decade

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest

Games Miheer Dewaskar Chennai Mathematical Institute April 27, 2016 1 / 19 Outline Finite

S S S S erious Games erious Games erious Games erious Games + Computer S + Computer S +

Potential Games Matoula Petrolia April 14, 2011 Examples Potential Games Potential vs

Pre-Grundy Games Games And Graphs Workshop 2017 In collaboration with : Eric Duch ene,

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

Introduction to Reinforcement Learning Kevin Chen and Zack Khan Lecture 1: Introduction to

Introduction to Reinforcement Learning and Q-Learning Skyler Seto (ss3349) May 2, 2016 Skyler

7. Motor Control and Reinforcement Learning Outline A. Action Selection and Reinforcement B.

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

Tactical Frameworks for teaching games Tim Hopper and Rick Bell HPEC/CAHPERD 2002 Banff BCPE

on the community effect of insecticide treated nets Malaria Policy Advisory Committee Meeting

Caliddas Q3 2019 Results November 13 th , 2019 1 Disclaimer The information provided herein

Tecnotree Corporation Financial Statements 1-12/2010 2 Feb, 2011 Tecnotree Group in Brief 1-12

DATA SCIENCE AND THE BUSINESS OF MAJOR LEAGUE BASEBALL Matthew Horton Josh Hamilton Aaron

Presenting a live 90-minute webinar with interactive Q&A Litigating Closely Held Business

Governance Issues for Municipalities and their LDCs Robert B. Warren, WeirFoulds LLP Daniel P.

Long-Term Services and Supports Trust Commission Meeting August 4, 2020 1 Zoom Controls

R i f Reinforcement Learning in L i i Board Games Board Games - PowerPoint PPT Presentation

R i f Reinforcement Learning in L i i Board Games Board Games G E O R G E T U C K E R G E O R G E T U C K E R Paper Background Reinforcement learning in board games g g Imran Ghory 2004 Surveys progress in last decade

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest

Games Miheer Dewaskar Chennai Mathematical Institute April 27, 2016 1 / 19 Outline Finite

S S S S erious Games erious Games erious Games erious Games + Computer S + Computer S +

Potential Games Matoula Petrolia April 14, 2011 Examples Potential Games Potential vs

Pre-Grundy Games Games And Graphs Workshop 2017 In collaboration with : Eric Duch ene,

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

Introduction to Reinforcement Learning Kevin Chen and Zack Khan Lecture 1: Introduction to

Introduction to Reinforcement Learning and Q-Learning Skyler Seto (ss3349) May 2, 2016 Skyler

7. Motor Control and Reinforcement Learning Outline A. Action Selection and Reinforcement B.

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

Tactical Frameworks for teaching games Tim Hopper and Rick Bell HPEC/CAHPERD 2002 Banff BCPE

on the community effect of insecticide treated nets Malaria Policy Advisory Committee Meeting

Caliddas Q3 2019 Results November 13 th , 2019 1 Disclaimer The information provided herein

Tecnotree Corporation Financial Statements 1-12/2010 2 Feb, 2011 Tecnotree Group in Brief 1-12

DATA SCIENCE AND THE BUSINESS OF MAJOR LEAGUE BASEBALL Matthew Horton Josh Hamilton Aaron

Presenting a live 90-minute webinar with interactive Q&amp;A Litigating Closely Held Business

Governance Issues for Municipalities and their LDCs Robert B. Warren, WeirFoulds LLP Daniel P.

Long-Term Services and Supports Trust Commission Meeting August 4, 2020 1 Zoom Controls

Presenting a live 90-minute webinar with interactive Q&A Litigating Closely Held Business