OOP Inheri -tance En- capsu- lation Learning to Play Tetris via - - PowerPoint PPT Presentation

oop
SMART_READER_LITE
LIVE PREVIEW

OOP Inheri -tance En- capsu- lation Learning to Play Tetris via - - PowerPoint PPT Presentation

Poly- mor- phism Abstra ction Class OOP Inheri -tance En- capsu- lation Learning to Play Tetris via Kuan-Ting Lai 2020/5/25 Deep Reinforcement Learning Deep Reinforcment Learning (DRL) Establishing Communiation between Java and


slide-1
SLIDE 1

Learning to Play Tetris via Deep Reinforcement Learning

Kuan-Ting Lai 2020/5/25

OOP

Class Abstra ction Inheri

  • tance

En- capsu- lation Poly- mor- phism

slide-2
SLIDE 2

Deep Reinforcment Learning (DRL)

slide-3
SLIDE 3

Establishing Communiation between Java and Python

  • Remote Procedure Call (RPC)
  • MsgPack RPC

− MessagePack: It’s like JSON, but fast and small https://msgpack.org/index.html

Action State

slide-4
SLIDE 4

OpenAI Gym

  • https://gym.openai.com/
slide-5
SLIDE 5

Hello Gym Example

import gym env = gym.make("CartPole-v1")

  • bservation = env.reset()

for _ in range(1000): env.render() # your agent here (this takes random actions) action = env.action_space.sample()

  • bservation, reward, done, info = env.step(action)

if done:

  • bservation = env.reset()

env.close()

https://gym.openai.com/docs/

slide-6
SLIDE 6

step()

  • observation (state 𝑇𝑢):

− Observation of the environment. Ex: pixel data from a camera, joint angles and joint velocities of a robot, or the board state in a board game.

  • reward (𝑠

𝑢)

− amount of reward achieved by the previous action.

  • done:

− True indicates the episode has terminated

  • info:

− diagnostic information useful for debugging.

slide-7
SLIDE 7

Creating Tetris DRL Project

slide-8
SLIDE 8

Creating a Maven Project

  • New Project -> Maven

− GroupId: org.aiotlab − ArtifactId: Tetris4DRL

slide-9
SLIDE 9

Adding Dependencies to “pom.xml”

  • Add the dependencies of

msgpack-rpc-java to pom.xml

<dependencies dependencies> <dependency dependency> <groupId groupId>org.msgpack</groupId groupId> <artifactId artifactId>msgpack</artifactId artifactId> <version version>0.6.12</version version> </dependency dependency> <dependency dependency> <groupId groupId>org.jboss.netty</groupId groupId> <artifactId artifactId>netty</artifactId artifactId> <version version>3.2.1.Final</version version> <exclusions exclusions> <exclusion exclusion> <groupId groupId>javax.servlet</groupId groupId> <artifactId artifactId>servlet-api</artifactId artifactId> </exclusion exclusion> <exclusion exclusion> <groupId groupId>commons-logging</groupId groupId> <artifactId artifactId>commons-logging</artifactId artifactId> </exclusion exclusion> </exclusions exclusions> </dependency dependency> <dependency dependency> <groupId groupId>org.slf4j</groupId groupId> <artifactId artifactId>slf4j-api</artifactId artifactId> <version version>1.6.1</version version> </dependency dependency> <dependency dependency> <groupId groupId>org.slf4j</groupId groupId> <artifactId artifactId>slf4j-log4j12</artifactId artifactId> <version version>1.6.1</version version> </dependency dependency> </dependencies dependencies>

slide-10
SLIDE 10

Reimporting Maven

slide-11
SLIDE 11

MsgPack-RPC for Java

slide-12
SLIDE 12

Download and Compile msgpack-rpc-java

  • 1. Install Maven (link)
  • 2. Download msgpack-rpc-java

− https://github.com/msgpack-rpc/msgpack-rpc-java

  • 3. Compile source code

− mvn compile

  • 4. Create jar file

− mvn package

  • 5. Fix a bug “type optional is ambiguous” (link)
  • 6. Copy and add the library to IntelliJ project

− Add “target/msgpack-rpc-0.7.1-SNAPSHOT.jar” to our project

slide-13
SLIDE 13

Fixing Error “type optional is ambiguous”

  • https://github.com/msgpack-rpc/msgpack-rpc-java/pull/12
  • Issue: Maven can't compile the package because of ambiguous
  • ptional type in AnnotationTest
  • Solution: Explicitly Import org.msgpack.annotation.Optional
slide-14
SLIDE 14

Add “msgpack-rpc-x.x.x.jar”

  • Create a folder “lib”
  • Copy and paste “msgpack-rpc-0.7.1.jar” to

the lib/ folder

  • Right click the jar file, and select “Add as

Library…”

slide-15
SLIDE 15

Remove Timer in Board.java

private void private void initBoard(Tetris parent) { setFocusable(true true); /*timer = new Timer(); timer.scheduleAtFixedRate(new ScheduleTask(), INITIAL_DELAY, PERIOD_INTERVAL);*/ … } private void private void newPiece() { … if if (!tryMove(curPiece curPiece, curX curX, curY curY)) { curPiece curPiece.setShape(Tetrominoe.NoShape NoShape); //timer.cancel(); isStarted isStarted = false false; statusbar statusbar.setText("Game over" "Game over"); } }

slide-16
SLIDE 16

Create the RPC Server

public class public class TetrisServer { public static void public static void main(String[] args) { // Create a RPC server EventLoop loop = EventLoop.defaultEventLoop(); Server svr = new new Server(); svr.serve(new new TetrisServer()); try try { svr.listen(server_port); System.out

  • ut.println("Tetris RPC server is listening at "

"Tetris RPC server is listening at " + server_port); loop.join(); } catch catch (IOException exp1) { exp1.printStackTrace(); } catch catch (InterruptedException exp2) { exp2.printStackTrace(); } } }

slide-17
SLIDE 17

public class TetrisServer { static int server_port = 10612; static Tetris game; // OpenAI GYM API public int [] reset() { game.restart(); return game.getBoardState(); } public int [] step(int action_type) { return game.step(action_type); } public int getTotalReward() { return game.getLinesRemoved(); } public boolean isDone() { return game.isGameOver(); } public static void main(String[] args) { game = new Tetris(); game.setVisible(true); // Create a RPC server EventLoop loop = EventLoop.defaultEventLoop(); Server svr = new Server(); svr.serve(new TetrisServer()); try { svr.listen(server_port); System.out.println("Tetris RPC server is listening at " + server_port); loop.join(); } catch (IOException exp1) { exp1.printStackTrace(); } catch (InterruptedException exp2) { exp2.printStackTrace(); } } }

Add OpenAI Gym API

slide-18
SLIDE 18

getBoardState()

public public int int [] getBoardState() { // Get board state int int b_size = boar

  • ard.getBoardSize();

Shape.Tetrominoe[] b_data = boar board.getBoardState(); int int [] state = new int new int[b_size]; for for (int int i=0; i<b_size; i++) { state[i] = b_data[i].ordinal(); } return return state; }

slide-19
SLIDE 19

int [] step(int action_type)

public int [] step(int action_type) { if if (isGameOver()) return getBoardState(); switch(action_type) { case case 0: // Move left move(-1); break; case case 1: // Move right move(1); break; case case 2: // Rotate left rotate(false); break; case case 3: // Rotate right rotate(true); break; case case 4: // Drop dropDown(); break; } board.doGameCycle(); return getBoardState(); }

slide-20
SLIDE 20

MsgPack-RPC for Python

slide-21
SLIDE 21

Install msgpack-rpc

  • pip install msgpack-rpc-python
slide-22
SLIDE 22

import msgpackrpc; import random import time server_ip = "localhost" server_port = 10612 client = msgpackrpc.Client(msgpackrpc.Address(server_ip, server_port)) client.call('reset') while True: action = random.randint(0, 4) state = client.call('step', action) #print(state) reward = client.call('getTotalReward') done = client.call('isDone') if done: print('Game over. Reward {}'.format(reward)) client.call('reset')

Random Play

slide-23
SLIDE 23

# Implement OpenAI Gym interface import gym import msgpackrpc # install msgpack-rpc-python import numpy as np class TetrisGym(gym.Env): board_width = 10 board_height = 22 def __init__(self, ip='localhost', port=10612): self.action_space = gym.spaces.Discrete(5) self.observation_space = gym.spaces.Box( low=0, high=7, shape=(self.board_height*self.board_width, 1), dtype=np.float32); self.client = msgpackrpc.Client(msgpackrpc.Address(ip, port)) # initialize environment settings self.done = False def reset(self): return self.client.call('reset') def step(self, action):

  • bservation = self.client.call('step', action)

reward = self.client.call('getTotalReward') self.done = self.client.call('isDone') info = None return observation, reward, self.done, info;

Inheriting OpenAI Gym

slide-24
SLIDE 24

References

  • https://msgpack.org/index.html