osu mania reinforcement learning agent
play

osu!mania Reinforcement Learning Agent - PowerPoint PPT Presentation

osu!mania Reinforcement Learning Agent ichrysomallis@isc.tuc.gr 2014030078 Contents Introduction osu!mania game Graphical User Interface customization Agents environment Approach and


  1. osu!mania Reinforcement Learning Agent Χρυσομάλλης Ιάσων ichrysomallis@isc.tuc.gr 2014030078

  2. Contents  Introduction  osu!mania game  Graphical User Interface customization  Agent’s environment  Approach and variable definition  Q-learning  Deep reinforcement learning  Future plans 2

  3. Introduction Topic: Develop an agent able to learn how to play the video game osu!mania, through reinforcement learning. Two agents :  Q-learning agent  Deep reinforcement learning agent 3

  4. osu!mania game Rhythm game , notes are falling Single-tap notes 1. Hold notes 2. Judgment bar 3. Player keys 4. Combo 5. Hitburst 6. Overall accuracy 7. Score 8. 4

  5. Graphical User Interface customization Fully customizable environment, all elements can be changed Each element is painted with solid color RGB = [X, 100, 100], where X is in accordance with the element’s identity (see numbers) 5

  6. Agent’s environment Record screenshots and translate information based on the RGB values given Small fraction of the screen includes relevant information , specific boxes are being recorded 6

  7. Approach and variable definition (1) Identical behavior on each column, problem can be narrowed down to single column learning  Agent’s actions : Instantaneous key tap 1. Key press (no release) 2. Key release 3. Do nothing 4. 7

  8. Approach and variable definition ( 2 )  Rewards:  Epsilon: o Initial value = 1 o Decay value = 0.9977 o Minimum value = 0.01 8

  9. Approach and variable definition ( 3 )  State:  One column of 200 pixels  Only red (R) layer  Three possible values (no note, singe-tap note, hold note) Deep reinforcement learning: o Raw input of the column Q-learning: o Only 8 pixels due to state complexity, taking one pixel every 15 pixels of the recorded column 9

  10. Q-learning  Algorithm:  Steps:  Receive current state  Choose an action based on epsilon  Execute the action  Receive new state  Check if song is over  Update Q-table 10

  11. Deep reinforcement learning  Neural network model (Keras):  Steps: Identical steps apart from last one. Save transitions in temporary memory and train the model with a smaller, randomly selected sample group (batch). 11

  12. Results Q-learning agent DQN agent 12

  13. Future plans  Try different combinations of neural network model layers  Design the neural network model in TensorFlow  Run the agent on GPU, instead of CPU  Make use of a high end computer 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend