LunarLander-v2 using Deep Reinforcement Learning
A project developed for Autonomous Agents Course PLH513 Portokalakis Petros February 2020
LunarLander-v2 using Deep Reinforcement Learning A project - - PowerPoint PPT Presentation
LunarLander-v2 using Deep Reinforcement Learning A project developed for Autonomous Agents Course PLH513 Portokalakis Petros February 2020 Simple Game 8-Dimensional state space 4 actions per state +100 points for landing
A project developed for Autonomous Agents Course PLH513 Portokalakis Petros February 2020
frame when firing main engine
encourage smooth landing)
Objective: approximate the optimal Q-Function (which satisfies the Bellman Equation) Neural network:
4 layer approach works well with a variety of hidden layer node number 5 layers prove insufficient to even train the agent
Experience replay:
between consecutive samples
pass to the network for the next state
Target network: Use an identical network to the policy network, but update target network weight’s every C iterations (C is a hyperparameter) First pass occures with the policy network Second pass occures with the target network
Abstract version of the agent algorithm implemented
Adding a third hidden layer
Hyperparameter Value Starting epsilon 1 Minimum epsilon 0.01 Decay factor of epsilon 0.99 Discount factor gamma 0.99 Learning rate 0.001 Batch size 64 Replay buffer 1000000