Task-Agnostic Dynamics Priors for Deep Reinforcement Learning
Yilun Du1, Karthik Narasimhan2
1 MIT, 2 Princeton
Task-Agnostic Dynamics Priors for Deep Reinforcement Learning Yilun - - PowerPoint PPT Presentation
Task-Agnostic Dynamics Priors for Deep Reinforcement Learning Yilun Du 1 , Karthik Narasimhan 2 1 MIT, 2 Princeton Key Questions t t+1 Can we learn physics in a task-agnostic fashion? Does it help sample efficiency of RL? Can we
Yilun Du1, Karthik Narasimhan2
1 MIT, 2 Princeton
task-agnostic fashion?
efficiency of RL?
physics from one environment to other?
t t+1
et al. (2018), …)
videos
train a policy
model on target environment.
Future Frame Spatial Memory Input
SpatialNet
zt ht zt+1 ht+1
additive updates in the ConvLSTM model (Xingjian et al., 2015))
Spatial Memory State Ground Truth Label State (ht ) Input (zt) Gated Input (it) State (ht) Proposal State (ut) State New (ht+1) Input (zt)
Spatial Memory
Output (ot) Ce Cu Cdyn Cd
additive updates in the ConvLSTM model (Xingjian et al., 2015))
containing moving objects of various shapes and sizes
Physics-centric games
actions
a policy
PhysGoal PhysForage PhysShooter Phys3D
Pixel Prediction Accuracy
Model Transfer > Model + Policy Transfer > No Transfer