Policy Continuation with Hindsight Inverse Dynamics
Hao Sun1, Zhizhong Li1, Xiaotong Liu2, Dahua Lin1, Bolei Zhou1
1 The Chinese University of Hong Kong 2 Peking University
sh018@ie.cuhk.edu.hk
Policy Continuation with Hindsight Inverse Dynamics Hao Sun 1 , - - PowerPoint PPT Presentation
Policy Continuation with Hindsight Inverse Dynamics Hao Sun 1 , Zhizhong Li 1 , Xiaotong Liu 2 , Dahua Lin 1 , Bolei Zhou 1 1 The Chinese University of Hong Kong 2 Peking University sh018@ie.cuhk.edu.hk Goal-Oriented Reward Sparse Tasks Goal
Hao Sun1, Zhizhong Li1, Xiaotong Liu2, Dahua Lin1, Bolei Zhou1
1 The Chinese University of Hong Kong 2 Peking University
sh018@ie.cuhk.edu.hk
Start Goal
[Hindsight Experience Replay, M Andrychowicz et al. 2017]
[Hindsight Experience Replay, M Andrychowicz et al. 2017]
Inverse Dynamics:
State Goal
Hindsight Inverse Dynamics:
step 1 step 2
In 1 step ?
step 1 step 2 In 1 step ?
step 1 step k
In less than k-1 steps ?
step 1 step 2
In 1 step ?
East Exhibition Hall B + C #194