CS885 Reinforcement Learning Lecture 15c: June 20, 2018
Semi-Markov Decision Processes [Put] Sec. 11.1-11.3
CS885 Spring 2018 Pascal Poupart 1 University of Waterloo
CS885 Reinforcement Learning Lecture 15c: June 20, 2018 Semi-Markov - - PowerPoint PPT Presentation
CS885 Reinforcement Learning Lecture 15c: June 20, 2018 Semi-Markov Decision Processes [Put] Sec. 11.1-11.3 University of Waterloo CS885 Spring 2018 Pascal Poupart 1 Hierarchical RL Hierarchy of goals Reach and actions in Destination
CS885 Spring 2018 Pascal Poupart 1 University of Waterloo
CS885 Spring 2018 Pascal Poupart 2
University of Waterloo
Reach Destination Reach B Reach A Reach C Turn Overtake Stop Park Break Gas Steering
CS885 Spring 2018 Pascal Poupart 3
University of Waterloo
CS885 Spring 2018 Pascal Poupart 4
University of Waterloo
CS885 Spring 2018 Pascal Poupart 5
University of Waterloo
CS885 Spring 2018 Pascal Poupart 6
B + #,C + D EF,G
BF K #J,CJ − K(#,C)
University of Waterloo
CS885 Spring 2018 Pascal Poupart 7
∀()*+ ∈ #$%&: Pr ()*+, 0 (), ! = ∑3456:45896∉ ;<=> ∏@AB
+CB Pr ()*@ ()*@CB, " ()*@CB
D (), !, ()*+, 0 = D (), " () + F ∑3456 Pr ()*B (), " () D ()*B," ()*B +⋯F∑3458Pr ()*+ ()*+CB,"(()*+CB) D ()*+," ()*+ …
University of Waterloo
CS885 Spring 2018 Pascal Poupart 8
( ) *+,-
(+ : #0,20 − :(#,2)
University of Waterloo