Introduction Value iteration Decision-theoretic agents Summary
Informatics 2D – Reasoning and Agents
Semester 2, 2019–2020
Alex Lascarides alex@inf.ed.ac.uk
Lecture 30 – Markov Decision Processes 27th March 2020
Informatics UoE Informatics 2D 1
Informatics 2D Reasoning and Agents Semester 2, 20192020 Alex - - PowerPoint PPT Presentation
Introduction Value iteration Decision-theoretic agents Summary Informatics 2D Reasoning and Agents Semester 2, 20192020 Alex Lascarides alex@inf.ed.ac.uk Lecture 30 Markov Decision Processes 27th March 2020 Informatics UoE
Introduction Value iteration Decision-theoretic agents Summary
Informatics UoE Informatics 2D 1
Introduction Value iteration Decision-theoretic agents Summary Sequential decision problems Optimality in sequential decision problems
Informatics UoE Informatics 2D 215
Introduction Value iteration Decision-theoretic agents Summary Sequential decision problems Optimality in sequential decision problems
1 2 3 1 2 3 4
START
0.8 0.1 0.1 (a) (b) –1 + 1
Informatics UoE Informatics 2D 216
Introduction Value iteration Decision-theoretic agents Summary Sequential decision problems Optimality in sequential decision problems
Informatics UoE Informatics 2D 217
Introduction Value iteration Decision-theoretic agents Summary Sequential decision problems Optimality in sequential decision problems
Informatics UoE Informatics 2D 218
Introduction Value iteration Decision-theoretic agents Summary Sequential decision problems Optimality in sequential decision problems
1 2 3 1 2 3 + 1
–1
4
–1
+1
R s ( ) < 1.6284 > 0 R s ( ) 0.4278 < < 0.0850 R s ( )
(a) (b)
< 0 R s ( ) 0.0221 <
–1
+1
–1
+1
–1
+1
Informatics UoE Informatics 2D 219
Introduction Value iteration Decision-theoretic agents Summary Sequential decision problems Optimality in sequential decision problems
Informatics UoE Informatics 2D 220
Introduction Value iteration Decision-theoretic agents Summary Sequential decision problems Optimality in sequential decision problems
0 ∧ [s0, s1, s2 . . .] [s′ 0, s′ 1, s′ 2, . . .] ⇒ [s1, s2 . . .] [s′ 1, s′ 2, . . .]
Informatics UoE Informatics 2D 221
Introduction Value iteration Decision-theoretic agents Summary Sequential decision problems Optimality in sequential decision problems
Informatics UoE Informatics 2D 222
Introduction Value iteration Decision-theoretic agents Summary Sequential decision problems Optimality in sequential decision problems
∞
∞
Informatics UoE Informatics 2D 223
Introduction Value iteration Decision-theoretic agents Summary Utilities of states The value iteration algorithm
Informatics UoE Informatics 2D 224
Introduction Value iteration Decision-theoretic agents Summary Utilities of states The value iteration algorithm
1
1
2
2
2
2
Informatics UoE Informatics 2D 225
Introduction Value iteration Decision-theoretic agents Summary Utilities of states The value iteration algorithm
Informatics UoE Informatics 2D 226
Introduction Value iteration Decision-theoretic agents Summary Utilities of states The value iteration algorithm
Informatics UoE Informatics 2D 227
Introduction Value iteration Decision-theoretic agents Summary Utilities of states The value iteration algorithm
Informatics UoE Informatics 2D 228
Introduction Value iteration Decision-theoretic agents Summary Utilities of states The value iteration algorithm
Informatics UoE Informatics 2D 229
Introduction Value iteration Decision-theoretic agents Summary Utilities of states The value iteration algorithm
Informatics UoE Informatics 2D 230
Introduction Value iteration Decision-theoretic agents Summary
Informatics UoE Informatics 2D 231
Introduction Value iteration Decision-theoretic agents Summary
Informatics UoE Informatics 2D 232
Introduction Value iteration Decision-theoretic agents Summary
Informatics UoE Informatics 2D 233