CS325 Artificial Intelligence
- Ch. 21 – Reinforcement Learning
Cengiz Günay, Emory Univ. Spring 2013
Günay
- Ch. 21 – Reinforcement Learning
Spring 2013 1 / 23
CS325 Artificial Intelligence Ch. 21 Reinforcement Learning Cengiz - - PowerPoint PPT Presentation
CS325 Artificial Intelligence Ch. 21 Reinforcement Learning Cengiz Gnay, Emory Univ. Spring 2013 Gnay Ch. 21 Reinforcement Learning Spring 2013 1 / 23 Rats! Rat put in a cage with lever. Each lever press sends a signal to
Günay
Spring 2013 1 / 23
Günay
Spring 2013 2 / 23
Günay
Spring 2013 2 / 23
Günay
Spring 2013 2 / 23
Günay
Spring 2013 4 / 23
Günay
Spring 2013 4 / 23
Günay
Spring 2013 7 / 23
Günay
Spring 2013 8 / 23
Günay
Spring 2013 8 / 23
Günay
Spring 2013 8 / 23
Günay
Spring 2013 9 / 23
Günay
Spring 2013 9 / 23
Günay
Spring 2013 9 / 23
Günay
Spring 2013 9 / 23
Günay
Spring 2013 9 / 23
Günay
Spring 2013 9 / 23
Günay
Spring 2013 9 / 23
Günay
Spring 2013 10 / 23
Günay
Spring 2013 10 / 23
Günay
Spring 2013 10 / 23
Günay
Spring 2013 11 / 23
Günay
Spring 2013 11 / 23
Günay
Spring 2013 11 / 23
1 Passive RL: Simple Case
Günay
Spring 2013 13 / 23
1 Passive RL: Simple Case
2 Active RL
Günay
Spring 2013 13 / 23
Günay
Spring 2013 14 / 23
Günay
Spring 2013 14 / 23
Günay
Spring 2013 14 / 23
Günay
Spring 2013 15 / 23
Günay
Spring 2013 15 / 23
Spring 2013 15 / 23
Günay
Spring 2013 16 / 23
Günay
Spring 2013 16 / 23
Günay
Spring 2013 16 / 23
Günay
Spring 2013 16 / 23
Günay
Spring 2013 16 / 23
0.2 0.4 0.6 0.8 1 100 200 300 400 500 Utility estimates Number of trials (1,1) (1,3) (2,1) (3,3) (4,3) 0.1 0.2 0.3 0.4 0.5 0.6 20 40 60 80 100 RMS error in utility Number of trials
Günay
Spring 2013 17 / 23
0.2 0.4 0.6 0.8 1 100 200 300 400 500 Utility estimates Number of trials (1,1) (1,3) (2,1) (3,3) (4,3) 0.1 0.2 0.3 0.4 0.5 0.6 20 40 60 80 100 RMS error in utility Number of trials
Günay
Spring 2013 17 / 23
Günay
Spring 2013 18 / 23
0.5 1 1.5 2 50 100 150 200 250 300 350 400 450 500 RMS error, policy loss Number of trials RMS error Policy loss
Günay
Spring 2013 18 / 23
0.5 1 1.5 2 50 100 150 200 250 300 350 400 450 500 RMS error, policy loss Number of trials RMS error Policy loss
Günay
Spring 2013 18 / 23
Günay
Spring 2013 19 / 23
Günay
Spring 2013 19 / 23
Günay
Spring 2013 19 / 23
Günay
Spring 2013 19 / 23
Günay
Spring 2013 19 / 23
Günay
Spring 2013 20 / 23
0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 20 40 60 80 100 Utility estimates Number of trials (1,1) (1,2) (1,3) (2,3) (3,2) (3,3) (4,3) 0.2 0.4 0.6 0.8 1 1.2 1.4 20 40 60 80 100 RMS error, policy loss Number of trials RMS error Policy loss
Günay
Spring 2013 21 / 23
Günay
Spring 2013 22 / 23
Günay
Spring 2013 22 / 23
Günay
Spring 2013 23 / 23
Günay
Spring 2013 23 / 23