A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching
Jacob Whitehill1,2 Javier Movellan1,2
1University of California, San Diego (UCSD) 2Machine Perception Technologies (www.mptec.com)
Saturday, December 8, 12
A Stochastic Optimal Control Perspective on Affect-Sensitive - - PowerPoint PPT Presentation
A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching Jacob Whitehill 1,2 Javier Movellan 1,2 1 University of California, San Diego (UCSD) 2 Machine Perception Technologies (www.mptec.com) Saturday, December 8, 12 Automated
Jacob Whitehill1,2 Javier Movellan1,2
1University of California, San Diego (UCSD) 2Machine Perception Technologies (www.mptec.com)
Saturday, December 8, 12
Saturday, December 8, 12
applied control theory to optimize the learning process for “flashcard”-style vocabulary learning.
Saturday, December 8, 12
“cognitive tutor” movement to teach complex skills, e.g.:
Algebra Cognitive Tutor
Saturday, December 8, 12
with more sophisticated graphics and sound.
Wayang Outpost math tutor
Saturday, December 8, 12
Saturday, December 8, 12
Saturday, December 8, 12
more effective if they used richer sensory information.
Saturday, December 8, 12
Saturday, December 8, 12
Saturday, December 8, 12
Saturday, December 8, 12
Saturday, December 8, 12
Saturday, December 8, 12
Saturday, December 8, 12
Learning rning gains
Aff.-Sens. Aff.-Blind
Day 1 0.249 0.389 Day 2 0.407 0.377
D’Mello, et al. 2010
Affect-sensitive tutor was less effective on day 1.
Saturday, December 8, 12
Saturday, December 8, 12
Saturday, December 8, 12
such a framework.
problem.
problem.
Saturday, December 8, 12
intractable.
approximately optimal control policies for automated teaching problems.
reinforcement learning methods have been developed for finding approximately optimal solutions.
Saturday, December 8, 12
ITS for language acquisition using approximate methods from SOC.
Brunskill, Griffiths, and Shafto (2011).
teacher naturally uses affective observations when they are available.
Saturday, December 8, 12
Saturday, December 8, 12
Saturday, December 8, 12
Saturday, December 8, 12
Saturday, December 8, 12
Saturday, December 8, 12
Saturday, December 8, 12
learns, and how she responds to questions asked by the teacher.
parameters.
teacher using simulation.
Saturday, December 8, 12
the manner of Nelson, Tenenbaum and Movellan (2007) for concept learning and Rafferty, et al. (2011) for concept teaching.
Saturday, December 8, 12
C1 Y1 A11 A1n
...
W1 Wn
...
Timestep 1 Timestep t
Ct Yt At1 Atn
...
Saturday, December 8, 12
C1 Y1 A11 A1n
...
W1 Wn
...
Timestep 1 Timestep t
Ct Yt At1 Atn
...
Saturday, December 8, 12
Saturday, December 8, 12
Saturday, December 8, 12
Saturday, December 8, 12
St
0.2 0.4 0.6 0.8 1
man woman boy girl eat drink milk breakfast
Saturday, December 8, 12
St
0.2 0.4 0.6 0.8 1
man woman boy girl eat drink milk breakfast
Saturday, December 8, 12
St Ut
Saturday, December 8, 12
St St+1 Ut
Saturday, December 8, 12
St St+1 Ut Ot
Saturday, December 8, 12
St St+1 Ut Ot
Saturday, December 8, 12
St St+1 Ut Ot
P(st+1 | o1:t, u1:t) ∝ Z P(st+1 | st, ut)P(ot | st, ut)P(st | o1:t−1, u1:t−1)dst
Saturday, December 8, 12
St St+1 Ut Ot
Prior belief Student response likelihood Student learning dynamics Posterior belief
P(st+1 | o1:t, u1:t) ∝ Z P(st+1 | st, ut)P(ot | st, ut)P(st | o1:t−1, u1:t−1)dst
Saturday, December 8, 12
St St+1 Ut Ot
Saturday, December 8, 12
0.2 0.4 0.6 0.8 1
man woman boy girl eat drink milk breakfast
St St+1 Ut Ot
0.2 0.4 0.6 0.8 1
man woman boy girl eat drink milk breakfast
Saturday, December 8, 12
Saturday, December 8, 12
V (π) . = E " τ X
t=1
r(St, Ut) | π #
Saturday, December 8, 12
π∗ . = arg max
π
V (π)
V (π) . = E " τ X
t=1
r(St, Ut) | π #
Saturday, December 8, 12
Saturday, December 8, 12
Word Meaning duzetuzi man fota woman nokidono boy mininami girl pipesu dog mekizo cat xisaxepe bird botazi rabbit koto eat notesabi drink
Saturday, December 8, 12
Saturday, December 8, 12
Saturday, December 8, 12
Saturday, December 8, 12
using an image sampled according to P(c | y).
Saturday, December 8, 12
OptimizedTeacher HandCraftedTeacher RandomWordTeacher 500 550 600 650 700 750 800 850 Avg time to finish (sec)
Avg time to finish v. teaching strategy
TimeCost(SOCTeacher) is 24% less than TimeCost(HeuristicTeacher) (p < 0.01).
SOCTeacher HeuristicTeacher
Saturday, December 8, 12
students were usually engaged in the task.
Saturday, December 8, 12
engagement.
Saturday, December 8, 12
Saturday, December 8, 12
Affective observation
P(st+1 | o1:t, u1:t) ∝ Z P(st+1 | st, ut)P(ot | st, ut)P(zt | βt)P(st | o1:t−1, u1:t−1)dst
Saturday, December 8, 12
50 100 150 200 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04
Timestep (t) Uncertainty in teacher’s belief
Affect−blind Affect−sensitive
Saturday, December 8, 12
50 100 150 200 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Timestep (t)
Affect−blind Affect−sensitive
Saturday, December 8, 12
Saturday, December 8, 12
Saturday, December 8, 12