Using Inverse Planning for Personalized Feedback
Anna N. Rafferty
Computer Science Department, Carleton College
Rachel A. Jansen Thomas L. Griffiths
Department of Psychology, University of California, Berkeley
Using Inverse Planning for Personalized Feedback Anna N. Rafferty - - PowerPoint PPT Presentation
Using Inverse Planning for Personalized Feedback Anna N. Rafferty Computer Science Department, Carleton College Rachel A. Jansen Thomas L. Griffiths Department of Psychology, University of California, Berkeley Using Data for Personalization
Anna N. Rafferty
Computer Science Department, Carleton College
Rachel A. Jansen Thomas L. Griffiths
Department of Psychology, University of California, Berkeley
Algorithm
? Provide experience X
about equation solving
diagnosis
Ξ = space of possible understandings p(ΞΈ | equations) Algebra skills (π1) Algebra skills (π2)
Conceptual Mal-rules 1+3x => 4x 3(2+5x) => 6+5x Arithmetic 1+5.9x+3.2x => 1+8.1x
Planning 3x+5x+4 = 2 => 3x+4 = -5x+2
e.g., Sleeman, 1984; Payne & Squibb, 1990; Koedinger & MacLaren,1997
ΞΈ β Ξ: 6-dimensional vector of parameters related to skill
p(ΞΈ | equations) Algebra skills (π1) Algebra skills (π2)
Prior
Prior: Encode information about what misunderstandings are common
Likelihood
p(ΞΈ | equations) β p(ΞΈ)p(equations | ΞΈ) Algebra skills (π2) Algebra skills (π1)
Likelihood: What is the probability of the observed data if the learner has a particular understanding?
Prior Likelihood
p(ΞΈ | equations) β p(ΞΈ)p(equations | ΞΈ) Algebra skills (π2) Algebra skills (π1)
2 + 3x = 6 3x = 6 + 2 3x = 8 Move 2 to right side Combine 6 and 2 Divide both sides by 3 ...
π affects what actions are considered and transition probabilities for actions.
Assume a noisily optimal policy: p(a | s) β exp(ΞΈΞ² Β· Q(s, a))
Q(s, a) = X
s02S
p(s0|s, a) R(s, a) + Ξ³ X
a02A
p(a0|s0)Q(s0, a0) !
Long term expected value:
2 + 3x = 6 3x = 6 + 2 3x = 8 Move 2 to right side Combine 6 and 2 Divide both sides by 3 ...
5 + 9 = 6.0x + 2.0x + 10.0[1 + 1 + 7.0x] 5 + 9 = 6.0x + 2.0x + 10 + 10 + 70.0x 5 + 9 = 6.0x + 2.0x + 20 + 70.0x 14 = 6.0x + 2.0x + 20 + 70.0x 5 + 9 = 6.0x + 2.0x + 10.0[1 + 1 + 7.0x] 5 + 9 = 6.0x + 2.0x + 10 + 10 + 70.0x 5 + 9 = 6.0x + 2.0x + 20 + 70.0x 14 = 6.0x + 2.0x + 20 + 70.0x 14 = 76.0x + 2.0x + 20.0 14 = 78.0x + 20.0 14 + β20 = 78.0x β7 = 78.0x β 7 78 = 1x
Arithmetic error parameter Action planning parameter Distributive property error parameter Move term error parameter
. . .
Representation of understanding (Ξ) Model of equation solving as a (parameterized) MDP Infer posterior probability over Ξ (MCMC)
0.5 1 0.5 1 Arithmetic Error Parameter Probability Value
Arithmetic Error Parameter
Probability Value
How do we turn this into a feedback activity?
0 0.5 1
Value
0.5
Probability Move
1
Value
0.5
Probability Combine
0 0.5 1
Value
0.5
Probability Divide
0 0.5 1
Value
0.5
Probability Distributive
2 4
Value
0.5
Probability Planning
0 0.5 1
Value
0.5
Probability Arithmetic
Overview of skills and assessment Text explanation and video from Khan Academy Targeted practice with fading scaffolding
Session 1: Website Problem Solving and Multiple Choice Test Session 3: Website Problem Solving and Multiple Choice Test Session 2: Feedback Activity
Targeted Feedback Random Feedback 6 12 18 24
Score Accuracy Improvements by Time and Condition
Before Feedback After Feedback
Targeted Feedback
Accuracy Improvements by Time and Condition
Random Feedback
Accuracy Improvements by Time and Condition
Before Feedback After Feedback
Accuracy Improvements by Time and Condition
Before Feedback After Feedback
Reliable improvement, but no difference in amount
Skill level > 0.85 Skill level < 0.85 6 12 18 24
Score Accuracy Improvements by Time and Level of Skill
Before Feedback After Feedback
Random Feedback Targeted Feedback 6 12 18 24
Score Accuracy Improvements by Time and Condition for Participants with Some Mastered and Some Unmastered Skills
Before Feedback After Feedback
Reliable difference in amount of improvement by condition.
learners who struggle with only some skills
feedback
Acknowledgements: Thank you to students Jonathan Brodie and Sam Vinitsky for programming contributions. Funding: This work is supported by NSF grant number DRL-1420732.
Contact: Anna Rafferty, arafferty@carleton.edu
1 2 3 4 5 Number of skills with proficiency < 0.85 0.2 0.4 0.6 0.8 1 Proportion of participants
a1 a2 a3 s1 s2 s3 ...
Actions: