[PPT] - Oppo Opportunit unities ies f for Human Human-AI AI Col PowerPoint Presentation

SLIDE 1

Oppo Opportunit unities ies f for Human Human-AI AI Col Collabor

rati

tive Tool

ols

s to

Advance

De Develop

pment

t of

f Moti

Motivati tion

n

An Analy lytic ics

Steven C. Dang and Kenneth R. Koedinger 10th International Learning Analytics and Knowledge Conference Workshop on Learning Analytic Services to Support Personalized Learning & Assessment at Scale

SLIDE 2

Cr Crystal Island

Narrative-centered Learning Environment

2

SLIDE 3

Operationalizing on New Systems

Off-task behavior can be indicative of

cognitive engagement (Baker et al, 2004)

Rowe et al (2009) operationalized off-

task behavior for Crystal Island

Narrative contains elements of

“seductive detail”

Off-task = any student behavior that

involves locations or objects not necessary for solving CRYSTAL ISLAND’S science mystery

3

Crystal Island Narrative-centered Learning Environment

SLIDE 4

Accuracy of Construct Operationalization

Results raised construct validity

questions:

Off-task behavior not related to pre-post

learning

No relationship to achievement
rientation or self-efficacy

4

Crystal Island Narrative-centered Learning Environment

SLIDE 5

5

Data World Model

SLIDE 6

6

Data World Model Data Iteration Model Iteration

SLIDE 7

Talk Overview

1. The problem of confounding constructs
2. Leveraging Behavior-based Psychometric Scales
3. Common Challenges and Opportunities

7

SLIDE 8

Confounding Constructs (Huggins-Manley et al, 2019)

Mono-operation bias threat
When a single indicator underrepresents a construct because the construct is

more complex than a single indicator

Student motivations impact many student behaviors

8

SLIDE 9

Leveraging Behavior-based Scales (under review)

Academic Diligence
“working assiduously on academic tasks which are beneficial in the long-run

but tedious in the moment, especially in comparison to more enjoyable, less effortful diversions “ (Galla et al, 2014)

Operational Measures:
Time-on-task, Problems Completed

9

SLIDE 10

Leveraging Behavior-based Scales (under review)

Academic Diligence
“working assiduously on academic tasks which are beneficial in the long-run

but tedious in the moment, especially in comparison to more enjoyable, less effortful diversions “ (Galla et al, 2014)

Operational Measures:
Time-on-task, Problems Completed
Conflated with knowledge measures.

10

SLIDE 11

11

50 Minute Class Period

SLIDE 12

12

50 Minute Class Period Student 1 Student 2 Student 3 Student 4 Student 5

SLIDE 13

13

50 Minute Class Period Student 1 Student 2 Student 3 Student 4 Student 5 Start Speed Sustained Effort Early Finish

SLIDE 14

12 Measure Behavior-based Scale

14

1 Start speed Absolute Mean 2 Variance 3 Scaled Mean 4 Variance 5 Sustained Effort Absolute Mean 6 Variance 7 Scaled Mean 8 Variance 9 Early Finish Absolute Mean 10 Variance 11 Scaled Mean 12 Variance

SLIDE 15

Psychometric Validation of the Scale

Factor Analysis Yielded 2 Factors
Start Speed and Sustained Effort
related to Math Interest & Self-efficacy
Early Finishing
related to Effort Regulation

15

SLIDE 16

Psychometric Validation of the Scale

Factor Analysis Yielded 2 Factors
Start Speed and Sustained Effort
related to Math Interest & Self-efficacy
Early Finishing
related to Effort Regulation
Goal was to identify less, knowledge dependent measures

16

SLIDE 17

Combined measure yielded the best predictive model and was also reliable

17

Final Grade ~ Gender + Ethnicity + SES + Prior Grade + Absenteeism + Diligence + (1 | Class)

Start Speed 1 2 3 4 Start Speed 1 3 Start Speed 2 4 Start Speed 1 Start Speed 2 Start Speed 3 Start Speed 4 Sustain Effort 5 6 7 8 Sustain Effort 5 7 Sustain Effort 6 8 Sustain Effort 5 Sustain Effort 6 Sustain Effort 7 Sustain Effort 8 Start Speed 1 2 3 4 Sustain Effort 5 6 7 8

SLIDE 18

Combined measure yielded the best predictive model and was also reliable

18

Final Grade ~ Gender + Ethnicity + SES + Prior Grade + Absenteeism + Diligence + (1 | Class)

Start Speed 1 2 3 4 Start Speed 1 3 Start Speed 2 4 Start Speed 1 Start Speed 2 Start Speed 3 Start Speed 4 Sustain Effort 5 6 7 8 Sustain Effort 5 7 Sustain Effort 6 8 Sustain Effort 5 Sustain Effort 6 Sustain Effort 7 Sustain Effort 8 Start Speed 1 2 3 4 Sustain Effort 5 6 7 8

SLIDE 19

Common Challenges and Opportunities by Leveraging Behavior-based Scales

Defining Models Iterating on Models

19

SLIDE 20

Defining Models

20

SLIDE 21

Model Parameter Setting

Aleven et al (2006) derived a model for help-

seeking strategies from Self-Regulated Learning theory

21

Defined thresholds for “Familiar-at-all” and “Sense of what to do”
Set to values that were “intuitively plausible, given our past experience”
Behavior-based Scale ~ Past Experience
developers can utilize data to similarly inform thresholds based on theory-

informed expectations

SLIDE 22

Deriving Fully Machine Learned Models

Baker et al (2004) derived a wide

range of features for input into the algorithm

eg: P(know), time-on-last-3, help-in-

last-8, etc.

Linear, quadratic, and interactions
Mathematical transforms of raw

input data are common and valuable data science process tools

22

SLIDE 23

Deriving Fully Machine Learned Models

Featuretools: Automating Feature engineering with deep learning

(Kanter & Veeramachaneni, 2015)

23

SLIDE 24

Hyperparameter setting (Kuvalja, et al, 2014)

Analyzed patterns of children’s self-directed speech for measuring

children’s self-regulated learning

Required setting hyperparameters of the algorithm
(eg: minimum number of occurrences, probability of observing a pattern

threshold)

Expert knowledge informed priors to set these thresholds
Behavior-based scale ~ Expert Knowledge
Given a target for a machine learning problem, autonomous ML algorithms

can automatically find optimal values for hyperparameters on a representative sample of data. (Kandasamey et al, 2019)

24

SLIDE 25

Common Challenges and Opportunities by Leveraging Behavior-based Scales

Defining Models Iterating on Models

25

SLIDE 26

Accuracy of Construct Operationalization

Identified Gaming in high and low post-test regardless of pre-test

(Baker et al, 2004)

Hurt and not-hurt gaming behaviors appeared to be differentiable

(Baker et al, 2008)

26

SLIDE 27

Accuracy of Construct Operationalization

Identified Gaming in high and low post-test regardless of pre-test

(Baker et al, 2004)

Hurt and not-hurt gaming behaviors appeared to be differentiable

(Baker et al, 2008)

Reflection after bottom-out hints is linked to learning

(Shih et al, 2008)

27

SLIDE 28

Accuracy of Construct Operationalization

Identified Gaming in high and low post-test regardless of pre-test

(Baker et al, 2004)

Hurt and not-hurt gaming behaviors appeared to be differentiable

(Baker et al, 2008)

Reflection after bottom-out hints is linked to learning

(Shih et al, 2008)

28

Quick Help Request Quick Help Request Quick Help Request

SLIDE 29

Accuracy of Construct Operationalization

Identified Gaming in high and low post-test regardless of pre-test

(Baker et al, 2004)

Hurt and not-hurt gaming behaviors appeared to be differentiable

(Baker et al, 2008)

Reflection after bottom-out hints is linked to learning

(Shih et al, 2008)

29

Quick Help Request Quick Help Request Quick Help Request Unexpectedly Slow Attempt

SLIDE 30

Supporting Model Iteration

Support Qualitative analysis for behavior discovery
Text-replay method (Baker & de Carvalho, 2008)
Overwhelming quantity of Data
1 Class of 15 students @ 2x/week = 200k transactions

30

SLIDE 31

Supporting Model Iteration

Leveraging supervision signal to guide search

through data

Extending Hurt vs Non-hurt analysis
Identify outlier students based on theoretically

informed expectations

Narrows transactions (15k)
Need additional work to investigate how to

leverage explainable-ai work to support more efficient browsing of sequential behavior data for anomalous patterns

31

SLIDE 32

Conclusion

Behavior-based Psychometric scales yield more valid & reliable

measurement

More valid measurements lead to better initial analytic models
Scales allow human experts to embed theoretical expectations into

the data and algorithms can leverage this information to more intelligently tackle many data science tasks

Opportunity to investigate how tools can leverage behavior scale

information to support qualitative analysis processes to identify shortcomings in the operationalized construct

32

SLIDE 33

Acknowledgements

Ken Koedinger, Matt Bernacki, Queenie Kravitz, David Klahr, Audrey Russo, Sharon Carver, Franceska Xhakaj, Ken Holstein, Julian Ramos, Judith Tucker

33

Oppo Opportunit unities ies f for Human Human-AI AI Col Collabor

tive Tool

s to

De Develop

t of

Motivati tion

An Analy lytic ics

Cr Crystal Island

Narrative-centered Learning Environment

Operationalizing on New Systems

cognitive engagement (Baker et al, 2004)

task behavior for Crystal Island

Accuracy of Construct Operationalization

questions:

Data World Model

Data World Model Data Iteration Model Iteration

Talk Overview

Confounding Constructs (Huggins-Manley et al, 2019)

Leveraging Behavior-based Scales (under review)

Leveraging Behavior-based Scales (under review)

12 Measure Behavior-based Scale

Psychometric Validation of the Scale

Psychometric Validation of the Scale

Combined measure yielded the best predictive model and was also reliable

Combined measure yielded the best predictive model and was also reliable

Common Challenges and Opportunities by Leveraging Behavior-based Scales

Defining Models Iterating on Models

Defining Models

Model Parameter Setting

seeking strategies from Self-Regulated Learning theory

Deriving Fully Machine Learned Models

range of features for input into the algorithm

input data are common and valuable data science process tools

Deriving Fully Machine Learned Models

Hyperparameter setting (Kuvalja, et al, 2014)

children’s self-regulated learning

Common Challenges and Opportunities by Leveraging Behavior-based Scales

Defining Models Iterating on Models

Accuracy of Construct Operationalization

Accuracy of Construct Operationalization

Accuracy of Construct Operationalization

Accuracy of Construct Operationalization

Supporting Model Iteration

Supporting Model Iteration

through data

Conclusion

measurement

the data and algorithms can leverage this information to more intelligently tackle many data science tasks

information to support qualitative analysis processes to identify shortcomings in the operationalized construct

Acknowledgements

Questions?