Predicting Performance on MOOC Assessments using Multi-Regression - - PowerPoint PPT Presentation

predicting performance on mooc assessments using multi
SMART_READER_LITE
LIVE PREVIEW

Predicting Performance on MOOC Assessments using Multi-Regression - - PowerPoint PPT Presentation

Predicting Performance on MOOC Assessments using Multi-Regression Models Zhiyun Ren, Huzefa Rangwala, Aditya Johri George Mason University 4400 University Drive, Fairfax, Virginia 22030 Outline q Background q Personal Linear Multi-Regression


slide-1
SLIDE 1

Predicting Performance on MOOC Assessments using Multi-Regression Models

Zhiyun Ren, Huzefa Rangwala, Aditya Johri George Mason University 4400 University Drive, Fairfax, Virginia 22030

slide-2
SLIDE 2

Outline

q Background q Personal Linear Multi-Regression Models q Feature selection q Experiments and discussion q Conclusion and future work

slide-3
SLIDE 3

Background

slide-4
SLIDE 4

Background

slide-5
SLIDE 5

Overview

q Information we have: MOOC server log q Things we want to do: Predict student’s performance

slide-6
SLIDE 6

Challenge

q Various kinds of participants q High attrition rate q Flexible timetable q Baselines we have tried: Linear regression model, meanscore

slide-7
SLIDE 7

Personal Linear Multi-Regression Models

!",$ = &" + &( + )"

*+, "$ = &" + &( +

()",. ,

"$,/0.,/ 12 /34

)

6 .34

𝑞𝑡 𝑋 𝑔𝑡𝑏

𝒎 --Number

  • f regression

models 𝒐𝑮 -- Number

  • f features

!"#"!"$% ((, *, +) 1 2/ (01,2 − 01,2)4

5 678

+ :( * ;

4 + ( ; 4) + <( * + ( )

slide-8
SLIDE 8

Data structure

(a) Homework and quiz (b) Video (c) Study session

slide-9
SLIDE 9

Feature selection

q quiz related features q time related features q interval-based features q homework related features

slide-10
SLIDE 10

Feature selection

q Video related features q Session features

slide-11
SLIDE 11

Experimental setup

q Different motivations part the data into two groups. q Different models are applied for different data types.

slide-12
SLIDE 12

Experimental protocol

q PreviousHW-based prediction q PreviousOneHW-based prediction

HW1 HW2 HW3 HW4 …... HW1 HW2 HW3 HW4 …...

slide-13
SLIDE 13

Experimental baseline: KT-IDEM

K K K Q Q Q

P(L0) P(T) P(T) P(G1) P(S1) P(G2) P(S2) P(G3) P(S3)

I I I

……

Model parameters P(L0) = Initial Knowledge P(T) = Probability of learning P(G1…n) = Probability of guess per question P(S1…n) = Probability of slip per question n denotes the number of all questions.

slide-14
SLIDE 14

Comparative Performance

q Prediction results with varying number of regression models for student group with continuous grade value

slide-15
SLIDE 15

Comparative Performance

q Prediction results with varying number of regression models for student group with binary grade value

slide-16
SLIDE 16

Comparative Performance

q The comparison of the accuracy and F1 scores with baseline approaches.

slide-17
SLIDE 17

Feature Importance

slide-18
SLIDE 18

Feature Importance

slide-19
SLIDE 19

Conclusion and future work

q Predict algorithm: personalized multiple linear regression model. q Experimental results: improved performance compared to baseline methods. q Other contribution: analysis of feature importance. q Future work: to set up an early warning system to help improve student’s performance

slide-20
SLIDE 20

Thank you!