Knowledge Tracing Machines: Factorization Machines for Knowledge - PowerPoint PPT Presentation

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Knowledge Tracing Machines: Factorization Machines for Knowledge Tracing Jill-Jênn Vie Hisashi Kashima KJMLW, February 22, 2019 https://arxiv.org/abs/1811.03388

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Practical intro When exercises are too easy/difficult, students get bored/discouraged. To personalize assessment, ⇒ need a model of how people respond to exercises. Example To personalize this presentation, ⇒ need a model of how people respond to my slides.

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Practical intro When exercises are too easy/difficult, students get bored/discouraged. To personalize assessment, → need a model of how people respond to exercises. Example To personalize this presentation, → need a model of how people respond to my slides.

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Practical intro When exercises are too easy/difficult, students get bored/discouraged. To personalize assessment, → need a model of how people respond to exercises. Example To personalize this presentation, → need a model of how people respond to my slides. p(understanding) Practical: 0.9 Theoretical: 0.6

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Theoretical intro Let us assume x is sparse.

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Theoretical intro Let us assume x is sparse. Linear regression y = � w , x � Logistic regression y = σ ( � w , x � ) where σ is sigmoid. Neural network x ( L +1) = σ ( � w , x ( L ) � ) where σ is ReLU. What if σ : x �→ x 2 for example?

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Theoretical intro Let us assume x is sparse. Linear regression y = � w , x � Logistic regression y = σ ( � w , x � ) where σ is sigmoid. Neural network x ( L +1) = σ ( � w , x ( L ) � ) where σ is ReLU. What if σ : x �→ x 2 for example? Polynomial kernel y = σ (1 + � w , x � ) where σ is a monomial. Factorization machine y = � w , x � + || V x || 2 Mathieu Blondel, Masakazu Ishihata, Akinori Fujino, and Naonori Ueda (2016). “Polynomial networks and factorization machines: new insights and efficient training algorithms”. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning-Volume 48 . JMLR. org, pp. 850–858

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Practical intro When exercises are too easy/difficult, students get bored/discouraged. To personalize assessment, → need a model of how people respond to exercises. Example To personalize this presentation, → need a model of how people respond to my slides.

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Practical intro When exercises are too easy/difficult, students get bored/discouraged. To personalize assessment, → need a model of how people respond to exercises. Example To personalize this presentation, → need a model of how people respond to my slides. p(understanding) Practical: 0.9 Theoretical: 0.9

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Students try exercises Math Learning Items 5 – 5 = ? New student ◦

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Students try exercises Math Learning Items 5 – 5 = ? 17 – 3 = ? New student ◦ ◦

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Students try exercises Math Learning Items 5 – 5 = ? 17 – 3 = ? 13 – 7 = ? New student ◦ ◦ ×

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Students try exercises Math Learning Items 5 – 5 = ? 17 – 3 = ? 13 – 7 = ? ◦ ◦ New student × Language Learning

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Students try exercises Math Learning Items 5 – 5 = ? 17 – 3 = ? 13 – 7 = ? ◦ ◦ New student × Language Learning Challenges Users can attempt a same item multiple times Users learn over time People can make mistakes that do not reflect their knowledge

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Predicting student performance: knowledge tracing Data A population of users answering items Events: “User i answered item j correctly/incorrectly” Side information If we know the skills required to solve each item e.g., + , × Class ID, school ID, etc. Goal: classification problem Predict the performance of new users on existing items\ Metric: AUC Method Learn parameters of questions from historical data e.g., difficulty Measure parameters of new students e.g., expertise

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Existing work Original Model Basically AUC Bayesian Knowledge Tracing Hidden Markov Model 0.67 (Corbett and Anderson 1994)

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Existing work Original Model Basically AUC Bayesian Knowledge Tracing Hidden Markov Model 0.67 (Corbett and Anderson 1994) Deep Knowledge Tracing Recurrent Neural Network 0.86 (Piech et al. 2015)

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Existing work Original Fixed Model Basically AUC AUC Bayesian Knowledge Tracing Hidden Markov Model 0.67 0.63 (Corbett and Anderson 1994) Deep Knowledge Tracing Recurrent Neural Network 0.86 0.75 (Piech et al. 2015)

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Existing work Original Fixed Model Basically AUC AUC Bayesian Knowledge Tracing Hidden Markov Model 0.67 0.63 (Corbett and Anderson 1994) Deep Knowledge Tracing Recurrent Neural Network 0.86 0.75 (Piech et al. 2015) Item Response Theory (Rasch 1960) Online Logistic Regression 0.76 (Wilson et al., 2016)

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Existing work Original Fixed Model Basically AUC AUC Bayesian Knowledge Tracing Hidden Markov Model 0.67 0.63 (Corbett and Anderson 1994) Deep Knowledge Tracing Recurrent Neural Network 0.86 0.75 (Piech et al. 2015) Item Response Theory (Rasch 1960) Online Logistic Regression 0.76 (Wilson et al., 2016) PFA ≤ DKT ≤ IRT ≤ KTM �� LogReg LSTM LogReg FM

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Limitations and contributions Several models for knowledge tracing were developed independently In our paper, we prove that our approach is more generic Our contributions Knowledge Tracing Machines unify most existing models Encoding student data to sparse features Then running logistic regression or factorization machines Better models found It is better to estimate a bias per item, not only per skill Side information improves performance more than higher dim.

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Our small dataset user item correct 1 1 1 User 1 answered Item 1 correct 1 2 0 User 1 answered Item 2 incorrect 2 1 0 User 2 answered Item 1 incorrect 2 1 1 User 2 answered Item 1 correct 2 2 ??? User 2 answered Item 2 ??? dummy.csv

0 1 1 0 0 1 0 0 1 1 0 1 1 0 1 0 1 1 0 1 0 0 0 0 0 1 1 0 1 1 0 0 1 0 0 0 0 0 0 1 1 0 1 0 0 1 1 2 0 2 0 0 1 1 0 1 0 0 1 0 0 1 0 0 2 1 1 0 2 ??? 1 1 ??? 2 1 1 3 2 0 3 2 0 2 0 Users 2 2 1 2 2 correct item user 0 0 encode IRT PFA KTM data.csv Items 0 1 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 Skills 1 0 0 0 0 0 0 0 0 0 2 1 Fails Wins 0 Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Our approach Encode data to sparse features Q 1 Q 2 Q 3 KC 1 KC 2 KC 3 KC 1 KC 2 KC 3 KC 1 KC 2 KC 3 sparse matrix X Run logistic regression or factorization machines ⇒ recover existing models or better models

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Model 1: Item Response Theory Learn abilities θ i for each user i Learn easiness e j for each item j such that: Pr (User i Item j OK) = σ ( θ i + e j ) σ : x �→ 1 / (1 + exp( − x )) logit Pr (User i Item j OK) = θ i + e j Really popular model, used for the PISA assessment Logistic regression Learn w such that logit Pr ( x ) = � w , x � + b

Knowledge Tracing Machines: Factorization Machines for Knowledge - PowerPoint PPT Presentation

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Knowledge Tracing Machines: Factorization Machines for Knowledge Tracing Jill-Jnn Vie Hisashi Kashima KJMLW, February 22, 2019

Advanced Ray Tracing 1 2/8/2006 Distributed Ray Tracing Distributed ray tracing is an

Computer Graphics - Ray-Tracing II - Hendrik Lensch Computer Graphics WS07/08 Ray Tracing II

1 minute Path tracing Bidirectional path tracing Progressive photon mapping 1 minute

MIT 6.837 - Ray Tracing Ray Tracing MIT EECS 6.837 Most slides are taken from Frdo Durand and

Advanced Ray Tracing Stochastic ray tracing: distribute rays stochastically across pixel

61A Extra Lecture 9 Announcements Pixels (Demo) Ray Tracing Ray Tracing A technique for

Week 4 Video 2 Knowledge Inference: Bayesian Knowledge Tracing Bayesian Knowledge Tracing (BKT)

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Computer Graphics - Ray Tracing I - Hendrik Lensch Computer Graphics WS07/08 Ray Tracing I

Introduction to Path Tracing Marc Sunet Table of contents From Ray Tracing to Path Tracing The

Ray Tracing 1 Ray Tracing Ray Tracing kills two birds with one stone: Solves the Hidden

Tracing with Perf tools Namhyung Kim 2013-11-13 Wed Namhyung Kim Tracing with Perf tools

Knowledge Tracing Machines: Families of models for predicting student performance Jill-Jnn Vie

Tensor Factorization via Matrix Factorization Volodymyr Kuleshov Arun Tejasvi Chaganty Percy

Matrix Factorization and Factorization Machines for Recommender Systems Chih-Jen Lin Department

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

Reading to Learn (RtL) Effective Cross-Content Literacy Instruction in Support of Engaged Content

Knowledge-intensive Processes: An Overview of Contemporary Approaches Claudio Di Ciccio, Andrea

Knowledge Representation Artificial Intelligence Lecture 5 Karim Bouzoubaa Intelligence - K

Uncertainty with logical, procedural and relational languages David Poole Department of Computer

Investigating Collaboration Dynamics in Different Ontology Development Environments Marco

Web of Things Linked Data & Semantic Processing Task Force Osaka Face to Face, May 2017 Dave

Presentation Design Principles Grouping Contrast Proportion Usability Presentation Design

T r ansmission of attac hme nt in thr e e ge ne r ations. Continuity and r e ve r sal

Sambuz

Useful Links

Newsletter

Mail Us

Knowledge Tracing Machines: Factorization Machines for Knowledge - PowerPoint PPT Presentation

Introduction Knowledge Tracing Encoding existing models Knowledge Tracing Machines Results Conclusion Knowledge Tracing Machines: Factorization Machines for Knowledge Tracing Jill-Jnn Vie Hisashi Kashima KJMLW, February 22, 2019

Advanced Ray Tracing 1 2/8/2006 Distributed Ray Tracing Distributed ray tracing is an

Computer Graphics - Ray-Tracing II - Hendrik Lensch Computer Graphics WS07/08 Ray Tracing II

1 minute Path tracing Bidirectional path tracing Progressive photon mapping 1 minute

MIT 6.837 - Ray Tracing Ray Tracing MIT EECS 6.837 Most slides are taken from Frdo Durand and

Advanced Ray Tracing Stochastic ray tracing: distribute rays stochastically across pixel

61A Extra Lecture 9 Announcements Pixels (Demo) Ray Tracing Ray Tracing A technique for

Week 4 Video 2 Knowledge Inference: Bayesian Knowledge Tracing Bayesian Knowledge Tracing (BKT)

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

Computer Graphics - Ray Tracing I - Hendrik Lensch Computer Graphics WS07/08 Ray Tracing I

Introduction to Path Tracing Marc Sunet Table of contents From Ray Tracing to Path Tracing The

Ray Tracing 1 Ray Tracing Ray Tracing kills two birds with one stone: Solves the Hidden

Tracing with Perf tools Namhyung Kim 2013-11-13 Wed Namhyung Kim Tracing with Perf tools

Knowledge Tracing Machines: Families of models for predicting student performance Jill-Jnn Vie

Tensor Factorization via Matrix Factorization Volodymyr Kuleshov Arun Tejasvi Chaganty Percy

Matrix Factorization and Factorization Machines for Recommender Systems Chih-Jen Lin Department

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

Reading to Learn (RtL) Effective Cross-Content Literacy Instruction in Support of Engaged Content

Knowledge-intensive Processes: An Overview of Contemporary Approaches Claudio Di Ciccio, Andrea

Knowledge Representation Artificial Intelligence Lecture 5 Karim Bouzoubaa Intelligence - K

Uncertainty with logical, procedural and relational languages David Poole Department of Computer

Investigating Collaboration Dynamics in Different Ontology Development Environments Marco

Web of Things Linked Data &amp; Semantic Processing Task Force Osaka Face to Face, May 2017 Dave

Presentation Design Principles Grouping Contrast Proportion Usability Presentation Design

T r ansmission of attac hme nt in thr e e ge ne r ations. Continuity and r e ve r sal

Sambuz

Useful Links

Newsletter

Mail Us

Web of Things Linked Data & Semantic Processing Task Force Osaka Face to Face, May 2017 Dave