The 28th ACM International Conference on Information and Knowledge - PowerPoint PPT Presentation

The 28th ACM International Conference on Information and Knowledge Management (CIKM 2019) Reporter: Zhenya Huang Date: 2019.11.04 Anhui Province Key Laboratory of Big Data Analysis and Application 1

Outline Background 1 Problem Definition 2 Framework 3 Experiment 4 Conclusion & Future work 5 Anhui Province Key Laboratory of Big Data Analysis and Application 2

Background Ø Online Education Systems become more and more popular Ø Abundant learning materials Ø E.g., exercise, course, video Ø Personalized learning service Ø Students can learn on their own pace Ø Various platforms Ø MOOC Ø Intelligent Tutoring System Ø Online Judging System Anhui Province Key Laboratory of Big Data Analysis and Application 3

Recommendation Ø Recommender systems Ø Suggest suitable exercises instead of letting students self-seeking Ø Interactive systems between agent vs. student Ø Key problem Ø Design an optimal strategy (algorithm) that can recommend the best exercise for each student at the right time recommendation Agent Student feedback Anhui Province Key Laboratory of Big Data Analysis and Application 4

Related work Ø Traditional recommendation for online learning Ø Basic idea: Ø Try to discover the weakness of students Ø Recommend the exercises that students may not learned well Ø Existing methods Ø Educational psychology Ø Cognitive diagnosis studies Ø Traditional Q learning algorithm Ø Data-driven algorithm Ø Content-based methods Ø Collaborative filtering Ø Deep neural networks Anhui Province Key Laboratory of Big Data Analysis and Application 5

Related work Ø Limitation Ø Single objective Ø Target at specific concepts with repeating exercising Ø Recommending non-mastered exercises Ø Always too hard Ø Student lose learning interests Function Function Function Function What kinds of objectives should we concern in exercise recommendation? Anhui Province Key Laboratory of Big Data Analysis and Application 6

Exercise Recommendation Ø Multiple Objectives Ø Review & Explore Ø Review non-mastered concept vs. Seek new knowledge Ø Smoothness Ø Continuous recommendations on difficulty levels can not vary dramatically Ø Engagement Ø Keep learning Ø Some are challenging but some are “gifts’’ Anhui Province Key Laboratory of Big Data Analysis and Application 7

Exercise Recommendation Ø Challenges Ø How to define multiple objectives? Ø Review & Explore Ø Smoothness Ø Engagement Ø How to enable flexible recommendations with considering above objectives simultaneously? Ø How to track students’ learning states Ø How to quantify the objectives Ø Large space of exercise candidates Anhui Province Key Laboratory of Big Data Analysis and Application 8

Problem Definition Ø Given: Ø Student: exercising record Ø Exercise: triplet Ø Content: c is word sequence, Ø Knowledge (concept): (e.g., Function) Ø Difficulty level: d is the error rate, i.e., the percentage of students who answer exercise e wrong Ø Markov Decision Process (MDP) Ø State ! " : the exercising history of the student Ø Action # " : recommend an exercise $ "%& based on State ! " Ø Reward r ! " , # " : consider multiple objectives based on the performance feedback Ø Transition T: function: ( × + → ( , mapping state ! " to state ! "%& Ø Goal: Ø Find an optimal policy π : S → A of recommending exercises to students, which maximizes the multi-objective rewards. Anhui Province Key Laboratory of Big Data Analysis and Application 10

DRE framework Ø At a glance Ø Deep reinforcement learning (Q-learning) framework Ø Exercise Q-network (EQN) Ø Estimate Q-values, generate exercise recommendation (taking action) Ø Track student learning states Ø Extract exercise semantics Ø Two Implementations Ø EQNM with Markov property Ø EQNR with Recurrent manner Ø Multi-objective Rewards Ø Review & Explore Ø Smoothness Ø Engagement Ø Off-policy training Anhui Province Key Laboratory of Big Data Analysis and Application 12

DRE framework Ø Optimization Objective Ø Future rewards ! " of state-action pair (s, a): Ø Optimal action-value function Ø Compute the Q-values for all a ′ ∈ A is infeasible Ø Estimate and store all state-action pairs (large exercise candidates) Ø Update all Q-values (student practices very few exercises) Ø Solution Ø Exercise Q-Network: as a network approximator θ Ø Minimize the objective function to estimate this network. Anhui Province Key Laboratory of Big Data Analysis and Application 13

DRE framework Ø Exercise Q-Network Ø Goal: estimate the action Q-value Q (s, a) of taking an action a at state s Ø Implement network approximator Ø Key points: Ø Learn the semantics of each exercise Ø Exercise Module Ø Learn the student knowledge states at each step Ø EQNM: Markov property Ø EQNR: Recurrent manner Anhui Province Key Laboratory of Big Data Analysis and Application 14

Exercise Q-Network Ø Exercise Module Ø Goal: learn the semantics of each exercise Ø Combination with knowledge, content and difficulty Content embedding Knowledge embedding Anhui Province Key Laboratory of Big Data Analysis and Application 15

Exercise Q-Network Ø Two implements Ø Goal: Learn the student knowledge states at each step Ø Estimate Q value Q(s, a): taking action at step t Ø EQNM: only observe current state Ø EQNR: consider historical state trajectories: n-layer fully-connected layers Current state embedding Anhui Province Key Laboratory of Big Data Analysis and Application 16

Multi-objective rewards Ø Review & Explore Ø Intuition: review non-mastered concept vs. seek new knowledge Ø Review factor: review what they learned not well: punishment ( ! " < 0) Ø Explore factor: suggest to seek diverse concepts: stimulation ( ! # > 0) Ø Smoothness Ø Intuition: two continuous recommendations on difficulty levels should not vary dramatically Ø Negative squared loss Anhui Province Key Laboratory of Big Data Analysis and Application 17

Multi-objective rewards Ø Engagement Ø Intuition: keep learning (interests), avoiding too hard or easy exercises all the time Ø Makes some recommendations are challenging but others seem “gifts” Ø Learning goal g Ø N historical performance ! on average Ø Balance multi-objective rewards Anhui Province Key Laboratory of Big Data Analysis and Application 18

Off-policy training Ø Training with offline logs Learn from other agent policy Experience reply Two separate networks Anhui Province Key Laboratory of Big Data Analysis and Application 19

Experiment Ø Datasets Ø MATH dataset (high school level) Ø PROGRAM dataset (oj platform) Ø Data analysis Ø Learning session Ø Interval timestamps last more than 24 (10) hours, split them into two sessions Ø Longer sessions have larger concept coverage Ø Longer sessions contain more samples with smaller difficulty differences Ø Longer sessions have exercises with medium difficulty on average Ø https://base.ustc.edu.cn/data/DRE/ Anhui Province Key Laboratory of Big Data Analysis and Application 21

Experiment Ø Offline Evaluation (Point-wise recommendation) Ø We evaluate methods on logged data Ø Static Ø Only contained pairs of student-exercise performance that had been recorded Ø Just know students’ final scores on exercise Ø Ranking problem Ø For student: rank an exercise list at a particular time Ø Based on performance: from bad to good Ø Data partition: for each sequence, 70% training, 30% testing Ø DRE framework: Ø Baseline: Ø Cognitive diagnosis: IRT Ø Recommender system: PMF, FM Ø Deep learning: DKT, DKVMN Ø Reinforcement learning: DQN Anhui Province Key Laboratory of Big Data Analysis and Application 22

Experiment Ø Offline Evaluation (Point-wise recommendation) Ø DRER and DREM generate accurate recommendations Ø EQN > DQN: EQN well capture the state presentations of students Ø DRER > DREM: EQNR can track the long-term dependency Anhui Province Key Laboratory of Big Data Analysis and Application 23

Experiment Ø Online Evaluation (Sequence-wise recommendation) Ø We evaluate methods in a simulated environment Ø Implement a student simulator Ø Real-time interaction Ø Sequential recommendation scenario Ø For student: provide the best exercise step by step Ø Evaluate the effectiveness on three rewards (multiple objectives) Ø Preliminaries Ø Student simulator: EERNN (state-of-the-art) Ø Data partition: 50% for training simulator, 50% for training DRE framework Anhui Province Key Laboratory of Big Data Analysis and Application 24

The 28th ACM International Conference on Information and Knowledge - PowerPoint PPT Presentation

The 28th ACM International Conference on Information and Knowledge Management (CIKM 2019) Reporter: Zhenya Huang Date: 2019.11.04 Anhui Province Key Laboratory of Big Data Analysis and Application 1 Outline Background 1 Problem Definition

ACM-W Europe Volunteering to Improve your Prospects Who am I I am the Chair of ACM-W Europe

ACM History Committee Brent Hailper n ACM SGB - Chicago - 27 Mar 2009 Purpose to foster

ETC/ACM air quality mapping method and its evaluation Jan Horlek (ETC/ACM, CHMI) Nina

Data in the Cloud Happy 10 th ACM SoCC! Raghu Ramakrishnan CTO for Data, Technical Fellow ACM

Lecture 20 March 28th, 2013 Biostatistics 602 - Lecture 20 Hyun Min Kang March 28th, 2013 Hyun

Objectives Objectives 28th February 28th February 4th March 2005 4th March 2005

ACM Highlights Learning Center tools for professional development: http://learning.acm.org

ACM Highlights Learning Center tools for professional development: http://learning.acm.org

ACM Highlights Learning Center tools for professional development: http://learning.acm.org

ACM SIGecom ecom: Electronic Commerce http://www.acm.org/sigecom dedicated to the

Light Field Display Yu Guo ACM Transactions on Graphics (TOG) - Proceedings of ACM SIGGRAPH Asia

Welcome to Today s ACM Webinar Welcome to today s ACM Webinar. The presentation starts

Publications Board Update Oct 12, 2018 Jack Davidson Co-chair ACM Publications Board ACM

Frequent Itemsets Itemset: a set of items E.g., acm = {a, c, m} Transaction database TDB

FY 2012 result presentation Conference call and Q&A 28th February 2013 Event: FY 2012

INF2080 Church Turing Thesis and Decidability Daniel Lupp Universitetet i Oslo 28th February

IT350 Web and Internet Programming Cookies: JavaScript and Perl (Some from Chapter 11.9 -4 th

Lee County Healthcare Coalition December December 7, 7, 2017 2017 2-4 PM 4 PM Connie B

Asian SWIFT method Efficient wavelet-based valuation of arithmetic Asian options Alvaro

www.burlingtonvt.gov/btv-housing-policy housing Ensure that conversion of housing units to short-

Computer Science II for Majors Lecture 07 Classes and Objects (Continued) Dr. Katherine

SQL Exercises This material comes form the recommended book by T. Connoly, C. Begg, A. Strachan

CS234 Notes - Lecture 9 Advanced Policy Gradient Patrick Cho, Emma Brunskill February 11, 2019

Examining Confidentiality Messaging in Establishment Surveys Aryn Hernandez, Krysten Mesner, and

The 28th ACM International Conference on Information and Knowledge - PowerPoint PPT Presentation

The 28th ACM International Conference on Information and Knowledge Management (CIKM 2019) Reporter: Zhenya Huang Date: 2019.11.04 Anhui Province Key Laboratory of Big Data Analysis and Application 1 Outline Background 1 Problem Definition

ACM-W Europe Volunteering to Improve your Prospects Who am I I am the Chair of ACM-W Europe

ACM History Committee Brent Hailper n ACM SGB - Chicago - 27 Mar 2009 Purpose to foster

ETC/ACM air quality mapping method and its evaluation Jan Horlek (ETC/ACM, CHMI) Nina

Data in the Cloud Happy 10 th ACM SoCC! Raghu Ramakrishnan CTO for Data, Technical Fellow ACM

Lecture 20 March 28th, 2013 Biostatistics 602 - Lecture 20 Hyun Min Kang March 28th, 2013 Hyun

Objectives Objectives 28th February 28th February 4th March 2005 4th March 2005

ACM Highlights Learning Center tools for professional development: http://learning.acm.org

ACM Highlights Learning Center tools for professional development: http://learning.acm.org

ACM Highlights Learning Center tools for professional development: http://learning.acm.org

ACM SIGecom ecom: Electronic Commerce http://www.acm.org/sigecom dedicated to the

Light Field Display Yu Guo ACM Transactions on Graphics (TOG) - Proceedings of ACM SIGGRAPH Asia

Welcome to Today s ACM Webinar Welcome to today s ACM Webinar. The presentation starts

Publications Board Update Oct 12, 2018 Jack Davidson Co-chair ACM Publications Board ACM

Frequent Itemsets Itemset: a set of items E.g., acm = {a, c, m} Transaction database TDB

FY 2012 result presentation Conference call and Q&amp;A 28th February 2013 Event: FY 2012

INF2080 Church Turing Thesis and Decidability Daniel Lupp Universitetet i Oslo 28th February

IT350 Web and Internet Programming Cookies: JavaScript and Perl (Some from Chapter 11.9 -4 th

Lee County Healthcare Coalition December December 7, 7, 2017 2017 2-4 PM 4 PM Connie B

Asian SWIFT method Efficient wavelet-based valuation of arithmetic Asian options Alvaro

www.burlingtonvt.gov/btv-housing-policy housing Ensure that conversion of housing units to short-

Computer Science II for Majors Lecture 07 Classes and Objects (Continued) Dr. Katherine

SQL Exercises This material comes form the recommended book by T. Connoly, C. Begg, A. Strachan

CS234 Notes - Lecture 9 Advanced Policy Gradient Patrick Cho, Emma Brunskill February 11, 2019

Examining Confidentiality Messaging in Establishment Surveys Aryn Hernandez, Krysten Mesner, and

FY 2012 result presentation Conference call and Q&A 28th February 2013 Event: FY 2012