Addressing the Testing Challenge with a Web-Based E-Assessment - PowerPoint PPT Presentation

Addressing the Testing Challenge with a Web-Based E-Assessment System that Tutors as it Assesses Mingyu Feng, Worcester Polytechnic Institute (WPI) Neil T. Heffernan, Worcester Polytechnic Institute (WPI) Kenneth R. Koedinger, Carnegie Mellon University (CMU)

The “ASSISTment” System � An e-assessment and e-learning system that does both ASSISTing of students and assessMENT (movie) � www.assistment.org � Massachusetts Comprehensive Assessment System “MCAS” � Web-based system built on Common Tutoring Object Platform (CTOP) [1] [1] Nuzzo-Jones., G. Macasek M.A., Walonoski, J., Rasmussen K. P., Heffernan, N.T., Common Tutor Object Platform, an e-Learning Software Development Strategy, WPI technical report. WPI-CS-TR-06-08. May 25 th , 2006 WWW’06 2

ASSISTment � We break multi-step problems into “ scaffolding questions” � “Hint Messages”: given on The original question (Demo/movie) a. Congruence demand that give hints about b. Perimeter c. Equation-Solving what step to do next � “Buggy Message”: a context The 1 st scaffolding question sensitive feedback message Congruence � “Knowledge Components”: Skills, Strategies, concepts The 2 nd scaffolding question � The state reports to teachers on Perimeter 5 areas � We seek to report on 100 A buggy message knowledge components � How does a student work with the ASSISTment? (movie) A hint message May 25 th , 2006 WWW’06 3

Goal � Help student Learning (this paper’s goal [2][3] ) � Assess students’ performance and present results to teachers. (this work focused on) � Online “ Grade book” report [2] Razzaq, L., Feng, M., Nuzzo-Jones, G., Heffernan, N.T., Koedinger, K. R., Junker, B., Ritter, S., Knight, A., Aniszczyk, C., Choksey, S., Livak, T., Mercado, E., Turner, T.E., Upalekar. R, Walonoski, J.A., Macasek. M.A., Rasmussen, K.P. (2005). The Assistment Project: Blending Assessment and Assisting. In C.K. Looi, G. McCalla, B. Bredeweg, & J. Breuker (Eds.) Proceedings of the 12th International Conference on Artificial Intelligence In Education , 555-562. Amsterdam: ISO Press. [3] Razzaq, L., Heffernan, N.T. (in press). Scaffolding vs. hints in the Assistment System . In Ikeda, Ashley & Chan (Eds.). Proceedings of the Eight International Conference on Intelligent Tutoring Systems . Springer-Verlag: Berlin. pp. 635- 644. 2006. May 25 th , 2006 WWW’06 4

Outline for the talk � Part I: Using � Part II: Longitudinal Models tracking student learning over time � Able to tell which schools provide the most learning to students � Can we tell teachers which skills are being learned May 25 th , 2006 WWW’06 5

Data Source � 600+ students of two middle schools � Used the ASSISTment system every other week from Sep. 2004 to June 2005 � Real MCAS score � test taken in May 2005 � 2 paper and pencil based tests, administered in Sep. 2004 and March 2005. May 25 th , 2006 WWW’06 6

Part I: Using Dynamic Measures � Research Questions � Can we do a more accurate job of predicting student's MCAS score using the online assistance information (concerning time, performance on scaffoldings, #attempt, #hint)? � Can we do a better job predicting MCAS in this online assessment system than the tradition paper and pencil test does? May 25 th , 2006 WWW’06 7

Part I: Using Dynamic Measures � Approach � Run forward stepwise linear regression to train up regression models using different independent variables � Result # Variables BIC + MAD * R 2 Model Independent Variable’s Entered Model I Paper practice results only 2 .588 -358 6.20881 The single online static metric of Model II percent correct on original 1 .567 -343 6.21108 questions Model II plus all other online Model III 5 .663 -423 5.44183 measures + BIC: Bayesian Information Criterion * MAD: Mean Absolute Deviance May 25 th , 2006 WWW’06 8

Part I: Using Dynamic Measures Model III Order Variables Coeff. Std. Coeff. 1 PERCENT_CORRECT 32.976 .425 2 AVG_ATTEMPT -11.209 -.199 3 AVG_ITEM_TIME -.037 -.143 4 AVG_HINT_REQUEST -2.420 -.121 5 ORIGINAL_PERCENT_CORRECT 12.618 1.66 � What do we see from Model III? � the more hint, attempt, time a student need to solve a problem, the worse his predicted score would be May 25 th , 2006 WWW’06 9

Part II: Track Learning Longitudinally � Recall the problems of prediction in Grade book � Only based on static measure (discussed in part I) � Time ignored � part II � What if we take time into consideration? � Research Questions � Can our system detect performance improving over time? � Can we tell the difference on learning rate of students from different schools? Teacher? (Who cares?) � Do students show difference on learning different skills? � Approach -- longitudinal data analysis Note: Different from Razzaq, Feng et. al which looks at student performance gain over learning opportunity pairs within the ASSISTment system, here “learning” includes students learning in class too. May 25 th , 2006 WWW’06 10

Longitudinal Data Analysis � What do we get from a longitudinal model? � Average population trajectory for the specified group Trajectory indicated by two parameters � γ γ intercept: slope: � 00 10 The average estimated score for a group at time j is � Υ = γ + γ * TIME j 00 10 j � One trajectory for every single student Each student got two parameters to vary from the group � average γ + ζ γ + ζ � Intercept: slope: 10 1 i 00 0 i The estimated score for student i at time j is � Υ = γ + ζ + γ + ζ ( ) ( ) * TIME ij 00 0 i 10 1 i j � Students’ initial knowledge is indicated by intercept, while slope shows the learning rate [4] Singer, J. D. & Willett, J. B. (2003). Applied Longitudinal Data Analysis: Modeling Change and Occurrence. Oxford University May 25 th , 2006 WWW’06 11 Press, New York.

May 25 th , 2006 WWW’06 12

17 Student from one class % Correct (Y- Table 2. Regression Models Axis) over a given month (X Axis) May 25 th , 2006 WWW’06 13

May 25 th , 2006 WWW’06 14

May 25 th , 2006 WWW’06 15

Part II: Track Learning Longitudinally � Result � Unconditional model (model A) : no predictors � Growth model (model B) � estimated initial average PredictedScore = 18 � estimated average monthly learning rate = 1.29 � Observation : students were learning over time � Add in school/teacher/class (model D/E/F) Unconditional means model (Model A, no predictor) � Model D shows statistical significant BIC = 31712 #param = 3 Diff = 84 advantage as measured by BIC � Observation : students from different Unconditional growth model BIC = 31628 (Model B, TIME) schools differ on both incoming #param = 6 Diff = 12 knowledge and learning rate Model D Model F TIME + SCHOOL Model E TIME + CLASS TIME + TEACHER BIC = 31616 BIC = 31668 May 25 th , 2006 WWW’06 #param = 8 16 BIC = 31672 #param = 70 #param = 20

Part II: Track Learning Longitudinally � The last question � Can we detect difference on learning rate of different skills? May 25 th , 2006 WWW’06 17

Growth of 5 Skills over Time for One Student 80 70 Geometry Percent Correct 60 Algebra 50 40 Measurement 30 Data Analysis 20 Number Sence 10 0 Sept Oct Nov Dec Jan Feb March Time May 25 th , 2006 WWW’06 18

Growth of 5 Skills over Time for One Student 80 Geometry 70 Algebra 60 Measurement Percent Correct Data Analysis 50 Number Sence 40 Linear (Geometry) 30 Linear (Data Analysis) Linear (Algebra) 20 Linear (Measurement) 10 Linear (Number Sence) 0 Sept Oct Nov Dec Jan Feb March Time May 25 th , 2006 WWW’06 19

Part II: Track Learning Longitudinally � The last question � Can we detect difference on learning rate of different skills? Yes we can! In this paper we showed that we can the model with 5 skills to do a more accurate prediction of their own data. Even more recent studies we have down have shown even finer grain model (98 skills) are better at non-only predicting our online data, but predicting the students test scores. [7] Pardos, Z. A., Heffernan, N. T., Anderson, B. & Heffernan, C. (in press). Using Fine-Grained Skill Models to Fit Student Performance with Bayesian Networks. Workshop in Educational Data Mining held at the Eight International Conference on Intelligent Tutoring Systems. Taiwan. 2006. [8] Feng, M., Heffernan, N., Mani, M., & Heffernan C. (in press). Using Mixed-Effects Modeling to Compare Different Grain-Sized Skill Models. AAAI'06 Workshop on Educational Data Mining, Boston, 2006. May 25 th , 2006 WWW’06 20

Large Scale : ASSISTment project � ASSISTments are tagged with skills May 25 th , 2006 WWW’06 21

Addressing the Testing Challenge with a Web-Based E-Assessment - PowerPoint PPT Presentation

Addressing the Testing Challenge with a Web-Based E-Assessment System that Tutors as it Assesses Mingyu Feng, Worcester Polytechnic Institute (WPI) Neil T. Heffernan, Worcester Polytechnic Institute (WPI) Kenneth R. Koedinger, Carnegie Mellon

Web testing Image by C Watts What is web testing? Testing web applications Applications of which

Web Services Web Services Towards Web Services Towards Web Services Towards Web Services A

Levels of Testing Chapter 12 Beyond unit testing Developer Testing stages Unit testing

Testing Terminology System testing Types of errors Function testing Structure

Property-Based Testing Matt Bachmann @mattbachmann Testing is Important Testing is Important

ADDRESSING INCREASED REGULATION IN THE ADDRESSING INCREASED REGULATION IN THE ADDRESSING

Addressing Modes Chapter 11 S. Dandamudi Outline Addressing modes Examples Simple

Addressing Modes Chapter 11 S. Dandamudi Outline Addressing modes Examples Simple

VAST CHALLENGE 2017 Bianca Barnucz & Stephanie Wegscheidl OVERVIEW VAST Challenge

Software Testing Overview What is software testing? General testing criteria Testing

Software testing Software Testing Introduction Testing levels Automated testing Principles and

1. Test page This page is for testing. This page is for testing. This page is for testing.

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Lecture 1: Semantic Web and RDF Aidan Hogan aidhog@gmail.com THE WEB The Web is now 26 years

ReSAKSS DATA CHALLENGE Annual Newsletter www.resakss.org/challenge ReSAKSS DATA CHALLENGE ANNUAL

Chapter 11 Instruction Sets: Addressing Modes and Formats Contents Addressing Pentium

New Challenges In Certification For Aircraft Software John Rushby Computer Science Laboratory

Webinar FAQ Thank you for attending our webinar series and for submitting questions to our

Meeting the FCC Mobility II Challenge New Hampshire Perspective What is the challenge supposed to

Metrics, Statistics, Tests Tetsuya Sakai Microsoft Research Asia, P. R. China @tetsuyasakai

Web Application Penetration By: Frank Coburn & Haris Mahboob Testing Take Aways Overview

2020 Census Program Management Review Decennial Census Programs U.S. Census Bureau April 20,

Testing a Saturation-Based Theorem Prover: Experiences and Challenges Giles Reger 1 , Martin Suda

Data Mining with Weka Class 2 Lesson 1 Be a classifier! Ian H. Witten Department of Computer

Addressing the Testing Challenge with a Web-Based E-Assessment - PowerPoint PPT Presentation

Addressing the Testing Challenge with a Web-Based E-Assessment System that Tutors as it Assesses Mingyu Feng, Worcester Polytechnic Institute (WPI) Neil T. Heffernan, Worcester Polytechnic Institute (WPI) Kenneth R. Koedinger, Carnegie Mellon

Web testing Image by C Watts What is web testing? Testing web applications Applications of which

Web Services Web Services Towards Web Services Towards Web Services Towards Web Services A

Levels of Testing Chapter 12 Beyond unit testing Developer Testing stages Unit testing

Testing Terminology System testing Types of errors Function testing Structure

Property-Based Testing Matt Bachmann @mattbachmann Testing is Important Testing is Important

ADDRESSING INCREASED REGULATION IN THE ADDRESSING INCREASED REGULATION IN THE ADDRESSING

Addressing Modes Chapter 11 S. Dandamudi Outline Addressing modes Examples Simple

Addressing Modes Chapter 11 S. Dandamudi Outline Addressing modes Examples Simple

VAST CHALLENGE 2017 Bianca Barnucz &amp; Stephanie Wegscheidl OVERVIEW VAST Challenge

Software Testing Overview What is software testing? General testing criteria Testing

Software testing Software Testing Introduction Testing levels Automated testing Principles and

1. Test page This page is for testing. This page is for testing. This page is for testing.

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Lecture 1: Semantic Web and RDF Aidan Hogan aidhog@gmail.com THE WEB The Web is now 26 years

ReSAKSS DATA CHALLENGE Annual Newsletter www.resakss.org/challenge ReSAKSS DATA CHALLENGE ANNUAL

Chapter 11 Instruction Sets: Addressing Modes and Formats Contents Addressing Pentium

New Challenges In Certification For Aircraft Software John Rushby Computer Science Laboratory

Webinar FAQ Thank you for attending our webinar series and for submitting questions to our

Meeting the FCC Mobility II Challenge New Hampshire Perspective What is the challenge supposed to

Metrics, Statistics, Tests Tetsuya Sakai Microsoft Research Asia, P. R. China @tetsuyasakai

Web Application Penetration By: Frank Coburn &amp; Haris Mahboob Testing Take Aways Overview

2020 Census Program Management Review Decennial Census Programs U.S. Census Bureau April 20,

Testing a Saturation-Based Theorem Prover: Experiences and Challenges Giles Reger 1 , Martin Suda

Data Mining with Weka Class 2 Lesson 1 Be a classifier! Ian H. Witten Department of Computer

VAST CHALLENGE 2017 Bianca Barnucz & Stephanie Wegscheidl OVERVIEW VAST Challenge

Web Application Penetration By: Frank Coburn & Haris Mahboob Testing Take Aways Overview