Combing Item Response Theory and Diagnostic Classification Models: A - PDF document

Combing Item Response Theory and Diagnostic Classification Models: A Psychometric Model for Scaling Ability and Diagnosing Misconceptions Laine P. Bradshaw James Madison University Jonathan Templin University of Georgia Author Note Correspondence concerning this article should be addressed to Laine Bradshaw, Department of Graduate Psychology, James Madison University, MSC 6806, 821 S. Main St., Harrisonburg, VA 22807. Email: laineb@uga.edu. This research was funded by National Science Foundation grants DRL-0822064; SES-0750859; and SES-1030337.

Abstract The Scaling Individuals and Classifying Misconceptions (SICM) model is presented as a combination of an item response theory (IRT) model and a diagnostic classification model (DCM). Common modeling and testing procedures utilize unidimensional IRT to provide an estimate of a student’s overall ability. Recent advances in psychometric s have focused on measuring multiple dimensions to provide more detailed feedback for students, teachers, and other stakeholders. DCMs provide multidimensional feedback by using multiple categorical variables that represent skills underlying a test that students may or may not have mastered. The SICM model combines an IRT model with a DCM that uses categorical variables that represent misconceptions instead of skills. In addition to the type of information common testing procedures provide about an examinee — an overall continuous ability, the SICM model also is able to provide multidimensional, diagnostic feedback in the form of statistical estimates of misconceptions. This additional feedback can be used by stakeholders to tailor instruction for students’ needs. Results of a simulation study demonstrate that the SICM MCMC estimation algorithm yields reasonably accurate estimates under large-scale testing conditions. Results of an empirical data analysis highlight the need to address statistical considerations of the model from the onset of the assessment development process.

Running head: Scaling Ability and Diagnosing Misconceptions 3 Combining Item Response Theory and Diagnostic Classification Models: A Psychometric Model for Scaling Ability and Diagnosing Misconceptions The need for more fine-grained feedback from assessments that can be used to understand students’ strengths and weakness has been emphasized at all levels of education. Educational policy (No Child Left Behind, 2001), modern curriculum standards (i.e., Common Core Standards; National Research Council, 2010), and classroom teachers (Huff & Goodman, 2007) have described multidimensional, diagnostic feedback as essential for tailoring instruction to students’ specific needs and making educational progress. In spite of this need, most state- level educational tests have been, and continue to be, designed from a unidimensional Item Response Theory (IRT; e.g., Hambleton, Swaminathan, & Rogers, 1991) perspective. This perspective optimizes the statistical estimation of a single continuous variable representing an overall ability and provides a single, composite score to describe a students’ performance with respect to an entire academic course. A common solution to providing ―diagnostic‖ feedback from IRT-designed state-level tests has been to report summed scores on sub-sections of a test. These subscores, because they are based on a small number of items, often lack of reliability. Decisions based on unreliable subscores counterproductively may misguide instructional strategies and resources (Wainer, Vevea, Camacho, Reeve, Rosa, Nelson, Swygert, & Thissen, 2001). These types of subscores are computed from items selected to be on the test due to their correlation with the other items on the test, so, as expected, they are highly related to the total score and thus do not provide distinct information from or additional information beyond the total score (Haberman, 2005; Harris & Hanson, 1991; Haberman, et al., 2009; Sinharay & Haberman, 2007).

Scaling Ability and Diagnosing Misconceptions 4 The new psychometric model presented in this paper presents a means to providing reliable multidimensional feedback within the framework of prevailing unidimensional IRT methods. The model capitalizes on advances regarding multidimensional measurement models which recently have been at the forefront of psychometric research because they promise detailed feedback for students, teachers, and other stake holders. Diagnostic classification models (DCMs; e.g., Rupp, Templin, & Henson, 2010) provide one approach to the measurement of multiple dimensions. DCMs use categorical latent attributes to represent skills or content components underlying a test that students may or may not have mastered. Using DCMs, the focus of the results of the assessment is shifted to identify which components each student has mastered. Attributes students have mastered can be viewed as areas in which students do not need further instruction. Similarly, attributes that students have not mastered indicate areas in which instruction or remediation should be focused. Thus, the attribute pattern can provide feedback to students and teachers with respect to more fine-grain components of a content area, which can be used to tailor instruction to students’ speci fic needs. DCMs sacrifice fine-grained measurement to provide multidimensional measurements. Instead of finely locating each examinee along a set of continuous traits as a multidimensional IRT model does, DCMs coarsely classify each examinee with respect to each trait (i.e., as masters or non-masters of the trait). This trade-off enables DCMs to provide diagnostic, multidimensional feedback with reasonable data demands (i.e., few number of items and examinee, Bradshaw & Cohen, 2010). DCMs’ low cost in terms of data demands and high benefits in terms of diagnostic information make them very attractive models in an educational setting where time for testing is limited but multidimensional feedback is needed to reflect the realities of the multifaceted nature of the objectives of educational courses. However, given the

Scaling Ability and Diagnosing Misconceptions 5 current reliance of testing on measuring an overall ability, DCMs, while efficient for providing detailed feedback, may not fulfill all the needs of policy-driven assessment systems centered on scaling examinee ability. In this paper, we propose a new nominal response psychometric model, the Scaling Individuals and Classifying Misconceptions (SICM) model, that blends the IRT and DCM frameworks. The SICM model alters traditional DCM practices by defining the attributes for a nominal response DCM as misconceptions that students have instead of as abilities (skills) that students have. The SICM model alters traditional nominal response IRT (NR IRT; Bock 1970) practices by having these categorical misconceptions predict the incorrect item response while a continuous ability predicts the correct response. When coupled together through the SICM model, the IRT and DCM components provide a more thorough description of the traits possessed by students. The SICM model both describes a measure of composite ability measured by the correctness of the responses and identifies distinct errors in understanding manifested through specific incorrect responses. The model therefore serves dual purposes of scaling examinee ability for comparative and accountability purposes and also diagnosing misconceptions for remediation purposes. The following section overviews the measurement of misconceptions through previous assessment development projects. The next section provides the statistical specifications of the SICM model and is followed by an illustration of the model through a contrast with two more familiar models. Then results of a simulation study are provided to establish the efficacy of the model and a real data analysis is given to illustrate the use of the model in practice. Measuring Misconceptions A key feature needed by the SICM model is that the incorrect alternatives for multiple

Combing Item Response Theory and Diagnostic Classification Models: A - PDF document

Combing Item Response Theory and Diagnostic Classification Models: A Psychometric Model for Scaling Ability and Diagnosing Misconceptions Laine P. Bradshaw James Madison University Jonathan Templin University of Georgia Author Note

Mental Health Adult Pre-Charge Diversion Program Agenda Why Pre-Charge Diversion? Item 1 Item 1

Week 4 Video 4 Knowledge Inference: Item Response Theory Item Response Theory A classic

Diagnostic et Prise Prise en Charge des en Charge des Diagnostic et Echecs De Thrombolyse De

Manufacturing Diagnostic Tool Manufacturing Diagnostic Tool An on board on board low cost

Pathfinder: /| child::element person { item* } (iter, item1) /| child::element closed_auction {

What is Item Response Theory? Nick Shryane Social Statistics Discipline Area University of

DIAGNOSTIC CONFIRMATION C CODE NAACCR Data Item #490 Diagnostic Confirmation Code KENTUCKY

Diagnostic Error Human Expertise and Cognitive Biases Diagnostic Error A recent article by

On- -line diagnostic of a PEM fuel cell stack line diagnostic of a PEM fuel cell stack On based

PARCC Diagnostic Assessments for Mathematics Comprehension: A Diagnostic Classification Model

Rapid Response Jobs are Alaskas Future Rapid Response Rapid Response Rapid Response is a

Fast Item Response Theory (IRT) Analysis by using GPUs Lei Chen lei.chen@liulishuo.com Liulishuo

Cataloging Roundtable Item Status and Copy Buckets Item Status provides a wealth of information!

Planning & Development Committee Item Item 4. Old Spa Booking Office, West Cliff, Whitby

Student Response Systems Student Response Systems Student Response Systems Student Response

Advisory Committee Fund and Budget Proposal January 18, 2018 Agenda Item 8 1 Item 8

ASSESSING THE MEASUREMENT MODEL RELIABILITY AND VALIDITY USING SPSS/AMOS USING SPSS/AMOS

Lecture 9 MIT OpenCourseWare Dynamic Storage Allocation Stack allocation: LIFO (last-in-first-out)

Last In First Out (LIFO) Nunatsiavut Government Submission to Ministerial Advisory Committee

Conduct on Licensed Premises Amending Title 12, Chapter 244 of the Minneapolis Code of

BAYESIAN CHARACTERISATION OF NATURAL VARIATION IN GENE EXPRESSION Madhuchhanda Bhattacharjee

Head Pose Estimation Via Probabilistic High-Dimensional Regression Vincent Drouard 1 Sil` eye Ba 1

1 No duck breeds Distribution of double-cropping rice 2 3 4 Overlay Anatidae flyways

Practices Panel: Transformational Management October 5, 2018 Suzanne Hurley & Caroline

Combing Item Response Theory and Diagnostic Classification Models: A - PDF document

Combing Item Response Theory and Diagnostic Classification Models: A Psychometric Model for Scaling Ability and Diagnosing Misconceptions Laine P. Bradshaw James Madison University Jonathan Templin University of Georgia Author Note

Mental Health Adult Pre-Charge Diversion Program Agenda Why Pre-Charge Diversion? Item 1 Item 1

Week 4 Video 4 Knowledge Inference: Item Response Theory Item Response Theory A classic

Diagnostic et Prise Prise en Charge des en Charge des Diagnostic et Echecs De Thrombolyse De

Manufacturing Diagnostic Tool Manufacturing Diagnostic Tool An on board on board low cost

Pathfinder: /| child::element person { item* } (iter, item1) /| child::element closed_auction {

What is Item Response Theory? Nick Shryane Social Statistics Discipline Area University of

DIAGNOSTIC CONFIRMATION C CODE NAACCR Data Item #490 Diagnostic Confirmation Code KENTUCKY

Diagnostic Error Human Expertise and Cognitive Biases Diagnostic Error A recent article by

On- -line diagnostic of a PEM fuel cell stack line diagnostic of a PEM fuel cell stack On based

PARCC Diagnostic Assessments for Mathematics Comprehension: A Diagnostic Classification Model

Rapid Response Jobs are Alaskas Future Rapid Response Rapid Response Rapid Response is a

Fast Item Response Theory (IRT) Analysis by using GPUs Lei Chen lei.chen@liulishuo.com Liulishuo

Cataloging Roundtable Item Status and Copy Buckets Item Status provides a wealth of information!

Planning &amp; Development Committee Item Item 4. Old Spa Booking Office, West Cliff, Whitby

Student Response Systems Student Response Systems Student Response Systems Student Response

Advisory Committee Fund and Budget Proposal January 18, 2018 Agenda Item 8 1 Item 8

ASSESSING THE MEASUREMENT MODEL RELIABILITY AND VALIDITY USING SPSS/AMOS USING SPSS/AMOS

Lecture 9 MIT OpenCourseWare Dynamic Storage Allocation Stack allocation: LIFO (last-in-first-out)

Last In First Out (LIFO) Nunatsiavut Government Submission to Ministerial Advisory Committee

Conduct on Licensed Premises Amending Title 12, Chapter 244 of the Minneapolis Code of

BAYESIAN CHARACTERISATION OF NATURAL VARIATION IN GENE EXPRESSION Madhuchhanda Bhattacharjee

Head Pose Estimation Via Probabilistic High-Dimensional Regression Vincent Drouard 1 Sil` eye Ba 1

1 No duck breeds Distribution of double-cropping rice 2 3 4 Overlay Anatidae flyways

Practices Panel: Transformational Management October 5, 2018 Suzanne Hurley &amp; Caroline

Planning & Development Committee Item Item 4. Old Spa Booking Office, West Cliff, Whitby

Practices Panel: Transformational Management October 5, 2018 Suzanne Hurley & Caroline