Flexible Latent Trait Metrics An Application of the Filtered - PowerPoint PPT Presentation

Flexible Latent Trait Metrics An Application of the Filtered Monotonic Polynomial Item Response Model Leah Feuerstahler University of California, Berkeley 1/78

Overview Premise : In many applications of item response theory (IRT), reported scores are nonlinear transformations of the IRT θ estimates. Goal : Develop an IRT framework such that θ is the continuous metric on which scores are reported. 2/78

Overview Probability 0.0 0.2 0.4 0.6 0.8 1.0 −6 −4 −2 θ 0 2 4 6 Probability 0.0 0.2 0.4 0.6 0.8 1.0 0 20 True Score (T) 40 60 80 3/78

Overview Premise : In many applications of item response theory (IRT), reported scores are nonlinear transformations of the IRT θ estimates. Goal : Develop an IRT framework such that θ is the continuous metric on which scores are reported. 1 Why • Why is the IRT θ metric often transformed? • Why is an IRT for transformed metrics needed? 2 How • Filtered monotonic polynomial (FMP) item response model • Item parameter linking 3 Applications • Functional metric transformations • Estimated metric transformations 4 Considerations, Limitations, Future Directions 4/78

1 Why 2 How 3 Applications 4 Considerations, Limitations, Future Directions 5/78

Scaling [T]he process of associating numbers or other ordered indicators with the performance of examinees. 1 Scaled scores are often transformations of number-correct scores or IRT ˆ θ . What are the criteria for selecting a scale? Examples: 1 Facilitates appropriate interpretation by the public 2 Anchored to external indicators 3 Consistent with intuitions about how variables should behave 1 Kolen and Brennan (2014, p. 371) 6/78

Scaling 1 Facilitates appropriate interpretation by the public • Normalized scores for a representative sample • z -scores (mean 0, sd 1) • T -scores (mean 50, sd 10) • Scores range from 0 to test length, or 0 to 100 • Domain Scores 1 • Optimal Scores 2 • Equated number-correct 3 • Constant measurement error • ACT scores (arcsine transformation of number-correct) 4 • Constant IRT information 5 1Bock, Thissen, & Zimowski (1997) 2Ramsay & Wiberg (2017) 3Stocking (1996) 4Kolen (1988) 5Samejima (1979) 7/78

Scaling 2 Anchored to external indicators • Expected number-correct scores 1 • Grade-equivalent scores 2 • Equating with a different test form • Linear relationship with other variables (intended use) 3 • Dollars are nonlinearly related to quality of life 4 • Typed words per minute is nonlinearly related to practice/effort 5 1Stocking (1996) 2Schulz & Nicewander (1997) 3Nunnally (1967, p. 28) 4Jones (1971) 5Angoff (1971, pp. 509-510) 8/78

Scaling 3 Consistent with beliefs about how variables should behave • Normally distributed ability 1 • Uncorrelated difficulty and discrimination parameters 2 • Does variability of achievement increase or decrease with grade level? • With Thurstonian scales, variability usually increases with grade level • IRT scales often exhibit “scale shrinkage” 3 • “Armchair” theorizing can lead to conflicting answers 4 • Interval level measurement “in some sense” 5 1Thurstone (1925) 2Lord (1975) 3Camilli (1988) 4Yen (1986, p. 312) 5Kolen & Brennan (2014, p. 374) 9/78

Interval vs. Ordinal Stevens (1946): Nominal Ordinal Interval Ratio Scale type defined in terms of admissible operations. Ordinal Interval Any monotonic transformation Only linear transformations Invariant ordering of observations Meaningful intervals Median, Percentiles Mean, Standard deviation Hardness of minerals Temperature 10/78

Interval vs. Ordinal Interval-level measurement is highly desirable for educational and psychological tests. What is actually MEANT by interval-level measurement? • Only linear transformations are admissible given the IRT model • The (Rasch) model fits • Declaring that scores are equal-interval ‘in some sense’ 1 • Scores are linearly related to the underlying construct 2 1Kolen & Brennan (2014, p. 374) 2Yen (1986) 11/78

Where Does the IRT θ Come From? What do item response models assume? Simple case: Mokken’s (1971) monotone homogeneity model (MHM) assumes only 1 Unidimensionality 2 Local independence 3 Monotonicity If the MHM assumptions hold, individuals can be ordered uniquely. 12/78

Where Does the IRT θ Come From? Under the MHM assumptions, any monotonic function of the latent trait implies an equally admissible item response model. 1 Suppose an IRT model with item response function (IRF) P i ( θ ) . For a continuous monotonic function h , where θ ⋆ = h − 1 ( θ ) , another item response model exists such that i [ h − 1 ( θ )] = P ⋆ P i ( θ ) = P ⋆ i ( θ ⋆ ) . Any reason to prefer θ to θ ⋆ ? 1Lord (1975) 13/78

Where Does the IRT θ Come From? Under the MHM assumptions, an infinite number of IRT models can fit data equally well. Identification restrictions are needed in practice. Two main solutions: 1 Parametric IRT (PIRT) • Specify the IRF shape • (Usually) determines scale up to linear transformations • Assumes that the chosen IRF shape(s) fits all scale items 2 Nonparametric IRT (NIRT) • Specify the latent trait distribution (e.g., standard normal) • Often conditions on (a monotonic transformation of) sum scores • Nonparametrically estimates the IRF shape 14/78

Nonlinear Transformations of the IRT Metric What does not change? • Ordering of examinees • Percentile rankings • Relative efficiency of item response curves What does change? • Item and test information • Standard errors • Confidence intervals • Reliability 15/78

Item Information Metric transformations can have dramatic effects on information functions. Lord (1974, p. 353): I i ( θ ⋆ ) I i ( θ ) = � 2 � ∂h ( θ ⋆ ) ∂θ ⋆ The trait level that maximizes I i ( θ ) need not be the corresponding trait level that maximizes I i ( θ ⋆ ) . 16/78

Metric Transformations Probability 0.0 0.2 0.4 0.6 0.8 1.0 A −3 −2 −1 θ 0 1 2 3 Probability 0.0 0.2 0.4 0.6 0.8 1.0 B −3 −2 −1 θ * 0 1 2 3 17/78

Metric Transformations Information 0.0 0.2 0.4 0.6 0.8 1.0 C −3 −2 −1 θ 0 1 2 3 Information 0.0 0.2 0.4 0.6 0.8 1.0 D −3 −2 −1 θ * 0 1 2 3 18/78

Relative Efficiency The relative efficiency of two information functions does not change with metric transformations. 1 RE = I ⋆ 1 ( θ ⋆ n ) n ) = I 1 ( θ n ) I ⋆ 2 ( θ ⋆ I 2 ( θ n ) The relative information provided by each item is invariant to monotonic transformations of the latent trait. The maximally informative item for a trait level is invariant to metric transformations. 1Lord (1974, 1980, p. 89) 19/78

Relative Efficiency Probability 0.0 0.2 0.4 0.6 0.8 1.0 A −3 −2 −1 θ 0 1 2 3 Probability 0.0 0.2 0.4 0.6 0.8 1.0 B −3 −2 −1 θ * 0 1 2 3 20/78

Relative Efficiency Information 0.0 0.5 1.0 1.5 2.0 C −3 −2 −1 θ 0 1 2 3 Information 0.0 0.5 1.0 1.5 2.0 D −3 −2 −1 θ * 0 1 2 3 21/78

Why Specify IRT on a Transformed Metric? • Parsimony (avoid multi-step analyses) • Many scale transformations (e.g., quadratic) do not enforce monotonicity • Computerized adaptive testing (CAT) • Many item selection and termination rules are metric-dependent • CAT requires computationally efficient methods • No need to repeatedly solve for transformed quantities • Statistical properties (e.g., bias) of ˆ θ can change with metric transformations 1 • Appropriately account for measurement error when evaluating the relationship between the latent variable and external variables 1Yi et al. (2001) 22/78

Desiderata for a Flexible-Metric IRT • Continuous, invertible metric transformations • Flexible, ability to express any continuous monotonic transformation • Model parameters that are readily portable to new contexts • Closed-form derivatives for computing information, standard errors, trait estimates • Reduction to commonly used IRT models (Rasch, 2PL, 3PL, etc.) 23/78

1 Why 2 How 3 Applications 4 Considerations, Limitations, Future Directions 24/78

Filtered Monotonic Polynomial IRT Proposed as a new NIRT model by Liang & Browne (2015). Based on the work of Elphinstone (1983, 1985). P i ( θ ) = H [ m i ( θ )] { 1 + exp[ − m i ( θ )] } − 1 = where m i ( θ ) = b 0 i + b 1 i θ + b 2 i θ 2 + · · · + b 2 k i +1 ,i θ 2 k i +1 • b i = ( b 0 i , b 1 i , . . . , b 2 k i +1 ,i ) ′ : item parameters/polynomial coefficients • k i : item complexity parameter, higher k i → greater flexibility • If k i = 0 , FMP reduces to 2PL (slope-intercept parameterization) 25/78

Filtered Monotonic Polynomial IRT With high enough k i , FMP can closely approximate any IRF that meets the MHM assumptions. Closeness of approximation can be characterized by the root integrated mean squared error (RIMSE) 1 : �� [ ˆ P i ( θ ) − P i ( θ )] 2 g ( θ ) dθ RIMSE i = g ( θ ) is the standard normal distribution 1Ramsay (1991) 26/78

Example FMP Approximations RIMSE i = { . 034 , . 034 , . 004 } for k i = { 0 , 1 , 2 } Four−Parameter Model A 1.0 0.8 Probability 0.6 0.4 True 0.2 k i = 0 k i = 1 k i = 2 0.0 −3 −2 −1 0 1 2 3 θ 27/78

Flexible Latent Trait Metrics An Application of the Filtered - PowerPoint PPT Presentation

Flexible Latent Trait Metrics An Application of the Filtered Monotonic Polynomial Item Response Model Leah Feuerstahler University of California, Berkeley 1/78 Overview Premise : In many applications of item response theory (IRT), reported

QTL Association Mapping 1 / 38 Introduction to Quantitative Trait Mapping We previously focused

The The Beverly Beverly Middle Middle School School Flexible Flexible Learning Learning

Personalized Learning Flexible Seating and Space Flexible Seating and Space Flexible Seating and

What we learned from Community Metrics Agenda Why are metrics used? How metrics are used

Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics

AGENCY OPERATIONS METRICS The Metrics of Me The Metrics of Me x 159 13,006 5 days old books

Proposal Metrics Dashboard What Gets Measured Gets Done Topics Why Keep Metrics? What

1 Latent variable models In the next section we will discuss latent variable models for

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Interpreting CNN Models for Apparent Personality Trait Regression Carles Ventura, David Masip,

The infinitesimal with dominance CIRM, February 2020 Recap of the additive model Trait value =

Marker Based Infinitesimal Model for Quantitative Trait Analysis Shizhong Xu Department of

Low rate of lineage High rates of diversification lineage diversification Ancestral trait

Mokken Scale Analysis Alternative names: Unidimensional Latent Variable Model (e.g., Holland &

Flexible Instruction Day Parent Presentation Flexible Instruction Day March 16 - 20 - Flexible

Flexible Infrastructure Qualification What Is Flexible Infrastructure/Benefits Flexible

"Labour Exclusion and Informality in a Latin American country, a Latent Class model approach

Towards Disentangled Representations via Variational Sparse Coding 1. Motivation 2. Research

Learning Latent Semantic Relations from Clickthrough Data for Query Suggestion Hao Ma, Haixuan

Statistical modelling of a terrorist network with the latent class model and Bayesian model

Trajectories of Health: Methods and Insights from Structural Equation Modeling Adam T.

Multimodal Implementation Plan Multimodal Implementation Plan OUTLINE Overview

A Latent Class Conjoint Analysis for analysing graduates profiles Paolo Mariani 1 , Andrea

ALMOST NUCLEAR: INTRODUCING THE NUCLEAR LATENCY DATASET Matthew Fuhrmann and Benjamin Tkach

Sambuz

Useful Links

Newsletter

Mail Us

Flexible Latent Trait Metrics An Application of the Filtered - PowerPoint PPT Presentation

Flexible Latent Trait Metrics An Application of the Filtered Monotonic Polynomial Item Response Model Leah Feuerstahler University of California, Berkeley 1/78 Overview Premise : In many applications of item response theory (IRT), reported

QTL Association Mapping 1 / 38 Introduction to Quantitative Trait Mapping We previously focused

The The Beverly Beverly Middle Middle School School Flexible Flexible Learning Learning

Personalized Learning Flexible Seating and Space Flexible Seating and Space Flexible Seating and

What we learned from Community Metrics Agenda Why are metrics used? How metrics are used

Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics

AGENCY OPERATIONS METRICS The Metrics of Me The Metrics of Me x 159 13,006 5 days old books

Proposal Metrics Dashboard What Gets Measured Gets Done Topics Why Keep Metrics? What

1 Latent variable models In the next section we will discuss latent variable models for

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Interpreting CNN Models for Apparent Personality Trait Regression Carles Ventura, David Masip,

The infinitesimal with dominance CIRM, February 2020 Recap of the additive model Trait value =

Marker Based Infinitesimal Model for Quantitative Trait Analysis Shizhong Xu Department of

Low rate of lineage High rates of diversification lineage diversification Ancestral trait

Mokken Scale Analysis Alternative names: Unidimensional Latent Variable Model (e.g., Holland &amp;

Flexible Instruction Day Parent Presentation Flexible Instruction Day March 16 - 20 - Flexible

Flexible Infrastructure Qualification What Is Flexible Infrastructure/Benefits Flexible

&quot;Labour Exclusion and Informality in a Latin American country, a Latent Class model approach

Towards Disentangled Representations via Variational Sparse Coding 1. Motivation 2. Research

Learning Latent Semantic Relations from Clickthrough Data for Query Suggestion Hao Ma, Haixuan

Statistical modelling of a terrorist network with the latent class model and Bayesian model

Trajectories of Health: Methods and Insights from Structural Equation Modeling Adam T.

Multimodal Implementation Plan Multimodal Implementation Plan OUTLINE Overview

A Latent Class Conjoint Analysis for analysing graduates profiles Paolo Mariani 1 , Andrea

ALMOST NUCLEAR: INTRODUCING THE NUCLEAR LATENCY DATASET Matthew Fuhrmann and Benjamin Tkach

Sambuz

Useful Links

Newsletter

Mail Us

Mokken Scale Analysis Alternative names: Unidimensional Latent Variable Model (e.g., Holland &

"Labour Exclusion and Informality in a Latin American country, a Latent Class model approach