Bias reduction in the estimation of Rasch models Outline David Firth - PowerPoint PPT Presentation

Rasch Models Maximum likelihood estimation Bias reduction Parameterization Application Discussion References Bias reduction in the estimation of Rasch models Outline David Firth 1 Rasch Models 1 d.firth@warwick.ac.uk Ioannis Kosmidis 21 Maximum likelihood estimation 2 i.kosmidis@ucl.ac.uk Bias reduction 3 Heather Turner 1 ht@heatherturner.net Parameterization 4 1 Department of Statistics, University of Warwick Application 5 2 Department of Statistical Science, University College London Discussion 6 Psychoco, 2012 Rasch Models Maximum likelihood estimation Bias reduction Parameterization Application Discussion References Rasch Models Maximum likelihood estimation Bias reduction Parameterization Application Discussion References One-parameter logistic regression Rasch models 1PL model The 1PL Rasch model: (a special logistic regression model) π is log = η is = α i + γ s ( i = 1 , . . . , I ; s = 1 , . . . , S ) , 1 − π is Independent Bernoulli responses in a subject-item arrangement: Y is is the outcome of the s th subject on the i th item. where α i , γ s are uknown model parameters, and η is the predictor for the 1PL model. π is = P ( Y is = 1) : the probability that s th subject succeeds on the i th item, ( i = 1 , . . . , I ; s = 1 , . . . , S ) . Parameter vector: θ = ( α 1 , . . . , α I , γ 1 , . . . , γ S ) T , Parameter interpretation: α i (or − α i ): measure of the “ease” (or “difficulty”) of the i th item, γ s : the “ability” of the s th subject.

Rasch Models Maximum likelihood estimation Bias reduction Parameterization Application Discussion References 2PL model: 5 subjects - 3 items Two-parameter logistic regression 2PL model Item 1: α 1 = 2, β 1 = 8 0.8 IRF 0.4 γ 1 γ 2 γ 3 γ 4 γ 5 0.0 ● ● ● ● ● The 2PL Rasch model: −10 −5 0 5 10 γ π is log = ˜ η is = α i + β i γ s ( i = 1 , . . . , I ; s = 1 , . . . , S ) , 1 − π is Item 2: α 2 = 0, β 2 = 2 0.8 where β i is a “discrimination” parameter for the i th item, and ˜ η is IRF 0.4 the predictor for the 2PL model. γ 1 γ 2 γ 3 γ 4 γ 5 0.0 ● ● ● ● ● θ = ( α 1 , . . . , α I , β 1 , . . . , β I , γ 1 , . . . , γ S ) T . −10 −5 0 5 10 Parameter vector: ˜ γ The larger | β i | is the steeper is the Item-Response Function (IRF) Item 3: α 3 = −2, β 3 = −1 (the map from γ s to π is ). 0.8 IRF 0.4 γ 1 γ 2 γ 3 γ 4 γ 5 0.0 ● ● ● ● ● −10 −5 0 5 10 γ Rasch Models Maximum likelihood estimation Bias reduction Parameterization Application Discussion References 1PL model: 5 subjects - 3 items Advantages Maximum likelihood estimation Item 1: α 1 = 2 0.8 IRF 0.4 γ 1 γ 2 γ 3 γ 4 γ 5 0.0 ● ● ● ● ● −10 −5 0 5 10 γ Item 2: α 2 = 0 → ML estimation is straighforward using generic tools (e.g. gnm uses a quasi Newton-Raphon iteration). 0.8 IRF 0.4 γ 1 γ 2 γ 3 γ 4 γ 5 → Generic inferential procedures (LR tests, likelihood-based confidence 0.0 ● ● ● ● ● intervals). −10 −5 0 5 10 γ Item 3: α 3 = −2 0.8 IRF 0.4 γ 1 γ 2 γ 3 γ 4 γ 5 0.0 ● ● ● ● ● −10 −5 0 5 10 γ

Rasch Models Maximum likelihood estimation Bias reduction Parameterization Application Discussion References Rasch Models Maximum likelihood estimation Bias reduction Parameterization Application Discussion References Issues Issues Maximum likelihood estimation - Issues Maximum likelihood estimation - Issues Useful asymptotic frameworks (e.g. information grows with the As with many models for binomial responses, there is positive number of subjects or number of items): probability of boundary ML estimates. → Full maximum likelihood generally delivers inconsistent → Numerical issues in estimation. estimates. (Andersen, 1980, Chapter 6) → Problems with asymptotic inference (e.g. Wald-based → Loss of performance (e.g. coverage) of tests, confidence inferences). intervals. → Add small constants to the responses in the spirit of Haldane (1955) → (Partial) Solutions: Conditional likelihoods, integrated likelihoods, (?) modified profile likelihoods → can be hard to apply for 2PL due to nonlinearity. Rasch Models Maximum likelihood estimation Bias reduction Parameterization Application Discussion References Rasch Models Maximum likelihood estimation Bias reduction Parameterization Application Discussion References Adjusted score functions Adjusted score functions Bias-reducing adjusted score functions Bias-reducing adjusted score functions Firth (1993): appropriate adjustment A ( θ ) to the score vector for getting estimators with smaller asymptotic bias than ML: → In binomial/multinomial response GLMs, the reduced-bias estimates ∇ θ l ( θ ) + A ( θ ) = 0 . have been found to be always finite (Heinze and Schemper 2002; Bull et al. 2002; Zorn 2005; Kosmidis 2009) Applicable to models where the infromation on the parameters → Easy implementation: increases with the number of observations ( dim θ is independent of Iterative bias correction (Kosmidis and Firth 2010) the number of observations). Iterated ML fits on pseudo-data (Kosmidis and Firth 2011) → Not the case for Rasch models under useful asymptotic frameworks. → But expect less-biased estimators than ML.

Rasch Models Maximum likelihood estimation Bias reduction Parameterization Application Discussion References Rasch Models Maximum likelihood estimation Bias reduction Parameterization Application Discussion References Adjusted score functions Adjusted score functions Adjusted score equations for 1PL Adjusted score equations for 1PL Adjusted score equations for 1PL (Firth 1993, logistic regressions) Adjusted score equations for 1PL (Firth 1993, logistic regressions) I S I S � y is + 1 � � y is + 1 � � � � � 0 = 2 h is + (1 + h is ) π is z ist ( t = 1 , . . . , I + S ) , 0 = 2 h is + (1 + h is ) π is z ist ( t = 1 , . . . , I + S ) , i =1 s =1 i =1 s =1 where where z ist = ∂η is /∂θ t is the ( s, t ) th element of the S × ( I + S ) matrix Z i , z ist = ∂η is /∂θ t is the ( s, t ) th element of the S × ( I + S ) matrix Z i , h is is the s th diagonal element of H i = Z i F − 1 Z T h is is the s th diagonal element of H i = Z i F − 1 Z T i Σ r (“hat value” i Σ r (“hat value” for the ( i, s ) th observation), for the ( i, s ) th observation), F = � T F = � T i =1 Z T i =1 Z T i Σ i Z i (the Fisher information), i Σ i Z i (the Fisher information), Σ i = diag { v i 1 , . . . , v iS } , v is = var( Y is ) Σ i = diag { v i 1 , . . . , v iS } , v is = var( Y is ) Rasch Models Maximum likelihood estimation Bias reduction Parameterization Application Discussion References Rasch Models Maximum likelihood estimation Bias reduction Parameterization Application Discussion References Adjusted score functions Adjusted score functions Adjusted score equations for 2PL Adjusted score equations for 2PL Adjusted score equations for 2PL (Kosmidis and Firth 2009, GNMs) Adjusted score equations for 2PL (Kosmidis and Firth 2009, GNMs) I S I S � y is + 1 � � y is + 1 � � � h is + (1 + ˜ ˜ � � h is + (1 + ˜ ˜ 0 = h is ) π is + c is v is z ist ˜ ( t = 1 , . . . , 2 I + S ) , 0 = h is ) π is + c is v is z ist ˜ ( t = 1 , . . . , 2 I + S ) , 2 2 i =1 s =1 i =1 s =1 where where η is /∂ ˜ η is /∂ ˜ θ t is the ( s, t ) th element of the S × (2 I + S ) matrix ˜ θ t is the ( s, t ) th element of the S × (2 I + S ) matrix ˜ z ist = ∂ ˜ ˜ Z i , ˜ z ist = ∂ ˜ Z i , ˜ ˜ h is is the “hat value” for the ( i, s ) th observation, h is is the “hat value” for the ( i, s ) th observation, ˜ i =1 ˜ i Σ i ˜ ˜ i =1 ˜ i Σ i ˜ F = � T F = � T Z T Z T Z i , Z i , Σ i = diag { v i 1 , . . . , v iS } , v is = var( Y is ) = π is (1 − π is ) , Σ i = diag { v i 1 , . . . , v iS } , v is = var( Y is ) = π is (1 − π is ) , c is is the asymptotic covariance of the ML estimators of β i and γ s c is is the asymptotic covariance of the ML estimators of β i and γ s (from the components of ˜ (from the components of ˜ F − 1 ). F − 1 ).

Bias reduction in the estimation of Rasch models Outline David Firth - PowerPoint PPT Presentation

Rasch Models Maximum likelihood estimation Bias reduction Parameterization Application Discussion References Bias reduction in the estimation of Rasch models Outline David Firth 1 Rasch Models 1 d.firth@warwick.ac.uk Ioannis Kosmidis 21

eRm Extended Rasch Modeling Item response theory Rasch measurement scale The

Mixtures of Rasch Models with R Package psychomix Hannah Frick, Carolin Strobl, Friedrich Leisch,

Mixtures of Rasch Models Several approaches to test for DIF: LR tests, Wald tests Rasch trees

A Rasch Analysis of Math Tests using M.In.E.R.Va. Platform Grazia Messineo - Salvatore Vassallo

BIAS What Is Bias? Bias can be defined as favoring one side, position, or belief being

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

Bias reduction in generalized nonlinear models Ioannis Kosmidis and David Firth Department of

BIAS BIAS LIGHT LIGHT & & MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh

Expectancy bias and Bias and forensic evidence Bias and speech research forensic speech

Publication bias in QCA Publication bias in QCA Publication bias in QCA Meaning, diagnosis and

Quasi-Exact Tests for the dichotomous Rasch Model conditional independence (local

A variational method for the Rasch model Frank Rijmen and Ji r Vomlel Catholic

A new statistical method for detecting Differential Item Functioning in the Rasch-Model Julia

Making Generative Classifiers Robust to Selection Bias Andrew Smith Charles Elkan November

Multiplicity and Estimation P.Bauer Medical University of Vienna London, November 2012

Equity & Excellence: Hidden Bias Implicit Bias Inherent Bias

De Bruijn graphs and their foldings Peter J. Cameron University of St Andrews (Joint work with

t ss r srt

Empirical Composite Likelihoods Nicola Lunardon, Francesco Pauli, Laura Ventura Dept. of

GSA Smart Card Handbook GSA Smart Card Handbook Industry Review and Comment Process Industry

BIOMETRY COURSE 2. Biometry Measurements Welcome to Biometry Emma Deighan Trainer in

On the Power of Power Analyses Sylvain GUILLEY, Laurent SAUVAGE, Florent FLAMENT, Maxime NASSAR,

mimR a (small) contribution to gR Sren Hjsgaard Biometry Research Unit Danish Institute

Web Mapping? Why? How? Isn't Google enough? Jo Cook Senior IT Support and Development Oxford

Bias reduction in the estimation of Rasch models Outline David Firth - PowerPoint PPT Presentation

Rasch Models Maximum likelihood estimation Bias reduction Parameterization Application Discussion References Bias reduction in the estimation of Rasch models Outline David Firth 1 Rasch Models 1 d.firth@warwick.ac.uk Ioannis Kosmidis 21

eRm Extended Rasch Modeling Item response theory Rasch measurement scale The

Mixtures of Rasch Models with R Package psychomix Hannah Frick, Carolin Strobl, Friedrich Leisch,

Mixtures of Rasch Models Several approaches to test for DIF: LR tests, Wald tests Rasch trees

A Rasch Analysis of Math Tests using M.In.E.R.Va. Platform Grazia Messineo - Salvatore Vassallo

BIAS What Is Bias? Bias can be defined as favoring one side, position, or belief being

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

Bias reduction in generalized nonlinear models Ioannis Kosmidis and David Firth Department of

BIAS BIAS LIGHT LIGHT &amp; &amp; MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh

Expectancy bias and Bias and forensic evidence Bias and speech research forensic speech

Publication bias in QCA Publication bias in QCA Publication bias in QCA Meaning, diagnosis and

Quasi-Exact Tests for the dichotomous Rasch Model conditional independence (local

A variational method for the Rasch model Frank Rijmen and Ji r Vomlel Catholic

A new statistical method for detecting Differential Item Functioning in the Rasch-Model Julia

Making Generative Classifiers Robust to Selection Bias Andrew Smith Charles Elkan November

Multiplicity and Estimation P.Bauer Medical University of Vienna London, November 2012

Equity &amp; Excellence: Hidden Bias Implicit Bias Inherent Bias

De Bruijn graphs and their foldings Peter J. Cameron University of St Andrews (Joint work with

t ss r srt

Empirical Composite Likelihoods Nicola Lunardon, Francesco Pauli, Laura Ventura Dept. of

GSA Smart Card Handbook GSA Smart Card Handbook Industry Review and Comment Process Industry

BIOMETRY COURSE 2. Biometry Measurements Welcome to Biometry Emma Deighan Trainer in

On the Power of Power Analyses Sylvain GUILLEY, Laurent SAUVAGE, Florent FLAMENT, Maxime NASSAR,

mimR a (small) contribution to gR Sren Hjsgaard Biometry Research Unit Danish Institute

Web Mapping? Why? How? Isn't Google enough? Jo Cook Senior IT Support and Development Oxford

BIAS BIAS LIGHT LIGHT & & MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh

Equity & Excellence: Hidden Bias Implicit Bias Inherent Bias