Lecture 7. Polytomous Data Nan Ye School of Mathematics and Physics - PowerPoint PPT Presentation

Lecture 7. Polytomous Data Nan Ye School of Mathematics and Physics University of Queensland 1 / 20

Polytomous Response Polytomous response: a response taking one of K > 2 fixed values (response categories). Main types • Ordinal scales: the categories are ordered. e.g. first, second, ... • Interval scales: the categories are ordered with scores attached to categories. e.g. height groups • Nominal scales: no structure at all. e.g. red, green, blue 2 / 20

This Lecture • Modelling ordinal scales • Modelling nominal scales 3 / 20

Models for Ordinal Scales Reduction to binary problems • Assume the categories are 1 , 2 , . . . , K . • Model each cumulative distribution p j ( x ) = P ( Y ≤ j | x ) by a logistic regression model p j ( x ) = logistic ( x ⊤ β j ) . • Equivalently, logit ( p j ( x )) = x ⊤ β j . • This may not guarantee that p j ( x ) ≥ p i ( x ) for j ≥ i . 4 / 20

Proportional odds model • If we further assume that β 1 = . . . = β K − 1 = β , we get the proportional odds model logit ( p j ( x )) = θ j + x ⊤ β . • The model need to satisfy θ 1 ≤ θ 2 ≤ . . . ≤ θ K − 1 . 5 / 20

• If we move from x to x ′ , we have p j ( x ′ ) / (1 − p j ( x ′ ) p j ( x ) / (1 − p j ( x )) = exp( β ⊤ ( x ′ − x )) . • That is, the odds changes by a factor of exp( β ⊤ ( x ′ − x )) independent of the class. • A unit increase in x i changes the odds by a factor of exp( β i ). 6 / 20

Proportional hazards model • We can use other link functions in proportional odds model. • Proportional odds model uses cloglog instead of logit link cloglog ( p j ( x )) = θ j + x ⊤ β . 7 / 20

Example Data Number of pneumonia cases and exposure time to a certain bacteria. exposure.time normal mild severe 5.8 98 0 0 15.0 51 2 1 21.5 34 6 3 27.5 35 5 8 33.5 32 10 9 39.5 23 7 8 46.0 12 6 10 51.5 4 2 5 8 / 20

Fit a proportional odds model > library(VGAM) > # the pneumo dataset is part of the VGAM library > fit.pom = vglm(cbind(normal, mild, severe) ~ log(exposure.time), data=pneumo, cumulative(parallel=T, link= ' logit ' )) 1 = normal, 2 = mild, 3 = severe 9 / 20

Inspect the proportional odds model > summary(fit.pom) Pearson residuals: Min 1Q Median 3Q Max logit(P[Y<=1]) -1.248 -0.07164 0.1441 0.3086 0.7714 logit(P[Y<=2]) -1.044 -0.18415 0.3093 0.3353 0.5048 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept):1 9.6761 1.3241 7.308 2.72e-13 *** (Intercept):2 10.5817 1.3454 7.865 3.69e-15 *** log(exposure.time) -2.5968 0.3811 -6.814 9.50e-12 *** The fitted models are logitP ( Y ≤ 1 | x ) = 9 . 6761 − 2 . 5968 log( exposure . time ) logitP ( Y ≤ 2 | x ) = 10 . 5817 − 2 . 5968 log( exposure . time ) 10 / 20

Warning: Hauck-Donner effect detected in the following estimate(s): ' (Intercept):1 ' Exponentiated coefficients: log(exposure.time) 0.07451115 • Hauck-Donner effect: Wald’s test of significance is misleading. This often happens when the data is separable (for this data set, the log exposure time can be used to perfectly predict whether Y ≤ 1). • Increasing log exposure time by one unit changes all the odds by a factor of 0.07451115. 11 / 20

Fit a proportional hazards model > fit.phm = vglm(cbind(normal, mild, severe) ~ log(exposure.time), data=pneumo, cumulative(parallel=T, link= ' cloglog ' )) 12 / 20

Inspect the proportional hazards model > summary(fit.phm) Pearson residuals: Min 1Q Median 3Q Max cloglog(P[Y<=1]) -0.6916 -0.4561 -0.04129 0.4381 0.5379 cloglog(P[Y<=2]) -1.0037 -0.3217 -0.03335 0.2006 0.7345 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept):1 4.3457 0.6299 6.900 5.22e-12 *** (Intercept):2 4.8283 0.6417 7.524 5.31e-14 *** log(exposure.time) -1.2407 0.1897 -6.540 6.15e-11 *** The fitted models are cloglogP ( Y ≤ 1 | x ) = 4 . 3457 − 1 . 2407 log( exposure . time ) cloglogP ( Y ≤ 2 | x ) = 4 . 8283 − 1 . 2407 log( exposure . time ) 13 / 20

Fitted probabilities and observed probabilities 1.0 data 0.8 logit P(<=normal) cloglog 0.6 0.4 0.2 0.0 2.0 2.5 3.0 3.5 4.0 log(exposure.time) 1.0 data 0.8 logit cloglog P(<=mild) 0.6 0.4 0.2 0.0 2.0 2.5 3.0 3.5 4.0 log(exposure.time) 14 / 20

Models for Nominal scales Multi-class logistic regression • Recall: in binary logistic regression, ln p ( Y = 1 | x , β ) p ( Y = 0 | x , β ) = β ⊤ x , That is, the log odds is linear. 15 / 20

• When the classes are 1 , . . . , K , we want ln p ( Y = i | x , β ) p ( Y = 1 | x , β ) = β ⊤ i x . • This implies exp( β ⊤ i x ) p ( Y = i | x , β 1: K ) = j x ) , j exp( β ⊤ ∑︁ where β 1 = 0 , and β 1: K denotes β 1 , . . . , β K . • Also known as multinomial logistic regression. 16 / 20

Decision boundary • Given x , we predict its class as arg max y p ( y | x , β ). • The set of x in class y is the convex polytope described by the contraints β ⊤ y x ≥ β ⊤ 1 x , . . . β ⊤ y x ≥ β ⊤ K x . • The boundary between different classes must be linear. 17 / 20

Linearly separable data • When the data is linearly separable, MLE diverges (it fails for simple data)! • There are various ways to fix this problem (for example, through regularization, or using objective functions which search for hyperplanes which are optimal in some sense, like support vector machines). Linearly separable data with K classes: there are K vectors β 1 , . . . , β K such that x is in class y iff β ⊤ y x ≥ β ⊤ i x for all i. 18 / 20

Multi-class logistic regression in R > fit.mlr <- vglm(Species ~ ., multinomial, iris) > # compute fitted probabilities > predict(fit.mlr, type= ' response ' ) > # compute probabilities on new data > predict(fit.mlr, newdata=iris, type= ' response ' ) 19 / 20

What You Need to Know • Modelling ordinal scales proportional odds model, proportional hazards model • Modelling nominal scales multiclass logistic regression • Working with polytomous response data in R 20 / 20

Lecture 7. Polytomous Data Nan Ye School of Mathematics and Physics - PowerPoint PPT Presentation

Lecture 7. Polytomous Data Nan Ye School of Mathematics and Physics University of Queensland 1 / 20 Polytomous Response Polytomous response: a response taking one of K > 2 fixed values (response categories). Main types Ordinal scales:

Data Cleaning Nan Tang, QCRI Big Data Cleaning Nan Tang, QCRI Big Data Cleaning Nan Tang,

Development of Taiwans Bunun Tribe Nan-An Tribe Natural Environment in Nan-An Location

CNBC Matlab Mini-Course Inf and NaN 3/0 returns Inf 0/0 returns NaN David S. Touretzky

A class of Multidimensional Latent Class IRT models for ordinal polytomous item responses Silvia

Maron & Ibben Dron Nan Kojbarok Aurok In Dren Maron & Ibben Dron Nan Kojbarok Aurok In Dren

Repairing Four-Atom Conjecture Ting-Ting Nan Advisor: Nigel Boston SP Coding and Information

Nanosonics Limited (ASX:NAN) Investor Presentation 7 December 2009 Disclaimer Summary

Canadian Human Rights Tribunal Proceeding Update Nishnawbe Aski Nation (NAN) Chiefs-in-Assembly

Proposed Acquisition of Nan Wah Building 13 May 2004 Agenda The Property Impact on A-REIT

Th The e new ew tre rends ds of f nan anoma omater terials ials ap appl plica

Inter-Island PSO Air Services Review of Services and Procurement Options COMHAIRLE NAN EILEAN

Privacy Preserving Distributed ID3 Algorithm Nan Meng University of Hong Kong

The Best-of-Three Voting on Dense Graphs Nan Kang 1 as Rivera 2 Nicol 1 Department of

Dependable Data Repairing with Fixing Rules Jiannan Wang Nan Tang 1 Data is Dirty 2

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

Lecture 10. Modeling Process and Model Diagnostics Nan Ye School of Mathematics and Physics

High Resolution Site Characterization Pragmatic Approaches to Remediation Success: Case

Architectural Space and Form evolution traced to developments in structural engineering and

TACOM ILSC FY20 MDEX COMBAT SUPPORT AND COMBAT SERVICE SUPPORT READINESS AND SUSTAINMENT

DESY testbeam Fred Hartjes NIKHEF Nikhef/Bonn LepCol meeting September 23, 2019 #

Disclosures I have nothing to disclose Updates In CAP/HAP/VAP Rachel Bystritsky, MD University

TRAINING IN THE ILO CLASSIFICATION OF RADIOGRAPHS FOR THE PNEUMOCONIOSIS IN BRAZIL Eduardo

Ensuring patient safety is on our health data agenda A/Prof. Farah Magrabi Leader, Patient

Community-Acquired Pneumonia acquired pneumonia as: a. An ailment that often leads to

Lecture 7. Polytomous Data Nan Ye School of Mathematics and Physics - PowerPoint PPT Presentation

Lecture 7. Polytomous Data Nan Ye School of Mathematics and Physics University of Queensland 1 / 20 Polytomous Response Polytomous response: a response taking one of K > 2 fixed values (response categories). Main types Ordinal scales:

Data Cleaning Nan Tang, QCRI Big Data Cleaning Nan Tang, QCRI Big Data Cleaning Nan Tang,

Development of Taiwans Bunun Tribe Nan-An Tribe Natural Environment in Nan-An Location

CNBC Matlab Mini-Course Inf and NaN 3/0 returns Inf 0/0 returns NaN David S. Touretzky

A class of Multidimensional Latent Class IRT models for ordinal polytomous item responses Silvia

Maron &amp; Ibben Dron Nan Kojbarok Aurok In Dren Maron &amp; Ibben Dron Nan Kojbarok Aurok In Dren

Repairing Four-Atom Conjecture Ting-Ting Nan Advisor: Nigel Boston SP Coding and Information

Nanosonics Limited (ASX:NAN) Investor Presentation 7 December 2009 Disclaimer Summary

Canadian Human Rights Tribunal Proceeding Update Nishnawbe Aski Nation (NAN) Chiefs-in-Assembly

Proposed Acquisition of Nan Wah Building 13 May 2004 Agenda The Property Impact on A-REIT

Th The e new ew tre rends ds of f nan anoma omater terials ials ap appl plica

Inter-Island PSO Air Services Review of Services and Procurement Options COMHAIRLE NAN EILEAN

Privacy Preserving Distributed ID3 Algorithm Nan Meng University of Hong Kong

The Best-of-Three Voting on Dense Graphs Nan Kang 1 as Rivera 2 Nicol 1 Department of

Dependable Data Repairing with Fixing Rules Jiannan Wang Nan Tang 1 Data is Dirty 2

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

Lecture 10. Modeling Process and Model Diagnostics Nan Ye School of Mathematics and Physics

High Resolution Site Characterization Pragmatic Approaches to Remediation Success: Case

Architectural Space and Form evolution traced to developments in structural engineering and

TACOM ILSC FY20 MDEX COMBAT SUPPORT AND COMBAT SERVICE SUPPORT READINESS AND SUSTAINMENT

DESY testbeam Fred Hartjes NIKHEF Nikhef/Bonn LepCol meeting September 23, 2019 #

Disclosures I have nothing to disclose Updates In CAP/HAP/VAP Rachel Bystritsky, MD University

TRAINING IN THE ILO CLASSIFICATION OF RADIOGRAPHS FOR THE PNEUMOCONIOSIS IN BRAZIL Eduardo

Ensuring patient safety is on our health data agenda A/Prof. Farah Magrabi Leader, Patient

Community-Acquired Pneumonia acquired pneumonia as: a. An ailment that often leads to

Maron & Ibben Dron Nan Kojbarok Aurok In Dren Maron & Ibben Dron Nan Kojbarok Aurok In Dren