STAT 215 Multiple Logistic Regression Colin Reimer Dawson Oberlin - PowerPoint PPT Presentation

Outline Multiple Predictors Nested Model Tests Model Selection STAT 215 Multiple Logistic Regression Colin Reimer Dawson Oberlin College November 16, 2017 1 / 24

Outline Multiple Predictors Nested Model Tests Model Selection Outline Multiple Predictors Nested Model Tests Model Selection 2 / 24

Outline Multiple Predictors Nested Model Tests Model Selection Logistic Regression With Multiple Predictors We are combining logistic regression (Ch. 9) with multiple regression (Chs 3-4). Nothing really fundamentally new. All of the “usual” options for predictors: • Quantitative variables • Powers of variables (e.g., second-order models) • Other transformations of variables (e.g., log) • Interactions (products) of variables • Indicator variables for binary predictors • Collections of k − 1 indicators for categorical predictors w/ k levels 4 / 24

Outline Multiple Predictors Nested Model Tests Model Selection Two Equivalent Forms of (Multiple) Logistic Regression Probability Form e β 0 + β 1 X + ··· + β k X k π = 1 + e β 0 + β 1 X 1 + ··· + β k X k Logit Form � � π = β 0 + β 1 X 1 + . . . β k X k log 1 − π 5 / 24

Outline Multiple Predictors Nested Model Tests Model Selection Example: Survival in ICU � 0 Died • Response: Survive = 1 Lived • Predictors: • Age • SysBP (Systolic Blood Pressure) • Pulse 6 / 24

Outline Multiple Predictors Nested Model Tests Model Selection Simple Logistic Models library("Stat2Data"); data("ICU") m1 <- glm(Survive ~ Age, family = "binomial", data = ICU) plotModel(m1) 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.8 Survive 0.6 0.4 0.2 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 40 60 80 Age 7 / 24

Outline Multiple Predictors Nested Model Tests Model Selection Simple Logistic Models m2 <- glm(Survive ~ SysBP, family = "binomial", data = ICU) plotModel(m2) 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.8 Survive 0.6 0.4 0.2 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 50 100 150 200 250 SysBP 8 / 24

Outline Multiple Predictors Nested Model Tests Model Selection Simple Logistic Models m3 <- glm(Survive ~ Pulse, family = "binomial", data = ICU) plotModel(m3) 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.8 Survive 0.6 0.4 0.2 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 50 100 150 Pulse 9 / 24

Outline Multiple Predictors Nested Model Tests Model Selection Simple Logistic Models m3 <- glm(Survive ~ Pulse + I(Pulse^2), family = "binomial", data = ICU) plotModel(m3) 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.8 Survive 0.6 0.4 0.2 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 50 100 150 Pulse 10 / 24

Outline Multiple Predictors Nested Model Tests Model Selection Multiple Predictor Model full.model <- glm(Survive ~ Age + SysBP, family = "binomial", data = ICU) summary(full.model)$coefficients %>% round(digits = 3) Estimate Std. Error z value Pr(>|z|) (Intercept) 0.962 1.000 0.962 0.336 Age -0.028 0.011 -2.637 0.008 SysBP 0.017 0.006 2.873 0.004 How to interpret tests of individual coefficients? Just as in linear regression: is the predictor adding something over the others? 11 / 24

Outline Multiple Predictors Nested Model Tests Model Selection Checking For Multicollinearity Same issues with multicollinearity can arise! dplyr::select(ICU, Age, SysBP, Pulse) %>% cor() %>% round(digits = 2) Age SysBP Pulse Age 1.00 0.04 0.04 SysBP 0.04 1.00 -0.06 Pulse 0.04 -0.06 1.00 vif(full.model) Age SysBP 1.001818 1.001818 But no worries in this case 12 / 24

Outline Multiple Predictors Nested Model Tests Model Selection Overall and Nested LR Tests pulse.quad.model <- glm(Survive ~ Age + SysBP + Pulse + I(Pulse^2), family = "binomial", data = ICU) no.pulse.model <- glm(Survive ~ Age + SysBP, family = "binomial", data = ICU) anova(no.pulse.model, pulse.quad.model, test = "LRT") Analysis of Deviance Table Model 1: Survive ~ Age + SysBP Model 2: Survive ~ Age + SysBP + Pulse + I(Pulse^2) Resid. Df Resid. Dev Df Deviance Pr(>Chi) 1 197 183.25 2 195 182.57 2 0.68431 0.7102 Test statistic: G = − 2(log P ( Data | Full ) − log P ( Data | Reduced )) 14 / 24

Outline Multiple Predictors Nested Model Tests Model Selection Overall and Nested LR Tests xpchisq(0.68431, df = 2, lower.tail = FALSE) 0.5 1 9 7 2 . . 0 0 0.4 density 0.3 0.2 0.1 2 4 6 8 10 12 [1] 0.7102381 15 / 24

Outline Multiple Predictors Nested Model Tests Model Selection One vs. Two Curves Is Sex an important predictor, controlling for BP? full.model <- glm(Survive ~ SysBP + factor(Sex) + SysBP:factor(Sex), family = 'binomial', data = ICU) summary(full.model)$coefficients Estimate Std. Error z value Pr(>|z|) (Intercept) -1.43930431 1.021041657 -1.409643 0.158645099 SysBP 0.02299392 0.008325432 2.761889 0.005746799 factor(Sex)1 1.45516591 1.525558283 0.953858 0.340155546 SysBP:factor(Sex)1 -0.01301957 0.011964883 -1.088148 0.276529569 reduced.model <- glm(Survive ~ SysBP, family = 'binomial', data = ICU) anova(reduced.model, full.model, test = "LRT") Analysis of Deviance Table Model 1: Survive ~ SysBP Model 2: Survive ~ SysBP + factor(Sex) + SysBP:factor(Sex) Resid. Df Resid. Dev Df Deviance Pr(>Chi) 1 198 191.34 2 196 189.99 2 1.3421 0.5112 16 / 24

STAT 215 Multiple Logistic Regression Colin Reimer Dawson Oberlin - PowerPoint PPT Presentation

Outline Multiple Predictors Nested Model Tests Model Selection STAT 215 Multiple Logistic Regression Colin Reimer Dawson Oberlin College November 16, 2017 1 / 24 Outline Multiple Predictors Nested Model Tests Model Selection Outline

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Todays lecture Logistic regression How can we use logistic regression for reranking? Shay

From Logistic Regression to Neural Networks CMSC 470 Marine Carpuat Logistic Regression What

LEARNING Outline Math Behind Logistic Regression Visualizing Logistic Regression Loss

Multiple Regression and Logistic Regression I Dajiang Liu @PHS 525 Apr-14-2016 Multiple

Workshop 10.5a: Logistic regression Murray Logan August 23, 2016 Table of contents 1 Logistic

STAT 213 Logistic Regression II Colin Reimer Dawson Oberlin College 28 April 2016 Outline

Logistic Regression using OLS1D in Excel 2013 XL4D: V0H XL4D: V0H XL4D: V0H 2015 Schield

Workshop 10.5a: Logistic regression Murray Logan 05 Sep 2016 Section 1 Logistic regression

Lecture 3: Logistic Regression Feng Li Shandong University fli@sdu.edu.cn September 21, 2020

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

S01 - Logistic Regression STAT 401 (Engineering) - Iowa State University April 23, 2018

Multiple and Logistic Regression IV Dajiang Liu @PHS 525 Apr-21 st -2016 Review of Last Two

XL4B: Logistic Regression using OLS1B in Excel 2013 25 Feb 2018 V0C-2x XL4B: V0C-2x XL4B: V0C-2x

Logistic regression Shay Cohen (based on slides by Sharon Goldwater) 28 October 2019 Todays

Stat 5421 Lecture Notes: To Accompany Agresti Ch 4 Charles J. Geyer October 16, 2020 Section

Cyber-Physical Systems Model Based Design IECE 553/453 Fall 2020 Prof. Dola Saha 1 Models

Descriptive Complexity of Jonni Virtema Deterministic Polylogarithmic Time Descriptive

Lecture 17: Interactive Proofs Arijit Bishnu 23.04.2010 Introduction Probabilistic Verifier

Agenda 1. Visualizing predictions in R (lab from Tuesday) 2. Parsimony and Occams razor

Statistics and Data Analysis Logistic Regression & Frequent Pattern Mining Ling-Chieh Kung

Lecture 3 Residual Analysis + Generalized Linear Models Colin Rundel 1/23/2018 1 Residual

Lecture 7: GLMs: Score equations, Residuals Author: Nick Reich / Transcribed by Bing Miu and

STAT 215 Multiple Logistic Regression Colin Reimer Dawson Oberlin - PowerPoint PPT Presentation

Outline Multiple Predictors Nested Model Tests Model Selection STAT 215 Multiple Logistic Regression Colin Reimer Dawson Oberlin College November 16, 2017 1 / 24 Outline Multiple Predictors Nested Model Tests Model Selection Outline

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Todays lecture Logistic regression How can we use logistic regression for reranking? Shay

From Logistic Regression to Neural Networks CMSC 470 Marine Carpuat Logistic Regression What

LEARNING Outline Math Behind Logistic Regression Visualizing Logistic Regression Loss

Multiple Regression and Logistic Regression I Dajiang Liu @PHS 525 Apr-14-2016 Multiple

Workshop 10.5a: Logistic regression Murray Logan August 23, 2016 Table of contents 1 Logistic

STAT 213 Logistic Regression II Colin Reimer Dawson Oberlin College 28 April 2016 Outline

Logistic Regression using OLS1D in Excel 2013 XL4D: V0H XL4D: V0H XL4D: V0H 2015 Schield

Workshop 10.5a: Logistic regression Murray Logan 05 Sep 2016 Section 1 Logistic regression

Lecture 3: Logistic Regression Feng Li Shandong University fli@sdu.edu.cn September 21, 2020

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

S01 - Logistic Regression STAT 401 (Engineering) - Iowa State University April 23, 2018

Multiple and Logistic Regression IV Dajiang Liu @PHS 525 Apr-21 st -2016 Review of Last Two

XL4B: Logistic Regression using OLS1B in Excel 2013 25 Feb 2018 V0C-2x XL4B: V0C-2x XL4B: V0C-2x

Logistic regression Shay Cohen (based on slides by Sharon Goldwater) 28 October 2019 Todays

Stat 5421 Lecture Notes: To Accompany Agresti Ch 4 Charles J. Geyer October 16, 2020 Section

Cyber-Physical Systems Model Based Design IECE 553/453 Fall 2020 Prof. Dola Saha 1 Models

Descriptive Complexity of Jonni Virtema Deterministic Polylogarithmic Time Descriptive

Lecture 17: Interactive Proofs Arijit Bishnu 23.04.2010 Introduction Probabilistic Verifier

Agenda 1. Visualizing predictions in R (lab from Tuesday) 2. Parsimony and Occams razor

Statistics and Data Analysis Logistic Regression &amp; Frequent Pattern Mining Ling-Chieh Kung

Lecture 3 Residual Analysis + Generalized Linear Models Colin Rundel 1/23/2018 1 Residual

Lecture 7: GLMs: Score equations, Residuals Author: Nick Reich / Transcribed by Bing Miu and

Statistics and Data Analysis Logistic Regression & Frequent Pattern Mining Ling-Chieh Kung