An Introduction to Logistic Regression Emily Hector University of - PowerPoint PPT Presentation

An Introduction to Logistic Regression Emily Hector University of Michigan June 19, 2019 1 / 39

Modeling Data I Types of outcomes I Continuous, binary, counts, ... I Dependence structure of outcomes I Independent observations I Correlated observations, repeated measures I Number of covariates, potential confounders I Controlling for confounders that could lead to spurious results I Sample size These factors will determine the appropriate statistical model to use 2 / 39

What is logistic regression? I Linear regression is the type of regression we use for a continuous, normally distributed response variable I Logistic regression is the type of regression we use for a binary response variable that follows a Bernoulli distribution Let us review: I Bernoulli Distribution I Linear Regression 3 / 39

Review of Bernoulli Distribution I Y ∼ Bernoulli ( p ) takes values in { 0 , 1 } , I e.g. a coin toss I Y = 1 for a success, Y = 0 for failure, I p = probability of success, i.e. p = P ( Y = 1), I e.g. p = 1 2 = P (heads) I Mean is p , Variance is p (1 − p ). Bernoulli probability density function (pdf): I 1 − p for y = 0 f ( y ; p ) = p for y = 1 = p y (1 − p ) 1 − y , y ∈ { 0 , 1 } 4 / 39

Review of Linear Regression I When do we use linear regression? 1. Linear relationship between outcome and variable 30 2. Independence of outcomes 3. Constant Normally 20 distributed errors Y (Homoscedasticity) 10 Model: Y i = — 0 + — 1 X i + ‘ i , ‘ i ∼ N (0 , ‡ 2 ). Then E ( Y i | X i ) = — 0 + — 1 X i , 0 V ar ( Y i ) = ‡ 2 . 0 10 20 30 40 50 X I How can this model break down? 5 / 39

Modeling binary outcomes with linear regression Fitting a linear regression model 1.00 on a binary outcome Y : I Y i | X i ∼ Bernoulli ( p X i ), 0.75 I E ( Y i ) = — 0 + — 1 X i = ‚ p X i . Problems? Y 0.50 I Linear relationship between X and Y ? 0.25 I Normally distributed errors? I Constant variance of Y ? 0.00 I Is ‚ p guaranteed to be in [0 , 1]? 0 10 20 30 40 50 X 6 / 39

Why can’t we use linear regression for binary outcomes? I The relationship between X and Y is not linear. I The response Y is not normally distributed. I The variance of a Bernoulli random variable depends on its expected value p X . I Fitted value of Y may not be 0 or 1, since linear models produce fitted values in ( −∞ , + ∞ ) 7 / 39

A regression model for binary data I Instead of modeling Y , model P ( Y = 1 | X ), i.e. probability that Y = 1 conditional on covariates. I Use a function that constrains probabilities between 0 and 1. 8 / 39

Logistic regression model I Let Y be a binary outcome and X a covariate/predictor. I We are interested in modeling p x = P ( Y = 1 | X = x ), i.e. the probability of a success for the covariate value of X = x . Define the logistic regression model as 3 4 p X logit ( p X ) = log = — 0 + — 1 X 1 − p X 1 2 p X I log is called the logit function 1 − p X e β 0+ β 1 X I p X = 1+ e β 0+ β 1 X e x e x lim 1+ e x = 0 and lim 1+ e x = 1, so 0 ≤ p x ≤ 1. I x →−∞ x →∞ 9 / 39

Likelihood equations for logistic regression I Assume Y i | X i ∼ Bernoulli ( p X i ) and x i × (1 − p x i ) 1 − y i f ( y i | p x i ) = p y i r N I Binomial likelihood: L ( p x | Y, X ) = p y i x i (1 − p x i ) 1 − y i i =1 I Binomial log-likelihood: Ó 1 2 Ô q N p xi ¸ ( p x | Y, X ) = y i log + log(1 − p x i ) 1 − p xi i =1 I Logistic regression log-likelihood: q ) * N y i ( — 0 + — 1 x i ) − log(1 + e β 0 + β 1 x i ) ¸ ( — | X, Y ) = i =1 I No closed form solution for Maximum Likelihood Estimates of — values. I Numerical maximization techniques required. 10 / 39

Logistic regression terminology Let p be the probability of success. Recall that 1 2 p X logit ( p X ) = log = — 0 + — 1 X . 1 − p X p X I Then 1 − p X is called the odds of success, 1 2 p X I log is called the log odds of success. 1 − p X Odds Log Odds Probability of Success (p) 11 / 39

Another motivation for logistic regression I Since p ∈ [0 , 1], the log odds is log[ p/ (1 − p )] ∈ ( −∞ , ∞ ). I So while linear regression estimates anything in ( −∞ , + ∞ ), I logistic regression estimates a proportion in [0 , 1]. 12 / 39

Review of probabilities and odds Measure Min Max Name P ( Y = 1) 0 1 “probability” P ( Y =1) 0 “odds” ∞ 1 − P ( Y =1) Ë È P ( Y =1) log “log-odds” or “logit” −∞ ∞ 1 − P ( Y =1) I The odds of an event are defined as odds( Y = 1) = P ( Y = 1) P ( Y = 1) p P ( Y = 0) = 1 − P ( Y = 1) = 1 − p odds( Y = 1) ⇒ p = 1 + odds( Y = 1) . 13 / 39

Review of odds ratio Outcome status + − + a b Exposure status c d − Odds of being a case given exposed OR = Odds of being a case given unexposed a b a + b / = a/c b/d = ad a + b = bc . c d c + d / c + d 14 / 39

Review of odds ratio I Odds Ratios (OR) can be useful for comparisons. I Suppose we have a trial to see if an intervention T reduces mortality, compared to a placebo, in patients with high cholesterol. The odds ratio is OR = odds(death | intervention T) odds(death | placebo) I The OR describes the benefits of intervention T: I OR < 1: the intervention is better than the placebo since odds(death | intervention T) < odds(death | placebo) I OR= 1: there is no di ff erence between the intervention and the placebo I OR > 1: the intervention is worse than the placebo since odds(death | intervention T) > odds(death | placebo) 15 / 39

Interpretation of logistic regression parameters 3 4 p X log = — 0 + — 1 X 1 − p X I — 0 is the log of the odds of success at zero values for all covariates. e β 0 1+ e β 0 is the probability of success at zero values for all covariates I e β 0 I Interpretation of 1+ e β 0 depends on the sampling of the dataset I Population cohort: disease prevalence at X = x I Case-control: ratio of cases to controls at X = x 16 / 39

Interpretation of logistic regression parameters Slope — 1 is the increase in the log odds ratio associated with a one-unit increase in X : — 1 = ( — 0 + — 1 ( X + 1)) − ( — 0 + — 1 X ) 1 2 Y Z 3 4 3 4 p X +1 ] ^ p X +1 p X 1 − p X +1 1 2 = log − log = log 1 + p X +1 1 − p X [ \ p X 1 − p X and e β 1 =OR!. I If — 1 = 0, there is no association between changes in X and changes in success probability (OR= 1). I If — 1 > 0, there is a positive association between X and p (OR > 1). I If — 1 < 0, there is a negative association between X and p (OR < 1). Interpretation of slope — 1 is the same regardless of sampling. 17 / 39

Interpretation odds ratios in logistic regression I OR > 1: positive relationship: as X increases, the probability of Y increases; exposure ( X = 1) associated with higher odds of outcome. I OR < 1: negative relationship: as X increases, probability of Y decreases; exposure ( X = 1) associated with lower odds of outcome. I OR= 1: no association; exposure ( X = 1) does not a ff ect odds of outcome. In logistic regression, we test null hypotheses of the form H 0 : — 1 = 0 which corresponds to OR= 1. 18 / 39

Logistic regression terminology I OR is the ratio of the odds for di ff erence Solid Lines are Odds Ratios, Dashed Lines are Log Odds Ratios success probabilities: 1 2 p 1 OR=1 1 − p 1 Log(OR)=0 1 2 p 2 1 − p 2 I OR= 1 when p 1 = p 2 . Probability of Success (p1) I Interpretation of odds ratios is di ffi cult! 19 / 39

Multiple logistic regression Consider a multiple logistic regression model: 3 4 p log = — 0 + — 1 X 1 + — 2 X 2 1 − p I Let X 1 be a continuous variable, X 2 an indicator variable (e.g. treatment or group). I Set — 0 = − 0 . 5, — 1 = 0 . 7, — 2 = 2 . 5. 20 / 39

Data example: CHD events Data from Western Collaborative Group Study (WCGS). For this example, we are interested in the outcome I 1 if develops CHD Y = 0 if no CHD 1. How likely is a person to develop coronary heart disease (CHD)? 2. Is hypertension associated with CHD events? 3. Is age associated with CHD events? 4. Does weight confound the association between hypertension and CHD events? 5. Is there a di ff erential e ff ect of CHD events for those with and without hypertension depending on weight? 21 / 39

How likely is a person to develop CHD? I The WCGS was a prospective cohort study of 3524 men aged 39 − 59 and employed in the San Francisco Bay or Los Angeles areas enrolled in 1960 and 1961. I Follow-up for CHD incidence was terminated in 1969. I 3154 men were CHD free at baseline. I 275 men developed CHD during the study. I The estimated probability a person in WCGS develops CHD is 257 / 3154 = 8 . 1%. I This is an unadjusted estimate that does not account for other risk factors. I How do we use logistic regression to determine factors that increase risk for CHD? 22 / 39

Getting ready to use R Make sure you have the package epitools installed. # install.packages("epitools") library (epitools) data (wcgs) ## Can get information on the dataset: str (wcgs) ## Define hypertension as systolic BP > 140 or diastolic BP > 80: wcgs$HT <- as.numeric (wcgs$sbp0>140 | wcgs$dbp0>90) 23 / 39

An Introduction to Logistic Regression Emily Hector University of - PowerPoint PPT Presentation

An Introduction to Logistic Regression Emily Hector University of Michigan June 19, 2019 1 / 39 Modeling Data I Types of outcomes I Continuous, binary, counts, ... I Dependence structure of outcomes I Independent observations I Correlated

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Todays lecture Logistic regression How can we use logistic regression for reranking? Shay

From Logistic Regression to Neural Networks CMSC 470 Marine Carpuat Logistic Regression What

LEARNING Outline Math Behind Logistic Regression Visualizing Logistic Regression Loss

Workshop 10.5a: Logistic regression Murray Logan August 23, 2016 Table of contents 1 Logistic

Logistic Regression using OLS1D in Excel 2013 XL4D: V0H XL4D: V0H XL4D: V0H 2015 Schield

Workshop 10.5a: Logistic regression Murray Logan 05 Sep 2016 Section 1 Logistic regression

Lecture 3: Logistic Regression Feng Li Shandong University fli@sdu.edu.cn September 21, 2020

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Machine Learning Logistic Regression Hamid R. Rabiee Spring 2015

XL4B: Logistic Regression using OLS1B in Excel 2013 25 Feb 2018 V0C-2x XL4B: V0C-2x XL4B: V0C-2x

Logistic regression Shay Cohen (based on slides by Sharon Goldwater) 28 October 2019 Todays

Learning From Data Lecture 9 Logistic Regression and Gradient Descent Logistic Regression

Logistic regression Predict binary outcomes (success/failure) from numerical or categorical

Logistic Regression: MLE vs. OLS3 in Excel2013 25 Aug 2016 V0H V0H V0H Schield MLE vs.

NorthShore University NorthShore University HealthSystem HealthSystem 1000 beds/9,500

Overview of 2012- 2020 +2021 Assoc/Prof Hanna Suominen , Adj/Prof, PhD, MSc, SFHEA Research

Big Data in Public Health David L. Mowat MBChB, MPH, FRCPC, FFPH Big Data for Health Policy

Idelalisib Sven de Vos, MD, PhD Director, UCLA Lymphoma Program Los Angeles, CA SdV May 2016

Health Care Innovation Awards Introduction to Round Two May 28, 2013 Agenda Introduction:

Coverage Transition Models Boston Small Group Convening April 23, 2012 Carolyn Ingram Senior

Toward More High-Resolution Projections of rising heat stress over the western Maritime Continent

Learning Collaborative Strategic Planning for Suicide Prevention Learning Module 1: Strategic