Lecture #11: Logistic Regression - Part II Data Science 1 CS 109A, - PowerPoint PPT Presentation

Lecture #11: Logistic Regression - Part II Data Science 1 CS 109A, STAT 121A, AC 209A, E-109A Pavlos Protopapas Kevin Rader Margo Levine Rahul Dave

Lecture Outline Logistic Regression: a Brief Review Classification Boundaries Regularization in Logistic Regression Multinomial Logistic Regression Bayes Theorem and Misclassification Rates ROC Curves 2

Logistic Regression: a Brief Review 3

log Multiple logistic regression is a generalization to multiple predictors. More specifically we can define a multiple logistic regression model to predict as such: log where there are predictors: . Note: statisticians are often lazy and use the notation log to mean ln (the text does this). We will write log if this is what we mean. Multiple Logistic Regression Earlier we saw the general form of simple logistic regression, meaning when there is just one predictor used in the model. What was the model statement (in terms of linear predictors)? 4

Multiple logistic regression is a generalization to multiple predictors. More specifically we can define a multiple logistic regression model to predict as such: log where there are predictors: . Note: statisticians are often lazy and use the notation log to mean ln (the text does this). We will write log if this is what we mean. Multiple Logistic Regression Earlier we saw the general form of simple logistic regression, meaning when there is just one predictor used in the model. What was the model statement (in terms of linear predictors)? ( P ( Y = 1) ) log = β 0 + β 1 X 1 − P ( Y = 1) 4

Multiple Logistic Regression Earlier we saw the general form of simple logistic regression, meaning when there is just one predictor used in the model. What was the model statement (in terms of linear predictors)? ( P ( Y = 1) ) log = β 0 + β 1 X 1 − P ( Y = 1) Multiple logistic regression is a generalization to multiple predictors. More specifically we can define a multiple logistic regression model to predict P ( Y = 1) as such: ( ) P ( Y = 1) log = β 0 + β 1 X 1 + β 2 X 2 + ... + β p X p 1 − P ( Y = 1) where there are p predictors: X = ( X 1 , X 2 , ..., X p ) . Note: statisticians are often lazy and use the notation log to mean ln (the text does this). We will write log 10 if this is what we mean. 4

Interpreting Multiple Logistic Regression: an Example Let’s get back to the NFL data. We are attempting to predict whether a play results in a TD based on location (yard line) and whether the play was a pass. The simultaneous effect of these two predictors can be brought into one model. Recall from earlier we had the following estimated models: ( � ) P ( Y = 1) log = − 7 . 425 + 0 . 0626 · X yard � 1 − P ( Y = 1) ( � ) P ( Y = 1) log = − 4 . 061 + 1 . 106 · X pass � 1 − P ( Y = 1) The results for the multiple logistic regression model are on the next slide. 5

Interpreting Multiple Logistic Regression: an Example 6

Some questions 1. Write down the complete model. Break this down into the model to predict log-odds of a touchdown based on the yard line for passes and the same model for non-passes. How is this different from the previous model (without interaction)? 2. Estimate the odds ratio of a TD comparing passes to non-passes. 3. Is there any evidence of multicollinearity in this model? 4. Is there any confounding in this problem? 7

Interactions in Multiple Logistic Regression Just like in linear regression, interaction terms can be considered in logistic regression. An interaction terms is incorporated into the model the same way, and the interpretation is very similar (on the log-odds scale of the response of course). Write down the model for the NFL data for the 2 predictors plus the interactions term. 8

Interpreting Multiple Logistic Regression with Interaction: an Example 9

Some questions 1. Write down the complete model. Break this down into the model to predict log-odds of a touchdown based on the yard line for passes and the same model for non-passes. How is this different from the previous model (without interaction)? 2. Use this model to estimate the probability of a touchdown for a pass at the 20 yard line. Do the same for a run at the 20 yard line. 3. Use this model to estimate the probability of a touchdown for a pass at the 99 yard line. Do the same for a run at the 99 yard line. 4. Is this a stronger model than the previous one? How would we check? 10

Classification Boundaries 11

Classification Recall that we could attempt to purely classify each observation based on whether the estimated P ( Y = 1) from the model was greater than 0.5. When dealing with ‘well-separated’ data, logistic regression can work well in performing classification. We saw a 2-D plot last time which had two predictors, X 1 and X 2 and depicted the classes as different colors. A similar one is shown on the next slide. 12

2D Classification in Logistic Regression: an Example 13

What would be a good logistic regression model to classify these points? Based on these predictors, two separate logistic regression model were considered that were based on different ordered polynomials of and and their interactions. The ‘circles’ represent the boundary for classification. How can the classification boundary be calculated for a logistic regression? 2D Classification in Logistic Regression: an Example Would a logistic regression model perform well in classifying the observations in this example? 14

Based on these predictors, two separate logistic regression model were considered that were based on different ordered polynomials of and and their interactions. The ‘circles’ represent the boundary for classification. How can the classification boundary be calculated for a logistic regression? 2D Classification in Logistic Regression: an Example Would a logistic regression model perform well in classifying the observations in this example? What would be a good logistic regression model to classify these points? 14

2D Classification in Logistic Regression: an Example Would a logistic regression model perform well in classifying the observations in this example? What would be a good logistic regression model to classify these points? Based on these predictors, two separate logistic regression model were considered that were based on different ordered polynomials of X 1 and X 2 and their interactions. The ‘circles’ represent the boundary for classification. How can the classification boundary be calculated for a logistic regression? 14

We could determine the misclassification rates in left out validation or test set(s) 2D Classification in Logistic Regression: an Example In the previous plot, which classification boundary performs better? How can you tell? How would you make this determination in an actual data example? 15

2D Classification in Logistic Regression: an Example In the previous plot, which classification boundary performs better? How can you tell? How would you make this determination in an actual data example? We could determine the misclassification rates in left out validation or test set(s) 15

Regularization in Logistic Regression 16

arg min arg min And a regularization approach was to add a penalty factor to this equation. Which for Ridge Regression becomes: arg min This penalty shrinks the estimates towards zero, and had the analogue of using a Normal prior in the Bayesian paradigm. Regularization in Linear Regression Based on the Likelihood framework, a loss function can be determined based on the likelihood function. We saw in linear regression that maximizing the log-likelihood is equivalent to minimizing the sum of squares error: 17

arg min This penalty shrinks the estimates towards zero, and had the analogue of using a Normal prior in the Bayesian paradigm. Regularization in Linear Regression Based on the Likelihood framework, a loss function can be determined based on the likelihood function. We saw in linear regression that maximizing the log-likelihood is equivalent to minimizing the sum of squares error: n n y i ) 2 = arg min arg min ∑ ∑ ( y i − ( β 0 + β 1 x 1 i + ... + β p x pi )) 2 ( y i − ˆ i =1 i =1 And a regularization approach was to add a penalty factor to this equation. Which for Ridge Regression becomes: 17

Regularization in Linear Regression Based on the Likelihood framework, a loss function can be determined based on the likelihood function. We saw in linear regression that maximizing the log-likelihood is equivalent to minimizing the sum of squares error: n n y i ) 2 = arg min arg min ∑ ∑ ( y i − ( β 0 + β 1 x 1 i + ... + β p x pi )) 2 ( y i − ˆ i =1 i =1 And a regularization approach was to add a penalty factor to this equation. Which for Ridge Regression becomes:   2     n n n ∑ ∑ ∑ arg min β 2  y i −  β 0 + + λ β j x ji     j   i =1 j =1 j =1 This penalty shrinks the estimates towards zero, and had the analogue of using a Normal prior in the Bayesian paradigm. 17

Lecture #11: Logistic Regression - Part II Data Science 1 CS 109A, - PowerPoint PPT Presentation

Lecture #11: Logistic Regression - Part II Data Science 1 CS 109A, STAT 121A, AC 209A, E-109A Pavlos Protopapas Kevin Rader Margo Levine Rahul Dave Lecture Outline Logistic Regression: a Brief Review Classification Boundaries

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Todays lecture Logistic regression How can we use logistic regression for reranking? Shay

From Logistic Regression to Neural Networks CMSC 470 Marine Carpuat Logistic Regression What

LEARNING Outline Math Behind Logistic Regression Visualizing Logistic Regression Loss

Workshop 10.5a: Logistic regression Murray Logan August 23, 2016 Table of contents 1 Logistic

Lecture 3: Logistic Regression Feng Li Shandong University fli@sdu.edu.cn September 21, 2020

Logistic Regression using OLS1D in Excel 2013 XL4D: V0H XL4D: V0H XL4D: V0H 2015 Schield

Workshop 10.5a: Logistic regression Murray Logan 05 Sep 2016 Section 1 Logistic regression

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Logistic regression Shay Cohen (based on slides by Sharon Goldwater) 28 October 2019 Todays

Learning From Data Lecture 9 Logistic Regression and Gradient Descent Logistic Regression

XL4B: Logistic Regression using OLS1B in Excel 2013 25 Feb 2018 V0C-2x XL4B: V0C-2x XL4B: V0C-2x

Machine Learning Logistic Regression Hamid R. Rabiee Spring 2015

Lecture 9 Logistic regression Lecture 9 Logistic regression 10 17 2008 Review Review

Logistic regression Predict binary outcomes (success/failure) from numerical or categorical

Cross-Target Stance Classification with Self-Attention Networks Chang Xu, Ccile Paris, Surya

Civil Liberties Group Presentations Questions Directions: o Create a PowerPoint presentation or

Perinatal Mental Health Dr Michael Yousif, Consultant in Psychological Medicine, OUH NHSFT

Health Standards Section Rural Health Clinics Role & Structure of Health Standards Section

1 Respecting Our Differences: How to Facilitate Difficult Conversations in the Classroom

Getting Started with Collective Impact Webinar Series Presented by: An Initiative of FSG and

CHRONIC HISTIOCYTIC INTERVILLOSITIS CD68 CD68 Chronic Histiocytic Intervillositis Background

BGI & BGI Europe Introduction Wei XU Business Development Manager BGI Europe Who We Are

Lecture #11: Logistic Regression - Part II Data Science 1 CS 109A, - PowerPoint PPT Presentation

Lecture #11: Logistic Regression - Part II Data Science 1 CS 109A, STAT 121A, AC 209A, E-109A Pavlos Protopapas Kevin Rader Margo Levine Rahul Dave Lecture Outline Logistic Regression: a Brief Review Classification Boundaries

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Todays lecture Logistic regression How can we use logistic regression for reranking? Shay

From Logistic Regression to Neural Networks CMSC 470 Marine Carpuat Logistic Regression What

LEARNING Outline Math Behind Logistic Regression Visualizing Logistic Regression Loss

Workshop 10.5a: Logistic regression Murray Logan August 23, 2016 Table of contents 1 Logistic

Lecture 3: Logistic Regression Feng Li Shandong University fli@sdu.edu.cn September 21, 2020

Logistic Regression using OLS1D in Excel 2013 XL4D: V0H XL4D: V0H XL4D: V0H 2015 Schield

Workshop 10.5a: Logistic regression Murray Logan 05 Sep 2016 Section 1 Logistic regression

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Logistic regression Shay Cohen (based on slides by Sharon Goldwater) 28 October 2019 Todays

Learning From Data Lecture 9 Logistic Regression and Gradient Descent Logistic Regression

XL4B: Logistic Regression using OLS1B in Excel 2013 25 Feb 2018 V0C-2x XL4B: V0C-2x XL4B: V0C-2x

Machine Learning Logistic Regression Hamid R. Rabiee Spring 2015

Lecture 9 Logistic regression Lecture 9 Logistic regression 10 17 2008 Review Review

Logistic regression Predict binary outcomes (success/failure) from numerical or categorical

Cross-Target Stance Classification with Self-Attention Networks Chang Xu, Ccile Paris, Surya

Civil Liberties Group Presentations Questions Directions: o Create a PowerPoint presentation or

Perinatal Mental Health Dr Michael Yousif, Consultant in Psychological Medicine, OUH NHSFT

Health Standards Section Rural Health Clinics Role &amp; Structure of Health Standards Section

1 Respecting Our Differences: How to Facilitate Difficult Conversations in the Classroom

Getting Started with Collective Impact Webinar Series Presented by: An Initiative of FSG and

CHRONIC HISTIOCYTIC INTERVILLOSITIS CD68 CD68 Chronic Histiocytic Intervillositis Background

BGI &amp; BGI Europe Introduction Wei XU Business Development Manager BGI Europe Who We Are

Health Standards Section Rural Health Clinics Role & Structure of Health Standards Section

BGI & BGI Europe Introduction Wei XU Business Development Manager BGI Europe Who We Are