Advanced Section #5: Generalized Linear Models: Logistic Regression - PowerPoint PPT Presentation

Advanced Section #5: Generalized Linear Models: Logistic Regression and Beyond Marios Mattheakis and Pavlos Protopapas CS109A Introduction to Data Science Pavlos Protopapas and Kevin Rader 1

Outline 1. Generalized Linear Models (GLMs): a. Motivation. b. Linear Regression Model (Recap): jumping-off point c. Generalize the Linear Model: i. Generalization of random component (Error Distribution). ii. Generalization of systematic component (Link Function). 2. Maximum Likelihood Estimation in this General Framework: a. Canonical Links. b. General Links. CS109A, P ROTOPAPAS , R ADER 2

Motivation Ordinary Linear Regression (OLS) is a great model … but cannot describe all the situations. OLS assumes: ➢ Normal distributed observations. ➢ Expectation that linearly depends on predictors. Many real-world observations do not follow these assumptions, e.g.: ➢ Binary data: Bernoulli or Binomial distributions. ➢ Positive data: Exponential or Gamma distributions. CS109A, P ROTOPAPAS , R ADER 3

GLMs formulations: Overview Error distribution: Normal Exponential Family Poisson Distributions Bernoulli ...more Generalized Linear Models Regression Model Link Function ...more CS109A, P ROTOPAPAS , R ADER 4

Regression Models Suppose a dataset with n training points In a Regression model we are looking for: ➢ is some fixed but unknown function. ➢ a random error term. CS109A, P ROTOPAPAS , R ADER 5

Linear Regression Model The observations are independently distributed about: A linear predictor with a Normal distribution. Linear Model: CS109A, P ROTOPAPAS , R ADER 6

Linear Regression Model The conditional on the predictor distribution: CS109A, P ROTOPAPAS , R ADER 7

GLMs formulation CS109A, P ROTOPAPAS , R ADER 8

GLMs formulation This will be a two-step generalization of simple linear regression. 1. Random Component: 2. Systematic Component: CS109A, P ROTOPAPAS , R ADER 9

Exponential Family of Distributions A wide range of distributions that includes a special cases the Normal, exponential, Gamma, Poisson, Bernoulli, binomial, and many others. : canonical parameter and is the parameter of interest. : dispersion parameter and is a scale parameter relative to variance. : cumulant function and completely characterizes the distribution. : normalization factor. CS109A, P ROTOPAPAS , R ADER 10

Likelihood and Score function Likelihood: log-likelihood: easier and numerically more stable Score function: CS109A, P ROTOPAPAS , R ADER 11

Two General Identities is the called Fisher information matrix. denotes the ν moment. CS109A, P ROTOPAPAS , R ADER 12

Some derivatives before the proofs First derivative of log-likelihood: Second derivative of log-likelihood: CS109A, P ROTOPAPAS , R ADER 13

Some useful relations before the proofs The ν moment of an arbitrary function: Since the observations are assumed independent of each other: For a well defined probability density: CS109A, P ROTOPAPAS , R ADER 14

Proof of Identity I Proof: the regularity condition takes the derivative out of the integral. CS109A, P ROTOPAPAS , R ADER 15

Proof of Identity II Proof 1st term: 2nd term: CS109A, P ROTOPAPAS , R ADER 16

Mean & Variance Formulas in the Exponential Family where primes denote derivatives w.r.t. canonical parameter is the cumulant function of the distribution, since it completely determines the first two moments. CS109A, P ROTOPAPAS , R ADER 17

Some derivatives before the proofs CS109A, P ROTOPAPAS , R ADER 18

Proof of mean formula Proof CS109A, P ROTOPAPAS , R ADER 19

Proof of Variance formula Proof CS109A, P ROTOPAPAS , R ADER 20

Normal Distribution: Example Probability density in Normal distribution: CS109A, P ROTOPAPAS , R ADER 21

Bernoulli distribution: Example It is a discrete probability distribution of a random binary variable: CS109A, P ROTOPAPAS , R ADER 22

Second step of GLMs formulation: Link Function Systematic Component: CS109A, P ROTOPAPAS , R ADER 23

Link Function A link function is a one-to-one differentiable transformation that transforms the expectation values to be linear with the predictors is called linear predictor. One-to-one function, so we can invert to get The link transforms the expectation NOT the observations. For instance, for the link CS109A, P ROTOPAPAS , R ADER 24

Canonical Links A Canonical Link makes the linear predictor equal to the canonical parameter A Canonical Transformation is relative to the cumulant function So, the cumulant function must be invertible CS109A, P ROTOPAPAS , R ADER 25

Normal and Bernoulli distributions: Examples Normal Distribution: We found earlier: Hence, Bernoulli Distribution: We found earlier: Hence, CS109A, P ROTOPAPAS , R ADER 26

Data Distribution and Canonical Links CS109A, P ROTOPAPAS , R ADER 27

GLMs: A general framework We found that linear, logistic and other regression models are special cases of the GMLs. Working in such a general framework is a great advantage. There is general theory that can be applied afterwards in any specific distribution and regression model. For instance, we have the general Likelihood and we can derive to general equations that Maximize the Likelihood. CS109A, P ROTOPAPAS , R ADER 28

Maximum Likelihood Estimation (MLE) CS109A, P ROTOPAPAS , R ADER 29

Maximum Likelihood Estimation (MLE) Likelihood in the Exponential Family: Log-likelihood in the Exponential Family: CS109A, P ROTOPAPAS , R ADER 30

log-likelihood is a strictly concave function hence, it can be maximized. CS109A, P ROTOPAPAS , R ADER 31

MLE for Canonical Links Normal Equations for MLE Solving Normal Equations we estimate the coefficients CS109A, P ROTOPAPAS , R ADER 32

MLE Examples Normal Distribution: Link = Identity Bernoulli Distribution: Link = Logit CS109A, P ROTOPAPAS , R ADER 33

MLE for General Links Sometimes we may use non-Canonical links. For instance, for algorithmic purposes such in the Bayesian probit regression. Generalizing Estimating Equations: CS109A, P ROTOPAPAS , R ADER 34

Summary Generalized Linear Models: • Motivation: OLS cannot describe everything. Good jumping-off. 1. Formulation: 2. ➢ Generalization of Random Component (error distribution). ➢ Generalization of Systematic Component (Link function). Normal & Bernoulli distributions: Examples. 3. • Maximum Likelihood Estimation (MLE) 1. General Framework: One theory for many regression models. Normal Equations for MLE (Canonical Links). 2. ➢ Linear & Logistic Regression examples. Generalized Estimating Equations (General Links). 3. CS109A, P ROTOPAPAS , R ADER 35

Advanced Section 5: Generalized Linear Models Questions ?? Office hours for Adv. Sec. Monday 6:00-7:30 pm Tuesday 6:30-8:00 pm CS109A, P ROTOPAPAS , R ADER 36

General Equations: Proof Using the chain rule: hence CS109A, P ROTOPAPAS , R ADER 37

Advanced Section #5: Generalized Linear Models: Logistic Regression - PowerPoint PPT Presentation

Advanced Section #5: Generalized Linear Models: Logistic Regression and Beyond Marios Mattheakis and Pavlos Protopapas CS109A Introduction to Data Science Pavlos Protopapas and Kevin Rader 1 Outline 1. Generalized Linear Models (GLMs): a.

Overview of logistic regression Richard Erickson Instructor DataCamp Generalized Linear Models

Multiple logistic regression Richard Erickson Instructor DataCamp Generalized Linear Models in

Limitations of linear models Richard Erickson Instructor DataCamp Generalized Linear Models in

LOGISTIC REGRESSION AND GENERALIZED LINEAR MODELS W. RYAN LEE CS109/AC209/STAT121 ADVANCED

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Advanced Section #5: Generalized Linear Models: Logistic Regression and Beyond Nick Stern

Topics of the day Logistic regression and generalized linear models Rasmus Waagepetersen

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Introduction to the R Statistical Computing Environment Linear and Generalized Linear Models in R

Introduction to Data Science: Logistic 0 1 1 according to a data fit criterion. account

Generalized linear models Christopher F Baum EC 823: Applied Econometrics Boston College, Spring

Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik

Introduction to General and Generalized Linear Models Generalized Linear Models - part I Henrik

Workshop 11.2a: Generalized Linear Mixed Effects Models (GLMM) Murray Logan February 7, 2017

Introduction to General and Generalized Linear Models Generalized Linear Models - part III Henrik

Logistic mixed models for DIF IRT models can be regarded as logistic mixed models (e.g., Adams,

Discursive Framing & Community Mobilization: Stopping the Melancthon Mega Quarry Rebecca

Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers Georgios

Desert Ecology Presented by the McDowell Sonoran Field Institute a program

Optimal Order Strategies on the Day- Ahead Electricity Market Martin Biel 20/9-2017 Outline

Problem Norman pays Oklahoma City $3.10/1000 gallons for drinking water Previously from

TensorFlow Probability Joshua V. Dillon Software Engineer Google Research What is TensorFlow

The Brachistochrone Curve Paige R MacDonald May 16, 2014 Paige R MacDonald The Brachistochrone

Pedagogic Transformation Parth Shah, James Campbell, Umang Shah What is the Vision? We

Advanced Section #5: Generalized Linear Models: Logistic Regression - PowerPoint PPT Presentation

Advanced Section #5: Generalized Linear Models: Logistic Regression and Beyond Marios Mattheakis and Pavlos Protopapas CS109A Introduction to Data Science Pavlos Protopapas and Kevin Rader 1 Outline 1. Generalized Linear Models (GLMs): a.

Overview of logistic regression Richard Erickson Instructor DataCamp Generalized Linear Models

Multiple logistic regression Richard Erickson Instructor DataCamp Generalized Linear Models in

Limitations of linear models Richard Erickson Instructor DataCamp Generalized Linear Models in

LOGISTIC REGRESSION AND GENERALIZED LINEAR MODELS W. RYAN LEE CS109/AC209/STAT121 ADVANCED

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Advanced Section #5: Generalized Linear Models: Logistic Regression and Beyond Nick Stern

Topics of the day Logistic regression and generalized linear models Rasmus Waagepetersen

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Introduction to the R Statistical Computing Environment Linear and Generalized Linear Models in R

Introduction to Data Science: Logistic 0 1 1 according to a data fit criterion. account

Generalized linear models Christopher F Baum EC 823: Applied Econometrics Boston College, Spring

Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik

Introduction to General and Generalized Linear Models Generalized Linear Models - part I Henrik

Workshop 11.2a: Generalized Linear Mixed Effects Models (GLMM) Murray Logan February 7, 2017

Introduction to General and Generalized Linear Models Generalized Linear Models - part III Henrik

Logistic mixed models for DIF IRT models can be regarded as logistic mixed models (e.g., Adams,

Discursive Framing &amp; Community Mobilization: Stopping the Melancthon Mega Quarry Rebecca

Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers Georgios

Desert Ecology Presented by the McDowell Sonoran Field Institute a program

Optimal Order Strategies on the Day- Ahead Electricity Market Martin Biel 20/9-2017 Outline

Problem Norman pays Oklahoma City $3.10/1000 gallons for drinking water Previously from

TensorFlow Probability Joshua V. Dillon Software Engineer Google Research What is TensorFlow

The Brachistochrone Curve Paige R MacDonald May 16, 2014 Paige R MacDonald The Brachistochrone

Pedagogic Transformation Parth Shah, James Campbell, Umang Shah What is the Vision? We

Discursive Framing & Community Mobilization: Stopping the Melancthon Mega Quarry Rebecca