Introduction to General and Generalized Linear Models Generalized - PowerPoint PPT Presentation

Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby October 2010 Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall October 2010 1 / 29

Today The generalized linear model Link function (Estimation) Fitted values Residuals Likelihood ratio test Over-dispersion Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall October 2010 2 / 29

The Generalized Linear Model The Generalized Linear Model Definition (The generalized linear model) Assume that Y 1 , Y 2 , . . . , Y n are mutually independent, and the density can be described by an exponential dispersion model with the same variance function V ( µ ) . A generalized linear model for Y 1 , Y 2 , . . . , Y n describes an affine hypothesis for η 1 , η 2 , . . . , η n , where η i = g ( µ i ) is a transformation of the mean values µ 1 , µ 2 , . . . , µ n . The hypothesis is of the form H 0 : η − η 0 ∈ L, where L is a linear subspace R n of dimension k , and where η 0 denotes a vector of known off-set values . Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall October 2010 3 / 29

The Generalized Linear Model Dimension and design matrix Definition (Dimension of the generalized linear model) The dimension k of the subspace L for the generalized linear model is the dimension of the model Definition (Design matrix for the generalized linear model) Consider the linear subspace L = span { x 1 , . . . , x k } , i.e. the subspace is spanned by k vectors ( k < n ), such that the hypothesis can be written η − η 0 = Xβ with β ∈ R k , where X has full rank. The n × k matrix X is called the design matrix . The i th row of the design matrix is given by the model vector 0 1 x i 1 B C x i 2 B C x i = A , B . C . @ . x ik for the i th observation. Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall October 2010 4 / 29

The Generalized Linear Model The link function Definition (The link function) The link function , g ( · ) describes the relation between the linear predictor η i and the mean value parameter µ i = E[ Y i ] . The relation is η i = g ( µ i ) The inverse mapping g − 1 ( · ) thus expresses the mean value µ as a function of the linear predictor η : µ = g − 1 ( η ) that is   � µ i = g − 1 ( x iT β ) = g − 1  x ij β j j Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall October 2010 5 / 29

The Generalized Linear Model Link functions The most commonly used link functions, η = g ( µ ) , are : µ = g − 1 ( η ) Name Link function η = g ( µ ) Identity µ η logarithm ln( µ ) exp( η ) logit ln( µ/ (1 − µ )) exp( η ) / [1 + exp( η )] reciprocal 1 /µ 1 /η µ k η 1 /k power √ µ η 2 squareroot Φ − 1 ( µ ) probit Φ( η ) log-log ln( − ln( µ )) exp( − exp( η )) cloglog ln( − ln(1 − µ )) 1 − exp( − exp( η )) Table: Commonly used link function. Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall October 2010 6 / 29

The Generalized Linear Model The canonical link The canonical link is the function which transforms the mean to the canonical location parameter of the exponential dispersion family, i.e. it is the function for which g ( µ ) = θ . The canonical link function for the most widely considered densities are Density Link: η = g ( µ ) Name Normal η = µ identity Poisson η = ln( µ ) logarithm Binomial η = ln[ µ/ (1 − µ )] logit Gamma η = 1 /µ reciprocal η = 1 /µ 2 Inverse Gauss power ( k = − 2 ) Table: Canonical link functions for some widely used densities. Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall October 2010 7 / 29

The Generalized Linear Model Specification of a generalized linear model a) Distribution / Variance function: Specification of the distribution – or the variance function V ( µ ) . b) Link function: Specification of the link function g ( · ) , which describes a function of the mean value which can be described linearly by the explanatory variables. c) Linear predictor: Specification of the linear dependency g ( µ i ) = η i = ( x i ) T β . d) Precision (optional): If needed the precision is formulated as known individual weights , λ i = w i , or as a common dispersion parameter , λ = 1 /σ 2 , or a combination λ i = w i /σ 2 . Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall October 2010 8 / 29

The Generalized Linear Model Maximum likelihood estimation Theorem (Estimation in generalized linear models) Consider the generalized linear model as defined on slide 3 for the observations Y 1 , . . . Y n and assume that Y 1 , . . . Y n are mutually independent with densities, which can be described by an exponential dispersion model with the variance function V ( · ) , dispersion parameter σ 2 , and optionally the weights w i . Assume that the linear predictor is parameterized with β corresponding to the design matrix X , then the maximum likelihood estimate � β for β is found as the solution to [ X ( β )] T i µ ( µ )( y − µ ) = 0 , where X ( β ) denotes the local design matrix and µ = µ ( β ) given by T β ) , µ i ( β ) = g − 1 ( x i denotes the fitted mean values corresponding to the parameters β , and i µ ( µ ) is the expected information with respect to µ . Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall October 2010 9 / 29

The Generalized Linear Model Properties of the ML estimator Theorem (Asymptotic distribution of the ML estimator) Under the hypothesis η = Xβ we have asymptotically � β − β √ ∈ N k ( 0 , Σ ) , σ 2 where the dispersion matrix Σ for � β is β ] = Σ = [ X T W ( β ) X ] − 1 D[ � with � � w i W ( β ) = diag , [ g ′ ( µ i )] 2 V ( µ i ) In the case of the canonical link, the weight matrix W ( β ) is W ( β ) = diag { w i V ( µ i ) } . Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall October 2010 10 / 29

The Generalized Linear Model Linear prediction for the generalized linear model Definition (Linear prediction for the generalized linear model) The linear prediction � η is defined as the values η = X � � β with the linear prediction corresponding to the i ’th observation is k � β j = ( x i ) T � x ij � η i = β . � j =1 The linear predictions � η are approximately normally distributed with σ 2 X Σ X T D[ � η ] ≈ � where Σ is the dispersion matrix for � β . Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall October 2010 11 / 29

The Generalized Linear Model Fitted values for the generalized linear model Definition (Fitted values for the generalized linear model) The fitted values are defined as the values µ = µ ( X � β ) , � where the i th value is given as µ i = g − 1 ( � η i ) � with the fitted value � η i of the linear prediction. The fitted values � µ are approximately normally distributed with � ∂ µ � 2 σ 2 X Σ X T D[ � µ ] ≈ � ∂ η where Σ is the dispersion matrix for � β . Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall October 2010 12 / 29

The Generalized Linear Model Residual deviance Definition (Residual deviance) Consider the generalized linear model defined on slide 3. The residual deviance corresponding to this model is n � D( y ; µ ( � β )) = w i d ( y i ; � µ i ) i =1 with d ( y i ; � µ i ) denoting the unit deviance corresponding the observation y i and the fitted value � µ i and where w i denotes the weights (if present). If the model includes a dispersion parameter σ 2 , the scaled residual deviance is β )) = D( y ; µ ( � β )) D ∗ ( y ; µ ( � . σ 2 Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall October 2010 13 / 29

The Generalized Linear Model Residuals Residuals represents the difference between the data and the model. In the classical GLM the residuals are r i = y i − � µ i . These are called response residuals for GLM’s. Since the variance of the response is not constant for most GLM’s we need some modification. We will look at: Deviance residuals Pearson residuals Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall October 2010 14 / 29

The Generalized Linear Model Residuals Definition (Deviance residual) Consider the generalized linear model from for the observations Y 1 , . . . Y n . The deviance residual for the i ’th observation is defined as � r D i = r D ( y i ; � µ i ) = sign ( y i − � µ i ) w i d ( y i , � µ i ) where sign ( x ) denotes the sign function sign ( x ) = 1 for x > 0 og sign ( x ) = − 1 for x < 0 , and with w i denoting the weight (if relevant), d ( y ; µ ) denoting the unit deviance and � µ i denoting the fitted value corresponding to the i ’th observation. Assessments of the deviance residuals is in good agreement with the likelihood approach as the deviance residuals simply express differences in log-likelihood. Henrik Madsen Poul Thyregod (IMM-DTU) Chapman & Hall October 2010 15 / 29

Introduction to General and Generalized Linear Models Generalized - PowerPoint PPT Presentation

Introduction to General and Generalized Linear Models Generalized Linear Models - part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby October 2010 Henrik Madsen Poul

Overview of logistic regression Richard Erickson Instructor DataCamp Generalized Linear Models

Limitations of linear models Richard Erickson Instructor DataCamp Generalized Linear Models in

Introduction to General and Generalized Linear Models Generalized Linear Models - part I Henrik

Introduction to General and Generalized Linear Models Generalized Linear Models - part III Henrik

Introduction to the R Statistical Computing Environment Linear and Generalized Linear Models in R

Generalized linear models Christopher F Baum EC 823: Applied Econometrics Boston College, Spring

Multiple logistic regression Richard Erickson Instructor DataCamp Generalized Linear Models in

Workshop 11.2a: Generalized Linear Mixed Effects Models (GLMM) Murray Logan February 7, 2017

Generalized Nonlinear Models gnm : a Package for Generalized Nonlinear Models Same form as

Introduction to General and Generalized Linear Models General Linear Models - part I Henrik

Introduction to General and Generalized Linear Models General Linear Models - part II Henrik

Generalized Additive Models September 10, 2019 Generalized Additive Models September 10, 2019 1

Introduction to Data Science: Logistic 0 1 1 according to a data fit criterion. account

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Workshop 3 Building from Linear Models to Generalised Linear Models Part 2: GLMs 2 2 What are

Statistics for Applications Chapter 10: Generalized Linear Models (GLMs) 1/52 Linear model A

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Linear Regression Part 2: Residuals and Errors INFO-1301, Quantitative Reasoning 1 University of

Deep Residual Learning for Portfolio Optimization: With Attention and Switching Modules Jeff

Lecture 3 Residual Analysis + Generalized Linear Models Colin Rundel 1/23/2017 1 Residual

Iterative Techniques in Matrix Algebra Relaxation Techniques for Solving Linear Systems Numerical

The vanishing gradient problem revisited: Highway and residual connections CS 6956: Deep

Augmenting Paths Math 482, Lecture 25 Misha Lavrov April 3, 2020 The greedy algorithm

Machine Learning and Data Mining Linear regression Kalev Kask Supervised learning Notation