Introduction Binary model Example Fit Test
Applied Statistics Lecturer: Serena Arima Introduction Binary - - PowerPoint PPT Presentation
Applied Statistics Lecturer: Serena Arima Introduction Binary - - PowerPoint PPT Presentation
Introduction Binary model Example Fit Test Applied Statistics Lecturer: Serena Arima Introduction Binary model Example Fit Test Introduction Until now: 1 Linear regression model; 2 Analysis of Variance model (ANOVA); 3 Analysis of
Introduction Binary model Example Fit Test
Introduction
Until now:
1 Linear regression model; 2 Analysis of Variance model (ANOVA); 3 Analysis of Covariance model (ANCOVA).
In practical applications, one often has to cope with phenomena that are discrete or mixed discrete-continuous nature.
Introduction Binary model Example Fit Test
Introduction
Until now:
1 Linear regression model; 2 Analysis of Variance model (ANOVA); 3 Analysis of Covariance model (ANCOVA).
In practical applications, one often has to cope with phenomena that are discrete or mixed discrete-continuous nature.
Introduction Binary model Example Fit Test
Introduction
Suppose we want to explain whether a family possesses a car or
- not. Let the sole explanatory variable to be the family income.
We have n families and the response variable is defined as yi = 1 if family i owns a car yi = 0 if family i does not own a car xi1 is the income of the family i.
Introduction Binary model Example Fit Test
Introduction
We estimate the relationship between y and x2 using the linear model yi = β0 + β1xi1 + ǫi = x′
i β + ǫi
It seems reasonable to make the standard assumption that E[ǫi|xi] = 0 E[yi|xi] = x′
i β
This implies that:
E[yi|xi] = 1 · Pr(yi = 1||xi) + 0 · Pr(yi = 0|xi) = Pr(yi = 1|xi) = x′
i β
Introduction Binary model Example Fit Test
Introduction
We estimate the relationship between y and x2 using the linear model yi = β0 + β1xi1 + ǫi = x′
i β + ǫi
It seems reasonable to make the standard assumption that E[ǫi|xi] = 0 E[yi|xi] = x′
i β
This implies that:
E[yi|xi] = 1 · Pr(yi = 1||xi) + 0 · Pr(yi = 0|xi) = Pr(yi = 1|xi) = x′
i β
Introduction Binary model Example Fit Test
Introduction
We can use the OLS method in order to estimate the model and we get:
- yi =
β0 + β1xi1
5 10 15 20 25
- 2
- 1
1 2
Regression model
Family Car
Introduction Binary model Example Fit Test
Introduction
Thus, the linear model implies that x′
i β is a probability and should
therefore lie between 0 and 1. This is only possible if the xi values are bounded and if certain restrictions on β are satisfied. Usually this is hard to achieve in practice. In addition, because yi has only two possible outcomes (0 and 1), the error term has two possible outcomes as well.
Introduction Binary model Example Fit Test
Introduction
Thus, the linear model implies that x′
i β is a probability and should
therefore lie between 0 and 1. This is only possible if the xi values are bounded and if certain restrictions on β are satisfied. Usually this is hard to achieve in practice. In addition, because yi has only two possible outcomes (0 and 1), the error term has two possible outcomes as well.
Introduction Binary model Example Fit Test
Introduction
In particular, the distribution of the error term ǫi is P(ǫi = −x′
i β) = P(yi = 0|xi) = 1 − x′ i β
P(ǫi = 1 − x′
i β) = P(yi = 1|xi) = x′ i β
Hence, the variance of the error term is V (ǫi|xi) = x′
i β(1 − x′ i β)
Hence, the error term is not Normal and it is also heteroskedastic! Moreover its variance depend upon the model parameters β.
Introduction Binary model Example Fit Test
Binary choice model
To overcome the problems, there exists a class of binary choice model designed to model the choice between two discrete
- alternatives. In general, we have
P(yi = 1|xi) = G(xi, β) for some function G(.) that takes values in [0, 1]. Usually, one restricts attention to functions of the form G(xi, beta) = F(x′
i β)
where F is some distribution function.
Introduction Binary model Example Fit Test
Binary choice model
To overcome the problems, there exists a class of binary choice model designed to model the choice between two discrete
- alternatives. In general, we have
P(yi = 1|xi) = G(xi, β) for some function G(.) that takes values in [0, 1]. Usually, one restricts attention to functions of the form G(xi, beta) = F(x′
i β)
where F is some distribution function.
Introduction Binary model Example Fit Test
Binary choice model
A common choice is the standard Normal distribution function F(w) = Φ(w) = w
−∞
1 √ 2π exp
- −1
2t2
- dt
leading the so-called probit model in which P(yi=1|xi = Φ(x′
i β) = Φ(β0 + β1xi1)
Introduction Binary model Example Fit Test
Binary choice model
Another choice is the standard logistic distribution function F(w) = L(w) = ew 1 + ew leading the so-called logit model in which P(yi = 1|xi) = exp(x′
i β)
1 + exp(x′
i β) =
exp(β0 + β1xi1) 1 + exp(β0 + β1xi1)
Introduction Binary model Example Fit Test
Binary choice model
This model can also be written as log P(yi = 1|xi) 1 − P(yi = 1|xi) = x′
i β
The left hand side is referred to log odds ratio. An odds ratio of 3 means the the odds of yi = 1 are 3 times those
- f yi = 0. Using this equality, the β coefficients can be interpreted
as describing the effect upon the odds ratio. For example, if βk = 0.1, a unit increase of xik increases the odds ratio by about 10%.
Introduction Binary model Example Fit Test
Binary choice model
Another common choice is the uniform distribution over the interval [0, 1] with distribution function F(w) = 0 w < 0 F(w) = w 0 ≤ w ≤ 0 F(w) = 1 w > 1. This results in the so-called linear probability model defined as Pr(yi = 1|xi) = 0 if x′
i β < 0;
Pr(yi = 1|xi) = x′
i β if 0 ≤ x′ i β ≤ 1;
Pr(yi = 1|xi) = 1 if x′
i β > 1.
Introduction Binary model Example Fit Test
Binary choice model: interpretation
A main difficulty with these models, it’s the parameters’ interpretation: apart for their signs, the coefficients in these binary choice models may be interpret according to marginal effect of changes in the explanatory variables. For a continuous explanatory variable xik, the marginal effect is defined as the partial derivative of the probability that yi equals one.
Introduction Binary model Example Fit Test
Binary choice model: interpretation
A main difficulty with these models, it’s the parameters’ interpretation: apart for their signs, the coefficients in these binary choice models may be interpret according to marginal effect of changes in the explanatory variables. For a continuous explanatory variable xik, the marginal effect is defined as the partial derivative of the probability that yi equals one.
Introduction Binary model Example Fit Test
Binary choice model: interpretation
For the probit model the marginal effect is dΦ(x′
i β)
dxik = φ(x′
i β)β
where φ denotes the standard normal density function, that is φ(w) = 1 √ 2π exp
- −1
2w2
Introduction Binary model Example Fit Test
Binary choice model: interpretation
For the logit model the marginal effect is dL(x′
i β)
dxik = ex′
i β
(1 + ex′
i β)
βk For the linear probability model the marginal effect is dx′
i β
dxik = βk (or 0).
Introduction Binary model Example Fit Test
Example 1: probit model
Suppose we have n = 2380 individuals and the following variables have been recorded (in 1920-1940): Loan: binary variable 1 if the bank loan is rejected, 0 if it is allowed; Income: monthly income for each individual; Race: race of each individual (0=white, 1=black) (R); LoanPayment: ratio income and loan payment (LP), income/payment
Introduction Binary model Example Fit Test
Example 1: probit model
We would like to study whether the rejection of a loan is related with other variables, such as the income, the race and the income/payment ratio. The response variable is a binary variable and the explanatory variables are both continuous and discrete. Let’s try to interpret different models!
Introduction Binary model Example Fit Test
Example 1: probit model
We would like to study whether the rejection of a loan is related with other variables, such as the income, the race and the income/payment ratio. The response variable is a binary variable and the explanatory variables are both continuous and discrete. Let’s try to interpret different models!
Introduction Binary model Example Fit Test
Example 0: linear model
We start with a simple linear model. The estimated model is: P(loanRejection = 1|LP) = −0.07991 + 0.60353LPi Increasing the income/loan ratio of 0.1, the probability that the loan is rejected increases of 0.06; What is the probability that the loan is rejected when the income/loan ratio is 0.5? The predicted probability is −0.07991 + 0.60353 · 0.5 = 0.22 What is the probability that the loan is rejected when the income/loan ratio is 0.01? The predicted probability is −0.07991 + 0.60353 · 0.01 = −0.073 (!!!)
Introduction Binary model Example Fit Test
Example 0: linear model
We start with a simple linear model. The estimated model is: P(loanRejection = 1|LP) = −0.07991 + 0.60353LPi Increasing the income/loan ratio of 0.1, the probability that the loan is rejected increases of 0.06; What is the probability that the loan is rejected when the income/loan ratio is 0.5? The predicted probability is −0.07991 + 0.60353 · 0.5 = 0.22 What is the probability that the loan is rejected when the income/loan ratio is 0.01? The predicted probability is −0.07991 + 0.60353 · 0.01 = −0.073 (!!!)
Introduction Binary model Example Fit Test
Example 0: linear model
We start with a simple linear model. The estimated model is: P(loanRejection = 1|LP) = −0.07991 + 0.60353LPi Increasing the income/loan ratio of 0.1, the probability that the loan is rejected increases of 0.06; What is the probability that the loan is rejected when the income/loan ratio is 0.5? The predicted probability is −0.07991 + 0.60353 · 0.5 = 0.22 What is the probability that the loan is rejected when the income/loan ratio is 0.01? The predicted probability is −0.07991 + 0.60353 · 0.01 = −0.073 (!!!)
Introduction Binary model Example Fit Test
Example 0: linear model
We start with a simple linear model. The estimated model is: P(loanRejection = 1|LP) = −0.07991 + 0.60353LPi Increasing the income/loan ratio of 0.1, the probability that the loan is rejected increases of 0.06; What is the probability that the loan is rejected when the income/loan ratio is 0.5? The predicted probability is −0.07991 + 0.60353 · 0.5 = 0.22 What is the probability that the loan is rejected when the income/loan ratio is 0.01? The predicted probability is −0.07991 + 0.60353 · 0.01 = −0.073 (!!!)
Introduction Binary model Example Fit Test
Example 0: linear model
0.0 0.5 1.0 1.5 2.0 2.5 3.0
- 0.5
0.0 0.5 1.0 1.5 Income/Loan ratio Loan
Introduction Binary model Example Fit Test
Example 1: probit model
Model 1: P(loanRejectioni = 1|LP = Φ(β + β1LPi) The estimated model is P(loanRejectioni = 1|LP) = Φ(−2.1941 + 2.9679LPi) How to interpret the model?
Introduction Binary model Example Fit Test
Example 1: probit model
P(loanRejectioni = 1|LP) = Φ(−2.1941 + 2.9679LPi) Step 0 Interpret the sign: increasing the income and loan-payment rate, the probability that the bank will reject a loan increases ( β1 = 2.9679). Step 1 What is the probability that the loan is rejected when the loan-payment rate is 0.3? P(loanRejectioni = 1|LP = 0.3) = Φ(−2.1941+2.9679·0.3) = 0.170
Introduction Binary model Example Fit Test
Example 1: probit model
P(loanRejectioni = 1|LP) = Φ(−2.1941 + 2.9679LPi) Step 0 Interpret the sign: increasing the income and loan-payment rate, the probability that the bank will reject a loan increases ( β1 = 2.9679). Step 1 What is the probability that the loan is rejected when the loan-payment rate is 0.3? P(loanRejectioni = 1|LP = 0.3) = Φ(−2.1941+2.9679·0.3) = 0.170
Introduction Binary model Example Fit Test
Example 1: probit model
Step 2 What is the probability that the loan is rejected when the loan-payment rate is 0.5? P(loanRejectioni = 1|LP = 0.5) = Φ(2.1941−2.9679·0.5) = 0.2388 Step 3 What is the probability that the loan is not allowed when the loan-payment rate is 0.8? That is, all income is used to pay the loan ratio P(loanRejectioni = 1|LP = 0.8) = Φ(2.1941−2.9679·0.8) = 0.571
Introduction Binary model Example Fit Test
Example 1: probit model
Model 2: Let’s insert the effect of the race. The estimated model is P(loanRejectioni = 1|LP, R) = Φ(−2.25879+2.74178LPi+0.70816Ri) How to interpret the model?
Introduction Binary model Example Fit Test
Example 1: probit model
P(loanRejectioni = 1|LP, R) = Φ(−2.25879+2.74178LPi+0.70816Ri) Step 0 Interpret the sign: increasing the income and loan-payment ratio, the probability that the bank will not allow a loan increases ( β1 = 2.7417) and it also decreases if the individual is black. Step 1 For a black man with loan-payment ratio equal to 0.3 the probability that the bank will reject a loan is P(loanRejectioni = 1|LP = 0.3R = 1) = Φ(2.26+2.74·0.3+0.71) = 0 and for a white man with the same ratio is P(loani = 1|LP = 0.3R = 1) = Φ(2.26 + 2.74 · 0.3) = 0.075
Introduction Binary model Example Fit Test
Example 1: logit model
Model 3: logit(P(loanRejectioni = 1|LP)) = β0 + β1LPi The estimated model is P(loanRejectioni = 1|LP) = exp(−4.0284 + 5.8845LPi) 1 + exp(−4.0284 + 5.8845LPi) Step 0 Interpret the sign: increasing the income and loan-payment rate, the probability that the bank will not allow a loan increases( β1 = 5.8845).
Introduction Binary model Example Fit Test
Example 1: logit model
Step 1 What is the probability that the loan is not allowed when the loan-payment rate is 0.3? P(loanRejectioni = 1|LP = 0.3) = exp(−4.0284 + 5.8845 · 0.3) 1 + exp(−4.0284 + 5.8845 · 0.3) = Step 2 What is the probability that the loan is not allowed when the loan-payment rate is 0.8? P(loanRejectioni = 1|LP = 0.8) = exp(−4.0284 + 5.8845 · 0.8) 1 + exp(−4.0284 + 5.8845 · 0.8) =
Introduction Binary model Example Fit Test
Probit and Logit model
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.4 0.8
Logit and Probit
Loan payment - income Loan Probit model Logit model
Introduction Binary model Example Fit Test
Model estimation
The likelihood function is for the binary response model is defined as L(β) =
n
- i=1
P(yi = 1|xi, β)yiP(yi = 0|xi, β)1−yi =
n
- i=1
F(x′
i β)yi(1 − F(x′ i β))1−yi
Introduction Binary model Example Fit Test
Model estimation
Hence the loglikelihood function is l(β) =
m
- i=1
yilogF(x′
i β) + n
- i=1
(1 − yi)(1 − F(x′
i β))
and its first derivative is dl(β) dβ =
n
- i=1
- yi − F(x′
i β)
F(x′
i β)(1 − F(x′ i β))f (x′ i β)
- xi = 0
Introduction Binary model Example Fit Test
Model estimation
The likelihood cannot be maximized analytically. We need numeric
- r iterative methods:
Newton - Raphson method; Fisher scoring method.
Introduction Binary model Example Fit Test
Goodness of fit
When the response variable is binary, the accuracy of the model can be judged either in terms of the fit between the calculated probabilities and the observed response frequencies
- r in terms of the model’s ability to forecast observed
responses. Contrary to the linear regression model, there is no single measure for the goodness of fit in binary choice models and a variety of measures exists.
Introduction Binary model Example Fit Test
Goodness of fit
A first goodness of fit measure is defined as
pseudo − R2 = 1 − 1 1 + 2(logL1 − logL0)/n
where logL1 denote the maximum loglikelihood value of the model of interest; logL0 denote the maximum loglikelihood value of the model with only intercept. pseudo − R2 ∈ [0, 1].
Introduction Binary model Example Fit Test
Goodness of fit
A first goodness of fit measure is defined as
pseudo − R2 = 1 − 1 1 + 2(logL1 − logL0)/n
where logL1 denote the maximum loglikelihood value of the model of interest; logL0 denote the maximum loglikelihood value of the model with only intercept. pseudo − R2 ∈ [0, 1].
Introduction Binary model Example Fit Test
Goodness of fit
An alternative measures is suggested by McFadded (1974)
McFaddenR2 = 1 − logL1 logL0
sometimes referred to as the likelihood ratio index. Because the log likelihood is the sum of log probabilities, it follows that logL0 < logL1 < 0, from which it is straightforward to show that also McFaddenR2 ∈ [0, 1].
Introduction Binary model Example Fit Test
Goodness of fit
An alternative measures is suggested by McFadded (1974)
McFaddenR2 = 1 − logL1 logL0
sometimes referred to as the likelihood ratio index. Because the log likelihood is the sum of log probabilities, it follows that logL0 < logL1 < 0, from which it is straightforward to show that also McFaddenR2 ∈ [0, 1].
Introduction Binary model Example Fit Test
Goodness of fit
Note that to compute logL0 it is not necessary to estimate a probit
- r logit model with intercept term only. Indeed, the ML estimate is
ˆ p = n1 n where n1 = n
i=1 yi and
logL0 = n1log(n1/n) + n0log(n0/n) On the other hand, the value of logL1 should be given by a computer package.
Introduction Binary model Example Fit Test
Goodness of fit
An alternative way to evaluate the goodness of fit is comparing correct and incorrect predictions.
The predicted values are ˆ yi = 1 if x′
i ˆ
β > 0 ˆ yi = 0 if x′
i ˆ
β ≤ 0
Introduction Binary model Example Fit Test
Goodness of fit
We can built the following table of predicted and observed values: yi
- yi
1 n00 n01 N0 1 n10 n11 N1
The proportion of correct predictions is given by HM = n00 N0 + n11 N1 Values of HM>1 define a good model.
Introduction Binary model Example Fit Test
Specification tests
Although the MLEs have the property of being consistent, there is
- ne important condition for this to hold: the likelihood function has
to be correctly specified. Consider the generic model P(yi = 1|xi) = F(x′
i β).
Suppose we want to test H0 : βk = 0 H1 : βk = 0 The test statistic is defined as z =
- βk
SE( βk) → N(0, 1) (asymptotic approximation)
Introduction Binary model Example Fit Test
Specification tests
On the other hand, suppose we would like to test H0 : β1 = β2 = β3 = 0 We compare the maximized loglikelihood of the full model l1( β) and the maximized loglikelihood of the reduced model (with β1 = β2 = β3 = 0) l0( β). We use the following likelihood ratio test T = −2(l1 − l0) ∼ χp−k where p is the number of parameters involved in the full model and k the number of parameters involved in the reduced model.
Introduction Binary model Example Fit Test
Binary choice model: an underlying latent model
It is possible to derive a binary choice model from underlying behavioural assumption. This leads to a latent variable representation of he model. Let us look at the decision of a married female to have a paid job
- r not. The utility difference between having a paid job or not
depends upon the wage but also on other personal characteristics like the age, the education, whether there are young children in the family, etc.
Introduction Binary model Example Fit Test
Binary choice model: an underlying latent model
Thus, for each person i we can write the utility difference between having a job an not as function of observed characteristics, x and unobserved characteristics ǫ. The utility difference y∗
i can be
defined as y∗
i = x′ i β + ǫi
Because y∗
i is unobserved, it is referred to as a latent variable.
Introduction Binary model Example Fit Test
Binary choice model: an underlying latent model
Thus, for each person i we can write the utility difference between having a job an not as function of observed characteristics, x and unobserved characteristics ǫ. The utility difference y∗
i can be
defined as y∗
i = x′ i β + ǫi
Because y∗
i is unobserved, it is referred to as a latent variable.
Introduction Binary model Example Fit Test
Binary choice model: an underlying latent model
We assume that a woman chooses to work if the utility difference y∗
i exceeds a certain threshold level, that is
yi = 1 if y∗
i > γ
In the binary choice model typically γ = 0. Hence, P(yi = 1) = P(y∗
i > 0) = P(x′ i β+ǫi > 0) = P(−ǫi ≤ x′ i β) = F(x′ i β)
Introduction Binary model Example Fit Test