BS2247 Introduction to Econometrics Lecture 4: The simple regression - - PowerPoint PPT Presentation

bs2247 introduction to econometrics lecture 4 the simple
SMART_READER_LITE
LIVE PREVIEW

BS2247 Introduction to Econometrics Lecture 4: The simple regression - - PowerPoint PPT Presentation

BS2247 Introduction to Econometrics Lecture 4: The simple regression model OLS Unbiasedness, OLS Variances, Units of measurement, Nonlinearities Dr. Kai Sun Aston Business School 1 / 26 Unbiasedness of OLS SLR for Simple Linear


slide-1
SLIDE 1

BS2247 Introduction to Econometrics Lecture 4: The simple regression model

OLS Unbiasedness, OLS Variances, Units of measurement, Nonlinearities

  • Dr. Kai Sun

Aston Business School

1 / 26

slide-2
SLIDE 2

Unbiasedness of OLS

SLR for “Simple Linear Regression”

◮ Assumption SLR.1: The population model is linear in

parameters, i.e., y = β1

0 + β1 1x + u ◮ Assumption SLR.2: Use a random sample of size n,

{(xi, yi) : i = 1, 2, . . . , n}, so yi = β0 + β1xi + ui

◮ Assumption SLR.3: There is sample variation in x, i.e.,

  • i(xi − ¯

x)2 > 0 (recall the formula for ˆ β1)

◮ Assumption SLR.4: Zero conditional mean, i.e., E(ui|xi) = 0

2 / 26

slide-3
SLIDE 3

Unbiasedness of OLS

SLR for “Simple Linear Regression”

◮ Assumption SLR.1: The population model is linear in

parameters, i.e., y = β1

0 + β1 1x + u ◮ Assumption SLR.2: Use a random sample of size n,

{(xi, yi) : i = 1, 2, . . . , n}, so yi = β0 + β1xi + ui

◮ Assumption SLR.3: There is sample variation in x, i.e.,

  • i(xi − ¯

x)2 > 0 (recall the formula for ˆ β1)

◮ Assumption SLR.4: Zero conditional mean, i.e., E(ui|xi) = 0

2 / 26

slide-4
SLIDE 4

Unbiasedness of OLS

SLR for “Simple Linear Regression”

◮ Assumption SLR.1: The population model is linear in

parameters, i.e., y = β1

0 + β1 1x + u ◮ Assumption SLR.2: Use a random sample of size n,

{(xi, yi) : i = 1, 2, . . . , n}, so yi = β0 + β1xi + ui

◮ Assumption SLR.3: There is sample variation in x, i.e.,

  • i(xi − ¯

x)2 > 0 (recall the formula for ˆ β1)

◮ Assumption SLR.4: Zero conditional mean, i.e., E(ui|xi) = 0

2 / 26

slide-5
SLIDE 5

Unbiasedness of OLS

SLR for “Simple Linear Regression”

◮ Assumption SLR.1: The population model is linear in

parameters, i.e., y = β1

0 + β1 1x + u ◮ Assumption SLR.2: Use a random sample of size n,

{(xi, yi) : i = 1, 2, . . . , n}, so yi = β0 + β1xi + ui

◮ Assumption SLR.3: There is sample variation in x, i.e.,

  • i(xi − ¯

x)2 > 0 (recall the formula for ˆ β1)

◮ Assumption SLR.4: Zero conditional mean, i.e., E(ui|xi) = 0

2 / 26

slide-6
SLIDE 6

Unbiasedness of ˆ β1

In order to think about unbiasedness, we need to rewrite our estimator, ˆ β’s, in terms of the population parameter, β’s. ˆ β1 =

  • i(xi − ¯

x)(yi − ¯ y)

  • i(xi − ¯

x)2 =

  • i(xi − ¯

x)yi

  • i(xi − ¯

x)2

3 / 26

slide-7
SLIDE 7

Unbiasedness of ˆ β1

Plug in yi = β0 + β1xi + ui and define SSTx =

i(xi − ¯

x)2, ˆ β1 =

  • i(xi − ¯

x)(β0 + β1xi + ui) SSTx = 0 + β1 +

  • i ui(xi − ¯

x) SSTx using the fact that

i(xi − ¯

x) = 0 and

  • i xi(xi − ¯

x) =

i(xi − ¯

x)2 = SSTx.

4 / 26

slide-8
SLIDE 8

Unbiasedness of ˆ β1

◮ Take expectation (conditional on xi) for both sides:

E(ˆ β1|xi) = β1 +

1 SSTx

  • i E(ui|xi)(xi − ¯

x) = β1 (using the Assumption SLR.4 that E(ui|xi) = 0)

◮ Finally, E(ˆ

β1) = E(E(ˆ β1|xi)) = E(β1) = β1 (first equality by law of iterated expectation).

◮ The fact that E(ˆ

β1) = β1 means that ˆ β1 is unbiased for β1.

5 / 26

slide-9
SLIDE 9

Unbiasedness of ˆ β1

◮ Take expectation (conditional on xi) for both sides:

E(ˆ β1|xi) = β1 +

1 SSTx

  • i E(ui|xi)(xi − ¯

x) = β1 (using the Assumption SLR.4 that E(ui|xi) = 0)

◮ Finally, E(ˆ

β1) = E(E(ˆ β1|xi)) = E(β1) = β1 (first equality by law of iterated expectation).

◮ The fact that E(ˆ

β1) = β1 means that ˆ β1 is unbiased for β1.

5 / 26

slide-10
SLIDE 10

Unbiasedness of ˆ β1

◮ Take expectation (conditional on xi) for both sides:

E(ˆ β1|xi) = β1 +

1 SSTx

  • i E(ui|xi)(xi − ¯

x) = β1 (using the Assumption SLR.4 that E(ui|xi) = 0)

◮ Finally, E(ˆ

β1) = E(E(ˆ β1|xi)) = E(β1) = β1 (first equality by law of iterated expectation).

◮ The fact that E(ˆ

β1) = β1 means that ˆ β1 is unbiased for β1.

5 / 26

slide-11
SLIDE 11

Unbiasedness of ˆ β0

Recall that ˆ β0 = ¯ y − ˆ β1¯ x = (β0 + β1¯ x + ¯ u) − ˆ β1¯ x = β0 + (β1 − ˆ β1)¯ x + ¯ u

◮ Take expectation for both sides:

E(ˆ β0) = β0 + ¯ xE(β1 − ˆ β1) + E(¯ u) = β0 using the fact that E(ˆ β1) = β1 and E(¯ u) = E(1/n

i ui) = 1/n i E(ui) = 0 ◮ The fact that E(ˆ

β0) = β0 means that ˆ β0 is unbiased for β0.

6 / 26

slide-12
SLIDE 12

Unbiasedness of ˆ β0

Recall that ˆ β0 = ¯ y − ˆ β1¯ x = (β0 + β1¯ x + ¯ u) − ˆ β1¯ x = β0 + (β1 − ˆ β1)¯ x + ¯ u

◮ Take expectation for both sides:

E(ˆ β0) = β0 + ¯ xE(β1 − ˆ β1) + E(¯ u) = β0 using the fact that E(ˆ β1) = β1 and E(¯ u) = E(1/n

i ui) = 1/n i E(ui) = 0 ◮ The fact that E(ˆ

β0) = β0 means that ˆ β0 is unbiased for β0.

6 / 26

slide-13
SLIDE 13

Unbiasedness of ˆ β0

Recall that ˆ β0 = ¯ y − ˆ β1¯ x = (β0 + β1¯ x + ¯ u) − ˆ β1¯ x = β0 + (β1 − ˆ β1)¯ x + ¯ u

◮ Take expectation for both sides:

E(ˆ β0) = β0 + ¯ xE(β1 − ˆ β1) + E(¯ u) = β0 using the fact that E(ˆ β1) = β1 and E(¯ u) = E(1/n

i ui) = 1/n i E(ui) = 0 ◮ The fact that E(ˆ

β0) = β0 means that ˆ β0 is unbiased for β0.

6 / 26

slide-14
SLIDE 14

Variance of OLS

◮ While unbiasedness means that the sampling distribution of

  • ur estimate is centered around the true parameter

◮ Want to think about how spread out this distribution is ◮ Much easier to think about this variance

under an additional assumption: Assumption SLR.5: Var(u|x) = σ2 (Homoskedasticity: the conditional variance of u is a constant)

7 / 26

slide-15
SLIDE 15

Variance of OLS

◮ While unbiasedness means that the sampling distribution of

  • ur estimate is centered around the true parameter

◮ Want to think about how spread out this distribution is ◮ Much easier to think about this variance

under an additional assumption: Assumption SLR.5: Var(u|x) = σ2 (Homoskedasticity: the conditional variance of u is a constant)

7 / 26

slide-16
SLIDE 16

Variance of OLS

◮ While unbiasedness means that the sampling distribution of

  • ur estimate is centered around the true parameter

◮ Want to think about how spread out this distribution is ◮ Much easier to think about this variance

under an additional assumption: Assumption SLR.5: Var(u|x) = σ2 (Homoskedasticity: the conditional variance of u is a constant)

7 / 26

slide-17
SLIDE 17

Var(u|x) = E(u2|x) − [E(u|x)]2 Since E(u|x) = 0, Var(u|x) = E(u2|x) = σ2

◮ By the law of iterated expectation,

E(u2) = E(E(u2|x)) = E(σ2) = σ2. Var(u) = E(u2) = σ2 means that the unconditional variance

  • f u is a constant, too - σ2 is also called error variance.

◮ If we take the conditional variance for both sides of

y = β0 + β1x + u, Var(y|x) = Var(β0 + β1x|x) + Var(u|x) = Var(u|x) = σ2. Var(β0 + β1x|x) = 0 because when we “conditional on x”, we can view (β0 + β1x) as a constant, whose variance = 0.

8 / 26

slide-18
SLIDE 18

Var(u|x) = E(u2|x) − [E(u|x)]2 Since E(u|x) = 0, Var(u|x) = E(u2|x) = σ2

◮ By the law of iterated expectation,

E(u2) = E(E(u2|x)) = E(σ2) = σ2. Var(u) = E(u2) = σ2 means that the unconditional variance

  • f u is a constant, too - σ2 is also called error variance.

◮ If we take the conditional variance for both sides of

y = β0 + β1x + u, Var(y|x) = Var(β0 + β1x|x) + Var(u|x) = Var(u|x) = σ2. Var(β0 + β1x|x) = 0 because when we “conditional on x”, we can view (β0 + β1x) as a constant, whose variance = 0.

8 / 26

slide-19
SLIDE 19

Var(u|x) = E(u2|x) − [E(u|x)]2 Since E(u|x) = 0, Var(u|x) = E(u2|x) = σ2

◮ By the law of iterated expectation,

E(u2) = E(E(u2|x)) = E(σ2) = σ2. Var(u) = E(u2) = σ2 means that the unconditional variance

  • f u is a constant, too - σ2 is also called error variance.

◮ If we take the conditional variance for both sides of

y = β0 + β1x + u, Var(y|x) = Var(β0 + β1x|x) + Var(u|x) = Var(u|x) = σ2. Var(β0 + β1x|x) = 0 because when we “conditional on x”, we can view (β0 + β1x) as a constant, whose variance = 0.

8 / 26

slide-20
SLIDE 20

Var(u|x) = E(u2|x) − [E(u|x)]2 Since E(u|x) = 0, Var(u|x) = E(u2|x) = σ2

◮ By the law of iterated expectation,

E(u2) = E(E(u2|x)) = E(σ2) = σ2. Var(u) = E(u2) = σ2 means that the unconditional variance

  • f u is a constant, too - σ2 is also called error variance.

◮ If we take the conditional variance for both sides of

y = β0 + β1x + u, Var(y|x) = Var(β0 + β1x|x) + Var(u|x) = Var(u|x) = σ2. Var(β0 + β1x|x) = 0 because when we “conditional on x”, we can view (β0 + β1x) as a constant, whose variance = 0.

8 / 26

slide-21
SLIDE 21

Homoskedastic Case

y

f(y|x)

. .

E(y|x) = 0 + 1x

x1 x2

9 / 26

slide-22
SLIDE 22

Heteroskedastic Case

f(y|x)

. . .

E(y|x) = 0 + 1x

x x1 x2 x3

10 / 26

slide-23
SLIDE 23

Variance of ˆ β1

Recall that ˆ β1 = β1 +

  • i ui(xi−¯

x) SSTx

Var(ˆ β1|xi) = 1 SST 2

x

Var(

  • i

(xi − ¯ x)ui|xi) = 1 SST 2

x

  • i

Var((xi − ¯ x)ui|xi) = 1 SST 2

x

  • i

(xi − ¯ x)2Var(ui|xi) = σ2 SST 2

x

  • i

(xi − ¯ x)2 = σ2 SST 2

x

SSTx = σ2 SSTx

11 / 26

slide-24
SLIDE 24

Variance of ˆ β1

What happens to Var(ˆ β1|xi) if (1) σ2 increases (2) SSTx increases (3) n increases

12 / 26

slide-25
SLIDE 25

Variance of ˆ β0

Recall that ˆ β0 = ¯ y − ˆ β1¯ x Var(ˆ β0|xi) = Var(¯ y|xi) + Var(ˆ β1¯ x|xi) − 2Cov(¯ y, ˆ β1¯ x|xi) = Var(¯ y|xi) + (¯ x)2Var(ˆ β1|xi) − 2¯ xCov(¯ y, ˆ β1|xi) = σ2/n + (¯ x)2 σ2 SSTx Note that Var(¯ y|xi) = Var((1/n)

i yi|xi) = (1/n2) i Var(yi|xi) =

(1/n2)

i σ2 = (1/n2) · nσ2 = σ2/n

13 / 26

slide-26
SLIDE 26

Variance of ˆ β0

Recall that ˆ β0 = ¯ y − ˆ β1¯ x Var(ˆ β0|xi) = Var(¯ y|xi) + Var(ˆ β1¯ x|xi) − 2Cov(¯ y, ˆ β1¯ x|xi) = Var(¯ y|xi) + (¯ x)2Var(ˆ β1|xi) − 2¯ xCov(¯ y, ˆ β1|xi) = σ2/n + (¯ x)2 σ2 SSTx Note that Var(¯ y|xi) = Var((1/n)

i yi|xi) = (1/n2) i Var(yi|xi) =

(1/n2)

i σ2 = (1/n2) · nσ2 = σ2/n

13 / 26

slide-27
SLIDE 27

Variance of ˆ β0

Simplify Var(ˆ β0|xi) = σ2/n + (¯ x)2 σ2 SSTx and get: Var(ˆ β0|xi) = σ2

i x2 i

n · SSTx (practice of summation operator)

14 / 26

slide-28
SLIDE 28

σ2 is unknown

◮ σ2 appears in both Var(ˆ

β0|xi) and Var(ˆ β1|xi), but the problem is that the error variance, σ2, is generally unknown, because we don’t observe the error term, u, hence Var(u).

◮ What we observe are the residuals, ˆ

ui.

◮ We can use the residuals

to form an estimate of the error variance: ˆ σ2 =

1 n−2

  • i ˆ

u2

i = SSR n−2

n − 2: degrees of freedom (DF) for SSR. (ˆ ui is restricted by two moment restrictions.)

15 / 26

slide-29
SLIDE 29

σ2 is unknown

◮ σ2 appears in both Var(ˆ

β0|xi) and Var(ˆ β1|xi), but the problem is that the error variance, σ2, is generally unknown, because we don’t observe the error term, u, hence Var(u).

◮ What we observe are the residuals, ˆ

ui.

◮ We can use the residuals

to form an estimate of the error variance: ˆ σ2 =

1 n−2

  • i ˆ

u2

i = SSR n−2

n − 2: degrees of freedom (DF) for SSR. (ˆ ui is restricted by two moment restrictions.)

15 / 26

slide-30
SLIDE 30

σ2 is unknown

◮ σ2 appears in both Var(ˆ

β0|xi) and Var(ˆ β1|xi), but the problem is that the error variance, σ2, is generally unknown, because we don’t observe the error term, u, hence Var(u).

◮ What we observe are the residuals, ˆ

ui.

◮ We can use the residuals

to form an estimate of the error variance: ˆ σ2 =

1 n−2

  • i ˆ

u2

i = SSR n−2

n − 2: degrees of freedom (DF) for SSR. (ˆ ui is restricted by two moment restrictions.)

15 / 26

slide-31
SLIDE 31

Estimated Variances

Substitute ˆ σ2 for σ2, and we can get the estimated variances:

  • Var(ˆ

β1) =

ˆ σ2 SSTx

  • Var(ˆ

β0) = ˆ

σ2

i x2 i

n·SSTx

and standard errors: se(ˆ β1) =

  • Var(ˆ

β1) se(ˆ β0) =

  • Var(ˆ

β0)

16 / 26

slide-32
SLIDE 32

Estimated Variances

Substitute ˆ σ2 for σ2, and we can get the estimated variances:

  • Var(ˆ

β1) =

ˆ σ2 SSTx

  • Var(ˆ

β0) = ˆ

σ2

i x2 i

n·SSTx

and standard errors: se(ˆ β1) =

  • Var(ˆ

β1) se(ˆ β0) =

  • Var(ˆ

β0)

16 / 26

slide-33
SLIDE 33

Units of measurement

◮ Suppose we have data on education and wage,

and estimated a wage equation using Stata: wage = ˆ β0 + ˆ β1educ + ˆ u = 146.95 + 60.21educ + ˆ u, R2 = 0.107

◮ Now we would like to interpret the results.

(1) educ (education) is measured as years of schooling (2) wage is measured as pounds per month

◮ So ˆ

β1 = 60.21 would mean that: if one increases his education by 1 year, then on average, his wage will increase by 60.21 pounds per month.

17 / 26

slide-34
SLIDE 34

Units of measurement

◮ Suppose we have data on education and wage,

and estimated a wage equation using Stata: wage = ˆ β0 + ˆ β1educ + ˆ u = 146.95 + 60.21educ + ˆ u, R2 = 0.107

◮ Now we would like to interpret the results.

(1) educ (education) is measured as years of schooling (2) wage is measured as pounds per month

◮ So ˆ

β1 = 60.21 would mean that: if one increases his education by 1 year, then on average, his wage will increase by 60.21 pounds per month.

17 / 26

slide-35
SLIDE 35

Units of measurement

◮ Suppose we have data on education and wage,

and estimated a wage equation using Stata: wage = ˆ β0 + ˆ β1educ + ˆ u = 146.95 + 60.21educ + ˆ u, R2 = 0.107

◮ Now we would like to interpret the results.

(1) educ (education) is measured as years of schooling (2) wage is measured as pounds per month

◮ So ˆ

β1 = 60.21 would mean that: if one increases his education by 1 year, then on average, his wage will increase by 60.21 pounds per month.

17 / 26

slide-36
SLIDE 36

Changing units of measurement: change in scale

◮ Suppose that we change the units of measurement for educ

and wage, such that (1) educ∗ is measured as months of schooling (2) wage∗ is measured as hundred pounds per month

◮ So for the same wage equation, wage∗ = ˆ

β∗

0 + ˆ

β∗

1educ∗ + ˆ

u∗, what would be ˆ β∗

0 and ˆ

β∗

1?

18 / 26

slide-37
SLIDE 37

Changing units of measurement: change in scale

◮ Suppose that we change the units of measurement for educ

and wage, such that (1) educ∗ is measured as months of schooling (2) wage∗ is measured as hundred pounds per month

◮ So for the same wage equation, wage∗ = ˆ

β∗

0 + ˆ

β∗

1educ∗ + ˆ

u∗, what would be ˆ β∗

0 and ˆ

β∗

1?

18 / 26

slide-38
SLIDE 38

◮ First, we can say that

(1) educ∗ = educ × 12 (2) wage∗ = wage/100

◮ Second, re-write the original equation

wage = ˆ β0 + ˆ β1educ + ˆ u as: (wage/100) · 100 = ˆ β0 + ˆ β1(educ · 12)/12 + ˆ u or wage∗ · 100 = ˆ β0 + ˆ β1educ∗/12 + ˆ u This equation is essentially the same as the original one, but includes variables with changed units of measurement, wage∗ and educ∗.

19 / 26

slide-39
SLIDE 39

◮ First, we can say that

(1) educ∗ = educ × 12 (2) wage∗ = wage/100

◮ Second, re-write the original equation

wage = ˆ β0 + ˆ β1educ + ˆ u as: (wage/100) · 100 = ˆ β0 + ˆ β1(educ · 12)/12 + ˆ u or wage∗ · 100 = ˆ β0 + ˆ β1educ∗/12 + ˆ u This equation is essentially the same as the original one, but includes variables with changed units of measurement, wage∗ and educ∗.

19 / 26

slide-40
SLIDE 40

◮ First, we can say that

(1) educ∗ = educ × 12 (2) wage∗ = wage/100

◮ Second, re-write the original equation

wage = ˆ β0 + ˆ β1educ + ˆ u as: (wage/100) · 100 = ˆ β0 + ˆ β1(educ · 12)/12 + ˆ u or wage∗ · 100 = ˆ β0 + ˆ β1educ∗/12 + ˆ u This equation is essentially the same as the original one, but includes variables with changed units of measurement, wage∗ and educ∗.

19 / 26

slide-41
SLIDE 41

◮ First, we can say that

(1) educ∗ = educ × 12 (2) wage∗ = wage/100

◮ Second, re-write the original equation

wage = ˆ β0 + ˆ β1educ + ˆ u as: (wage/100) · 100 = ˆ β0 + ˆ β1(educ · 12)/12 + ˆ u or wage∗ · 100 = ˆ β0 + ˆ β1educ∗/12 + ˆ u This equation is essentially the same as the original one, but includes variables with changed units of measurement, wage∗ and educ∗.

19 / 26

slide-42
SLIDE 42

◮ Third, re-arrange the equation in the second step,

i.e., divide both sides by 100, and get wage∗ = (ˆ β0/100) + (ˆ β1/1200)educ∗ + ˆ u/100 So ˆ β∗

0 = ˆ

β0/100 = 1.4695, and ˆ β∗

1 = ˆ

β1/1200 = 60.21/1200 = 0.05.

◮ This is an example of change in scale.

Can you interpret ˆ β∗

0 and ˆ

β∗

1?

20 / 26

slide-43
SLIDE 43

◮ Third, re-arrange the equation in the second step,

i.e., divide both sides by 100, and get wage∗ = (ˆ β0/100) + (ˆ β1/1200)educ∗ + ˆ u/100 So ˆ β∗

0 = ˆ

β0/100 = 1.4695, and ˆ β∗

1 = ˆ

β1/1200 = 60.21/1200 = 0.05.

◮ This is an example of change in scale.

Can you interpret ˆ β∗

0 and ˆ

β∗

1?

20 / 26

slide-44
SLIDE 44

◮ Third, re-arrange the equation in the second step,

i.e., divide both sides by 100, and get wage∗ = (ˆ β0/100) + (ˆ β1/1200)educ∗ + ˆ u/100 So ˆ β∗

0 = ˆ

β0/100 = 1.4695, and ˆ β∗

1 = ˆ

β1/1200 = 60.21/1200 = 0.05.

◮ This is an example of change in scale.

Can you interpret ˆ β∗

0 and ˆ

β∗

1?

20 / 26

slide-45
SLIDE 45

Changing units of measurement: Change in origin

Suppose that we change the units of measurement for educ such that educ∗ is measured as years of schooling after primary school (assume it takes 6 years to finish primary school). We’d like to know how the coefficients would change from the original equation.

◮ First, we can say that educ∗ = educ − 6. ◮ Second, re-write the original equation

wage = ˆ β0 + ˆ β1educ + ˆ u as: wage = ˆ β0 + ˆ β1(educ − 6 + 6) + ˆ u or wage = ˆ β0 + ˆ β1(educ∗ + 6) + ˆ u

21 / 26

slide-46
SLIDE 46

Changing units of measurement: Change in origin

Suppose that we change the units of measurement for educ such that educ∗ is measured as years of schooling after primary school (assume it takes 6 years to finish primary school). We’d like to know how the coefficients would change from the original equation.

◮ First, we can say that educ∗ = educ − 6. ◮ Second, re-write the original equation

wage = ˆ β0 + ˆ β1educ + ˆ u as: wage = ˆ β0 + ˆ β1(educ − 6 + 6) + ˆ u or wage = ˆ β0 + ˆ β1(educ∗ + 6) + ˆ u

21 / 26

slide-47
SLIDE 47

Changing units of measurement: Change in origin

Suppose that we change the units of measurement for educ such that educ∗ is measured as years of schooling after primary school (assume it takes 6 years to finish primary school). We’d like to know how the coefficients would change from the original equation.

◮ First, we can say that educ∗ = educ − 6. ◮ Second, re-write the original equation

wage = ˆ β0 + ˆ β1educ + ˆ u as: wage = ˆ β0 + ˆ β1(educ − 6 + 6) + ˆ u or wage = ˆ β0 + ˆ β1(educ∗ + 6) + ˆ u

21 / 26

slide-48
SLIDE 48

◮ Third, re-arrange the equation in the second step,

wage = (ˆ β0 + 6ˆ β1) + ˆ β1educ∗ + ˆ u So ˆ β∗

0 = ˆ

β0 + 6ˆ β1 = 146.95 + 6 × 60.21 = 508.21, and ˆ β∗

1 does not change from the original equation.

22 / 26

slide-49
SLIDE 49

Changing units of measurement does not change the goodness-of-fit, R2 = SSE/SST. This is because (1) When there is a change in scale, SSE and SST change proportionally, and the ratio SSE/SST does not change. (2) When there is a change in origin, neither SSE nor SST changes, and the ratio SSE/SST does not change.

23 / 26

slide-50
SLIDE 50

Changing units of measurement does not change the goodness-of-fit, R2 = SSE/SST. This is because (1) When there is a change in scale, SSE and SST change proportionally, and the ratio SSE/SST does not change. (2) When there is a change in origin, neither SSE nor SST changes, and the ratio SSE/SST does not change.

23 / 26

slide-51
SLIDE 51

Nonlinearities in simple regression

◮ What if we estimate another wage equation,

where wage is logged:

  • log(wage) = ˆ

β0 + ˆ β1educ = 0.62 + 0.08educ, R2 = 0.2 This is a log-level model.

◮ So ˆ

β1 = 0.08 would mean that: if one increases his education by 1 year, then on average, his wage will increase by 8% per month.

◮ Advantage of the log-level model:

the slope coefficient is unit-free (i.e., it’s a percentage).

24 / 26

slide-52
SLIDE 52

Nonlinearities in simple regression

◮ What if we estimate another wage equation,

where wage is logged:

  • log(wage) = ˆ

β0 + ˆ β1educ = 0.62 + 0.08educ, R2 = 0.2 This is a log-level model.

◮ So ˆ

β1 = 0.08 would mean that: if one increases his education by 1 year, then on average, his wage will increase by 8% per month.

◮ Advantage of the log-level model:

the slope coefficient is unit-free (i.e., it’s a percentage).

24 / 26

slide-53
SLIDE 53

Nonlinearities in simple regression

◮ What if we estimate another wage equation,

where wage is logged:

  • log(wage) = ˆ

β0 + ˆ β1educ = 0.62 + 0.08educ, R2 = 0.2 This is a log-level model.

◮ So ˆ

β1 = 0.08 would mean that: if one increases his education by 1 year, then on average, his wage will increase by 8% per month.

◮ Advantage of the log-level model:

the slope coefficient is unit-free (i.e., it’s a percentage).

24 / 26

slide-54
SLIDE 54

Nonlinearities in simple regression

◮ Is the regression log(wage) = β0 + β1educ + u linear?

wage is non-linear in educ (because wage is exponential function of educ). However, the regression is linear in parameters!

◮ Examples:

Linear in parameter: y = β0 + β1 √x + u Non-linear in parameter: y = xβ + u

25 / 26

slide-55
SLIDE 55

Nonlinearities in simple regression

◮ Is the regression log(wage) = β0 + β1educ + u linear?

wage is non-linear in educ (because wage is exponential function of educ). However, the regression is linear in parameters!

◮ Examples:

Linear in parameter: y = β0 + β1 √x + u Non-linear in parameter: y = xβ + u

25 / 26

slide-56
SLIDE 56

Reading

Chapter 2, Introductory Econometrics - A Modern Approach, 4th Edition, J. Wooldridge

26 / 26