[PPT] - Statistical View of Linear Least Squares Minjung Kyung PowerPoint Presentation

SLIDE 1

2005 SAMSI Undergraduate Workshop

Statistical View of Linear Least Squares

Minjung Kyung

mkyung@stat.ncsu.edu

May 22, 2005

0-0

SLIDE 2

2005 SAMSI Undergraduate Workshop

Introduction to Linear Regression

Functional relation between two variables is expressed

by a mathematical formula. – X denotes the independent variable – Y denotes the dependent variable – A functional relation is of the form Y = f(X). – Given a particular value of X, the function f indicates the corresponding value of Y .

SAMSI Linear Least Squares

SLIDE 3

2005 SAMSI Undergraduate Workshop

50 100 150 50 100 150 200 250 300

Example of Functional Relation

X Y

SAMSI Linear Least Squares

SLIDE 4

2005 SAMSI Undergraduate Workshop

Introduction to Linear Regression

Statistical relation between two variables

– not a perfect one – in general, the observations for a statistical relation do not fall directly on the curve of relationship

SAMSI Linear Least Squares

SLIDE 5

2005 SAMSI Undergraduate Workshop

20 40 60 80 100 120 100 200 300 400 500

Scatter Plot

Lot Size Work Hrs 20 40 60 80 100 120 100 200 300 400 500

Scatter Plot and Line of Statistical Relationship

y=62.37+3.57x Lot Size Work Hrs

SAMSI Linear Least Squares

SLIDE 6

2005 SAMSI Undergraduate Workshop

10 20 30 40 50 60 10 20 30 40 50

Curvilinear Statistical Realtion Example

Days Prognosis

SAMSI Linear Least Squares

SLIDE 7

2005 SAMSI Undergraduate Workshop

Introduction to Linear Regression

A regression model is a formal means of expression the

two essential ingredients of a statistical relation:

1. A tendency of the response variable Y to vary with

the predictor variable X in a systematic fashion.

2. A scattering of points around the curve of statistical

relationship.

SAMSI Linear Least Squares

SLIDE 8

2005 SAMSI Undergraduate Workshop

Simple Linear Regression Model Yi = β0 + β1Xi + ǫi

Yi is the value of the response variable in the ith trial
β0 and β1 are parameters (the regression coefficients)
Xi is the value of the predictor variable in the ith trial
ǫi is a random error term

SAMSI Linear Least Squares

SLIDE 9

2005 SAMSI Undergraduate Workshop

Simple Linear Regression Model Model Assumptions

1. the error terms are normally distributed with mean 0

and variance σ2 for all values of i ǫi ∼ N(0, σ2)

2. the error terms ǫi and ǫj are independent if i = j
3. Although the model explicitly allows for measurement

error in Y , measurements made on X are known precisely (there is no measurement error)

SAMSI Linear Least Squares

SLIDE 10

2005 SAMSI Undergraduate Workshop

Simple Linear Regression Model Important Features of the Simple Linear Regression Model

1. The response Yi is a sum of 2 components: the

deterministic term β0 + β1Xi and the random error term ǫi. Therefore, Yi is a random variable.

2. The response Yi comes from a probability distribution

whose mean is E[Yi] = β0 + β1Xi.

3. The response Yi exceeds or falls short of the value of the

regression function by the error term amount ǫi.

4. The responses Yi have the same constant variance as the

error term ǫi var[Yi] = var[β0 + β1Xi + ǫi] = σ2.

SAMSI Linear Least Squares

SLIDE 11

2005 SAMSI Undergraduate Workshop

5. The responses Yi and Yj are uncorrelated, since the error

terms ǫi and ǫj are uncorrelated. In summary, the responses Yi come from normal distribution with mean E[Yi] = β0 + β1Xi and variance σ2, the same for all levels of X. Further, any two responses Yi and Yj are uncorrelated. Yi ∼ i.i.d N(β0 + β1Xi, σ2)

SAMSI Linear Least Squares

SLIDE 12

2005 SAMSI Undergraduate Workshop

Steps for Selecting an Appropriate Regression Model

1. Exploratory data analysis
2. Develop one or more tentative regression models
3. Examine and revise the regression models for their

appropriateness for the data at hand (or develop new models)

4. Make inferences on basis of the selected regression model

SAMSI Linear Least Squares

SLIDE 13

2005 SAMSI Undergraduate Workshop

Estimation of the Regression Function Method of least square

For the observations (Xi, Yi), consider the deviation of Yi

from its expected value Yi − (β0 + β1Xi).

Consider the sum of the squared deviations

Q =

n

i=1

(Yi − β0 − β1Xi)2. (1) The estimators of β0 and β1 are those values β0 and β1 that minimize Q for the given sample observations (X1, Y1), (X2, Y2), . . . , (Xn, Yn).

SAMSI Linear Least Squares

SLIDE 14

2005 SAMSI Undergraduate Workshop

Estimation of the Regression Function Least square estimator

The estimator

β0 and β1 that satisfy the least squares criterion can be found in 2 ways

1. Numerical Search Procedures
2. Analytical procedures

We will use analytical approach.

SAMSI Linear Least Squares

SLIDE 15

2005 SAMSI Undergraduate Workshop

Estimation of the Regression Function Least square estimator

The values of β0 and β1 that minimize Q can be derived

by differentiating (1) with respect to β0 and β1 and setting the result equal to 0 ∂Q ∂β0 = −2

n

i=1

(Yi − β0 − β1Xi) = 0 ∂Q ∂β1 = −2

n

i=1

Xi(Yi − β0 − β1Xi) = 0

Simplifying, we get the normal equations

n

i=1

Yi − nβ0 − β1

n

i=1

Xi = 0

SAMSI Linear Least Squares

SLIDE 16

2005 SAMSI Undergraduate Workshop n

i=1

XiYi − β0

n

i=1

Xi − β1

n

i=1

X2

i = 0

The normal equations can be solved simultaneously to

get estimates of the parameters β0 and β1

β1 =

n

i=1(Xi − X)(Yi − Y )

n

i=1(Xi − X)2

β0 = 1

n n

i=1

Yi − β1

n

i=1

Xi

= Y −

β1X where X and Y are the means of the X and Y

bservations, respectively.

SAMSI Linear Least Squares

SLIDE 17

2005 SAMSI Undergraduate Workshop

Estimation of the Regression Function Residuals

The fitted value for the ith case,
Yi =

β0 + β1Xi

The ith residual is the difference between the observed

value Yi and the fitted value Yi ei = Yi − Yi = Yi − ( β0 + β1Xi).

Model Error Term: ǫi = Yi − (β0 + β1Xi)

Represents the vertical deviation of Yi from the unknown true regression line.

Residual: ei = Yi −

Yi

SAMSI Linear Least Squares

SLIDE 18

2005 SAMSI Undergraduate Workshop

– Represents the vertical deviation of Yi from the fitted value Yi on the estimated regression line. – Residuals are useful for studying whether a given regression model is appropriate for the given data.

SAMSI Linear Least Squares

SLIDE 19

2005 SAMSI Undergraduate Workshop

Properties of fitted regression line

n

i=1 ei = 0.

n

i=1 ei, is a minimum.

n

i=1 Yi = n i=1

Yi.

n

i=1 Xiei = 0.

n

i=1

Yiei = 0.

The regression line always goes through the point (X, Y ).

SAMSI Linear Least Squares

SLIDE 20

2005 SAMSI Undergraduate Workshop

Estimation of σ2

A variety of inferences concerning the regression

function require an estimate of σ2. – To get an estimate of σ2, first compute the error sum

f squares or residual sum of squares:

SSE =

n

i=1

(Yi − Yi)2 =

n

i=1

e2

i .

– The mean square error(MSE) is computed as MSE = SSE n − 2. – It can be shown that MSE is an unbiased estimator of σ2.

SAMSI Linear Least Squares

SLIDE 21

2005 SAMSI Undergraduate Workshop

Matrix Approach to Least Squares

The regression model Yi = β0 + β1Xi + ǫi can be written in

matrix notation as Y = Xβ + ǫ where Y =         Y1 Y1 . . . Yn         , X =         1 X1 1 X2 . . . . . . 1 Xn         β =   β0 β1   , ǫ =         ǫ1 ǫ2 . . . ǫn        

SAMSI Linear Least Squares

SLIDE 22

2005 SAMSI Undergraduate Workshop

Matrix Approach to Least Squares

The normal equations in matrix form are

X′Xβ = X′Y

The model parameters can be estimated as follows:
β = (X′X)−1X′Y

SAMSI Linear Least Squares

SLIDE 23

2005 SAMSI Undergraduate Workshop

Matrix Approach to Least Squares

The residuals are computed using:

e = Y − Y = Y − X β, where

Y =

    

β0 +

β1X1 . . .

β0 +

β1Xn     

The estimate for σ2 is computed as follows:

σ2 = e′e n − 2

SAMSI Linear Least Squares

SLIDE 24

2005 SAMSI Undergraduate Workshop

Inferences in Regression Analysis Inferences concerning β1

Point Estimator of β1
β1 =

n

i=1(Xi − X)(Yi − Y )

n

i=1(Xi − X)2

Estimate of the standard error of

β1 SE[ β1] = MSE n

i=1(Xi − X)2

Confidence interval for β1:
β1 ± t1−α/2;n−2SE[

β1]

SAMSI Linear Least Squares

SLIDE 25

2005 SAMSI Undergraduate Workshop

Inferences in Regression Analysis Inferences concerning β1

To test H0 : β1 = 0 vs. H0 : β1 = 0,

– Test statistics t =

β1 − 0

SE[ β1] – p-value p(|t| > T(1−α/2,n−2)) → if p-value< α , we reject H0.

SAMSI Linear Least Squares

SLIDE 26

2005 SAMSI Undergraduate Workshop

Inferences in Regression Analysis Inferences concerning β0

Point Estimator of β0
β0 = Y −

β1X

Estimate of the standard error of

β0 SE[ β0] = MSE

1

n + X

2

n

i=1(Xi − X)2

Confidence interval for β0:
β0 ± t1−α/2;n−2SE[

β0]

SAMSI Linear Least Squares

SLIDE 27

2005 SAMSI Undergraduate Workshop

Inferences in Regression Analysis Inferences concerning β0

To test H0 : β0 = 0 vs. H0 : β0 = 0,

– Test statistics t =

β0 − 0

SE[ β0] – p-value p(|t| > T(1−α/2,n−2)) → if p-value< α , we reject H0.

SAMSI Linear Least Squares

SLIDE 28

2005 SAMSI Undergraduate Workshop

Checking the Model Assumptions

If the model is appropriate for the data at hand, the
bserved residuals ei should reflect the properties

assumed for the ǫi.

Residuals can be used to detect departures from the

linear regression model – A residual plot against the fitted values can be used to determine if the error terms have a constant variance. – A plot of the residuals in order of data can be used to test for correlation between error terms. When the error terms are independent, we expect them to fluctuate in a random pattern around 0.

SAMSI Linear Least Squares

SLIDE 29

2005 SAMSI Undergraduate Workshop

Prototype Residual Plots

20 40 60 80 100 120 −1 1 2

Constant Error Variance

Lot Size Residual 10 20 30 40 50 60 −1.0 −0.5 0.0 0.5 1.0 1.5

Need Curveilinear Regression

Days Residual

SAMSI Linear Least Squares

SLIDE 30

2005 SAMSI Undergraduate Workshop

20 30 40 50 60 −2 −1 1 2

Increasing Error Variance

Age Residual 35 40 45 50 60 70 80 90 100

Outlier Effect

Temperature Mortality

SAMSI Linear Least Squares

SLIDE 31

2005 SAMSI Undergraduate Workshop

Nonlinear Regression Nonlinear Regression vs. Linear Regression

A regression model is called nonlinear, if the derivatives
f the model with respect to the model parameters

depends on one or more parameters. – eg) y = δ + (α − δ)/(1 + expβ log(x/γ)) + ǫ – Take derivatives with respect to δ, for example: ∂y/∂δ = 1 − 1/(1 + expβlog(x/γ)). – The derivative involves other parameters, hence the model is nonlinear.

SAMSI Linear Least Squares

SLIDE 32

2005 SAMSI Undergraduate Workshop

Nonlinear Regression Nonlinear Regression vs. Linear Regression

A regression model is not necessarily nonlinear if the

graphed regression trend is curved. – eg) y = β0 + β1x + β2x2 + ǫ – Take derivatives of y with respect to the parameters β0, β1, and β2: ∂y/∂β0 = 1 ∂y/∂β1 = x ∂y/∂β2 = x2. – None of these derivatives depends on a model parameter, the model is linear.

SAMSI Linear Least Squares

SLIDE 33

2005 SAMSI Undergraduate Workshop

Nonlinear Regression Fitting Nonlinear Regression Models

The general from of a nonlinear regression models is

y = η(x, β) + ǫ where x is a vector of covariates, β is a vector of unknown parameters and ǫ is a N(0, σ2) error term.

To estimate unknown parameters,

min

β n

i=1
yi − η(xi, β)

2.

SAMSI Linear Least Squares

SLIDE 34

2005 SAMSI Undergraduate Workshop

Nonlinear Regression Fitting Nonlinear Regression Models

The variance parameter σ2 is estimated by the residual

mean square as in linear regression. σ2 = e′e n − 2, where e =      y1 − η(x1, β) . . . yn − η(xn, β)     

SAMSI Linear Least Squares

SLIDE 35

2005 SAMSI Undergraduate Workshop

Nonlinear Regression Fitting Nonlinear Regression Models

There is no explicit formula for the estimates, so

iterative procedures are required.

One of the disadvantages of nonlinear models is that the

process is iterative. – To estimate the parameters of the model, you commence with a set of user-supplied starting values. – Care must be exercised in choosing good starting values. – It is thus sensible to start the iterative process with different sets of starting values and to observe whether the program arrives at the same parameter estimates.

SAMSI Linear Least Squares

SLIDE 36

2005 SAMSI Undergraduate Workshop

The software tries to improve on the quality of the

model fit to the data by adjusting the values of the parameters successively: One iteration

In the next iteration, the program again attempts to

improve on the fit by modifying the parameters.

Once an improvement is not possible, the fit is

considered converged.

Notice that all results in nonlinear regression are
asymptotic. That means the standard error, for

example, is only correct if you have an infinitely large sample size. For any finite sample size, the reported standard error is only an approximation which improves with increasing sample size.

SAMSI Linear Least Squares

SLIDE 37

2005 SAMSI Undergraduate Workshop

Nonlinear Regression Example of Nonlinear Regression Model

10 20 30 40 50 60 10 20 30 40 50

Curvilinear Statistical Realtion Example

Days Prognosis

The formula of this model is y = β0 exp(β1x) + ǫ.

SAMSI Linear Least Squares

SLIDE 38

2005 SAMSI Undergraduate Workshop

The estimated parameter is

β0 = 58.6066 and β1 = −0.0396 with approximate standard errors 1.4845 and 0.0017, respectively.

SAMSI Linear Least Squares