Least Squares Method The objective of the scatter diagram is to - - PowerPoint PPT Presentation

least squares method
SMART_READER_LITE
LIVE PREVIEW

Least Squares Method The objective of the scatter diagram is to - - PowerPoint PPT Presentation

1 Least Squares Method The objective of the scatter diagram is to measure the strength and direction of the linear relationship. Both can be more easily judged by drawing a straight line through the data. Which line best describes the


slide-1
SLIDE 1

1

The objective of the scatter diagram is to measure the strength and direction of the linear relationship. Both can be more easily judged by drawing a straight line through the data.

Least Squares Method

Which line best describes the relationship between X and Y?

slide-2
SLIDE 2

2 We need an objective method of producing a straight line. The best line will be one that is “closest” to the points on the scatterplot. In

  • ther words, the best line is one that minimises the total distance between

itself and all the observed data points.  Since we oftentimes use regression to predict values of Y from observed values of X, we choose to measure the distance vertically.

Least Squares Method

slide-3
SLIDE 3

3

We want to find the line that minimises the vertical distance between itself and the observed points on the scatterplot. So here we have 2 different lines that may describe the relationship between X and Y. To determine which one is best, we can find the vertical distances from each point to the line...

 So based on this, the line on the right is better than the line on the left in describing the relationship between X and Y. ***infinite number of lines***

Least Squares Method

slide-4
SLIDE 4

4

Recall, the slope-intercept equation for a line is expressed in these terms: y = mx + b Where: m is the slope of the line b is the y-intercept. If we have determined there is a linear relationship between two variables with covariance and the coefficient

  • f correlation, can we determine a linear function of the

relationship?

Least Squares Method

slide-5
SLIDE 5

5

x b b y

1

ˆ  

Just to make things more difficult for students, we typically rewrite this line as: where the slope, and the intercept,

Least Squares Method

Read as y-hat! --- Fitted regression line!

2 1 x xy

s s b  x b y b

1

 

Read as ”b naught”

slide-6
SLIDE 6

6

Interpretation of the b0, b1

slide-7
SLIDE 7

7 We can then define the error to be the difference between the coordinates and the prediction line. The coordinate of one point: (xi, yi) Predicted value for given xi : “Best” line minimizes , the sum of the squared errors.

“Best” line: least-squares, or regression line

i i

x b b y

1

ˆ  

 

2

ˆi

i

y y

Error = distance from one point to the line = Coordinate – Prediction

Some of the errors will be positive and some will be negative! The problem is that when we add positive and negative values, they tend to cancel each other out.

slide-8
SLIDE 8

8 When we square those error lines, we are literally making squares from those

  • lines. We can visualize this as...

So we want to find the regression line that minimizes the sum of the areas of these error squares. For this regression line, the sum of the areas of the squares would look like this...

“Best” line: least-squares, or regression line

Some of the errors will be positive and some will be negative! The problem is that when we add positive and negative values, they tend to cancel each other out.

slide-9
SLIDE 9

9

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

Let`s determine the best-fitted line for following data:

slide-10
SLIDE 10

10

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

slide-11
SLIDE 11

11

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

slide-12
SLIDE 12

12

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

slide-13
SLIDE 13

13

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

Lines of best fit will pivot around the point which represents the mean of X and the mean of the Y variables!

slide-14
SLIDE 14

14

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

 

1

2 2

    n x x s

i

slide-15
SLIDE 15

15

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

slide-16
SLIDE 16

16

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

slide-17
SLIDE 17

17

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

slide-18
SLIDE 18

18

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

slide-19
SLIDE 19

19

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

slide-20
SLIDE 20

20

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

slide-21
SLIDE 21

21

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

 

1

2 2

    n x x s

i

slide-22
SLIDE 22

22

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

 

1

2 2

    n x x s

i

slide-23
SLIDE 23

23

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

 

1

2 2

    n x x s

i

slide-24
SLIDE 24

24

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

 

1

2 2

    n x x s

i

slide-25
SLIDE 25

25

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

 

1

2 2

    n x x s

i

slide-26
SLIDE 26

26

Least Squares Method

2 1 x xy

s s b  x b y b

1

 

slide-27
SLIDE 27

27

Only for medium to strong correlations...

Line of Best Fit

slide-28
SLIDE 28

28

Line of Best Fit

slide-29
SLIDE 29

29

Line of Best Fit

slide-30
SLIDE 30

30

r measures “closeness” of data to the “best” line. How best? In terms of least squared error:

What line?

slide-31
SLIDE 31

31

In a fixed and variable costs model: b0 =9.95? Intercept: predicted value of y when x = 0. b1 =2.25? Slope: predicted change in y when x increases by 1.

Interpretation of the b0, b1, ˆi

i

y x   ˆ 9.95 2.25

i i

y x  

slide-32
SLIDE 32

32

A simple example of a linear equation A company has fixed costs of $7,000 for plant and equipment and variable costs of $600 for each unit of output. What is total cost at varying levels of output? let x = units of output let C = total cost C = fixed cost plus variable cost = 7,000 + 600 x

Interpretation of the b0, b1, ˆi

i

y x  

slide-33
SLIDE 33

33

b1, slope, always has the same sign as r, the correlation coefficient — but they measure different things! The sum of the errors (or residuals), , is always 0 (zero). The line always passes through the point .

Interpretation of the b0, b1, ˆi

i

y x    

i i

y y ˆ 

 

y x,

slide-34
SLIDE 34

34

When we introduced the coefficient of correlation we pointed out that except for −1, 0, and +1 we cannot precisely interpret its meaning. We can judge the coefficient of correlation in relation to its proximity to −1, 0, and +1 only. Fortunately, we have another measure that can be precisely interpreted. It is the coefficient

  • f

determination, which is calculated by squaring the coefficient of correlation. For this reason we denote it R2 .

Coefficient of Determination

slide-35
SLIDE 35

35

The coefficient of determination measures the amount of variation in the dependent variable that is explained by the variation in the independent variable.

Coefficient of Determination

The coefficient of determination is R2 = 0.758 This tells us that 75.8%

  • f

the variation in electrical costs is explained by the number of tools. The remaining 24.2% is unexplained.

slide-36
SLIDE 36

36

Least Squares Method --- R2

slide-37
SLIDE 37

37

Parameters and Statistics