8.4.3 Linear Regression
- Prof. Tesler
Math 283 Fall 2019
- Prof. Tesler
8.4.3: Linear Regression Math 283 / Fall 2019 1 / 28
8.4.3 Linear Regression Prof. Tesler Math 283 Fall 2019 Prof. - - PowerPoint PPT Presentation
8.4.3 Linear Regression Prof. Tesler Math 283 Fall 2019 Prof. Tesler 8.4.3: Linear Regression Math 283 / Fall 2019 1 / 28 Regression Given n points ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , we want to determine a function y = f ( x ) that is
8.4.3: Linear Regression Math 283 / Fall 2019 1 / 28
−10 10 20 30 10 20 30 40 50
x y
8.4.3: Linear Regression Math 283 / Fall 2019 2 / 28
10 12 14 16 18 20 60 80 100
! ! ! ! ! ! ! ! !
5 10 15 20 25 −1000 500
! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
5 10 15 20 25 30 1 2 3 4 5
! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
5 10 15 20 2 4 6 8 10
! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
8.4.3: Linear Regression Math 283 / Fall 2019 3 / 28
8.4.3: Linear Regression Math 283 / Fall 2019 4 / 28
! ! ! ! ! ! ! ! ! !
−20 −10 10 20 30 10 20 30 40 50 x y
8.4.3: Linear Regression Math 283 / Fall 2019 5 / 28
! ! ! ! ! ! ! ! ! !
−20 −10 10 20 30 10 20 30 40 50 x y
8.4.3: Linear Regression Math 283 / Fall 2019 6 / 28
! ! ! ! ! ! ! ! ! !
−20 −10 10 20 30 10 20 30 40 50 x y
8.4.3: Linear Regression Math 283 / Fall 2019 7 / 28
n
n
n
2
n
8.4.3: Linear Regression Math 283 / Fall 2019 8 / 28
−10 10 20 30 10 20 30 40 50
y = β0 + β1x + ε
x y slope = 0.6180 y = 24.9494 + 0.6180x
−10 10 20 30 10 20 30 40 50
x = α0 + α1y + ε
x y slope = 0.8695 x = −28.2067 + 1.1501y
8.4.3: Linear Regression Math 283 / Fall 2019 9 / 28
−10 10 20 30 10 20 30 40 50
y = β0 + β1x + ε
x y slope = 0.6180 y = 24.9494 + 0.6180x
−10 10 20 30 10 20 30 40 50
x = α0 + α1y + ε
x y slope = 0.8695 x = −28.2067 + 1.1501y
−10 10 20 30 10 20 30 40 50
First principal component
x y slope = 0.6934274
−10 10 20 30 10 20 30 40 50
All three
x y x = 1.685727 y = 25.99114
8.4.3: Linear Regression Math 283 / Fall 2019 10 / 28
8.4.3: Linear Regression Math 283 / Fall 2019 11 / 28
! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
8.4.3: Linear Regression Math 283 / Fall 2019 12 / 28
8.4.3: Linear Regression Math 283 / Fall 2019 13 / 28
8.4.3: Linear Regression Math 283 / Fall 2019 14 / 28
5 10 15 20 25 50 100 150 y = !0 + !1x + " x y
! ! ! ! ! ! ! ! !
r2 = 0.7683551
!
true line sample data best fit line 95% prediction interval
8.4.3: Linear Regression Math 283 / Fall 2019 15 / 28
i xi2
i(xi−¯
i xi2
i(xi−¯
i(xi−¯
i(xi−¯
8.4.3: Linear Regression Math 283 / Fall 2019 16 / 28
8.4.3: Linear Regression Math 283 / Fall 2019 17 / 28
8.4.3: Linear Regression Math 283 / Fall 2019 18 / 28
https://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient#Sample_size
8.4.3: Linear Regression Math 283 / Fall 2019 19 / 28
1 0.8 0.4
1 1 1
http://en.wikipedia.org/wiki/File:Correlation_examples2.svg http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient
8.4.3: Linear Regression Math 283 / Fall 2019 20 / 28
8.4.3: Linear Regression Math 283 / Fall 2019 21 / 28
1 0.8 0.4
1 1 1
http://en.wikipedia.org/wiki/File:Correlation_examples2.svg http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient
8.4.3: Linear Regression Math 283 / Fall 2019 22 / 28
http://www.tylervigen.com/view_correlation?id=1703 http://tylervigen.com/view_correlation?id=1759
8.4.3: Linear Regression Math 283 / Fall 2019 23 / 28
8.4.3: Linear Regression Math 283 / Fall 2019 24 / 28
8.4.3: Linear Regression Math 283 / Fall 2019 25 / 28
8.4.3: Linear Regression Math 283 / Fall 2019 26 / 28
>> # Generate data with known x >> # but random errors in y >> x = (-10:10)’; # column vector >> err = normrnd(0, 100, size(x)); >> y = 10*(x.^2) - 3*x + 6 + err; >> # Point estimate (no conf. int.): >> polyfit(x,y,2) 9.5968
30.5096 >> # Interval estimate (with conf. int.) >> # Create the design matrix >> Xdesign = [ones(size(x)), x, x.^2] Xdesign = 1
100 1
81 ... 1 10 100 >> [b, bint] = regress(y, Xdesign) b = 30.5096
9.5968 bint =
109.6587
8.0655 7.9854 11.2082
−5 5 10 200 400 600 800 1000
Fitting a polynomial to data
x y y = 10x2 − 3x + 6 (True curve, hidden) y = β ^
2x2 + β
^
1x + β
^
0 (Best fit quadratic)
8.4.3: Linear Regression Math 283 / Fall 2019 27 / 28
> # Generate data with known x > # but random errors in y > x = -10:10; > n = length(x); > err = rnorm(n, 0, 100); > y = 10*x^2 - 3*x + 6 + err; > # Fit to y = b0 + b1*x + b2*x^2 > # intercept b0 is implied: > bestfit = lm(y ~ I(x) + I(x^2)); > coefficients(bestfit) (Intercept) I(x) I(x^2) 30.5096087
9.5968040 > confint(bestfit) 2.5 % 97.5 % (Intercept) -48.639445 109.658662 I(x)
8.065507 I(x^2) 7.985427 11.208181
−5 5 10 200 400 600 800 1000
Fitting a polynomial to data
x y y = 10x2 − 3x + 6 (True curve, hidden) y = β ^
2x2 + β
^
1x + β
^
0 (Best fit quadratic)
8.4.3: Linear Regression Math 283 / Fall 2019 28 / 28