Least Squares Regression October 30, 2019 October 30, 2019 1 / 22

Finding the Best Line We want a line with small residuals, so it might make sense to try to minimize n n � � e i = ( y i − ˆ y i ) i =1 i =1 ...but this will give us very large negative residuals! Section 8.2 October 30, 2019 2 / 22

Finding the Best Line As with the standard deviation, we will use squares to shift the focus to magnitude: n n � � e 2 y i ) 2 i = ( y i − ˆ i =1 i =1 Section 8.2 October 30, 2019 3 / 22

Finding the Best Line n n � � e 2 y i ) 2 i = ( y i − ˆ i =1 i =1 n � [ y i − ( b 0 + b 1 x i )] 2 = i =1 The values of b that minimize this will make up our regression line. This is called the Least Squares Criterion . Section 8.2 October 30, 2019 4 / 22

Finding the Best Line To fit a least squares regression , we require Linearity. The data should show a linear trend. Nearly normal residuals. The residuals should be well-approximated by a normal distribution. Constant variability. As we move along x , the variability around the regression line should stay constant. Independent observations. This will apply to random samples. Section 8.2 October 30, 2019 5 / 22

Finding the Least Squares Line We want to estimate β 0 and β 1 in the equation y = β 0 + β 1 x + ǫ by minimizing � n y i ) 2 . i =1 ( y i − ˆ Section 8.2 October 30, 2019 6 / 22

Finding the Least Squares Line This turns out to be remarkably straightforward! The slope can be estimated as b 1 = s y R s x and the intercept by b 0 = ¯ y − b 1 ¯ x Section 8.2 October 30, 2019 7 / 22

Example The faithful dataset in R has two measurements taken for the Old Faithful Geyser in Yellowstone National Park: eruptions : the length of each eruption waiting : the time between eruptions Each is measured in minutes. Section 8.2 October 30, 2019 8 / 22

Example We want to see if we can use the wait time to predict eruption duration. Section 8.2 October 30, 2019 9 / 22

Example The sample statistics for these data are waiting eruptions mean x = 70 . 90 ¯ y = 3 . 49 ¯ sd s x = 13 . 60 s y = 1 . 14 R = 0 . 90 Find the linear regression line and interpret the parameter estimates. Section 8.2 October 30, 2019 10 / 22

Example Section 8.2 October 30, 2019 11 / 22

Hypothesis Testing in Linear Regression Whenever we estimate a parameter, we want to use a hypothesis test to think about our confidence in that estimate. For β i ( i = 0 , 1) H 0 : β i = 0 H A : β i � = 0 We will do this using a one-sample t-test. Section 8.2 October 30, 2019 12 / 22

Example If we use R to get the coefficients for our faithful data, we get Estimate Std. Error t value Pr( > | t | ) (Intercept) -1.874016 0.160143 -11.70 < 2e-16 waiting 0.075628 0.002219 34.09 < 2e-16 What does this tell us about our parameters? Section 8.2 October 30, 2019 13 / 22

Extrapolation When we make predictions, we simply plug in values of x to estimate values of y . However, this has limitations! We don’t know how the data outside of our limited window will behave. Section 8.2 October 30, 2019 14 / 22

Extrapolation Applying a model estimate for values outside of the data’s range for x is called extrapolation . The linear model is only an approximation. We don’t know anything about the relationship outside of the scope of our data. Extrapolation assumes that the linear relationship holds in places where it has not been analyzed. Section 8.2 October 30, 2019 15 / 22

Extrapolation Section 8.2 October 30, 2019 16 / 22

Example In this data, waiting times range from 43 minutes to 96 minutes. Let’s predict eruption time for a 50 minute wait. eruption time for a 10 minute wait. Section 8.2 October 30, 2019 17 / 22

Using R 2 to Describe Strength of Fit We’ve evaluated the strength of a linear relationship between two variables using the correlation coefficient R . However, it is also common to use R 2 . This helps describe how closely the data cluster around a linear fit. Section 8.2 October 30, 2019 18 / 22

Using R 2 to Describe Strength of Fit Suppose R 2 = 0 . 62 for a linear model. Then we would say About 62% of the data’s variability is accounted for using the linear model. And yes, R 2 is the square of the correlation coefficient R ! Section 8.2 October 30, 2019 19 / 22

Example Interpret the R 2 value for this model. What else can we learn from the R output? Section 8.2 October 30, 2019 20 / 22

Regression Example This is the residual plot for the geyser regression. Do you see any problems? Section 8.2 October 30, 2019 21 / 22

Regression Example This is a histogram of the residuals. Do they look normally distributed? Section 8.2 October 30, 2019 22 / 22

Least Squares Regression October 30, 2019 October 30, 2019 1 / 22 - PowerPoint PPT Presentation

Least Squares Regression October 30, 2019 October 30, 2019 1 / 22 Finding the Best Line We want a line with small residuals, so it might make sense to try to minimize n n e i = ( y i y i ) i =1 i =1 ...but this will give us

Least Mean Squares Regression Machine Learning 1 Least Squares Method for regression

Practical Least-Squares for Computer Graphics Siggraph Course 11 Siggraph Course 11 Practical

Statistical Properties of the Regularized Least Squares Functional and a hybrid LSQR Newton method

The Mathemagic of Magic Squares History of Magic Squares Mathematics and Magic Squares

1 Least Squares Regression Suppose someone hands you a stack of N vectors, { x N } , each of

Deep Learning - Theory and Practice Linear Regression, Least Squares 20-02-2020 Classification

ECE 516: Adaptive Digital Filters Lecture 13 (Recursive Least-Squares) Mojtaba Soltanalian 2

Statistical Geometry Processing Winter Semester 2011/2012 Least-Squares Least-Squares Fitting

9. Equality constraints and tradeoffs More least squares Example: moving average model

8. Least squares Review of linear equations Least squares Example: curve-fitting

Linear Least Squares I Steve Marschner Cornell CS 322 Cornell CS 322 Linear Least Squares I 1

Moving Least Squares Outline The Approximation Power of Moving Least- Squares D. Levin

The Chi-squared Distribution of the Regularized Least Squares Functional for Regularization

Non linear Least Squares Lectures for PHD course on Numerical optimization Enrico Bertolazzi

Geometry of Least Squares 2 Least squares from the

A fast way to compute Least Squares Teo Zhi Shen Anderson Serangoon Junior College Least

Post 2015 Agenda Mike Battcock Civil Society Department 1 Millennium Development Goals

Introduction Sub-Saharan Africa general women digital divide many aspects

IN THE WORLD ECONOMY: On the Significance of Developing Countries Deepak Nayyar UNU-WIDER 30th

The Northern Ontario Recovery Program Presented: Jennifer Findlay, Northern Development Advisor

Linear least squares Non-consistent systems Ax = b , b / R ( A ) This means b or a part of it

Least Squares (outline) Standard regression: Fit data with weighted sum of regressors.

Ordinary Least Squares for Histogram Data based on Wasserstein Distance Rosanna Verde Antonio

COMS 4721: Machine Learning for Data Science Lecture 2, 1/19/2017 Prof. John Paisley Department