inference about a future value of y
play

Inference About a Future Value of Y A regression model may be fitted - PowerPoint PPT Presentation

ST 380 Probability and Statistics for the Physical Sciences Inference About a Future Value of Y A regression model may be fitted to learn about the association of Y and x , represented by 0 and especially 1 . However, sometimes the intent is


  1. ST 380 Probability and Statistics for the Physical Sciences Inference About a Future Value of Y A regression model may be fitted to learn about the association of Y and x , represented by β 0 and especially β 1 . However, sometimes the intent is to make inferences about the likely values of Y under new conditions. We might want to learn about the distribution of Y when pH = 7.5, which is not one of the values in the data set. 1 / 7 Simple Linear Regression Prediction

  2. ST 380 Probability and Statistics for the Physical Sciences In the regression model, when x has some new value x ∗ , E ( Y ) = β 0 + β 1 x ∗ , so the natural estimator of E ( Y ) is Y = ˆ ˆ β 0 + ˆ β 1 x ∗ . Y ) = β 0 + β 1 x ∗ = E ( Y ), so ˆ We can show that E ( ˆ Y is an unbiased estimator of E ( Y ). To construct confidence intervals for E ( Y ), we need the standard error of ˆ Y ; the formula is known, but using software is simpler. 2 / 7 Simple Linear Regression Prediction

  3. ST 380 Probability and Statistics for the Physical Sciences In R arsenicLm <- lm(Percent ~ pH, arsenic) predict(arsenicLm, data.frame(pH = 7.5), se.fit = TRUE, interval = "confidence") Output $fit fit lwr upr 1 55.01145 50.67454 59.34837 $se.fit [1] 2.045806 $df [1] 16 $residual.scale [1] 6.125584 3 / 7 Simple Linear Regression Prediction

  4. ST 380 Probability and Statistics for the Physical Sciences In the R output, fit is ˆ Y , and se.fit is its estimated standard error. lwr and upr are the endpoints of the confidence interval for E ( Y ), by default the 95% confidence interval. 4 / 7 Simple Linear Regression Prediction

  5. ST 380 Probability and Statistics for the Physical Sciences Predicting the Future Value of Y Note: E ( Y ) is the expected value of Y when x = x ∗ ; in the example, it is the capability of the process to remove arsenic from water with a pH of x ∗ = 7 . 5. Sometimes we need to predict the observed value of Y in a future experiment with x = x ∗ . Since Y = E ( Y ) + ǫ and E ( ǫ ) = 0, the best predictor of Y is still ˆ Y . 5 / 7 Simple Linear Regression Prediction

  6. ST 380 Probability and Statistics for the Physical Sciences But V ( Y − ˆ Y ) = V { [ Y − E ( Y )] + [ E ( Y ) − ˆ Y ] } = V [ Y − E ( Y )] + V [ E ( Y ) − ˆ Y ] = σ 2 + V [ ˆ Y ] . The prediction interval for Y is also centered at ˆ Y , but is wider than the confidence interval. 6 / 7 Simple Linear Regression Prediction

  7. ST 380 Probability and Statistics for the Physical Sciences In R The same predict() method is used, but with an option to make the interval appropriately wider: predict(arsenicLm, data.frame(pH = 7.5), interval = "prediction") Output fit lwr upr 1 55.01145 41.32072 68.70218 Note that the prediction interval has a width of 27.4, whereas the confidence interval has a width of 8.7. 7 / 7 Simple Linear Regression Prediction

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend