Theory of Regression Analysis with Applications T Padma Ragaleena - PowerPoint PPT Presentation

Multiple-linear regression model Theory of Regression Analysis with Applications T Padma Ragaleena National Institute of Science Education and Research Bhubaneswar 20 November 2019 T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 1 / 38

Multiple-linear regression model Multiple-linear regression model T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 2 / 38

Multiple-linear regression model Regression model Response Regressor 1 Regressor 2 · · · Regressor k y x 1 x 2 · · · x k y 1 x 11 x 12 · · · x 1 k y 2 x 21 x 22 · · · x 2 k · · · · · · · · · · · y n x n 1 x n 2 x nk Y = X β + ǫ where ǫ ∼ N ( 0 , σ 2 I ) We also assume : cov ( ǫ i , ǫ j ) = 0 for all i � = j Y is a random vector ; all x i ’s are not random and they are known with negligible error We assume the existence of at least an approximate linear relationship between response variables and other regressors. T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 3 / 38

Multiple-linear regression model Why are we assuming a distribution for ǫ ? to get the p-values and confidence intervals for quantities of interest (hypothesis testing) Why did we choose normal distribution? It describes random errors in real world processes reasonably well There is well developed mathematical theory behind normal distribution Are non-normal distributions useful? In financial models, errors are assumed to come from a heavy tailed distribution , normal distribution is not suitable here. T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 4 / 38

Multiple-linear regression model Least Square Estimates How do we estimate β SSE = � n y i ) 2 measures the amount of deviation of the predicted value i = 1 ( y i − ˆ from the true value. One way to get a “good estimate” for β is to minimize the SSE. So we minimize S ( β ) = ( y − X β ) ′ ( y − X β ) with respect to β and call the minimizing vector as the Least Square Estimate(LSE) for the model. It is denoted by ˆ β . In order to find the β which minimizes S ( β ) , we use the following property of Hilbert spaces: Closest point theorem Let M be a closed convex subset of a Hilbert space H , x �∈ M then ∃ ! y 0 ∈ M such that || x − y 0 || ≤ || x − m || for all m ∈ M . Also, y 0 − x ∈ M ⊥ Using this theorem, we get : β = ( X ′ X ) − 1 X ′ Y = Least Square Estimate ˆ T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 5 / 38

Multiple-linear regression model Least square estimates In Hilbert spaces, y 0 is called the projection of x on to the subspace M . Similarly, H = X ( X ′ X ) − 1 X ′ is called projection matrix because ˆ y = Hy For Hilbert spaces, we know that the projection map defined as P ( x ) = y 0 is idempotent. Here also, H is idempotent i.e. H 2 = H T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 6 / 38

Multiple-linear regression model Properties of least square estimates(LSE) LSE is an unbiased estimate for β ˆ β is a maximum likelihood estimator for β . Least square estimators are Best Linear Unbiased Estimators - BLUE (Gauss-Markov theorem) Gauss-Markov theorem Let Y = X β + ǫ be a regression model such that each ǫ i follows a distribution with mean 0 , variance σ 2 and cov ( ǫ i , ǫ j ) = 0. Then the LSE are Best Linear Unbiased Estimators. Observe that no normality is assumed for the errors ˆ ⇒ Var ( a ′ ˆ β ) ≤ Var ( a ′ ˜ β ) for all a ∈ R p ; ˜ β is best = β = any other linear unbiased estimate T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 7 / 38

Multiple-linear regression model Coefficient of determination y ) 2 = � n y i ) 2 + � n y ) 2 i.e. SST = SSRes + SSR � n i = 1 ( y i − ¯ i = 1 ( y i − ˆ i = 1 (ˆ y i − ¯ SST measures the total variation of y i ’s around ¯ y SSRes measures the variation that could not be explained by the model. SSR is the variation that can be explained by the model. Then R 2 = 1 − SSRes ∈ [ 0 , 1 ] gives a proportion of variation in y i that could be SST explained by the model. T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 8 / 38

Multiple-linear regression model Coefficient of determination Consider the data containing temperature (x-variable) and the log of the light intensity(y- variable) of 47 stars in the star cluster CYG OB1 data("CYGOB1") model1 <- lm(CYGOB1$logli CYGOB1$logst , data = CYGOB1) summary(model1)$r.squared 0.04427374 Regerssion captures only 4.4% variation . This is not a good model. T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 9 / 38

Multiple-linear regression model Tests of significance H 0 : β j = 0 for all j against H 1 : at least one β j � = 0 tests if there exists any linear relationship between response and predictors. Test Statistic : Under the null hypothesis SSR k ∼ F k , n − p SSRes σ 2 Under a level of significance α , we have enough evidence to reject H 0 in favour of H 1 if | F ∗ | ≥ F α 2 ; k , n − p or reject the null hypothesis in favour of H 1 if p-value ≤ α T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 10 / 38

Multiple-linear regression model Tests of significance Once we know that previous null hypothesis is rejected, then our next aim would be to know which coefficients β j are non-zero. H 0 : β j = 0 against β j � = 0 Test Statistic : Under the null hypothesis: ˆ β j β j ) ∼ t n − k − 1 � Var ( ˆ Under a level of significance α , we have enough evidence to reject H 0 in favour of H 1 if | t ∗ | ≥ t α 2 ; n − k − 1 or reject the null hypothesis in favour of H 1 if p-value ≤ α T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 11 / 38

Multiple-linear regression model Tests of significance A more general hypothesis would be to test the r linearly independent hypothesis i.e. H 0 : ˆ β 0 a i 0 + ˆ β 1 a i 1 + ... + ˆ β k a ik = b i for all i = 1 , 2 , ..., r . β = ˜ In other words, the hypothesis we want to test is H 0 : A ˆ b where T is a known linear transformation. Test statistic: Under the null hypothesis: b ) ′ ( A ( X ′ X ) − 1 A ′ ) − 1 ( A ˆ ( A ˆ β − ˜ β − ˜ b ) ∼ F r , n − p σ 2 r ˆ Under a level of significance α , we have enough evidence to reject H 0 in favour of H 1 if | F ∗ | ≥ F α 2 ; r , n − p or reject the null hypothesis in favour of H 1 if p-value ≤ α T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 12 / 38

Multiple-linear regression model Regression Diagnostics Our aim is to check if our model follows the regression assumptions. A few remedies are suggested if the assumptions are not being followed. The validity of these assumption is needed for the results to be meaningful. If these assumptions are violated, the results can be incorrect or misleading. So such underlying assumptions have to be verified before attempting to do regression modeling. T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 13 / 38

Multiple-linear regression model Residuals Residuals e i = y i − ˆ y i can be thought of as a realization of the error terms. Thus any departure from assumptions on errors, should show up in the residuals. We can show that e = ( I − H ) ǫ . Hence Var ( e ) = σ 2 ( I − H ) . Even though errors ǫ i are assumed to be uncorrelated and independent, the residuals e i ’s are correlated and hence dependent. T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 14 / 38

Multiple-linear regression model Normality assumption Q-Q plot is a graphical tool that is used to assess normality. It plots the theoretical quantiles (horizontal axis) against the sample quantiles (vertical axis) ) Using the residual values ( e i ), an empirical distribution is constructed using which we get sample quantiles. If X is a discrete random variable, then ξ p is called the p th quantile of a random variable X if P ( X ≤ ξ p ) ≥ p and P ( X ≥ ξ p ) ≥ 1 − p If X is a continuous random variable, then p th quatile is the unique ξ p such that P ( X ≤ ξ p ) = p T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 15 / 38

Multiple-linear regression model Q-Q plot Here we want to check if the residuals e i are coming from a normal distribution. Considering the residual values we have, we can estimate the cdf from which these points have come from as: ˆ F ( x ) = 1 � n i = 1 I ( e i ≤ x ) n th quantile. If e ( 1 ) ≤ e ( 2 ) ≤ ... ≤ e ( n ) , then e ( i ) will be the i n Plot ˆ F − 1 ( i n against Φ − 1 ( i n ) = ξ 1 n ) If the normality assumption is followed then the plot has to be an approximate y = x line. T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 16 / 38

Multiple-linear regression model Normal Q-Q plot T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 17 / 38

Multiple-linear regression model Non-normal Q-Q plot T Padma Ragaleena NISER Theory of Regression Analysis with Applications 20 November 2019 18 / 38

Theory of Regression Analysis with Applications T Padma Ragaleena - PowerPoint PPT Presentation

Multiple-linear regression model Theory of Regression Analysis with Applications T Padma Ragaleena National Institute of Science Education and Research Bhubaneswar 20 November 2019 T Padma Ragaleena NISER Theory of Regression Analysis with

Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Analysis of variance and regression Other types of regression models Other types of regression

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Kernel Methods for Regression Support Vector Regression Gaussian Mixture Regression Gaussian

Lecture 8: Regression Trees Instructor: Saravanan Thirumuruganathan CSE 5334 Saravanan

Multiple Regression and Logistic Regression I Dajiang Liu @PHS 525 Apr-14-2016 Multiple

Planning and Optimization B2. Regression: Introduction & STRIPS Case Malte Helmert and

Chapter 7 Linear Regression 04/05/2016 Huamei Dong 1. Review Least square regression line 2.

Regression Analysis in Stata Hsueh-Sheng Wu CFDR Workshop Series February 18, 2019 1 Overview

2/18/20 & 2/19/20 POL 144A: Eastern European Democratization Isaac Hale Winter 2020 Hale

Introduction to Regression Analysis Modeling a Response A regression model describes how a

Mixing in Product Spaces Elchanan Mossel Elchanan Mossel Mixing in Product Spaces Poincar e

Detection of Gauss Markov Random Fields under Routing Energy Constraint A. Anandkumar 1 L. Tong 1

k Ho t k S . E . k degrees of freedom = n

Outline 1 Presentation of the problem Truncated Stochastic Algorithms and Variance Reduction:

Detection of Gauss-Markov Random Field on Nearest-Neighbor Graph A. Anandkumar 1 L. Tong 1 A.

networked control systems Massimo Franceschetti PhD School on Control of Networked and

Gaussian Free Field in (self-adjoint) random matrices and random surfaces Alexei Borodin Corners

The stochastic heat equation driven by a Gaussian noise: Markov property Doyoon Kim 1 , 2 Raluca

Theory of Regression Analysis with Applications T Padma Ragaleena - PowerPoint PPT Presentation

Multiple-linear regression model Theory of Regression Analysis with Applications T Padma Ragaleena National Institute of Science Education and Research Bhubaneswar 20 November 2019 T Padma Ragaleena NISER Theory of Regression Analysis with

Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Analysis of variance and regression Other types of regression models Other types of regression

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Kernel Methods for Regression Support Vector Regression Gaussian Mixture Regression Gaussian

Lecture 8: Regression Trees Instructor: Saravanan Thirumuruganathan CSE 5334 Saravanan

Multiple Regression and Logistic Regression I Dajiang Liu @PHS 525 Apr-14-2016 Multiple

Planning and Optimization B2. Regression: Introduction &amp; STRIPS Case Malte Helmert and

Chapter 7 Linear Regression 04/05/2016 Huamei Dong 1. Review Least square regression line 2.

Regression Analysis in Stata Hsueh-Sheng Wu CFDR Workshop Series February 18, 2019 1 Overview

2/18/20 &amp; 2/19/20 POL 144A: Eastern European Democratization Isaac Hale Winter 2020 Hale

Introduction to Regression Analysis Modeling a Response A regression model describes how a

Mixing in Product Spaces Elchanan Mossel Elchanan Mossel Mixing in Product Spaces Poincar e

Detection of Gauss Markov Random Fields under Routing Energy Constraint A. Anandkumar 1 L. Tong 1

k Ho t k S . E . k degrees of freedom = n

Outline 1 Presentation of the problem Truncated Stochastic Algorithms and Variance Reduction:

Detection of Gauss-Markov Random Field on Nearest-Neighbor Graph A. Anandkumar 1 L. Tong 1 A.

networked control systems Massimo Franceschetti PhD School on Control of Networked and

Gaussian Free Field in (self-adjoint) random matrices and random surfaces Alexei Borodin Corners

The stochastic heat equation driven by a Gaussian noise: Markov property Doyoon Kim 1 , 2 Raluca

Planning and Optimization B2. Regression: Introduction & STRIPS Case Malte Helmert and

2/18/20 & 2/19/20 POL 144A: Eastern European Democratization Isaac Hale Winter 2020 Hale