linear models overview topic introduction justification
play

Linear Models Overview Topic Introduction & Justification - PowerPoint PPT Presentation

Linear Models Overview Topic Introduction & Justification Introduction & Model Assumptions We are discussing linear models for the following reasons Matrix Representation In many practical situations, the assumptions underlying


  1. Linear Models Overview Topic Introduction & Justification • Introduction & Model Assumptions We are discussing linear models for the following reasons • Matrix Representation • In many practical situations, the assumptions underlying linear models do apply • Coefficient Estimation • The theory and methods are very rich • Least Squares • These simple models often work reasonably well on difficult • Properties problems • Analysis of Variance • They demonstrate the broad range of information that can be • Performance Coefficients obtained from a model. Would like similar information from • Statistical Inferences nonlinear models. • Prediction Errors • Linear models form the core of many nonlinear models • “Many” relationships in data are simple (monotonic) J. McNames Portland State University ECE 4/557 Linear Models Ver. 1.27 1 J. McNames Portland State University ECE 4/557 Linear Models Ver. 1.27 2 Notation Notation Continued • Data set is of fixed size n • Output Variables • Consider the i th set of input-output points in the data set – Assume a single model output • Input variables – This does not reduce the generality – The inputs of the i th set can be collected into a vector x i – Could create a separate linear model for each output – x i is a column vector with p − 1 elements, x i ∈ R p − 1 – More efficient method computationally (see me if interested) – Thus, there are p − 1 model inputs – y i : output of the i th set of input-output points – Individual inputs of a vector of inputs will be written as x i,j • Will also use ambiguous notation for outputs • Notation for vectors – Sometimes y is a vector of all the outputs in the data set – Sometimes x i will represent a column vector of all the model – Other times y is a scalar output due to a single input vector x inputs in the i th set of input-output pairs in the data set – Will try to keep clear by context and use of boldface for – Other times x i will represent the i th point (scalar) in the input vectors and matrices vector x • Note that in this set of notes, random variables are not – May also denote as x · ,i represented with capital letters (e.g., y , ε ) – Distinction will be clear from context and use of boldface for vectors and matrices J. McNames Portland State University ECE 4/557 Linear Models Ver. 1.27 3 J. McNames Portland State University ECE 4/557 Linear Models Ver. 1.27 4

  2. Statistical Model for the Data: Assumptions Linear Model Comments ⎛ ⎞ p − 1 Statistical Model Regression Model � ⎠ + ε i y = ω 0 + ω j x j ⎝ ⎛ ⎞ p − 1 p − 1 � � j =1 ⎠ + ε y = ω 0 + ω j x j y = w 0 + ˆ w j x j ⎝ With linear models, we make the following key assumptions: j =1 j =1 • The output variable is related to the input variables by the linear • The p model parameters ( ω j ) are unknown relationship shown above • Our goal: estimate y given an input vector x and a data set • All of the inputs x i,j are known exactly (non-random) • The noise term, ε , is independent of x and therefore unpredictable – Fortunately, the optimal estimator for the random inputs case is the same • E[ y ] = ω 0 + � p − 1 j =1 ω j x j • ε is a random error term • Our estimate (regression model) will be of the same form – ε i has zero mean: E[ ε i ] = 0 • If w j = ω j , ˆ y = E[ y ] – ε has constant variance: E[ ε 2 i ] = E[ ε 2 k ] = σ 2 ε for all i, k • E[ y ] is the optimal model in the sense of minimum MSE – ε i and ε k are uncorrelated for i � = k • Our goal is equivalent to estimating the model parameters ω j – We will sometimes assume ε i ∼ N (0 , σ 2 ε ) ∀ i J. McNames Portland State University ECE 4/557 Linear Models Ver. 1.27 5 J. McNames Portland State University ECE 4/557 Linear Models Ver. 1.27 6 Where Does ε Come From? Linear Model Performance Limits Observed p − 1 p − 1 � � Output Variables Process y = ω 0 + ω j x j + ε y = w 0 + ˆ w j x j x 1 ,...,x n y c j =1 j =1 Unobserved Observed • Even in the best case scenario w j = ω j , our model will not be Variables Variables z 1 ,...,z n x n ,...,x p-1 perfect: y − E[ y ; x ] = y − ˆ y = ε z c+ 1 • The random component of y is not predictable • In general, y = f ( x 1 , . . . , x p − 1 , z 1 , . . . , z n z ) • ε represents the net contribution of the unmeasured effects on the output y • We only have access to x 1 , . . . , x p − 1 • For this topic, we are assuming that the output has the following • ε is unpredictable given the data set and the model inputs relationship: y = ω 0 + � p − 1 j =1 ω j x j + f z ( z 1 , . . . , z n z ) • Since the ε is the only random variable in the expression for y , it is easy to show that var( y ) = σ 2 • Here ε = f z ( z 1 , . . . , z n z ) accounts for the effect of all these ε unknown variables • Recall that var( c + X ) = var( X ) where X is a random variable • If ε is of the form ε = � n z j =1 α j z j , the CLT applies and and c is a constant ε ∼ N (0 , σ 2 ε ) J. McNames Portland State University ECE 4/557 Linear Models Ver. 1.27 7 J. McNames Portland State University ECE 4/557 Linear Models Ver. 1.27 8

  3. Geometric Interpretation Example 1: Linear Surface • The process parameter ω i represents how much influence the 1−Dimensional Response Surface Example 2 model input x i (scalar) has on the process output y : Data Set 1.5 Approximation ∂y True Model = ω i 1 ∂x i 0.5 • This is useful information that many nonlinear models lack 0 • Geometrically, ˆ y = f ( x ) represents a hyperplane Output y −0.5 • Also called the response surface −1 • For a single input the response surface is is a line −1.5 −2 −2.5 −3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Input x J. McNames Portland State University ECE 4/557 Linear Models Ver. 1.27 9 J. McNames Portland State University ECE 4/557 Linear Models Ver. 1.27 10 Example 1: MATLAB Code Practical Application function [] = LinearSurface(); • Often the assumptions we make will be invalid close all; • People (engineers, researchers, etc.) uses these models profusely FigureSet(1,4.5,2.8); N = 20; anyway x = rand(N,1); y = 2*x - 1 + randn(N,1); • In most ways, linear models are robust xh = 0:0.01:1; A = [ones(N,1) x]; b = y; • They usually still generate reasonable fits even if the assumptions w = pinv(A)*b; yh = w(1) + w(2)*xh; % y_hat are incorrect yt = -1 + 2 *xh; % y_true • Nonlinear models make fewer assumptions, but have other h = plot(x,y,’ko’,xh,yh,’b’,xh,yt,’r’); set(h(1),’MarkerSize’,2); set(h(1),’MarkerFaceColor’,’k’); problems set(h(1),’MarkerEdgeColor’,’k’); set(h,’LineWidth’,1.5); • There is no perfect method for modeling xlabel(’Input x’); ylabel(’Output y’); title(’1-Dimensional Response Surface Example’); set(gca,’Box’,’Off’); grid on; AxisSet(8); legend(’Data Set’,’Approximation’,’True Model’,2); print -depsc LinearSurface; J. McNames Portland State University ECE 4/557 Linear Models Ver. 1.27 11 J. McNames Portland State University ECE 4/557 Linear Models Ver. 1.27 12

  4. Process Diagram Model Diagram x 1 ω 1 x 1 w 1 ω 0 ε w 0 x 2 ω 2 y x 2 w 2 Σ Σ Σ y ˆ x p − 1 ω p − 1 x p − 1 w p − 1 • Often pseudo block-diagrams are used to show the flow of • Can draw a similar diagram for the regression model computation in the statistical model or regression model • We want w j ≈ ω j so that ˆ y ≈ E [ y ; x ] • The diagram above illustrates how the output of the statistical • We cannot estimate ε from knowledge of x model is generated • Our primary goal is to estimate y • An equivalent goal is to estimate the parameters ω i , if the statistical model is accurate J. McNames Portland State University ECE 4/557 Linear Models Ver. 1.27 13 J. McNames Portland State University ECE 4/557 Linear Models Ver. 1.27 14 Nominal Data & Classification Efficient Nominal Data Encoding • Linear models can be used with nominal data as well as interval • Suppose the process output (or input) is one of four colors: red, • Use a simple transformation blue, yellow, or green • This best way to encode this is to declare a model output for each • Example: � category 1 if male x i = Category y 1 y 2 y 3 y 4 0 if female Red +1 -1 -1 -1 • Can also used for classification Blue -1 +1 -1 -1 Yellow -1 -1 +1 -1 • For example, if the data set outputs are encoded as follows Green -1 -1 -1 +1 � 1 if success • This example requires constructing 4 models y = 0 if failure • For new inputs, the output of the model will not be binary then the decision rule for predicted outputs during application of • For example, y 1 = 0 . 6 , y 2 = − 1 . 1 , y 3 = 0 . 8 , y 4 = 0 the model could be • How do you choose what the final nominal output is? If y < 0 . 5 , declare failure. • This yields additional information about the inputs Otherwise, declare success. J. McNames Portland State University ECE 4/557 Linear Models Ver. 1.27 15 J. McNames Portland State University ECE 4/557 Linear Models Ver. 1.27 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend