Topic 5: Non-Linear Relationships and Non-Linear Least Squares - PowerPoint PPT Presentation

Topic 5: Non-Linear Relationships and Non-Linear Least Squares Non-linear Relationships Many relationships between variables are non-linear. (Examples) OLS may not work (recall A.1). It may be biased and inconsistent. In other situations, we may still be able to use OLS, either by approximating the non-linear relationship, or by appropriately transforming the population model. 1

 The models we’ ve worked with so far have been linear in the parameters .  They’ve been of the form: 𝒛 = 𝑌𝜸 + 𝜻  Many models based on economic theory are actually non-linear in the parameters.  In general: 𝒛 = 𝑔(𝜾; 𝑌) + 𝜻 where 𝑔 is non-linear.  Note the linear model is a special case. 2

Transforming a non-linear population model Cobb-Douglas production function: 𝑍 = 𝐵𝐿 𝛾 2 𝑀 𝛾 3 𝜁 By taking logs, the Cobb-Douglas production function can be rewritten as: log 𝑍 = 𝛾 1 + 𝛾 2 log 𝐿 + 𝛾 3 log 𝑀 + log⁡ (𝜁) This model now satisfies A.1 (linear in the parameters), however, it is not advisable to estimate by OLS in most cases. Silva and Tenreyro (2006) 1 : If log⁡ (𝜁) is heteroskedastic (it likely is), 𝑌 and 𝜻 are not independent! 1 Silva and Tenreyro (2006). The Log of Gravity. The Review of Economics and Statistics. 3

“It may be surprising that the pattern of heteroscedasticity … can affect the consistency of an estimator, rather than just its efficiency. The reason is that the nonlinear transformation …changes the properties of the error term in a nontrivial way” Approximations Some mathematical properties may be exploited in order to approximate the function 𝑔(𝜾; 𝑌) .  Polynomials  Logarithms  Dummy variables 4

Polynomial Regression Model One way to characterize the non-linear relationship between 𝑧 and 𝑦 is to say that the marginal effect of 𝑦 on 𝑧 depends on the value of 𝑦 itself.  Just include powers of the regressors on the right-hand-side  Not a violation of A.2  e.g. 𝑧 = 𝛾 0 + 𝛾 1 𝑦 + 𝛾 2 𝑦 2 + 𝛾 3 𝑦 3 + ⋯ + 𝜁  Take the derivative  Choosing 𝜸 approximates the non-linear function 𝑔  The validity of the approximation is based on Taylor-series expansion  The appropriate order of the polynomial may be determined through a series of t -tests 5

Logarithms Can take the logarithm of the LHS and/or RHS variables.  The 𝛾 s have approximate percentage-change interpretations  log-lin  lin-log  log-log For example: log 𝑥𝑏𝑕𝑓 =⁡ 𝛾 0 + 𝛾 1 𝑓𝑒𝑣𝑑 + 𝛾 2 𝑔𝑓𝑛𝑏𝑚𝑓 + ⋯ + 𝜁  Take the derivative w.r.t. 𝑓𝑒𝑣𝑑  Change in 𝑓𝑒𝑣𝑑 leads to a multiplicative change of exp(𝛾 1 ) in 𝑥𝑏𝑕𝑓  approximately 100 𝛾 1 % change (approx. based on Taylor-series expansion of exp(𝑦) )  females make 100[ exp(𝛾 2 ) − 1 ]% more than males 6

Dummy variables – Splines Ther e may be a “break” in the model so that it is “piecewise” linear.  Example: wage before and after age = 18.  “knots” and dummy variables  [pictures and notes]  Nothing in the unrestricted estimators to ensure the two functions join at the knot  Use RLS  Multiple knots can be introduced  Location of the knots can be arbitrary, leading to nonparametric kernel regression 7

Non-linear population models There are many situations where transformations/approximations of the non- linear model is not desirable/possible, and the non-linear pop. model should be estimated directly.  CES Production function : −𝜍 + (1 − 𝜀)𝑀 𝑗 −𝑤/𝜍 exp⁡ −𝜍 ] 𝑍 𝑗 = 𝛿[𝜀𝐿 𝑗 (𝜁 𝑗 ) −𝜍 + (1 − 𝜀)𝑀 𝑗 𝑤 −𝜍 ] +𝜁 𝑗 or, 𝑚𝑜(𝑍 𝑗 ) = 𝑚𝑜(𝛿) − ( 𝜍 ) 𝑚𝑜[𝜀𝐿 𝑗  Linear Expenditure System : ( Stone, 1954 ) Max. 𝑉(𝒓) = ∑ 𝛾 𝑗 𝑚𝑜(𝑟 𝑗 − 𝛿 𝑗 ) ( Stone-Geary /Klein-Rubin ) 𝑗 s.t. ∑ 𝑞 𝑗 𝑟 𝑗 = 𝑁 𝑗 8

Yields the following system of demand equations: 𝑞 𝑗 𝑟 𝑗 = 𝛿 𝑗 𝑞 𝑗 + 𝛾 𝑗 (𝑁 − ∑ 𝛿 𝑘 𝑞 𝑘 ) ; i = 1, 2, … ., n 𝑘 The 𝛾 𝑗 ’s are the Marginal Budget Shares . So, we require that 0 < 𝛾 𝑗 < 1 ; i = 1, 2, …., n .  Box-Cox transform (often applied to positive valued variables  “Limited dependent variables” o y must be positive (or negative) o y is a dummy o y is an integer 9

In general, suppose we have a single non-linear equation: 𝑧 𝑗 = 𝑔(𝑦 𝑗1 , 𝑦 𝑗2 , … , 𝑦 𝑗𝑙 ; 𝜄 1 , 𝜄 2 , … , 𝜄 𝑞 ) + 𝜁 𝑗  We can still consider a “Least Squares” approach.  The Non-Linear Least Squares estimator is the vector, 𝜾 ̂ , that minimizes the 𝟑 ̂)] quantity: 𝑇(𝑌, 𝜾) = ∑ [𝑧 𝑗 − 𝑔 𝑗 (𝑌, 𝜾 . 𝒋  Clearly the usual LS estimator is just a special case of this.  To obtain the estimator, we differentiate S with respect to each element of ̂ ; set up the “ p ” first -order conditions and solve. 𝜾  Difficulty – usually, the first-order conditions are themselves non-linear in the unknowns (the parameters).  This means there is (generally) no exact, closed-form, solution.  Can’t write down an explicit formula for the estimators of parameter s. 10

Example 𝑧 𝑗 = 𝜄 1 + 𝜄 2 𝑦 𝑗2 + 𝜄 3 𝑦 𝑗3 + (𝜄 2 𝜄 3 )𝑦 𝑗4 + 𝜁 𝑗 𝑇 = ∑[𝑧 𝑗 − 𝜄 1 − 𝜄 2 𝑦 𝑗2 − 𝜄 3 𝑦 𝑗3 − (𝜄 2 𝜄 3 )𝑦 𝑗4 ] 2 𝑗 𝜖𝑇 = −2 ∑[ 𝑧 𝑗 − 𝜄 1 − 𝜄 2 𝑦 𝑗2 − 𝜄 3 𝑦 𝑗3 − ( 𝜄 2 𝜄 3 ) 𝑦 𝑗4 ] 𝜖𝜄 1 𝑗 𝜖𝑇 = −2 ∑[(𝜄 3 𝑦 𝑗4 + 𝑦 𝑗2 )(𝑧 𝑗 − 𝜄 1 − 𝜄 2 𝑦 𝑗2 − 𝜄 3 𝑦 𝑗3 − 𝜄 2 𝜄 3 𝑦 𝑗4 )] 𝜖𝜄 2 𝑗 𝜖𝑇 = −2 ∑[(𝜄 2 𝑦 𝑗4 + 𝑦 𝑗3 )(𝑧 𝑗 − 𝜄 1 − 𝜄 2 𝑦 𝑗2 − 𝜄 3 𝑦 𝑗3 − 𝜄 2 𝜄 3 𝑦 𝑗4 )] 𝜖𝜄 3 𝑗 11

Setting these 3 equ ations to zero, we can’t solve analytically for the estimators of the three parameters.  In situations such as this, we need to use a numerical algorithm to obtain a solution to the first-order conditions.  Lots of methods for doing this – one possibility is Newton’s algorithm (the Newton-Raphson algorithm ). Methods of Descent ̃ = 𝜾 0 + 𝑡⁡𝒆(𝜾 0 ) 𝜾⁡ 𝜾 0 = initial (vector) value. s = step-length (positive scalar) 𝒆(. ) = direction vector 12

 Usually, 𝒆(. ) Depends on the gradient vector at 𝜾 0 .  It may also depend on the change in the gradient (the Hessian matrix) at 𝜾 0 .  Some specific algorithms in the “family” make the step -length a function of the Hessian.  One very useful, specific member of the family of “Descent Methods” is the Newton-Raphson algorithm : Suppose we want to minimize some function, 𝑔(𝜾) . ̃ , the vector Approximate the function using a Taylor’s series expansion about 𝜾 value that minimizes 𝑔(𝜾) : ′ [ 𝜖 2 𝑔 ′ (𝜖𝑔 ̃ + 1 ̃) + (𝜾 − 𝜾 ̃) ̃) ̃) 𝑔(𝜾) ≅ 𝑔(𝜾 𝜖𝜾) 2! (𝜾 − 𝜾 𝜖𝜾𝜖𝜾 ′ ] (𝜾 − 𝜾 𝜾 ̃ 𝜾 13

Or: ̃) + 1 ′ 𝑕(𝜾 ′ 𝐼(𝜾 ̃) + (𝜾 − 𝜾 ̃) ̃) ̃)(𝜾 − 𝜾 ̃) 𝑔(𝜾) ≅ 𝑔(𝜾 2! (𝜾 − 𝜾 So, 𝜖𝑔(𝜾) ̃) + 1 ′ 𝑕(𝜾 ̃) ̃)(𝜾 − 𝜾 ̃) ≅ 0 + (𝜾 − 𝜾 2! 2𝐼(𝜾 𝜖𝜾 ̃ ) = 0 ; as 𝜾 ̃ locates a minimum. However, 𝑕 (𝜾 So, 𝜖𝑔(𝜾) ̃) ≅ 𝐼 −1 (𝜾 ̃) ( (𝜾 − 𝜾 𝜖𝜾 ) ; ̃ ≅ 𝜾 − 𝐼 −1 (𝜾 ̃)𝑕(𝜾) or, 𝜾 14

This suggests a numerical algorithm: Set 𝜾 = 𝜾 0 to begin, and then iterate – 𝜾 1 = 𝜾 0 − 𝐼 −1 (𝜾 1 )𝑕(𝜾 0 ) 𝜾 2 = 𝜾 1 − 𝐼 −1 (𝜾 2 )𝑕(𝜾 1 ) ⋮ ⋮ ⋮ 𝜾 𝑜+1 = 𝜾 𝑜 − 𝐼 −1 (𝜾 𝑜+1 )𝑕(𝜾 𝑜 ) or, approximately: 𝜾 𝑜+1 = 𝜾 𝑜 − 𝐼 −1 (𝜾 𝑜 )𝑕(𝜾 𝑜 ) 15

(𝑗) −𝜄 𝑜 (𝑗) ) (𝜄 𝑜+1 | <⁡𝜁 (𝑗) ; i = 1, 2, …, p Stop if | (𝑗) 𝜄 𝑜 Note: 1. s = 1. 2. 𝒆(𝜾 𝑜 ) = −𝐼 −1 (𝜾 𝑜 )𝑕(𝜾 𝑜 ) . 3. Algorithm fails if H ever becomes singular at any iteration. 4. Achieve a minimum of f (.) if H is positive definite . 5. Algorithm may locate only a local minimum. 6. Algorithm may oscillate . The algorithm can be given a nice geometric interpretation – scalar θ . 16

𝜖𝑔(𝜄) = 𝑕(𝜄) = 0 . To find an extremum of f (.), solve 𝜖𝜄 ⁡⁡⁡⁡𝑕 𝜄 𝑛𝑗𝑜 𝜄 𝜄 1 𝜄 0 17

⁡⁡⁡⁡𝑕 𝜄 𝑛𝑏𝑦 𝜄 𝑛𝑗𝑜 𝜄 𝜄 1 𝜄 2 𝜄 0 18

⁡⁡⁡⁡𝑕 𝑕(𝜄 0 ) = 𝐼(𝜄 0 ) 𝜄 0 − 𝜄 1 ⁡⁡⁡⇒⁡⁡⁡⁡⁡⁡⁡⁡𝜄 1 = 𝜄 0 − 𝐼 −1 (𝜄 0 )𝑕(𝜄 0 ) ⁡⁡⁡⁡⁡⁡⁡⁡𝜾 𝒐+𝟐 = 𝜾 𝒐 − 𝑰 −𝟐 (𝜾 𝒐 )𝒉(𝜾 𝒐 ) 𝜄 𝑛𝑗𝑜 𝜄 𝜄 1 𝜄 0 19

If 𝑔(𝜾) is quadratic in 𝜾 , then the algorithm converges in one iteration: ⁡⁡⁡⁡𝑕 If the function is quadratic, then its gradient is linear: 𝜄 𝑛𝑗𝑜 𝜄 𝜄 1 𝜄 0 20

In general, different choices of 𝜄 0 may lead to different solutions, or no solution at all. ⁡⁡⁡⁡𝑕 𝜄 𝑛𝑏𝑦 𝜄 𝑛𝑏𝑦 𝜄 𝑛𝑗𝑜 𝜄 𝑛𝑗𝑜 𝜄 𝜄 0 21

Topic 5: Non-Linear Relationships and Non-Linear Least Squares - PowerPoint PPT Presentation

Topic 5: Non-Linear Relationships and Non-Linear Least Squares Non-linear Relationships Many relationships between variables are non-linear. (Examples) OLS may not work (recall A.1). It may be biased and inconsistent. In other situations, we

Virtual Student Orientation Information for Families SLIDESMANIA.COM TOPIC TOPIC TOPIC TOPIC

ConnectHome ConnectHome Topic 2 Topic 2 Nation Webinar Nation Webinar Topic 3 Topic 3 Topic

COMP31212: Concurrency Topic 5.3: Liveness and Topic 5.4 Fairness Topic 5.3: Liveness Properties

Practical Least-Squares for Computer Graphics Siggraph Course 11 Siggraph Course 11 Practical

UNIT TOPICS TOPIC 1: MINERALS TOPIC 2: IGNEOUS ROCKS TOPIC 3: SEDIMENTARY ROCKS

TOPIC #X: TOPIC NAME DATE, 2020 PRESENTATION OUTLINE Main topic #1 Main topic #2 Main

Linear Least Squares I Steve Marschner Cornell CS 322 Cornell CS 322 Linear Least Squares I 1

Flow, Space and Activity Relationships II. Chapter 3 of the textbook Activity relationships

Therapeutic Relationships Therapeutic Relationships Therapeutic Relationships Therapeutic

Non linear Least Squares Lectures for PHD course on Numerical optimization Enrico Bertolazzi

Non-linear Least Squares and Durbins Problem Asymptotic Theory Part V James J. Heckman

Least-Action Filtering L. C. G. Rogers Statistical Laboratory, University of Cambridge

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Hearing Loss Hearing Loss and and Relationships Relationships Shanna Groves and Melissa Frye

Relationships - why some work and others don t. Relationships and Parenting. How we are treated

Introduction to characters and parsimony analysis Genetic Relationships Genetic relationships

Variance stabilization and simple GARCH models Erik Lindstrm Simulation, GBM Standard model in

Kernel Based Estimation of Inequality Indices and Risk Measures Arthur Charpentier

Advanced Analytics in Business [D0S07a] Big Data Platforms & Technologies [D0S06a]

Regression DAAG Chapters 5 and 6 Learning objectives The overarching objective is to reinforce

Overview of this module Course 02429 Analysis of correlated data: Mixed Linear Models Linear

Experimental Analysis of Mode Switching Techniques, Yang Li et al. Premise: Mode switching is

Apache SystemML Declarative Machine Learning Luciano Resende IBM | Spark Technology Center IBM

Biospecimen Assessment Michelle Danaher University of Maryland, Baltimore County and Eunice

Topic 5: Non-Linear Relationships and Non-Linear Least Squares - PowerPoint PPT Presentation

Topic 5: Non-Linear Relationships and Non-Linear Least Squares Non-linear Relationships Many relationships between variables are non-linear. (Examples) OLS may not work (recall A.1). It may be biased and inconsistent. In other situations, we

Virtual Student Orientation Information for Families SLIDESMANIA.COM TOPIC TOPIC TOPIC TOPIC

ConnectHome ConnectHome Topic 2 Topic 2 Nation Webinar Nation Webinar Topic 3 Topic 3 Topic

COMP31212: Concurrency Topic 5.3: Liveness and Topic 5.4 Fairness Topic 5.3: Liveness Properties

Practical Least-Squares for Computer Graphics Siggraph Course 11 Siggraph Course 11 Practical

UNIT TOPICS TOPIC 1: MINERALS TOPIC 2: IGNEOUS ROCKS TOPIC 3: SEDIMENTARY ROCKS

TOPIC #X: TOPIC NAME DATE, 2020 PRESENTATION OUTLINE Main topic #1 Main topic #2 Main

Linear Least Squares I Steve Marschner Cornell CS 322 Cornell CS 322 Linear Least Squares I 1

Flow, Space and Activity Relationships II. Chapter 3 of the textbook Activity relationships

Therapeutic Relationships Therapeutic Relationships Therapeutic Relationships Therapeutic

Non linear Least Squares Lectures for PHD course on Numerical optimization Enrico Bertolazzi

Non-linear Least Squares and Durbins Problem Asymptotic Theory Part V James J. Heckman

Least-Action Filtering L. C. G. Rogers Statistical Laboratory, University of Cambridge

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Hearing Loss Hearing Loss and and Relationships Relationships Shanna Groves and Melissa Frye

Relationships - why some work and others don t. Relationships and Parenting. How we are treated

Introduction to characters and parsimony analysis Genetic Relationships Genetic relationships

Variance stabilization and simple GARCH models Erik Lindstrm Simulation, GBM Standard model in

Kernel Based Estimation of Inequality Indices and Risk Measures Arthur Charpentier

Advanced Analytics in Business [D0S07a] Big Data Platforms &amp; Technologies [D0S06a]

Regression DAAG Chapters 5 and 6 Learning objectives The overarching objective is to reinforce

Overview of this module Course 02429 Analysis of correlated data: Mixed Linear Models Linear

Experimental Analysis of Mode Switching Techniques, Yang Li et al. Premise: Mode switching is

Apache SystemML Declarative Machine Learning Luciano Resende IBM | Spark Technology Center IBM

Biospecimen Assessment Michelle Danaher University of Maryland, Baltimore County and Eunice

Advanced Analytics in Business [D0S07a] Big Data Platforms & Technologies [D0S06a]