Linear Models for Regression Henrik I Christensen Robotics & - - PowerPoint PPT Presentation

linear models for regression
SMART_READER_LITE
LIVE PREVIEW

Linear Models for Regression Henrik I Christensen Robotics & - - PowerPoint PPT Presentation

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References Linear Models for Regression Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280


slide-1
SLIDE 1

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Linear Models for Regression

Henrik I Christensen

Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu

Henrik I Christensen (RIM@GT) Linear Regression 1 / 39

slide-2
SLIDE 2

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Outline

1

Introduction

2

Preliminaries

3

Linear Basis Function Models

4

Baysian Linear Regression

5

Baysian Model Comparison

6

Summary

Henrik I Christensen (RIM@GT) Linear Regression 2 / 39

slide-3
SLIDE 3

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Introduction

The objective of regression is to enable prediction of a value t based

  • n modelling over a dataset X.

Consider a set of D observations over a space How can we generate estimates for the future?

Battery time? Time to completion? Position of doors?

Henrik I Christensen (RIM@GT) Linear Regression 3 / 39

slide-4
SLIDE 4

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Introduction (2)

Example from Chapter 1

x t 1 −1 1

y(x, w) = w0 + w1x + w2x2 + . . . + wmxm =

m

  • i=0

wixi

Henrik I Christensen (RIM@GT) Linear Regression 4 / 39

slide-5
SLIDE 5

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Introduction (3)

In general the functions could be beyond simple polynomials The “components” are termed basis functions, i.e. y(x, w) =

m

  • i=0

wiφi(x) = wT φ(x)

Henrik I Christensen (RIM@GT) Linear Regression 5 / 39

slide-6
SLIDE 6

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Outline

1

Introduction

2

Preliminaries

3

Linear Basis Function Models

4

Baysian Linear Regression

5

Baysian Model Comparison

6

Summary

Henrik I Christensen (RIM@GT) Linear Regression 6 / 39

slide-7
SLIDE 7

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Loss Function

For optimization we need a penalty / loss function L(t, y(x)) Expected loss is then E[L] = L(t, y(x))p(x, t)dxdt For the squared loss function we have E[L] = {y(x) − t}2p(x, t)dxdt Goal: choose y(x) to minimize expected loss (E[L])

Henrik I Christensen (RIM@GT) Linear Regression 7 / 39

slide-8
SLIDE 8

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Loss Function

Derivation of the extrema δE[L] δy(x) = 2

  • {y(x) − t}p(x, t)dt = 0

Implies that y(x) =

  • tp(x, t)dt

p(x) =

  • tp(t|x)dt = E[t|x]

Henrik I Christensen (RIM@GT) Linear Regression 8 / 39

slide-9
SLIDE 9

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Loss Function - Interpretation

t x x0 y(x0) y(x) p(t|x0)

Henrik I Christensen (RIM@GT) Linear Regression 9 / 39

slide-10
SLIDE 10

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Alternative

Consider a small rewrite {y(x) − t}2 = {y(x) − E[t|x] + E[t|x] − t}2 The expected loss is then E[L] =

  • {y(x) − E[t|x]}2p(x)dx +
  • {E[t|x] − t}2p(x)dx

Henrik I Christensen (RIM@GT) Linear Regression 10 / 39

slide-11
SLIDE 11

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Outline

1

Introduction

2

Preliminaries

3

Linear Basis Function Models

4

Baysian Linear Regression

5

Baysian Model Comparison

6

Summary

Henrik I Christensen (RIM@GT) Linear Regression 11 / 39

slide-12
SLIDE 12

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Polynomial Basis Functions

Basic Definition: φi(x) = xi Global functions Small change in x affects all of them

−1 1 −1 −0.5 0.5 1

Henrik I Christensen (RIM@GT) Linear Regression 12 / 39

slide-13
SLIDE 13

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Gaussian Basis Functions

Basic Definition: φi(x) = e− (x−µi )2

2s2

A way to Gaussian mixtures, local impact Not required to have probabilistic interpretation. µ control position and s control scale

−1 1 0.25 0.5 0.75 1

Henrik I Christensen (RIM@GT) Linear Regression 13 / 39

slide-14
SLIDE 14

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Sigmoid Basis Functions

Basic Definition: φi(x) = σ x − µi s

  • where

σ(a) = 1 1 + e−a µ controls location and s controls slope

−1 1 0.25 0.5 0.75 1

Henrik I Christensen (RIM@GT) Linear Regression 14 / 39

slide-15
SLIDE 15

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Maximum Likelihood & Least Squares

Assume observation from a deterministic function contaminated by Gaussian Noise t = y(x, w) + ǫ p(ǫ|β) = N(ǫ|0, β−1) the problem at hand is then p(t|x, w, β) = N(t|y(x, w), β−1) From a series of observations we have the likelihood p(t|X|w, β) =

N

  • i=1

N(ti|wTφ(xi), β−1)

Henrik I Christensen (RIM@GT) Linear Regression 15 / 39

slide-16
SLIDE 16

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Maximum Likelihood & Least Squares (2)

This results in ln p(t|w, β) = N 2 ln β − N 2 ln(2π) − βED(w) where ED(w) = 1 2

N

  • i=1

{ti − wTφ(xi)}2 is the sum of squared errors

Henrik I Christensen (RIM@GT) Linear Regression 16 / 39

slide-17
SLIDE 17

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Maximum Likelihood & Least Squares (3)

Computing the extrema yields: wML =

  • ΦTΦ

−1 ΦTt where Φ =      φ0(x1) φ1(x1) · · · φM−1(x1) φ0(x1) φ1(x2) · · · φM−1(x2) . . . . . . ... . . . φ0(xN) φ1(xN) · · · φM−1(xN)     

Henrik I Christensen (RIM@GT) Linear Regression 17 / 39

slide-18
SLIDE 18

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Line Estimation

Least square minimization:

Line equation: y = ax + b Error in fit:

i(yi − axi − b)2

Solution: ¯ y 2 ¯ y

  • =

¯ x2 ¯ x ¯ x 1 a b

  • Minimizes vertical errors. Non-robust!

Henrik I Christensen (RIM@GT) Linear Regression 18 / 39

slide-19
SLIDE 19

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

LSQ on Lasers

Line model: ri cos(φi − θ) = ρ Error model: di = ri cos(φi − θ) − ρ Optimize: argmin(ρ,θ)

  • i(ri cos(φi − θ) − ρ)2

Error model derived in Deriche et al. (1992) Well suited for “clean-up” of Hough lines

Henrik I Christensen (RIM@GT) Linear Regression 19 / 39

slide-20
SLIDE 20

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Total Least Squares

Line equation: ax + by + c = 0 Error in fit:

i(axi + byi + c)2 where a2 + b2 = 1.

Solution: ¯ x2 − ¯ x¯ x ¯ xy − ¯ x¯ y ¯ xy − ¯ x¯ y ¯ y2 − ¯ y¯ y a b

  • = µ

a b

  • where µ is a scale factor.

c = −a¯ x − b¯ y

Henrik I Christensen (RIM@GT) Linear Regression 20 / 39

slide-21
SLIDE 21

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Line Representations

The line representation is crucial Often a redundant model is adopted Line parameters vs end-points Important for fusion of segments. End-points are less stable

Henrik I Christensen (RIM@GT) Linear Regression 21 / 39

slide-22
SLIDE 22

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Sequential Adaptation

In some cases one at a time estimation is more suitable Also known as gradient descent w(τ+1) = w(τ) − η∇En = w(τ) − η(tn − w(τ)Tφ(xn))φ(xn) Knows as least-mean square (LMS). An issue is how to choose η?

Henrik I Christensen (RIM@GT) Linear Regression 22 / 39

slide-23
SLIDE 23

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Regularized Least Squares

As seen in lecture 2 sometime control of parameters might be useful. Consider the error function: ED(w) + λEW (w) which generates 1 2

N

  • i=1

{ti − wtφ(xi)}2 + λ 2 wTw which is minimized by w =

  • λI + ΦTΦ

−1 ΦTt

Henrik I Christensen (RIM@GT) Linear Regression 23 / 39

slide-24
SLIDE 24

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Outline

1

Introduction

2

Preliminaries

3

Linear Basis Function Models

4

Baysian Linear Regression

5

Baysian Model Comparison

6

Summary

Henrik I Christensen (RIM@GT) Linear Regression 24 / 39

slide-25
SLIDE 25

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Bayesian Linear Regression

Define a conjugate prior over w p(w) = N(w|m0, S0) given the likelihood function and regular from Bayesian analysis we can derive p(w|t) = N(w|mN, SN) where mN = SN

  • S−1

0 m0 + βΦTt

  • S−1

N

= S−1 + βΦTΦ

Henrik I Christensen (RIM@GT) Linear Regression 25 / 39

slide-26
SLIDE 26

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Bayesian Linear Regression (2)

A common choice is p(w) = N(w|0, α−1I) So that mN = βSNΦTt S−1

N

= αI + βΦTΦ

Henrik I Christensen (RIM@GT) Linear Regression 26 / 39

slide-27
SLIDE 27

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Example - No Data

Henrik I Christensen (RIM@GT) Linear Regression 27 / 39

slide-28
SLIDE 28

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Example - 1 Data Point

Henrik I Christensen (RIM@GT) Linear Regression 28 / 39

slide-29
SLIDE 29

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Example - 2 Data Points

Henrik I Christensen (RIM@GT) Linear Regression 29 / 39

slide-30
SLIDE 30

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Example - 20 Data Points

Henrik I Christensen (RIM@GT) Linear Regression 30 / 39

slide-31
SLIDE 31

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Outline

1

Introduction

2

Preliminaries

3

Linear Basis Function Models

4

Baysian Linear Regression

5

Baysian Model Comparison

6

Summary

Henrik I Christensen (RIM@GT) Linear Regression 31 / 39

slide-32
SLIDE 32

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Bayesian Model Comparison

How does one select an appropriate model? Assume for a minute we want to compare a set of models Mi, i ∈ 1, ...L for a dataset D We could compute p(Mi|D) ∝ p(D|Mi)p(Mi) Bayes Factor: Ratio of evidence for two models p(D|Mi) p(D|Mj)

Henrik I Christensen (RIM@GT) Linear Regression 32 / 39

slide-33
SLIDE 33

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

The mixture distribution approach

We could use all the models: p(t|x, D) =

L

  • i=1

p(t|x, Mi, D)p(Mi|D) Or simply go with the most probably/best model.

Henrik I Christensen (RIM@GT) Linear Regression 33 / 39

slide-34
SLIDE 34

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Model Evidence

We can compute model evidence p(D|Mi) =

  • p(D|w, Mi)p(w|Mi)dw

Allow computation of model fit based on parameter range

Henrik I Christensen (RIM@GT) Linear Regression 34 / 39

slide-35
SLIDE 35

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Evaluation of Parameters

Evaluation of posterior over parameters p(w|D, Mi) = P(D|w, Mi)p(w|Mi) P(D|Mi) There is a need to understand how good is a model?

Henrik I Christensen (RIM@GT) Linear Regression 35 / 39

slide-36
SLIDE 36

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Model Comparison

Consider evaluation of a model w. parameters w p(D) =

  • p(D|w)p(w)dw ≈ p(D|wmap)σposterior

σprior Then ln p(D) ≈ ln p(D|wmap) + ln σposterior σprior

  • Henrik I Christensen (RIM@GT)

Linear Regression 36 / 39

slide-37
SLIDE 37

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Model Comparison as Kullback-Leibler

From earlier we have comparison of distributions KL =

  • p(D|M1) ln p(D|M1)

p(D|M2)dD Enables comparison of two different models

Henrik I Christensen (RIM@GT) Linear Regression 37 / 39

slide-38
SLIDE 38

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Outline

1

Introduction

2

Preliminaries

3

Linear Basis Function Models

4

Baysian Linear Regression

5

Baysian Model Comparison

6

Summary

Henrik I Christensen (RIM@GT) Linear Regression 38 / 39

slide-39
SLIDE 39

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Summary

Brief intro to linear methods for estimation of models Prediction of values and models

Needed for adaptive selection of models (black-box/grey-box) Evaluation of sensor models, ...

Consideration of batch and recursive estimation methods Significant discussion of methods for evaluation of models and parameters. This far purely a discussion of linear models

Henrik I Christensen (RIM@GT) Linear Regression 39 / 39

slide-40
SLIDE 40

Introduction Preliminaries Linear Models Bayes Regress Model Comparison Summary References

Deriche, R., Vaillant, R., & Faugeras, O. 1992. From Noisy Edges Points to 3D Reconstruction of a Scene : A Robust Approach and Its Uncertainty Analysis. Vol. 2. World Scientific. Series in Machine Perception and Artificial Intelligence. Pages 71–79.

Henrik I Christensen (RIM@GT) Linear Regression 39 / 39