Linear Fitting CS3220 - Summer 2008 Jonathan Kaldor (based on Sp07 - - PowerPoint PPT Presentation

linear fitting
SMART_READER_LITE
LIVE PREVIEW

Linear Fitting CS3220 - Summer 2008 Jonathan Kaldor (based on Sp07 - - PowerPoint PPT Presentation

Linear Fitting CS3220 - Summer 2008 Jonathan Kaldor (based on Sp07 Slides) From N to M We have been talking about solving linear systems of n equations in n variables In other words, Ax = b where A is n x n Usually: a single


slide-1
SLIDE 1

Linear Fitting

CS3220 - Summer 2008 Jonathan Kaldor (based on Sp07 Slides)

slide-2
SLIDE 2

From N to M

  • We have been talking about solving linear

systems of n equations in n variables

  • In other words, Ax = b where A is n x n
  • Usually: a single solution

A x = b

Square system

slide-3
SLIDE 3

From N to M

  • What happens if the number of equations is

not equal to the number of unknowns?

  • General case: m linear equations in n

unknowns

  • Still expressible as a matrix times a vector...
  • ...but no longer a square matrix
slide-4
SLIDE 4

Rectangular Systems

A x = b A x = b

Overdetermined

(m > n)

Underdetermined

(m < n)

slide-5
SLIDE 5

Rectangular Systems

  • Still well-defined set of matrix equations
  • May be full rank but have many solutions

(or no exact solutions)

  • Our focus: m > n (overdetermined

systems)

slide-6
SLIDE 6

Example

4 2 2 10 2 4 x y = 1 3

slide-7
SLIDE 7

Overdetermined Systems

  • When full rank, extra equations either not

necessary or unsatisfiable

  • Can we even talk about what a solution to

this problem is?

  • Want “best” answer for some definition
  • f “best”
slide-8
SLIDE 8

Examples

  • Model Fitting
slide-9
SLIDE 9

Examples

  • Model Fitting
slide-10
SLIDE 10

Examples

  • Model Fitting
  • We’ve fit our data points exactly
  • But do we need to?
  • Error in experimental results
  • Fewer dimensions in model
  • High degree polynomial overfitting data
slide-11
SLIDE 11

Examples

  • Model Fitting
slide-12
SLIDE 12

Examples

  • Model Fitting

1 x1 1 x2 1 x3 1 x4 ⋮ b a = d1 d2 d3 d4 ⋮ Find best equation ax+b to match data

slide-13
SLIDE 13

Examples

  • Model Fitting

1 x1 x12 1 x2 x22 1 x3 x32 1 x4 x42 ⋮ c b a = d1 d2 d3 d4 ⋮ Find best equation ax2 + bx + c to match data

slide-14
SLIDE 14

Examples

  • Hugely applicable in sciences
  • Fitting model to experimental results
  • Economics
  • Predicting economic performance from

economic indicators

  • NBA - predicting future performance of

draft picks

slide-15
SLIDE 15

Back to the Problem

  • So this is an important problem
  • ... but we still don’t know what the

answer will look like!

  • In the square system, solved Ax = b,

i.e. Ax - b = 0

  • In the rectangular system, Ax - b is not

necessarily 0, but instead we can minimize the distance between Ax and b

slide-16
SLIDE 16

Vector Distances

  • How long is this vector?

(3,4)

slide-17
SLIDE 17

Vector Distances

  • How long is this vector?

(3,4) 5?

slide-18
SLIDE 18

Vector Distances

  • How long is this vector?

(3,4) 5? 7?

slide-19
SLIDE 19

Vector Distances

  • How long is this vector?

(3,4) 5? 7? Something else?

slide-20
SLIDE 20

Vector Distances

  • Distances (called ‘norms’, denoted with ‖‖)
  • We require four properties:

‖0‖ = 0 ‖x‖ > 0 if x≠0 ‖c x‖ = |c| ‖x‖ ‖x + y‖ ≤ ‖x‖ + ‖y‖

  • Last property: triangle inequality
slide-21
SLIDE 21

Vector Distances

  • Common vector norms: p-norms
  • (∑ |xi|p)^(1/p)
  • Common cases:
  • p = 1 (Manhattan distance)
  • p = 2 (Euclidean distance)
  • p = infinity (Chebyshev norm)
slide-22
SLIDE 22

Vector Distances

  • Denote particular p-norm with subscript
  • ‖x‖1, ‖x‖2, etc...
  • Note alternate form of 2-norm
  • sqrt(xTx)
slide-23
SLIDE 23

Back to the Problem (Again)

  • Rectangular systems solved with respect to

the 2-norm

  • x* = min ‖Ax-b‖2

= min sqrt(∑ (A(i,:)*x - b(i))2) = min ∑ (A(i,:)*x - b(i))2

  • We say x* is the least-squares solution to

the rectangular system Ax=b, with residual r = Ax* - b

x x x

slide-24
SLIDE 24

Least Squares

  • Why the 2-norm?
  • Intuitive
  • Sometimes is the ‘proper’ measure
  • Easy to solve
  • Of the 3 reasons, third is most important
slide-25
SLIDE 25

2x1 Least Squares

  • Take a 2x1 example

2 1 x = 3 3

slide-26
SLIDE 26

2x1 Least Squares

(2,1) (3,3)

slide-27
SLIDE 27

2x1 Least Squares

(2,1) (3,3)

slide-28
SLIDE 28

2x1 Least Squares

  • Given line and point p, find closest point on

line to p

  • Perpendicular from p to the line
  • a.k.a. orthogonal projection
slide-29
SLIDE 29

Review of Orthogonality

  • We say two vectors are orthogonal if their

dot product is equal to 0

  • xTy = ‖x‖2‖y‖2 cosΘ
  • If x≠0 and y≠0, above is zero iff cosΘ =

0, i.e. Θ=±π/2, i.e. they are perpendicular

slide-30
SLIDE 30

Review of Orthogonality

  • We say two vectors are orthonormal if

they are orthogonal and ‖v1‖2 = ‖v2‖2 = 1

  • Can extend to say sets of n vectors are
  • rthonormal with respect to each other
  • We say a matrix Q is orthogonal if its

columns are all orthonormal with respect to each other

  • QTQ = I
slide-31
SLIDE 31

Perpendicular Residual

  • In our 2x1 case, the residual ax - b is
  • rthogonal to the vector a
  • Leads to aTr = 0

aT(ax - b) = 0 aTax - aTb = 0 aTax = aTb

slide-32
SLIDE 32

3x2 Case

  • Trust your geometric intuition
  • 3x2 case: closest point on plane
  • Holds in higher dimensions (but best not to

try and picture it!)

slide-33
SLIDE 33

3x2 Case

  • Find closest point on 2D plane defined by

vectors a1 and a2 (3D vectors)

  • Residual must be orthogonal to both a1 and

a2

  • Two equations: a1Tr = 0

a2Tr = 0

  • Rewrite as

a1T a2T x = a1 a2 b

slide-34
SLIDE 34

General Case

  • Extend this to m equations in n variables
  • Our residual must be orthogonal to each

column in A

  • Results in n equations, each of the form

A(:,i)Tr = 0

  • Can rewrite as ATAx = ATb
slide-35
SLIDE 35

Normal Equations

  • This is known as the system of normal

equations

  • The solution to ATAx = ATb, x*, is the

solution to the least squares problem Ax = b

  • Convert rectangular system into square

system, solve using standard techniques (note: can use Cholesky)

slide-36
SLIDE 36

Outliers

slide-37
SLIDE 37

Outliers

slide-38
SLIDE 38

Outliers

  • Where do they come from?
  • Error in measurements
  • User error
  • Why do they have such an effect?
  • Least squares
slide-39
SLIDE 39

Outliers

  • How can we handle them?
  • Toss out worst-fitting points (but need to

make sure they really are outliers first!)

  • Measure error differently
slide-40
SLIDE 40

Solving Least Squares in MATLAB

  • Remember \ ?
  • Solves rectangular as well as square

systems

  • A \ b will solve the rectangular system

Ax = b in the least squares sense

  • Can also specify multiple right hand sides:

A \ B solves AX=B for each column of B