Least Squares and Data Fitting Data fitting How do we best fit a - - PowerPoint PPT Presentation

โ–ถ
least squares and data fitting data fitting
SMART_READER_LITE
LIVE PREVIEW

Least Squares and Data Fitting Data fitting How do we best fit a - - PowerPoint PPT Presentation

Least Squares and Data Fitting Data fitting How do we best fit a set of data points? Linear Least Squares 1) Fitting with a line Given data points { ! , ! , , " , " } , we want to find the function =


slide-1
SLIDE 1

Least Squares and Data Fitting

slide-2
SLIDE 2

How do we best fit a set of data points?

Data fitting

slide-3
SLIDE 3

Given ๐‘› data points { ๐‘ข!, ๐‘ง! , โ€ฆ , ๐‘ข", ๐‘ง" }, we want to find the function

๐‘ง = ๐‘ฆ! + ๐‘ฆ" ๐‘ข

that best fit the data (or better, we want to find the coefficients ๐‘ฆ#, ๐‘ฆ!). Thinking geometrically, we can think โ€œwhat is the line that most nearly passes through all the points?โ€

Linear Least Squares

1) Fitting with a line

slide-4
SLIDE 4

Given ๐‘› data points { ๐‘ข!, ๐‘ง! , โ€ฆ , ๐‘ข", ๐‘ง" }, we want to find ๐‘ฆ# and ๐‘ฆ! such that ๐‘ง$ = ๐‘ฆ# + ๐‘ฆ! ๐‘ข$ โˆ€๐‘— โˆˆ 1, ๐‘›

  • r in matrix form:

Note that this system of linear equations has more equations than unknowns โ€“ OVERDETERMINED SYSTEMS ๐’ร—๐’ ๐’ร—๐Ÿ ๐’ร—๐Ÿ 1 ๐‘ข! โ‹ฎ โ‹ฎ 1 ๐‘ข" ๐‘ฆ# ๐‘ฆ! = ๐‘ง! โ‹ฎ ๐‘ง"

๐‘ฉ ๐’š = ๐’„

We want to find the appropriate linear combination of the columns of ๐‘ฉ that makes up the vector ๐’„. If a solution exists that satisfies ๐‘ฉ ๐’š = ๐’„ then ๐’„ โˆˆ ๐‘ ๐‘๐‘œ๐‘•๐‘“(๐‘ฉ)

slide-5
SLIDE 5

Linear Least Squares

  • In most cases, ๐’„ โˆ‰ ๐‘ ๐‘๐‘œ๐‘•๐‘“(๐‘ฉ) and ๐‘ฉ ๐’š = ๐’„ does not have an

exact solution!

  • Therefore, an overdetermined system is better expressed as

๐‘ฉ ๐’š โ‰… ๐’„

slide-6
SLIDE 6

Linear Least Squares

  • Least Squares: find the solution ๐’š that minimizes the residual

๐’” = ๐’„ โˆ’ ๐‘ฉ ๐’š

  • Letโ€™s define the function ๐œš as the square of the 2-norm of the residual

๐œš ๐’š = ๐’„ โˆ’ ๐‘ฉ ๐’š %

%

slide-7
SLIDE 7

Linear Least Squares

  • Least Squares: find the solution ๐’š that minimizes the residual

๐’” = ๐’„ โˆ’ ๐‘ฉ ๐’š

  • Letโ€™s define the function ๐œš as the square of the 2-norm of the residual

๐œš ๐’š = ๐’„ โˆ’ ๐‘ฉ ๐’š %

%

  • Then the least squares problem becomes

min

๐’š ๐œš (๐’š)

  • Suppose ๐œš: โ„›" โ†’ โ„› is a smooth function, then ๐œš ๐’š reaches a (local)

maximum or minimum at a point ๐’šโˆ— โˆˆ โ„›" only if โˆ‡๐œš ๐’šโˆ— = 0

slide-8
SLIDE 8

How to find the minimizer?

  • To minimize the 2-norm of the residual vector

min

๐’š ๐œš ๐’š =

๐’„ โˆ’ ๐‘ฉ ๐’š %

%

๐œš ๐’š = (๐’„ โˆ’ ๐‘ฉ ๐’š)((๐’„ โˆ’ ๐‘ฉ ๐’š) โˆ‡๐œš ๐’š = 2(๐‘ฉ( ๐’„ โˆ’ ๐‘ฉ(๐‘ฉ ๐’š) First order necessary condition: โˆ‡๐œš ๐’š = 0 โ†’ ๐‘ฉ( ๐’„ โˆ’ ๐‘ฉ(๐‘ฉ ๐’š = ๐Ÿ โ†’ ๐‘ฉ(๐‘ฉ ๐’š = ๐‘ฉ( ๐’„ Second order sufficient condition: ๐ธ%๐œš ๐’š = 2๐‘ฉ(๐‘ฉ 2๐‘ฉ(๐‘ฉ is a positive semi-definite matrix โ†’ the solution is a minimum Normal Equations โ€“ solve a linear system of equations

slide-9
SLIDE 9

Linear Least Squares (another approach)

  • Find ๐’› = ๐‘ฉ ๐’š which is closest to the vector ๐’„
  • What is the vector ๐’› = ๐‘ฉ ๐’š โˆˆ ๐‘ ๐‘๐‘œ๐‘•๐‘“(๐‘ฉ) that is closest to vector ๐’› in

the Euclidean norm?

When ๐’” = ๐’„ โˆ’ ๐’› = ๐’„ โˆ’ ๐‘ฉ ๐’š is orthogonal to all columns of ๐‘ฉ, then ๐’› is closest to ๐’„ ๐‘ฉ๐‘ผ๐’” = ๐‘ฉ๐‘ผ(๐’„ โˆ’ ๐‘ฉ ๐’š)=0 ๐‘ฉ"๐‘ฉ ๐’š = ๐‘ฉ" ๐’„

slide-10
SLIDE 10

Summary:

  • ๐‘ฉ is a ๐‘›ร—๐‘œ matrix, where ๐‘› > ๐‘œ.
  • ๐‘› is the number of data pair points. ๐‘œ is the number of parameters of the

โ€œbest fitโ€ function.

  • Linear Least Squares problem ๐‘ฉ ๐’š โ‰… ๐’„ always has solution.
  • The Linear Least Squares solution ๐’š minimizes the square of the 2-norm
  • f the residual:

min

๐’š

๐’„ โˆ’ ๐‘ฉ ๐’š %

%

  • One method to solve the minimization problem is to solve the system of

Normal Equations ๐‘ฉ(๐‘ฉ ๐’š = ๐‘ฉ( ๐’„

  • Letโ€™s see some examples and discuss the limitations of this method.
slide-11
SLIDE 11

Example:

Solve: ๐‘ฉ#๐‘ฉ ๐’š = ๐‘ฉ# ๐’„

slide-12
SLIDE 12
  • Does not need to be a line! For example, here we are fitting the data

using a quadratic curve.

Data fitting - not always a line fit!

Linear Least Squares: The problem is linear in its coefficients!

slide-13
SLIDE 13

Another examples

We want to find the coefficients of the quadratic function that best fits the data points:

We would not want our โ€œfitโ€ curve to pass through the data points exactly as we are looking to model the general trend and not capture the noise.

๐‘ง = ๐‘ฆ! + ๐‘ฆ" ๐‘ข + ๐‘ฆ# ๐‘ข#

slide-14
SLIDE 14

Data fitting

1 ๐‘ข! ๐‘ข!

"

โ‹ฎ โ‹ฎ โ‹ฎ 1 ๐‘ข# ๐‘ข#

"

๐‘ฆ$ ๐‘ฆ! ๐‘ฆ" = ๐‘ง! โ‹ฎ ๐‘ง#

(๐‘ข$,๐‘ง$) Solve: ๐‘ฉ#๐‘ฉ ๐’š = ๐‘ฉ# ๐’„

slide-15
SLIDE 15

Which function is not suitable for linear least squares? A) ๐‘ง = ๐‘ + ๐‘ ๐‘ฆ + ๐‘‘ ๐‘ฆ# + ๐‘’ ๐‘ฆ% B) ๐‘ง = ๐‘ฆ ๐‘ + ๐‘ ๐‘ฆ + ๐‘‘ ๐‘ฆ# + ๐‘’ ๐‘ฆ% C) ๐‘ง = ๐‘ sin ๐‘ฆ + ๐‘/ cos ๐‘ฆ D) ๐‘ง = ๐‘ sin ๐‘ฆ + ๐‘ฆ/ cos ๐‘๐‘ฆ E) ๐‘ง = ๐‘ ๐‘“&#' + ๐‘ ๐‘“#'

slide-16
SLIDE 16

Computational Cost

๐‘ฉ#๐‘ฉ ๐’š = ๐‘ฉ# ๐’„

  • Compute ๐‘ฉ(๐‘ฉ: ๐‘ƒ ๐‘›๐‘œ%
  • Factorize ๐‘ฉ(๐‘ฉ: LU โ†’ ๐‘ƒ

% 2 ๐‘œ2 , Cholesky โ†’ ๐‘ƒ ! 2 ๐‘œ2

  • Solve ๐‘ƒ ๐‘œ%
  • Since ๐‘› > ๐‘œ the overall cost is ๐‘ƒ ๐‘›๐‘œ%
slide-17
SLIDE 17

Short questions

Given the data in the table below, which of the plots shows the line of best fit in terms of least squares? A) B) C) D)

slide-18
SLIDE 18

Short questions

Given the data in the table below, and the least squares model ๐‘ง = ๐‘‘! + ๐‘‘% sin ๐‘ข๐œŒ + ๐‘‘2 sin ๐‘ข๐œŒ/2 + ๐‘‘3 sin ๐‘ข๐œŒ/4 written in matrix form as determine the entry ๐ต%2 of the matrix ๐‘ฉ. Note that indices start with 1. A) โˆ’1.0 B) 1.0 C) โˆ’ 0.7 D) 0.7 E) 0.0

slide-19
SLIDE 19

Solving Linear Least Squares with SVD

slide-20
SLIDE 20

๐‘ฉ is a ๐‘›ร—๐‘œ matrix where ๐‘› > ๐‘œ (more points to fit than coefficient to be determined) Normal Equations: ๐‘ฉ!๐‘ฉ ๐’š = ๐‘ฉ! ๐’„

  • The solution ๐‘ฉ ๐’š โ‰… ๐’„ is unique if and only if ๐‘ ๐‘๐‘œ๐‘™ ๐ = ๐‘œ

(๐‘ฉ is full column rank)

  • ๐‘ ๐‘๐‘œ๐‘™ ๐ = ๐‘œ โ†’ columns of ๐‘ฉ are linearly independent โ†’ ๐‘œ non-zero

singular values โ†’ ๐‘ฉ! ๐‘ฉ has only positive eigenvalues โ†’ ๐‘ฉ!๐‘ฉ is a symmetric and positive definite matrix โ†’ ๐‘ฉ!๐‘ฉ is invertible ๐’š = ๐‘ฉ!๐‘ฉ "๐Ÿ๐‘ฉ! ๐’„

  • If ๐‘ ๐‘๐‘œ๐‘™ ๐ < ๐‘œ, then ๐‘ฉ is rank-deficient, and solution of linear least squares

problem is not unique.

What we have learned so farโ€ฆ

slide-21
SLIDE 21

Condition number for Normal Equations

Finding the least square solution of ๐‘ฉ ๐’š โ‰… ๐’„ (where ๐‘ฉ is full rank matrix) using the Normal Equations ๐‘ฉ(๐‘ฉ ๐’š = ๐‘ฉ( ๐’„ has some advantages, since we are solving a square system of linear equations with a symmetric matrix (and hence it is possible to use decompositions such as Cholesky Factorization) However, the normal equations tend to worsen the conditioning of the matrix. ๐‘‘๐‘๐‘œ๐‘’ ๐‘ฉ(๐‘ฉ = (๐‘‘๐‘๐‘œ๐‘’ ๐‘ฉ )% How can we solve the least square problem without squaring the condition of the matrix?

slide-22
SLIDE 22

SVD to solve linear least squares problems

We want to find the least square solution of ๐‘ฉ ๐’š โ‰… ๐’„, where ๐‘ฉ = ๐‘ฝ ๐šป ๐‘พ๐‘ผ

  • r better expressed in reduced form: ๐‘ฉ = ๐‘ฝ5 ๐šป๐‘บ ๐‘พ๐‘ผ

๐‘ฉ = โ‹ฎ โ€ฆ โ‹ฎ ๐’—# โ€ฆ ๐’—$ โ‹ฎ โ€ฆ โ‹ฎ ๐œ# โ‹ฑ ๐œ% โ‹ฎ โ€ฆ ๐ฐ#

"

โ€ฆ โ‹ฎ โ‹ฎ โ‹ฎ โ€ฆ ๐ฐ%

"

โ€ฆ

๐‘ฉ is a ๐‘›ร—๐‘œ rectangular matrix where ๐‘› > ๐‘œ, and hence the SVD decomposition is given by:

slide-23
SLIDE 23

Recall Reduced SVD

๐‘ฉ = ๐‘ฝ4 ๐šป๐‘บ ๐‘พ๐‘ผ

๐‘›ร—๐‘œ ๐‘›ร—๐‘œ ๐‘œร—๐‘œ ๐‘œร—๐‘œ

๐‘› > ๐‘œ

slide-24
SLIDE 24
slide-25
SLIDE 25

SVD to solve linear least squares problems

We want to find the least square solution of ๐‘ฉ ๐’š โ‰… ๐’„, where ๐‘ฉ = ๐‘ฝ$ ๐šป๐‘บ ๐‘พ๐‘ผ Normal equations: ๐‘ฉ"๐‘ฉ ๐’š = ๐‘ฉ" ๐’„ โŸถ ๐‘ฝ& ๐šป๐‘บ ๐‘พ" " ๐‘ฝ& ๐šป๐‘บ ๐‘พ" ๐’š = ๐‘ฝ& ๐šป๐‘บ ๐‘พ" "๐’„

๐‘ฉ = โ‹ฎ โ€ฆ โ‹ฎ ๐’—! โ€ฆ ๐’—A โ‹ฎ โ€ฆ โ‹ฎ ๐œ! โ‹ฑ ๐œA โ€ฆ ๐ฐ!

(

โ€ฆ โ‹ฎ โ‹ฎ โ‹ฎ โ€ฆ ๐ฐA

(

โ€ฆ

๐‘ฉ = ๐‘ฝ4 ๐šป๐‘บ ๐‘พ๐‘ผ

๐‘พ ๐šป๐‘บ๐‘ฝ&

" ๐‘ฝ& ๐šป๐‘บ ๐‘พ" ๐’š = ๐‘พ ๐šป๐‘บ๐‘ฝ& "๐’„

๐‘พ ๐šป๐‘บ ๐šป๐‘บ ๐‘พ"๐’š = ๐‘พ ๐šป๐‘บ๐‘ฝ&

"๐’„

๐šป๐‘บ

( ๐‘พ"๐’š = ๐šป๐‘บ ๐‘ฝ& "๐’„

When can we take the inverse of the singular matrix?

slide-26
SLIDE 26

๐šป๐‘บ

" ๐‘พ#๐’š = ๐šป๐‘บ ๐‘ฝ$ #๐’„

1) Full rank matrix (๐œ$ โ‰  0 โˆ€๐‘—):

๐‘พ#๐’š = ๐šป๐‘บ

%&๐‘ฝ$ #๐’„

๐’š = ๐‘พ ๐šป๐‘บ

"'๐‘ฝ$ ! ๐’„

Unique solution:

๐‘œร—๐‘œ ๐‘œร—๐‘œ ๐‘œร—๐‘› ๐‘›ร—1 ๐‘œร—1

rank ๐‘ฉ = ๐‘œ

2) Rank deficient matrix ( rank ๐‘ฉ = ๐‘  < ๐‘œ )

Solution is not unique!!

Find solution ๐’š such that min

๐’š ๐œš ๐’š =

๐’„ โˆ’ ๐‘ฉ ๐’š "

"

and also ๐šป๐‘บ

" ๐‘พ#๐’š = ๐šป๐‘บ ๐‘ฝ$ #๐’„

min

๐’š

๐’š ๐Ÿ‘

slide-27
SLIDE 27

2) Rank deficient matrix (continue)

Change of variables: Set ๐‘พ#๐’š = ๐’› and then solve ๐šป๐‘บ ๐’› = ๐‘ฝ$

#๐’„ for the variable ๐’›

๐œ# โ‹ฑ ๐œ) โ‹ฑ ๐‘ง# โ‹ฎ ๐‘ง) ๐‘ง)*# โ‹ฎ ๐‘ง% = ๐’—#

"๐’„

โ‹ฎ ๐’—)

"๐’„

๐’—)*#

"

๐’„ โ‹ฎ ๐’—%

"๐’„

๐‘ง+ = ๐’—+

"๐’„

๐œ+ ๐‘— = 1,2, โ€ฆ , ๐‘  What do we do when ๐‘— > ๐‘ ? Which choice of ๐‘ง+ will minimize ๐’š ๐Ÿ‘ = ๐‘พ ๐’› ๐Ÿ‘? ๐‘ง+ = 0, ๐‘— = ๐‘  + 1, โ€ฆ , ๐‘œ Set

We want to find the solution ๐’š that satisfies ๐šป๐‘บ

" ๐‘พ#๐’š = ๐šป๐‘บ ๐‘ฝ$ #๐’„ and also satisfies

min

๐’š

๐’š ๐Ÿ‘ Evaluate ๐’š = ๐‘พ๐’› = โ‹ฎ โ€ฆ โ‹ฎ ๐’˜& โ€ฆ ๐’˜) โ‹ฎ โ€ฆ โ‹ฎ ๐‘ง& ๐‘ง" โ‹ฎ ๐‘ง) ๐’š = 9

*+& )

๐‘ง* ๐’˜๐’‹ = 9

*+&

  • !./

) (๐’—* #๐’„)

๐œ* ๐’˜๐’‹

slide-28
SLIDE 28

Solving Least Squares Problem with SVD (summary)

  • Find ๐’š that satisfies min

๐’š

๐’„ โˆ’ ๐‘ฉ ๐’š )

)

  • Find ๐’› that satisfies min

๐’›

๐šป๐‘บ ๐’› โˆ’ ๐‘ฝ$

#๐’„ " "

  • Propose ๐’› that is solution of ๐šป๐‘บ ๐’› = ๐‘ฝ5

(๐’„

  • Evaluate: ๐’œ = ๐‘ฝ5

(๐’„

  • Set: ๐‘ง$ = h

O" P" , if ๐œ$ โ‰  0

0, otherwise ๐‘— = 1, โ€ฆ , ๐‘œ

  • Then compute ๐’š = ๐‘พ ๐’›

Cost:

๐‘› ๐‘œ ๐‘œ ๐‘œ# Cost of SVD: ๐‘ƒ(๐‘› ๐‘œ#)

slide-29
SLIDE 29
  • If ๐œ$โ‰  0 for โˆ€๐‘— = 1, โ€ฆ , ๐‘œ, then the solution ๐’› = ๐‘พ

๐šป๐‘บ

Q!๐‘ฝ5 ( ๐’„ is

unique (and not a โ€œchoiceโ€).

  • If at least one of the singular values is zero, then the proposed solution ๐’› is

the one with the smallest 2-norm ( ๐’› % is minimal ) that minimizes the 2-norm of the residual ๐šป๐‘บ ๐’› โˆ’ ๐‘ฝ5

(๐’„ %

  • Since ๐’š % =

๐‘พ ๐’› %= ๐’› %, then the solution ๐’š is also the one with the smallest 2-norm ( ๐’š % is minimal ) for all possible ๐’š for which ๐‘ฉ๐’š โˆ’ ๐’„ % is minimal.

Solving Least Squares Problem with SVD (summary)

slide-30
SLIDE 30

Solve ๐‘ฉ ๐’š โ‰… ๐’„ or ๐‘ฝ> ๐šป๐‘บ๐‘พ๐‘ผ๐’š โ‰… ๐’„ ๐’š โ‰… ๐‘พ ๐šป๐‘บ A ๐‘ฝ>

B ๐’„

Solving Least Squares Problem with SVD (summary)

slide-31
SLIDE 31

Consider solving the least squares problem ๐‘ฉ ๐’š โ‰… ๐’„, where the singular value decomposition of the matrix ๐‘ฉ = ๐‘ฝ ๐šป ๐‘พ๐‘ผ๐’š is: Determine ๐’„ โˆ’ ๐‘ฉ ๐’š (

Example:

slide-32
SLIDE 32

Example

Suppose you have ๐‘ฉ = ๐‘ฝ ๐šป ๐‘พ๐‘ผ๐’š calculated. What is the cost of solving min

๐’š

๐’„ โˆ’ ๐‘ฉ ๐’š %

% ?

A) ๐‘ƒ(๐‘œ) B) ๐‘ƒ( ๐‘œ%) C) ๐‘ƒ(๐‘›๐‘œ) D) ๐‘ƒ ๐‘› E) ๐‘ƒ( ๐‘›%)