 
              Statistical Geometry Processing Winter Semester 2011/2012 Least-Squares
Least-Squares Fitting
Approximation Common Situation: • We have many data points, they might be noisy • Example: Scanned data • Want to approximate the data with a smooth curve / surface What we need: • Criterion – what is a good approximation? • Methods to compute this approximation 3
Approximation Techniques Agenda: • Least-squares approximation (and why/when this makes sense) • Total least-squares linear approximation (get rid of the coordinate system) • Iteratively reweighted least-squares (for nasty noise distributions) 4
Least-Squares We assume the following scenario: • Given: Function values y i at positions x i . (1D  1D for now) • Independent variables x i known exactly. • Dependent variables y i with some error. • Error Gaussian, i.i.d.  normal distributed  independent  same distribution at every point • We know the class of functions 5
Situation y 1 y 2 y n x n x 1 x 2 Situation: • Original sample points taken at x i from original f . • Unknown Gaussian iid noise added to each y i . ~ • Want to estimated reconstructed f . 6
Summary Statistical model yields least-squares criterion: ~ n   2 arg min ( f ( x ) y ) i i ~ f i  1 Linear function space leads to quadratic objective: 2     ~ k n k            f x : b x   arg min b ( x ) y     j j j j i i       λ i  1 j  1 j  1 Critical point: linear system n           b , b : b ( x ) b ( x ) b , b b , b y , b      i j i t j t 1 1 1 k  1  1 t  1        with:      n          y , b : b ( x ) y  b , b b , b y , b        i i t t k 1 k k k k t  1 7
Maximum Likelihood Estimation Goal: • Maximize the probability that the data originated from ~ the reconstructed curve f . • “Maximum likelihood estimation”     2   1 x     p ( x ) exp    ,   2 2  2 π   Gaussian normal distribution 8
Maximum Likelihood Estimation ~    2 n ~ n 1 ( f ( x ) y )        arg max N ( f ( x ) y ) arg max exp i i    0 , i i  2 2 ~ ~  2 π   f f   i 1 i 1 ~    2 n ( f ( x ) y ) 1      arg max ln exp i i    2 2 ~  2 π   f  i 1 ~      2 ( f ( x ) y ) n 1      arg max ln i i      2 2 ~  2 π       f  i 1 ~  2 ( f ( x ) y ) n   arg min i i  2 2 ~ f  i 1 n ~    2 arg min ( f ( x ) y ) i i ~ f  i 1 9
Maximum Likelihood Estimation ~    2 n ~ n 1 ( f ( x ) y )        arg max N ( f ( x ) y ) arg max exp i i    0 , i i  2 2 ~ ~  2 π   f f   i 1 i 1 ~    2 n ( f ( x ) y ) 1      arg max ln exp i i    2 2 ~  2 π   f  i 1 ~      2 ( f ( x ) y ) n 1      arg max ln i i      2 2 ~  2 π       f  i 1 ~  2 ( f ( x ) y ) n   arg min i i  2 2 ~ f  i 1 n ~    2 arg min ( f ( x ) y ) i i ~ f  i 1 10
Least-Squares Approximation This shows: • Maximum likelihood estimate minimizes sum of squared errors Next: Compute optimal coefficients ~ k        f x : b x • Linear ansatz: j j j  1 • Determine optimal  i 11
Maximum Likelihood Estimation         b ( x )   b ( x )   y   1   1   i 1   1          λ : , b ( x ) : , b : y :                 k entries k entries n entries, n entries i              b ( x ) b ( x ) y             k k i n n 2     n ~ n k      2     arg min ( f ( x ) y ) arg min  b ( x ) y    i i j j i i     λ λ   i  1 i  1 j  1   n  2  T  arg min λ b ( x ) y i i λ  i 1     n n n     2   T T  T  x x y x y arg min λ b ( ) b ( ) λ 2 λ b ( )     i i i i i     λ    i 1 i 1 i 1 x T Ax bx c  Quadratic optimization problem 12
Critical Point         b ( x )   b ( x )   y   1   1   i 1   1          λ : , b ( x ) : , b : y :                 k entries k entries n entries, n entries i              b ( x ) b ( x ) y             k k i n n     n n n      2 T T T    λ b ( x ) b ( x ) λ 2 y λ b ( x ) y     λ i i i i i     i  1 i  1 i  1   T y b   1   n  T     2 b ( x ) b ( x ) λ 2    i i       i  1 T y b   k We obtain a linear system of equations:   T y b   1    n T    b ( x ) b ( x ) λ    i i        i 1 T y b   k 13
Critical Point This can also be written as:        b , b b , b y , b      1 1 1 k  1  1                    b , b b , b y , b        k 1 k k k k with: n    b , b : b ( x ) b ( x ) i j i t j t  t 1 n    y , b : b ( x ) y i i t t  t 1 14
Summary (again) Statistical model yields least-squares criterion: ~ n   2 arg min ( f ( x ) y ) i i ~ f i  1 Linear function space leads to quadratic objective: 2     ~ k n k            f x : b x   arg min b ( x ) y     j j j j i i       λ i  1 j  1 j  1 Critical point: linear system n           b , b : b ( x ) b ( x ) b , b b , b y , b      i j i t j t 1 1 1 k  1  1 t  1        with:      n          y , b : b ( x ) y  b , b b , b y , b        i i t t k 1 k k k k t  1 15
Variants Weighted least squares: • In case the data point’s noise has different standard deviations  at the different data points • This gives a weighted least squares problem • Noisier points have smaller influence 16
Same procedure as prev. slides... ~    2 n ~ n ( f ( x ) y ) 1        arg max N ( f ( x ) y ) arg max exp i i    i i 2  ~ ~  2 2 π   f f   i 1 i 1 i i ~    2 n ( f ( x ) y ) 1      arg max log exp i i   2  ~  2 2 π   f  i 1 i i ~      2 n 1 ( f ( x ) y )        arg max log i i   2  ~   2  2 π   f    i 1 i i ~  2 n ( f ( x ) y )   arg min i i 2  ~ 2 f  i 1 i n 1 ~    2 arg min ( f ( x ) y ) i i 2  ~ f  i 1  i weights 17
Recommend
More recommend