quiz
play

Quiz Let q 1 , . . . , q n be orthonormal vectors in R m . Let V = - PowerPoint PPT Presentation

Quiz Let q 1 , . . . , q n be orthonormal vectors in R m . Let V = Span { q 1 , . . . , q n } . What does orthonormal mean? Show: There is a matrix M such that, for any vector b in R m , the the coordinate representation of b ||V in


  1. Quiz Let q 1 , . . . , q n be orthonormal vectors in R m . Let V = Span { q 1 , . . . , q n } . ◮ What does “orthonormal” mean? ◮ Show: There is a matrix M such that, for any vector b in R m , the the coordinate representation of b ||V in terms of q 1 , . . . , q n can be written as M b . Be sure to explain.

  2. Projection onto columns of a column-orthogonal matrix Suppose q 1 , . . . , q n are orthonormal vectors. � q j , b � Projection of b onto q j is b || q j = σ j q j where σ j = � q j , b � � q j , q j � = Vector [ σ 1 , . . . , σ n ] can be written using dot-product definition of matrix-vector multiplication:   q T    q 1 · b    σ 1 1   . . .   . . .  =  = b       . . .         q T q n · b σ n   n       σ 1     .   . and linear combination σ 1 q 1 + · · · + σ n q n = q 1 · · · q n   .       σ n      

  3. Towards QR factorization Orthogonalization of columns of matrix A gives us a representation of A as product of ◮ matrix with mutually orthogonal columns ◮ invertible triangular matrix      1  α 12 α 13 α 1 n 1 α 23 α 2 n             1 α 3 n          v ∗ v ∗ v ∗ v ∗  v 1 v 2 v 3 · · · v n = · · ·  ...     n  1 2 3               α n − 1 , n           1 Suppose columns v 1 , . . . , v n are linearly independent. Then v ∗ 1 , . . . , v ∗ n are nonzero. ◮ Normalize v ∗ 1 , . . . , v ∗ n (Matrix is called Q ) ◮ To compensate, scale the rows of the triangular matrix. (Matrix is R ) The result is the QR factorization. Q is a column-orthogonal matrix and R is an upper-triangular matrix.

  4. Towards QR factorization Orthogonalization of columns of matrix A gives us a representation of A as product of ◮ matrix with mutually orthogonal columns ◮ invertible triangular matrix � v ∗  1 �  β 12 β 13 β 1 n    � v ∗ 2 � β 23 β 2 n       � v ∗  3 � β 3 n          ...     v 1 v 2 v 3 v n q 1 q 2 q 3 q n · · · = · · ·                       β n − 1 , n      � v ∗ n � Suppose columns v 1 , . . . , v n are linearly independent. Then v ∗ 1 , . . . , v ∗ n are nonzero. ◮ Normalize v ∗ 1 , . . . , v ∗ n (Matrix is called Q ) ◮ To compensate, scale the rows of the triangular matrix. (Matrix is R ) The result is the QR factorization. Q is a column-orthogonal matrix and R is an upper-triangular matrix.

  5. Using the QR factorization to solve a matrix equation A x = b First suppose A is square and its columns are linearly independent. Then A is invertible. It follows that there is a solution (because we can write x = A − 1 b ) QR Solver Algorithm to find the solution in this case: Find Q , R such that A = QR and Q is column-orthogonal and R is triangular Compute vector c = Q T b Solve R x = c using backward substitution, and return the solution. Why is this correct? ◮ Let ˆ x be the solution returned by the algorithm. ◮ We have R ˆ x = Q T b ◮ Multiply both sides by Q : Q ( R ˆ x ) = Q ( Q T b ) ◮ Use associativity: ( QR )ˆ x = ( QQ T ) b ◮ Substitute A for QR : A ˆ x = ( QQ T ) b ◮ Since Q and Q T are inverses, we know QQ T is identity matrix: A ˆ x = 1 b Thus A ˆ x = b .

  6. Solving A x = b What if columns of A are not independent? Let v 1 , v 2 , v 3 , v 4 be columns of A . Suppose v 1 , v 2 , v 3 , v 4 are linearly dependent. Then there is a basis consisting of a subset, say v 1 , v 2 , v 4     x 1       x 2    v 1   v 2 v 3 v 4  : x 1 , x 2 , x 3 , x 4 ∈ R =    x 3       x 4        x 1    v 1  : x 1 , x 2 , x 4 ∈ R v 2 v 4 x 2   x 4   Therefore: if there is a solution to A x = b then there is a solution to A ′ x ′ = b where columns of A ′ are a subset basis of columns of A (and x ′ consists of corresponding variables). So solve A ′ x ′ = b instead.

  7. The least squares problem Suppose A is an m × n matrix and its columns are linearly independent.   x 1 Since each column is an m -vector, dimension of   1 2 3 4 5 x 2   column space is at most m , so n ≤ m .   6 7 8 9 10 = b x 3       11 12 13 14 15 x 4 What if n < m ? How can we solve the matrix   x 5 equation A x = b ? Remark: There might not be a solution:   1 2 3   x 1 ◮ Define f : R n − → R m by f ( x ) = A x 4 5 6  = b   x 2    7 8 9 ◮ Dimension of Im f is n   x 3 10 11 12 ◮ Dimension of co-domain is m . ◮ Thus f is not onto. Goal: An algorithm that, given a matrix A whose columns are linearly independent and given b , finds the vector ˆ x minimizing � b − A ˆ x � . Solution: Same algorithm as we used for square A

  8. The least squares problem Recall... High-Dimensional Fire Engine Lemma: The point in a vector space V closest to b is b ||V and the distance is � b ⊥V � . Given equation A x = b , let V be the column space of A . x = b ||V . We need to show that the QR Solver Algorithm returns a vector ˆ x such that A ˆ

  9. Projection onto columns of a column-orthogonal matrix Suppose q 1 , . . . , q n are orthonormal vectors. � q j , b � Projection of b onto q j is b || q j = σ j q j where σ j = � q j , b � � q j , q j � = Vector [ σ 1 , . . . , σ n ] can be written using dot-product definition of matrix-vector multiplication:   q T    q 1 · b    σ 1 1   . . .   . . .  =  = b       . . .         q T q n · b σ n   n       σ 1     .   . and linear combination σ 1 q 1 + · · · + σ n q n = q 1 · · · q n   .       σ n      

  10. QR Solver Algorithm for A x ≈ b Summary: ◮ QQ T b = b || Proposed algorithm: Find Q , R such that A = QR and Q is column-orthogonal and R is triangular Compute vector c = Q T b Solve R x = c using backward substitution, and return the solution ˆ x . Goal: To show that the solution ˆ x returned is the vector that minimizes � b − A ˆ x � Every vector of the form A x is in Col A (= Col Q ) By the High-Dimensional Fire Engine Lemma, the vector in Col A closest to b is b || , the projection of b onto Col A . Solution ˆ x satisfies R ˆ x = Q T b x = QQ T b Multiply by Q : QR ˆ x = b || . Therefore A ˆ

  11. Least squares when columns are linearly de pendent? This comes up, e.g. ranking sports teams. Need a more sophisticated algorithm. We’ll see it soon.

  12. The Normal Equations Let A be a matrix with linearly independent columns. Let QR be its QR factorization. We have given one algorithm for solving the least-squares problem A x ≈ b : Find Q , R such that A = QR and Q is column-orthogonal and R is triangular Compute vector c = Q T b Solve R x = c using backward substitution, and return the solution ˆ x . However, there are other ways to find solution. Not hard to show that ◮ A T A is an invertible matrix ◮ The solution to the matrix-vector equation ( A T A ) x = A T b is the solution to the least-squares problem A x ≈ b ◮ Can use another method (e.g. Gaussian elimination) to solve ( A T ) x = A T b The linear equations making up A T A x = A T b are called the normal equations.

  13. Application of least squares: linear regression Finding the line that best fits some two-dimensional data. Data on age versus brain Let f ( x ) be the function that predicts brain mass for someone of age mass from the Bureau of x . Made-up Numbers: Hypothesis: after age 45, brain mass decreases linearly with age, i.e. that f ( x ) = mx + b for some numbers m , b . age brain mass Goal: find m , b to as to minimize the sum of squares of prediction 45 4 lbs. errors 55 3.8 The observations are ( x 1 , y 1 ) = (45 , 4), ( x 2 , y 2 ) = (55 , 3 . 8), 65 3.75 ( x 3 , y 3 ) = (65 , 3 . 75),( x 4 , y 4 ) = (75 , 3 . 5), ( x 5 , y 5 ) = (85 , 3 . 3). 75 3.5 The prediction error on the i th observation is | f ( x i ) − y i | . 85 3.3 i ( f ( x i ) − y i ) 2 . The sum of squares of prediction errors is � For each observation, measure the difference between the predicted and observed y -value. In this application, this difference is measured in pounds. Measuring the distance from the point to the line wouldn’t make sense.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend