Compression, inversion and sparse approximate PCA of dense kernel - PowerPoint PPT Presentation

A numerical experiment We define J ( k ) := I ( k ) \ I ( k − 1 ) and define the sparsity pattern: � � ≤ 2 ∗ 2 min ( k , l ) � � i ∈ J ( k ) , j ∈ J ( l ) , dist � � S 2 := ( i , j ) ∈ I × I x i , x j . � We order the elements of I from coarse to fine, that is from J ( 1 ) to J ( q ) . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 30 / 130

A numerical experiment L provides a good approximation of Γ at only 2 percent of the storage cost. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 31 / 130

A numerical experiment L provides a good approximation of Γ at only 2 percent of the storage cost. Can be computed in near linear complexity in time and space. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 32 / 130

A numerical experiment L provides a good approximation of Γ at only 2 percent of the storage cost. Can be computed in near linear complexity in time and space. Allows for approximate evaluation of Γ , Γ − 1 , and det (Γ) in near-linear time. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 33 / 130

A numerical experiment L provides a good approximation of Γ at only 2 percent of the storage cost. Can be computed in near linear complexity in time and space. Allows for approximate evaluation of Γ , Γ − 1 , and det (Γ) in near-linear time. Allows for sampling of X ∼ N ( 0 , Γ) in near-linear time. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 34 / 130

A numerical experiment In this work, we F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 35 / 130

A numerical experiment In this work, we prove that this phaenomenon holds whenever the covariance function K is the Green’s function of an elliptic boundary value problem. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 36 / 130

A numerical experiment In this work, we prove that this phaenomenon holds whenever the covariance function K is the Green’s function of an elliptic boundary value problem. prove that it leads to an algorithm with computational complexity � � 4 d + 1 � N log 2 ( N ) � log ( 1 /ǫ ) + log 2 ( N ) of O in time and N log ( N ) log d ( N 1 � � O ǫ ) in space for an approximation error of ǫ . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 37 / 130

A numerical experiment In this work, we prove that this phaenomenon holds whenever the covariance function K is the Green’s function of an elliptic boundary value problem. prove that it leads to an algorithm with computational complexity � � 4 d + 1 � N log 2 ( N ) � log ( 1 /ǫ ) + log 2 ( N ) of O in time and N log ( N ) log d ( N 1 � � O ǫ ) in space. show that even though the Matérn family is not covered rigorously by our theoretical results, we get good approximation results, in particular in the interior of the domain. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 38 / 130

A numerical experiment In this work, we prove that this phaenomenon holds whenever the covariance function K is the Green’s function of an elliptic boundary value problem. prove that it leads to an algorithm with computational complexity � � 4 d + 1 � N log 2 ( N ) � log ( 1 /ǫ ) + log 2 ( N ) of O in time and N log ( N ) log d ( N 1 � � O ǫ ) in space. show that even though the Matérn family is not covered rigorously by our theoretical results, we get good approximation results, in particular in the interior of the domain. show that as a byproduct of our algorithm we obtain a sparse approximate PCA with near optimal approximation property. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 39 / 130

Disintegration of Gaussian Measures and the Screening Effect Let X be a centered Gaussian vector with covariance Θ . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 40 / 130

Disintegration of Gaussian Measures and the Screening Effect Let X be a centered Gaussian vector with covariance Θ . Assume we want to compute E [ f ( X )] for some function f . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 41 / 130

Disintegration of Gaussian Measures and the Screening Effect Let X be a centered Gaussian vector with covariance Θ . Assume we want to compute E [ f ( X )] for some function f . Use Monte Carlo, but for Θ large, each sample is expensive. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 42 / 130

Disintegration of Gaussian Measures and the Screening Effect Let X be a centered Gaussian vector with covariance Θ . Assume we want to compute E [ f ( X )] for some function f . Use Monte Carlo, but for Θ large, each sample is expensive. Idea: use disintegration of measure: E [ f ( X )] = E [ E [ f ( X ) | Y ] ( Y )] . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 43 / 130

Disintegration of Gaussian Measures and the Screening Effect Let X be a centered Gaussian vector with covariance Θ . Assume we want to compute E [ f ( X )] for some function f . Use Monte Carlo, but for Θ large, each sample is expensive. Idea: use disintegration of measure: E [ f ( X )] = E [ E [ f ( X ) | Y ] ( Y )] . Choose Y , such that Y and E [ f ( X ) | Y ] can be sampled cheaply. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 44 / 130

Disintegration of Gaussian Measures and the Screening Effect Consider X ∈ R N , { x i } 1 ≤ i ≤ N ⊂ [ 0 , 1 ] and Θ i , j := exp � � −| x i − x j | . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 45 / 130

Disintegration of Gaussian Measures and the Screening Effect Consider X ∈ R N , { x i } 1 ≤ i ≤ N ⊂ [ 0 , 1 ] and Θ i , j := exp � � −| x i − x j | . Corresponds to a prior on the space H 1 ( 0 , 1 ) of mean square differentiable functions. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 46 / 130

Disintegration of Gaussian Measures and the Screening Effect Consider X ∈ R N , { x i } 1 ≤ i ≤ N ⊂ [ 0 , 1 ] and Θ i , j := exp � � −| x i − x j | . Corresponds to a prior on the space H 1 ( 0 , 1 ) of mean square differentiable functions. Assume x i are ordered in increasing order and x ⌊ N / 2 ⌋ ≈ 1 / 2 F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 47 / 130

Disintegration of Gaussian Measures and the Screening Effect Consider X ∈ R N , { x i } 1 ≤ i ≤ N ⊂ [ 0 , 1 ] and Θ i , j := exp � � −| x i − x j | . Corresponds to a prior on the space H 1 ( 0 , 1 ) of mean square differentiable functions. Assume x i are ordered in increasing order and x ⌊ N / 2 ⌋ ≈ 1 / 2 � � We then have, for i < ⌊ N / 2 ⌋ < j : Cov X i , X j | X ⌊ N / 2 ⌋ ≈ 0. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 48 / 130

Disintegration of Gaussian Measures and the Screening Effect Consider X ∈ R N , { x i } 1 ≤ i ≤ N ⊂ [ 0 , 1 ] and Θ i , j := exp � � . −| x i − x j | Corresponds to a prior on the space H 1 ( 0 , 1 ) of mean square differentiable functions. Assume x i are ordered in increasing order and x ⌊ N / 2 ⌋ ≈ 1 / 2 � � We then have, for i < ⌊ N / 2 ⌋ < j : Cov X i , X j | X ⌊ N / 2 ⌋ ≈ 0. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 49 / 130

Disintegration of Gaussian Measures and the Screening Effect Consider X ∈ R N , { x i } 1 ≤ i ≤ N ⊂ [ 0 , 1 ] and Θ i , j := exp � � −| x i − x j | . Corresponds to a prior on the space H 1 ( 0 , 1 ) of mean square differentiable functions. Assume x i are ordered in increasing order and x ⌊ N / 2 ⌋ ≈ 1 / 2 � � We then have, for i < ⌊ N / 2 ⌋ < j : Cov X i , X j | X ⌊ N / 2 ⌋ ≈ 0. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 50 / 130

Disintegration of Gaussian Measures and the Screening Effect For two observation sites x i , x j , the covariance conditional on the obervation sites inbetween is small. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 51 / 130

Disintegration of Gaussian Measures and the Screening Effect For two observation sites x i , x j , the covariance conditional on the obervation sites inbetween is small. Known as screening effect in the spatial statistics community. Analysed by Stein (2002). Used, among others, by Banerjee et al. (2008) and Katzfuss (2015) for efficient approximation of Gaussian processes. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 52 / 130

Disintegration of Gaussian Measures and the Screening Effect For two observation sites x i , x j , the covariance conditional on the obervation sites inbetween is small. Known as screening effect in the spatial statistics community. Analysed by Stein (2002). Used, among others, by Banerjee et al. (2008) and Katzfuss (2015) for efficient approximation of Gaussian processes. Let us take Y = X ⌊ N / 2 ⌋ . Then Y is cheap to sample, and the covariance matrix of X | Y has only 2 ( N / 2 ) 2 noneglegible entries. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 53 / 130

Disintegration of Gaussian Measures and the Screening Effect For two observation sites x i , x j , the covariance conditional on the obervation sites inbetween is small. Known as screening effect in the spatial statistics community. Analysed by Stein (2002). Used, among others, by Banerjee et al. (2008) and Katzfuss (2015) for efficient approximation of Gaussian processes. Let us take Y = X ⌊ N / 2 ⌋ . Then Y is cheap to sample, and the covariance matrix of X | Y has only 2 ( N / 2 ) 2 noneglegible entries. When using Cholesky decomposition, this yields a factor 4 improvement of computational speed. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 54 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Look at a single step of Block Cholesky decomposition: F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 55 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Look at a single step of Block Cholesky decomposition: This corresponds to: � Θ 11 � Θ 12 Θ 21 Θ 22 Θ − 1 � � � Θ 11 � � � Id 0 0 Id 11 Θ 12 = Θ 21 Θ − 1 Θ 22 − Θ 21 Θ − 1 Id 0 11 Θ 12 0 Id 11 F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 56 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Look at a single step of Block Cholesky decomposition: This corresponds to: � Θ 11 � Θ 12 Θ 21 Θ 22 Θ − 1 � Id 0 � � Θ 11 0 � � � Id 11 Θ 12 = Θ 21 Θ − 1 Θ 22 − Θ 21 Θ − 1 Id 0 11 Θ 12 0 Id 11 � � Θ 21 Θ − 1 Note, that for b = E [ X 2 | X 1 = b ] , and 11 Θ 22 − Θ 21 Θ − 1 11 Θ 12 = Cov [ X 2 | X 1 ] . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 57 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Look at a single step of Block Cholesky decomposition: This corresponds to: � Θ 11 � Θ 12 Θ 21 Θ 22 Θ − 1 � � � Θ 11 � � � Id 0 0 Id 11 Θ 12 = Θ 21 Θ − 1 Θ 22 − Θ 21 Θ − 1 Id 0 11 Θ 12 0 Id 11 � � Θ 21 Θ − 1 Note, that for b = E [ X 2 | X 1 = b ] , and 11 Θ 22 − Θ 21 Θ − 1 11 Θ 12 = Cov [ X 2 | X 1 ] . (Block-)Cholesky decomposition is computationally equivalent to the disintegration of Gaussian measures. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 58 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Look at a single step of Block Cholesky decomposition: This corresponds to: � Θ 11 Θ 12 � Θ 21 Θ 22 Θ − 1 � Id 0 � � Θ 11 0 � � � Id 11 Θ 12 = Θ 21 Θ − 1 Θ 22 − Θ 21 Θ − 1 Id 0 11 Θ 12 0 Id 11 � � Θ 21 Θ − 1 Note, that for b = E [ X 2 | X 1 = b ] , and 11 Θ 22 − Θ 21 Θ − 1 11 Θ 12 = Cov [ X 2 | X 1 ] . (Block-)Cholesky decomposition is computationally equivalent to the disintegration of Gaussian measures. Follows immediately from well known formulas, but rarely used in the literature. One Example: Bickson (2008). F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 59 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in This suggests to choose a bisective elimination ordering: F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 60 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in This suggests to choose a bisective elimination ordering: Lets start compting the Cholesky decomposition F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 61 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in This suggests to choose a bisective elimination ordering: Lets start compting the Cholesky decomposition We observe a fade-out of entries! F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 62 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in What about higher dimensional examples? F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 63 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in How about higher dimensional examples? In 2d, use quadsection: F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 64 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We know that the result of the factorisation is sparse, but can we compute it efficiently? F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 65 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We know that the result of the factorisation is sparse, but can we compute it efficiently? Key observation: The entry ( i , j ) is used for the first time with the min ( i , j ) -th pivot. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 66 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We know that the result of the factorisation is sparse, but can we compute it efficiently? Key observation: The entry ( i , j ) is used for the first time with the min ( i , j ) -th pivot. If we know they will be negligible untill we use them, we don’t have to update them, nor know them in the first place. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 67 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We know that the result of the factorisation is sparse, but can we compute it efficiently? Key observation: The entry ( i , j ) is used for the first time with the min ( i , j ) -th pivot. If we know they will be negligible untill we use them, we don’t have to update them, nor know them in the first place. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 68 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Bisective/Quadsective ordering is the reverse of nested dissection. Indeed, for P the order-reversing permutation matrix, we have: LL T � − 1 (Θ) − 1 = � = L − T L − 1 � T ⇒ P (Θ) − 1 P = PL − T PPL − 1 P = � � � PL − T P PL − T P = , But we have L − 1 = L T (Θ) − 1 . For a sparse elimination ordering of Θ , the reverse ordering leads to sparse factorisation of (Θ) − 1 F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 69 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We obtain a very simple algorithm: F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 70 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We obtain a very simple algorithm: Given a positive definite matrix Θ and a Graph G , such that Θ − 1 is sparse according to G . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 71 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We obtain a very simple algorithm: Given a positive definite matrix Θ and a Graph G , such that Θ − 1 is sparse according to G . Obtain inverse nested dissection ordering for G . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 72 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We obtain a very simple algorithm: Given a positive definite matrix Θ and a Graph G , such that Θ − 1 is sparse according to G . Obtain inverse nested dissection ordering for G . Set entries ( i , j ) that are separated after pivot number min ( i , j ) to zero. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 73 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in We obtain a very simple algorithm: Given a positive definite matrix Θ and a Graph G , such that Θ − 1 is sparse according to G . Obtain inverse nested dissection ordering for G . Set entries ( i , j ) that are separated after pivot number min ( i , j ) to zero. Compute incomplete Cholesky factorisation. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 74 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Remaining problems with our approach: F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 75 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Remaining problems with our approach: Nested dissection does not lead to near-linear complexity algorithms F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 76 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Remaining problems with our approach: Nested dissection does not lead to near-linear complexity algorithms Precision matrix will not be exactly sparse. How is it localised? F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 77 / 130

Sparse Factorisation of Dense Matrices: fade-out instead of fill-in Remaining problems with our approach: Nested dissection does not lead to near-linear complexity algorithms Precision matrix will not be exactly sparse. How is it localised? The answer can be found in the recent literature on numerical homogenisation: F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 78 / 130

Sparse factorisation of dense matrices using gamblets “Gamblet” bases have been introduced as part of the game theoretical approach to numerical PDE (Owhadi (2017), Owhadi and Scovel (2017) ). F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 79 / 130

Sparse factorisation of dense matrices using gamblets “Gamblet” bases have been introduced as part of the game theoretical approach to numerical PDE (Owhadi (2017), Owhadi and Scovel (2017) ). Assume our covariance matrix is � φ ( q ) ( x ) G ( x , y ) φ ( q ) Θ i , j = ( y ) d x d y i j [ 0 , 1 ] 2 For φ ( q ) := ✶ [( i − 1 ) h q , ih q ] and G the Green’s function of a second i order elliptic PDE. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 80 / 130

Sparse factorisation of dense matrices using gamblets “Gamblet” bases have been introduced as part of the game theoretical approach to numerical PDE (Owhadi (2017), Owhadi and Scovel (2017) ). Assume our covariance matrix is � φ ( q ) ( x ) G ( x , y ) φ ( q ) Θ i , j = ( y ) d x d y i j [ 0 , 1 ] 2 For φ ( q ) := ✶ [( i − 1 ) h q , ih q ] and G the Green’s function of a second i order elliptic PDE. 1 φ ( q ) � Corresponds to X i ( ω ) = ( x ) u ( x , ω ) d x , with u ( ω ) solution i 0 to elliptic SPDE with Gaussian forcing. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 81 / 130

Sparse factorisation of dense matrices using gamblets Similiar to our case, only with ✶ [( i − 1 ) h q , ih q ] instead of dirac mesure. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 82 / 130

Sparse factorisation of dense matrices using gamblets Similiar to our case, only with ✶ [( i − 1 ) h q , ih q ] instead of dirac mesure. For φ ( k ) := ✶ [ ( i − 1 ) h k , ih k ] , Owhadi and Scovel (2017) shows: i F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 83 / 130

Sparse factorisation of dense matrices using gamblets Similiar to our case, only with ✶ [( i − 1 ) h q , ih q ] instead of dirac mesure. For φ ( k ) := ✶ [ ( i − 1 ) h k , ih k ] , Owhadi and Scovel (2017) shows: i � 1 � � ψ ( k ) 0 u ( x ) φ ( k ) := E u | ( x ) d x = δ i , j is exponentially localised, i j on a scale of h k : F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 84 / 130

Sparse factorisation of dense matrices using gamblets Similiar to our case, only with ✶ [( i − 1 ) h q , ih q ] instead of dirac mesure. For φ ( k ) := ✶ [ ( i − 1 ) h k , ih k ] , Owhadi and Scovel (2017) shows: i � 1 � � ψ ( k ) 0 u ( x ) φ ( k ) := E u | ( x ) d x = δ i , j is exponentially localised, i j on a scale of h k : Main idea: Estimate on exponential decay of a conditional expectation implies exponential decay of a Cholesky factors. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 85 / 130

Sparse factorisation of dense matrices using gamblets Transform to multiresolution basis to obtain block matrix: � φ ( k ) ,χ ( x ) G ( x , y ) φ ( l ) ,χ � � Γ k , l i , j = ( y ) d x d y i j [ 0 , 1 ] 2 F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 86 / 130

Sparse factorisation of dense matrices using gamblets Transform to multiresolution basis to obtain block matrix: � φ ( k ) ,χ ( x ) G ( x , y ) φ ( l ) ,χ � � Γ k , l i , j = ( y ) d x d y i j [ 0 , 1 ] 2 � � φ ( k ) ,χ Where the j ∈ J ( k ) are chosen as Haar basis functions. j F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 87 / 130

Sparse factorisation of dense matrices using gamblets Then the results of Owhadi (2017) and Owhadi and Scovel (2017) imply that: F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 88 / 130

Sparse factorisation of dense matrices using gamblets Then the results of Owhadi (2017) and Owhadi and Scovel (2017) imply that: � 1 � � χ ( k ) 0 u ( x ) φ ( l ) ,χ := E is exponentially u | ( x ) d x = δ i , j δ k , l , ∀ l ≤ k i j localised, on a scale of h k : − γ � � �� χ ( k ) x − x ( k ) � x − x ( k ) � ≤ C exp . � � � � i i i h k � F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 89 / 130

Sparse factorisation of dense matrices using gamblets Then the results of Owhadi (2017) and Owhadi and Scovel (2017) imply that: � 1 � � χ ( k ) 0 u ( x ) φ ( l ) ,χ := E u | ( x ) d x = δ i , j δ k , l , ∀ l ≤ k is exponentially i j localised, on a scale of h k : − γ � � �� χ ( k ) x − x ( k ) � x − x ( k ) � ≤ C exp . � � � � i i i h k � Furthermore, the stiffness matrices decay exponentially on each level: 1 � B ( k ) χ ( k ) ( x ) G − 1 χ ( k ) � � �� := ( x ) d x ≤ exp − γ � x i − x j i , j i j 0 F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 90 / 130

Sparse factorisation of dense matrices using gamblets Then the results of Owhadi (2017) and Owhadi and Scovel (2017) imply that: � 1 � � χ ( k ) 0 u ( x ) φ ( l ) ,χ := E u | ( x ) d x = δ i , j δ k , l , ∀ l ≤ k is exponentially i j localised, on a scale of h k : − γ � � �� χ ( k ) x − x ( k ) � x − x ( k ) � ≤ C exp . � � � � i i i h k � Furthermore, the stiffness matrices decay exponentially on each level: 1 � B ( k ) χ ( k ) ( x ) G − 1 χ ( k ) � � �� := ( x ) d x ≤ exp − γ � x i − x j i , j i j 0 Finally, we have for a constant κ : � B ( k ) � cond ≤ κ, ∀ k F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 91 / 130

Sparse factorisation of dense matrices using gamblets The above properties will allow us to show localisation of the (block ) Cholesky factors: F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 92 / 130

Sparse factorisation of dense matrices using gamblets The above properties will allow us to show localisation of the (block ) Cholesky factors: Consider the two-scale case: � Γ 11 � Γ 12 Γ 21 Γ 22 Γ − 1 � Id 0 � � Θ 11 0 � � � Id 11 Γ 12 = Γ 21 Γ − 1 Γ 22 − Γ 21 Γ − 1 Id 0 11 Γ 12 0 Id 11 F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 93 / 130

Sparse factorisation of dense matrices using gamblets The above properties will allow us to show localisation of the (block ) Cholesky factors: Consider the two-scale case: � Γ 11 � Γ 12 Γ 21 Γ 22 Γ − 1 � Id 0 � � Γ 11 0 � � � Id 11 Γ 12 = Γ 21 Γ − 1 Γ 22 − Γ 21 Γ − 1 Id 0 11 Γ 12 0 Id 11 �� u φ ( 2 ) ,χ u φ ( 1 ) ,χ φ ( 2 ) ,χ χ ( 1 ) Γ 21 Γ − 1 � i , j = E d x d x = δ j , m = d x m � 11 i i j � �� B ( 2 ) � − 1 u φ ( 2 ) ,χ d x u φ ( 1 ) ,χ d x � Γ 22 − Γ 21 Γ − 1 � 11 Γ 12 = Cov = � � F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 94 / 130

Sparse factorisation of dense matrices using gamblets � � � � φ ( 2 ) ,χ χ ( 1 ) � � x ( 2 ) − x ( 1 ) � − γ Γ 21 Γ − 1 � i , j = d x ≤ C exp � � 11 i j h i j � F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 95 / 130

Sparse factorisation of dense matrices using gamblets � � � � φ ( 2 ) ,χ χ ( 1 ) � � x ( 2 ) − x ( 1 ) � − γ Γ 21 Γ − 1 � i , j = d x ≤ C exp � � 11 i j h i j � Fact: Inverses ( Demko (1984), Jaffard (1990) ) and Cholesky factors (Benzi and T˚ uma (2000), Krishtal et al. (2015) ) of well-conditioned and banded/exponentially localised matrices are exponentially localised. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 96 / 130

Sparse factorisation of dense matrices using gamblets � � � � � � φ ( 2 ) ,χ χ ( 1 ) � x ( 2 ) − x ( 1 ) Γ 21 Γ − 1 − γ � i , j = d x ≤ C exp � � 11 i j h i j � Fact: Inverses ( Demko (1984), Jaffard (1990) ) and Cholesky factors (Benzi and T˚ uma (2000), Krishtal et al. (2015) ) of well-conditioned and banded/exponentially localised matrices are exponentially localised. �� B ( 2 ) � − 1 � � � � � i − x ( 2 ) − γ � x 2 Therefore: i , j ≤ C exp . � � h 2 j � F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 97 / 130

Sparse factorisation of dense matrices using gamblets � � � � � � φ ( 2 ) ,χ χ ( 1 ) � x ( 2 ) − x ( 1 ) Γ 21 Γ − 1 − γ � i , j = d x ≤ C exp � � 11 i j h i j � Fact: Inverses ( Demko (1984), Jaffard (1990) ) and Cholesky factors (Benzi and T˚ uma (2000), Krishtal et al. (2015) ) of well-conditioned and banded/exponentially localised matrices are exponentially localised. �� B ( 2 ) � − 1 � � � � � i − x ( 2 ) − γ � x 2 Therefore: i , j ≤ C exp . � � h 2 j � Argument can can be extended to multiple scales. Results in exponentially decaying (block-)Cholesky factors. F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 98 / 130

Sparse factorisation of dense matrices using gamblets � � � � � � φ ( 2 ) ,χ χ ( 1 ) � x ( 2 ) − x ( 1 ) Γ 21 Γ − 1 − γ � i , j = d x ≤ C exp � � 11 i j h i j � Fact: Inverses ( Demko (1984), Jaffard (1990) ) and Cholesky factors (Benzi and T˚ uma (2000), Krishtal et al. (2015) ) of well-conditioned and banded/exponentially localised matrices are exponentially localised. �� B ( 2 ) � − 1 � � � � � i − x ( 2 ) − γ � x 2 Therefore: i , j ≤ C exp . � � h 2 j � Argument can can be extended to multiple scales. Results in exponentially decaying (block-)Cholesky factors. These factors can be approximated in time complexity by (block-)Cholesky decomposition in computational complexity of � � 4 d + 1 � N log 2 ( N ) � log ( 1 /ǫ ) + log 2 ( N ) O in time and N log ( N ) log d ( N 1 � � O ǫ ) in space for an approximation error of ǫ . F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 99 / 130

Sparse factorisation of dense matrices using gamblets How about φ ( q ) = δ x ( q ) , i.e. pointwise sampling? i i F. Schäfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017 100 / 130

Compression, inversion and sparse approximate PCA of dense kernel - PowerPoint PPT Presentation

Compression, inversion and sparse approximate PCA of dense kernel matrices in near linear computational complexity Florian Schfer ICERM 2017 F. Schfer, T.J. Sullivan, H. Owhadi Sparse factorisation of dense Kernel matrices June 8th 2017

ECS231 PCA, revisited May 28, 2019 1 / 18 Outline 1. PCA for lossy data compression 2. PCA for

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

14.9.2 JPEG2000 compression DCT compression basis for JPEG wavelet compression

Lossless compression in lossy compression systems Almost every lossy compression system

Results for different matrices and comparisons Dense Matrices Rectangular Matrices

MLCC 2015 Dimensionality Reduction and PCA Lorenzo Rosasco UNIGE-MIT-IIT June 25, 2015 Outline

Vector'Semantics Dense%Vectors% Dan%Jurafsky Sparse'versus'dense'vectors PPMI%vectors%are

Strengthening the inversion Tactic in Coq Dependent Types Inversion Lemmas Implications Anne

JPEG Compression Ian Snyder December 11, 2009 Ian Snyder JPEG Compression Outline

Lecture 9: Compression 1 / 52 Compression Recap Bu ff er Management Recap 2 / 52 Compression

Dense Flow Visualization Lecture 10 February 27, 2020 General Overview Dense methods in 2D

A Massively Parallel Dense Symmetric A Massively Parallel Dense Symmetric A Massively Parallel

Ive Got You Under My Skin: A Comparison of IV and s/c PCA Nick Williamson Clinical Nurse

Exploratory Factor Analysis PCA Analysis A Review Precipitation Temperature Ecosystems PCA

Lecture 25: Autoencoders Kernel PCA Aykut Erdem January 2017 Hacettepe University Today

Generalized Distances Between Rankings Ravi Kumar Sergei Vassilvitskii Yahoo! Research

Divide and Conquer: Counting Inversions Rank Analysis Collaborative filtering matches

Data and potatoes (some drafts of stories) NomenclatureS Stories that we could have told

The Method of Intrinsic Scaling Jos Miguel Urbano CMUC, University of Coimbra, Portugal

Sorting is removing inversions. In an array sorted by ( ! ) we have [ ] ! [ ] ( ) i j ,

Data Unfolding with Wiener-SVD Method arXiv:1705.03568 Tang, a,1 X. Li, b,1 X. Qian, a,2 H. Wei,

Another Look at Inversions over Binary Fields Vassil Dimitrov 1 Kimmo Jrvinen 2 1 Department of

AMAZONIAN ATMOSPHERIC CO 2 DATA SUGGEST MISSING MOISTURE SENSITIVITY IN CARBON-CLIMATE MODELS