sparse methods and high dimensional parametric pde s
play

Sparse methods and high dimensional parametric PDEs Albert Cohen - PowerPoint PPT Presentation

Sparse methods and high dimensional parametric PDEs Albert Cohen Laboratoire Jacques-Louis Lions Universit e Pierre et Marie Curie Paris Toulouse 22-04-2014 What is sparsity ? Small dimesional phenomenon in high dimensional context R N


  1. Sparse methods and high dimensional parametric PDE’s Albert Cohen Laboratoire Jacques-Louis Lions Universit´ e Pierre et Marie Curie Paris Toulouse 22-04-2014

  2. What is sparsity ? Small dimesional phenomenon in high dimensional context R N representing a signal, image or Simple example : vector x = ( x 1 , · · · , x N ) ∈ I function, discretized with N >> 1. The vector x is sparse if only few of its coordinates are non-zero.

  3. How to quantify this ? The set ot k -sparse vectors R N ; # { i ; x i � = 0 } ≤ k } Σ k := { x ∈ I As k gets smaller, x ∈ Σ k gets sparser. More realistic : a vector is quasi-sparse if only a few numerically significant coordinates concentrate most of the information. How to measure this notion of concentration ? Remarks : A vector in Σ k is characterized by k non-zero values and their k positions. Intrinsically nonlinear concepts : x , y ∈ Σ k does not imply x + y ∈ Σ k . Sparsity is often hidden, and revealed through an appropriate representation (change of basis).

  4. Agenda 1. Sparsity and wavelet representations (90-00) 2. Sparsity in PDE’s and Images, compressed sensing (00-10) 3. High dimensional parametric PDE’s (10- ) References R. DeVore, “Nonlinear approximation”, Acta Numerica, 1998. A. Cohen, “Numerical analysis of wavelet methods”, Studies in Mathematics and its Applications, Elsevier, 2003. A. Cohen, W. Dahmen, I. Daubechies and R. DeVore, “Harmonic Analysis of the space BV”, Revista Matematica Iberoamericana, 2003. A. Cohen, W. Dahmen and R. DeVore, “Compressed sensing and best k -term approximation”, Journal of the AMS, 2009. A. Cohen, R. DeVore and C. Schwab, “Analytic regularity and polynomial approximation of parametric and stochastic PDEs”, Analysis and Application, 2011.

  5. Multiscale representations into wavelet bases : the Haar system 1 f = < f , e > e e 0 0 0 0 1 1 0 1 + < f , e > e e 1 0 1 1 -1 0 1 + < f , e > e + < f , e > e 2 2 3 3 1 0 . . . . = Σ f ψ ψ f := < f , λ > λ λ λ λ ψ λ ( x ) := 2 j / 2 ψ ( 2 j x − k ) , λ = ( j , k ) , j ≥ 0 , k ∈ Z Z , | λ | = j = j ( λ ) . More general wavelets are constructed from similar multiscale approximation processes, using smoother functions such as splines, finite elements... In d dimension ψ λ ( x ) := 2 dj / 2 ψ ( 2 j x − k ) , k ∈ Z Z d .

  6. Discrete signals : fast decomposition/reconstruction algorithms 1D array ( f 0 , · · · , f N ) ⇒ Two half array : averages f 2 k + f 2 k + 1 2 and differences f 2 k − f 2 k + 1 2 ⇒ Iterate on the half array of averages... Multiscale processing of 2D data : separable algorithm a c d J−1 J−1 c J c b d d J−1 J−1 Image f ( k , l ) ⇒ process lines ⇒ process columns ⇒ Iterate ...

  7. Digital Image 512x512 Multiscale Decomposition Multiscale decompositions of natural images are sparse : a few numerically significant coefficients concentrate most of the energy and information.

  8. Application to Image Compression Basic idea : encode with more precision the few numerically significant coefficients ⇒ Resolution is locally adapted Example : 1 % largest coefficients encoded Compression standard JPEG 2000 : - Same basic principles - Based on smoother wavelets - Good quality with compression 1/40

  9. Application to image denoising Noisy digital image Multiscale decomposition Natural strategy : thresholding i.e. put to zero the coefficients which are smaller than the noise level.

  10. Two other applications Statistical learning : given a set of data ( x i , y i ) , i = 1 , 2 , · · · , m , drawn independently according to a probability law, build a function f such that | f ( x ) − y | is small in the average ( E ( | f ( x ) − y | 2 ) as small as possible). Difficulty : build the adaptive grid from uncertain data, update it as more and more samples are received. Adaptive numerical simulation of PDE’s : Computing on a non-uniform grid is justified for solutions which displays isolated singularities (shocks). Difficulty : the solution f is unknown. Build the grid or set of wavelet coefficients which is best adapted to the solution. Use a-posteriori information, gained throughout the numerical computation.

  11. Measuring sparsity in a representation f = � f λ ψ λ Intuition : the number of coefficients above a threshold η should not grow too fast as η → 0. Weak spaces : ( f λ ) ∈ w ℓ p if and only if Card { λ s . t . | f λ | > η } ≤ C η − p , or equivalently, the decreasing rearrangement ( f n ) n > 0 of ( | f λ | ) satisfies f n ≤ Cn − 1 / p . The w ℓ p quasi-norm can be defined by n 1 / p f n . � ( f λ ) � w ℓ p := sup n > 0 Obviously ℓ p ⊂ w ℓ p . The representation is sparser as p → 0. If p < 2 and ( ψ λ ) is (any) orthonormal basis in a Hilbert space H , an equivalent statement is in terms of best N -term approximation : with f N = � N largest | f λ | f λ ψ λ , � � | f n | 2 � 1 / 2 � � n − 2 / p � 1 / 2 ≤ C � ( f λ ) � w ℓ p N − s , s = 1 p − 1 � f − f N � H = ≤ � ( f λ ) � w ℓ p 2 . n ≥ N n ≥ N

  12. Older observation by Stechkin For the strong ℓ p space one has s = 1 p − 1 ( f λ ) λ ∈ Λ ∈ ℓ p ⇒ � f − f N � H ≤ � ( f λ ) � ℓ p ( N + 1 ) − s , 2 . Proof : using the decreasing rearrangement, we combine � � n ) 1 / 2 ≤ f 1 − p / 2 n ) 2 = ( � ( f λ ) � p / 2 f 2 f 2 − p f p � f − f N � H = ( n ℓ p N + 1 n > N n > N and N + 1 � ( N + 1 ) f p n ≤ � ( f λ ) � p f p N + 1 ≤ ℓ p . n = 1 Note that a large value of s corresponds to a value p < 1 (non-convex spaces). For concrete choices of bases (such as wavelets) a relevant question is therefore : what smoothness properties of f ensure that the coefficient sequence ( f λ ) belongs to ℓ p or w ℓ p for small values of p ?

  13. Central problems in approximation theory - X normed space. - ( Σ N ) N ≥ 0 ⊂ X approximation subspaces : g ∈ Σ N described by N (or O ( N ) ) parameters. - Best approximation error σ N ( f ) := inf g ∈ Σ N � f − g � X . Problem 1 : characterise those functions in f ∈ X having a certain rate of approximation f ∈ X r ⇔ σ N ( f ) < ∼ N − r Here A < ∼ B means that A ≤ CB , where the constant C is independent of the parameters defining A and B .

  14. Examples Linear approximation : Σ N linear space of dimension N (or O ( N ) ). - Σ N := Π N polynomials of degree N in dimension 1 - Σ N := { f ∈ C r ([ 0 , 1 ]) ; f | [ k ] ∈ Π m , k = 0 , · · · , N − 1 } with 0 ≤ r ≤ m fixed, N , k + 1 N splines with uniform knots. - Σ N := Vect ( e 1 , · · · , e N ) with ( e k ) k > 0 a functional basis. Nonlinear approximation : Σ N + Σ N � = Σ N . - Σ N := { p q , p , q ∈ Π N } rational fractions - Σ N := { f ∈ C r ([ 0 , 1 ]) ; f | [ x k , x k + 1 ] ∈ Π m , 0 = x 0 < · · · < x N = 1 } with 0 ≤ r ≤ m fixed, free knots splines. - Σ N := { � λ ∈ E d λ ψ λ ; # ( E ) ≤ N } set of all N -terms combination of a basis ( ψ λ ) .

  15. Central problem in computational approximation Problem 2 : practical realization of f � → f N ∈ Σ N such that � f − f N � X < ∼ σ N ( f ) . If Σ N are linear spaces and P N : X → Σ N are uniformly bounded projectors � P N � X → X ≤ C , then f N := P N f is a good choice, since for all g ∈ Σ N , � f − f N � X ≤ � f − g � X + � g − f N � X = � f − g � X + � P N ( g − f )) � X ≤ ( 1 + C ) � g − f � X , and therefore � f − f N � X ≤ ( 1 + C ) σ N ( f ) . What about nonlinear spaces ?

  16. A basic example Approximation of f ∈ C ([ 0 , 1 ]) by piecewise constant functions on a partition I 1 , · · · , I N , defining � f N ( x ) = a k := | I k | − 1 f , if x ∈ I k . I k Local error : � f − a k � L ∞ ( I k ) ≤ max x , y ∈ I k | f ( x ) − f ( y ) | Linear case : I k = [ k N , k + 1 N ] uniform partition. f ′ ∈ L ∞ ⇔ � f − f N � L ∞ ≤ CN − 1 ( C = sup | f ′ | ) . � � � � � � � � � � � � � � � � 1/N � � � � 0

  17. Nonlinear case : I k free partition. If f ′ ∈ L 1 , choose the partition such that equilibrates � I k | f ′ | = N − 1 � 1 0 | f ′ | . the total variation � 1 f ′ ∈ L 1 ⇔ � f − f N � L ∞ ≤ CN − 1 | f ′ | ) . ( C = 0 � � � � � � � � � � � � �� �� �� �� �� �� 1/N � � � � 0 Approximation rate governed by differents smoothness spaces ! Example : f ( t ) = t α with 0 < α < 1, then f ′ ( t ) = α t α − 1 is in L 1 , not in L ∞ . Nonlinear approximation rate N − 1 outperforms linear approximation rate N − α .

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend