Polynomial, sparse and low-rank approximations Anthony Nouy - PowerPoint PPT Presentation

Polynomial, sparse and low-rank approximations Anthony Nouy Centrale Nantes Laboratoire de Mathématiques Jean Leray RICAM Special Semester on “Multivariate Algorithms and their Foundations in Number Theory”, Linz, December 14, 2018 Tutorial on Uncertainty Quantification - Efficient Methods for PDEs with Random Coefficients Anthony Nouy 1 / 59

Uncertainty quantification We consider a (numerical or experimental) model depending on a set of random parameters X = ( X 1 , . . . , X d ) that describe the uncertainties on the model, and some output variable of interest Y = u ( X ) . Forward problems: evaluation of statistics, probability of events, sensitivity indices... � E ( h ( Y )) = E ( h ◦ u ( X )) = h ( u ( x 1 , . . . , x d )) p ( x 1 , . . . , x d ) dx 1 . . . dx d Inverse problems: from (partial) observations of Y , estimate the distribution µ of X d µ ( x 1 , . . . , x d ) Solving forward and inverse problems requires the evaluation of the model for many instances of X . This is usually unaffordable when one evaluation requires a costly numerical simulation (or experiment). Anthony Nouy 2 / 59

Approximation for uncertainty quantification In practice, we rely on approximations of the map X �→ u ( X ) used as predictive surrogate models (reduced order models, metamodels) which are easy to operate with (evaluation, integration, derivation...). This requires approximation formats (model classes) that exploit some specific features of the functions (e.g. regularity, low effective dimension, sparsity, low rank...), possibly deduced from some knowledge on the model, algorithms for constructing approximations from available information: samples (black box), model’s equations (white or grey box)... Anthony Nouy 3 / 59

Approximation for uncertainty quantification An approximation ˜ Y = ˜ u ( X ) of Y = u ( X ) can be directly used for obtaining approximate solutions to forward and inverse problems, with a control of errors on quantities of interest, e.g. � | E ( Y ) − E ( ˜ Y ) | ≤ | u ( x ) − ˜ u ( x ) | d µ ( x ) = � u − ˜ u � L 1 µ , but also to design variance reduction methods for Monte-Carlo methods, e.g. as a control variate N Y ) + 1 E ( Y ) ≈ E ( ˜ � u ( X k )) := ˆ ( u ( X k ) − ˜ I N , N k = 1 I N ) ≤ 1 E ( | ˆ I N − E ( Y ) | 2 ) = V (ˆ u � 2 N � u − ˜ µ . L 2 Anthony Nouy 4 / 59

Approximation The goal is to approximate a function u from a space M by a function u n from a subset M n (model class) described by n (or O ( n ) ) parameters. We distinguish linear approximation, where M n are linear spaces, from nonlinear approximation, where M n are nonlinear sets. The quality of an approximation u n in M n can be assessed by d ( u , u n ) where d is a metric on M , and the quality of the model class is assessed by the best approximation error e n ( u ) M = inf v ∈ M n d ( u , v ) Anthony Nouy 5 / 59

Approximation Given a function u , and given a family of model classes ( M n ) n ≥ 1 , fundamental problems are to determine if and how fast e n ( u ) M tends to 0, and to provide algorithms which produce approximations u n ∈ M n such that d ( u , u n ) ≤ Ce n ( u ) M with C independent of n or C ( n ) e n ( u ) M → 0 as n → ∞ . Anthony Nouy 6 / 59

Worst-case and mean squared errors For functions defined on a parameter space X (equipped with a measure µ ) and with values in some Banach space V , a classical setting is to consider functions from the Bochner space M = L p µ ( X ; V ) = V ⊗ L p µ ( X ) equipped with the metric d ( u , v ) = � u − v � L p µ ( X ; V ) . Two typical cases are p = ∞ (worst-case setting), � u − v � L ∞ µ ( X ; V ) = ess sup � u ( x ) − v ( x ) � V x ∈X and p = 2 (mean-squared setting), � � u − v � 2 � u ( x ) − v ( x ) � 2 V d µ ( x ) = E ( � u ( X ) − v ( X ) � 2 µ ( X ; V ) = V ) L 2 X where X ∼ µ . µ ( X ; V ) , approximation results in L 2 can be Noting that � u − v � L 2 µ ( X ; V ) ≤ � u − v � L ∞ deduced from stronger results in L ∞ . Anthony Nouy 7 / 59

Model classes for vector-valued functions For the approximation of a function u ∈ L p µ ( X ; V ) , typical model classes are M n = V ⊗ S n , where S n is a subspace of L p µ ( X ) (e.g. polynomials, wavelets...), which results in an approximation n � u n ( x ) = v i ϕ i ( x ) i = 1 with an explicit expression as a function of x . M n = L p µ ( X ; V n ) = V n ⊗ L p µ ( X ) , where V n is a low-dimensional subspace of V , which results in an approximation n � u n ( x ) = v i ϕ i ( x ) i = 1 which is not explicit in terms of x . When u ( x ) is solution of a parameter-dependent equation, the approximation u n ( x ) ∈ V n is obtained by some projection of u ( x ) on V n that exploits the equations. This corresponds to projection-based model order reduction methods. Anthony Nouy 8 / 59

Computing an approximation An approximation u n in a certain model class M n can be obtained by an interpolation of u at a set of points Γ n . For a vector space M n = V ⊗ S n and a set of points Γ n ⊂ X unisolvent for S n , the interpolation u n is such that u n ( x ) = u ( x ) ∀ x ∈ Γ n , and � u − u n � L p ≤ ( 1 + L ( p ) n ) e n ( u ) L p where L ( p ) is the norm of the interpolation operator from L p µ ( X ) to S n , which n depends on the quality of the set of points Γ n for S n . For p = ∞ , L ( ∞ ) is the Lebesgue constant L ( ∞ ) � n = sup x ∈X i = 1 | ℓ i ( x ) | where { ℓ i } n n is a basis of S n with the interpolation property. Anthony Nouy 9 / 59

Computing an approximation A minimization of an empirical risk functional m 1 � ℓ ( u ( x k ) , v ( x k )) ≈ min min v ∈ M n E ( ℓ ( u ( X ) , v ( X ))) m v ∈ M n k = 1 where the x k are samples of X and the risk E ( ℓ ( u ( X ) , v ( X ))) provides some “distance” d ( u , v ) between u and v . A better performance can be obtained by solving m 1 � min w k ℓ ( u ( x k ) , v ( x k )) m v ∈ M n k = 1 where the x k are samples in X drawn from a suitable distribution d ν ( x ) = ρ ( x ) d µ ( x ) on X , and the weights w k = ρ ( x k ) − 1 . Anthony Nouy 10 / 59

Computing an approximation a (weighted) least-squares projection of u ∈ L 2 µ ( X ; V ) , which is solution of m 1 � ρ ( x k ) − 1 � u ( x k ) − v ( x k ) � 2 min V m v ∈ M n k = 1 where the x k are samples in X drawn from a certain distribution d ν ( x ) = ρ ( x ) d µ ( x ) on X . For M n = V ⊗ S n with S n a n -dimensional subspace of L 2 µ ( X ) with orthonormal basis { ϕ i } n i = 1 , the quality of the least-squares projection depends on how far the empirical Gram matrix m G ij = 1 � w k ϕ i ( x k ) ϕ j ( x k ) m k = 1 is from identity. An optimal weighted least-squares method [Cohen and Migliorati 2017] is obtained i = 1 ϕ i ( x ) 2 . Then for m ≥ n ǫ − 2 log ( 2 n η − 1 ) , this ensures that � n with ρ ( x ) = 1 n P ( � G − I � > ǫ ) ≤ η and (in particular) 1 n E ( � u − u n � 2 L 2 ) ≤ Ce n ( u ) 2 L 2 + � u � 2 η, with C = 1 + m . 1 − ǫ Anthony Nouy 11 / 59

Computing an approximation Given the model’s equations A ( x ) u ( x ) = f ( x ) , with A ( x ) : V → W , f ( x ) ∈ W an approximation u n can be obtained through a Galerkin projection 1 of u , e.g. defined by � � A ( x ) v ( x ) − f ( x ) � 2 min W d µ ( x ) or v ∈ M n sup min � A ( x ) v ( x ) − f ( x ) � W v ∈ M n x ∈X X If A ( x ) is a linear operator such that α � v � V ≤ � A ( x ) v � W ≤ β � v � V , then µ ( X ; V ) ≤ β � u − u n � L p v ∈ M n � u − v � L p α inf µ ( X ; V ) 1 coined stochastic Galerkin projection Anthony Nouy 12 / 59

Outline Polynomial approximation 1 Sparse approximation 2 Projection based model reduction 3 (Other) model classes for high-dimensional approximation 4 Anthony Nouy 13 / 59

Polynomial approximation Outline Polynomial approximation 1 Sparse approximation 2 Projection based model reduction 3 (Other) model classes for high-dimensional approximation 4 Anthony Nouy 14 / 59

Polynomial approximation Polynomial spaces Let X = X 1 × . . . × X d ⊂ R d . For each dimension k , we consider a family of univariate polynomials { ψ k n } n ≥ 0 with ψ k n ∈ P n ( X k ) . Then we define the tensorised basis ψ α ( x ) = ψ 1 α 1 ( x 1 ) . . . ψ d α d ( x d ) where α is a multi-index in N d . For a set Λ ⊂ N d , we consider the space of polynomials P Λ ( X ) = span { ψ α ( x ) : α ∈ Λ } In general, the polynomial space P Λ ( X ) depends on the chosen univariate polynomial bases, except for downward closed sets Λ such that α ∈ Λ and β ≤ α ⇒ β ∈ Λ Anthony Nouy 15 / 59

Polynomial approximation Polynomial interpolation Let Γ k = ( t k i ) i ≥ 0 be a sequence of points in X k such that the set ( t k i ) n i = 0 is unisolvent for P n ( X k ) , which means that for any a ∈ R n + 1 , there exists a unique polynomial v ∈ P n ( X k ) such that v ( t k i ) = a i for all 0 ≤ i ≤ n , n : R X k → P n ( X k ) . therefore allowing to define the interpolation operator I k Then for any downward closed set Λ ⊂ N d , the set Γ Λ = { t α = ( t 1 α 1 , . . . , t d α d ) : α ∈ Λ } is unisolvent for P Λ ( X ) , that uniquely defines an interpolation operator (oblique projection) I Λ : R X → P Λ ( X ) whose norm can be bounded using upper bounds of the norm of one-dimensional interpolation operators. Anthony Nouy 16 / 59

Polynomial, sparse and low-rank approximations Anthony Nouy - PowerPoint PPT Presentation

Polynomial, sparse and low-rank approximations Anthony Nouy Centrale Nantes Laboratoire de Mathmatiques Jean Leray RICAM Special Semester on Multivariate Algorithms and their Foundations in Number Theory, Linz, December 14, 2018

2 3 4 5 8 9 MINNEAPOLIS MILWAUKEE MSA RANK #16 MSA RANK #39 CHICAGO MSA RANK #3

Parallel Numerical Algorithms Chapter 6 Matrix Models Section 6.2 Low Rank Approximation

Why Algorithmic and Rigorous Polynomial Approximations? Rigorous Polynomial Approximation =

LR-GLM: High-Dimensional Bayesian Inference Using Low-Rank Data Approximations Brian Trippe ,

Generating Sparse Representations by Adaptive Multiscale Approximations Angela Kunoth University

On the minimum rank of a graph Jisu Jeong June 21, 2013 Jisu Jeong On the minimum rank of a

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Introduction Warping polynomial Span of warping polynomial Span and dealternating number Ayaka

1 Low-rank approximations to a matrix using SVD First point: we can write the SVD as a sum of

1 SVD applications: rank, column, row, and null spaces Rank : the rank of a matrix is equal to:

A new family of maximum rank distance codes or: Maximum rank distance codes and finite semifields

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

Multiple-Rank Updates to Matrix Factorizations Zack 8/30/2013 Outline u Introduction u

On Kauffman polynomial of alternating knot and HOMFLY polynomial of its Whitehead double

PATTERN RECOGNITION AND MACHINE LEARNING Polynomial Curve Fitting Sum-of-Squares Error Function 0

Sparse Polynomial Interpolation With Arbitrary Orthogonal Polynomial Bases Erdal Imamoglu

Deep Semantic Matching for Amazon Product Search Yi Yiwei ei So Song ng Amazon Product

IEPs that Work SESSION 4 SHELLEY MOORE Today A nice start Sharing what we tried Reviewing

Chemistry 120 Fall 2016 Instructor: Dr. Upali Siriwardane e-mail: upali@latech.edu Office: CTH

Language Modeling, Efficiency/Training Tricks Graham Neubig Site

Toward Automated Pattern Discovery: Deep Representation Learning with Spatial-Temporal-Networked

Low-Mass Dark Matter Searches Using Quantum Sensing and Readout with MKIDs and Paramps

Computations related to the Riemann Hypothesis William F. Galway Department of Mathematics

Krylov subspace methods for eigenvalue problems David S. Watkins watkins@math.wsu.edu

Sambuz

Useful Links

Newsletter

Mail Us