High Dimensional Approxima- tion - Background and Sources Dahmen Outline Outline
High Dimensional Approximation - Outline Background and Sources - - PowerPoint PPT Presentation
High Dimensional Approximation - Outline Background and Sources - - PowerPoint PPT Presentation
Outline High Dimensional Approxima- tion - Background and Sources Dahmen High Dimensional Approximation - Outline Background and Sources Wolfgang Dahmen Seminar: USC, High Dimensional Approximation, Feb 13, 2008 Outline High Outline
High Dimensional Approxima- tion - Background and Sources Dahmen Outline Outline
Outline
1
Learning Theory Regression Basic Concepts Remedies
2
Specific Problem Areas Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
3
Methodological Aspects Summary of Key Issues Compressed Sensing Greedy Techniques
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Regression
Learning Theory - Regression Problem
ρ unknown measure on Z := X × Y fρ(x) :=
- Y
ydρ(y|x) = E(y|x)
Y X x x’
E(f) :=
- Z
(y − f(x))2dρ E(f) = E(fρ) + f − fρ2
L2(X,ρX )
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Basic Concepts
Concepts - Obstructions
Relevant concepts Nonparametric estimation, concentration inequalities, nonlinear approximation Solution strategies Adaptive partitioning Complexity regularization (model selection) dim X large - Curse of dimensionality: Are there ways around it?
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Remedies
Ameliorating the curse of dimensionality
Dimensionwise decompositions - ANOVA-type schemes Kernel methods, neural networks Sparse grids, hyperbolic cross approximation Kronnecker-product approximation Dimension reduction - “learning” embedded manifolds Recovery schemes: Greedy algorithms Procedural recovery (sparse occupancy trees, Sprecher’s alg.) A higher level of difficulty: Learning on Banach spaces (dimX = ∞) Learning implicitly given functions - e.g. solutions of stochastic PDEs
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Climatology- An Example
Dynamical System Input
∂ψ ∂t + D(ψ, x) = P(ψ, x) ψ 3D prognostic dependent variable, e.g. temperature, pressure, moisture, etc. x 3D dependent variable, e.g. latitude, longitude, height, D model dynamics, PDE of motion, thermodynamcs, balance laws, etc. P model physics, long, short range athmospheric radiation , turbulence, convection, clouds, interactions with land, chemistry, etc. so complicated even as simplified parametrized versions – based on solving deterministic equations
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Climatology- An Example
Alternative: Learning
Instead of computing the forcing terms by solving deterministic equations, taking most of the time, one tries to “learn” P from aquired data Problem: Given Z = {zi = (xi, yi) ∈ X × Y ⊂ Rd×d′ : i = 1, . . . , N} find f : Rd → Rd′ with f(xi) = yi, i = 1, . . . , N Possible strategy: Sparse occupancy trees Question: reasonable error bounds?- concentration of measure phenomenon
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Finance
Finance - high dimensional integration
In the US mortgages last 30 years and may be repaid each month, which gives 12 × 30 = 360 repayment possibilities Computation of 360-dimensional expected value
1
- · · ·
1
- f(x1, . . . , x360)dx1 · · · dx360
Note: Quadrature rule with k nodes in [0, 1] requires k360 point evaluations...
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Electronic Structure Calculation
Electronic Structure Calculation
Goal: Numerical simulation of molecular phenomena in chemistry, molecular biology, semiconductor devices, material sciences... “Ab-Initio” Calculations based on first principles in quantum mechanics (ignoring relativistic effects and using the Born-Oppenheimer approximate Model)
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Electronic Structure Calculation
Quantum mechanical postulates:
System of N identicle (non-relativistic) particles with spin si described by a state function ψ(x1, s1; . . . ; xN, sN), ψ : R3N ⊗ SN → C, ψ, ψ = 1 ψ satisfies (stat.) Schr¨
- dinger equation with Hamiltonian H
Hψ = E0ψ, E0 = min
ψ,ψ=1Hψ, ψ
Born-Oppenheim: H =
N
- i=1
- − 1
2∆i −
M
- j=1
zj |xi − Rj| + 1 2
- j=i
1 |xi − xj|
- zj = charge of jth nucleus at position Rj
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Stochastic Multiscale Modeling
Typical Applications
Simulation of porous media flow, contamination prediction, well protection Understanding heterogeneous materials like concrete Classical diffusion equation: − div (A∇u) = f in D ⊂ Rd, u |∂D= 0, (d = 2, 3) (1) A = A(x) describes diffusivity of the material Problem: In heterogeneous porous media the small scales
- f the material make it impossible to describe all details by
A and to resolve them numerically
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Stochastic Multiscale Modeling
Stochastic Model
Idea: view A as a random field (A = aI scalar) about which (coarsely sampled) measurements provide uncertain information: a = a(·, ω) : ω → L∞(D) =: X, ω ∈ Ω where (Ω, Σ, ρ) probability space on data space X Proposition: When a(·, ω) stays bounded away from zero ρ-a.s. then (1) is well posed, i.e. there exists a unique u(·, ω) : Ω → H1
0(D)
which is a weak solution of (1).
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Stochastic Multiscale Modeling
Transformation into Paremeter Dependent PDE
Typical goal: determine u = EΩ(u) Possible strategy – Ansatz: a(x, ω) = EΩ(a)(x) +
∞
- m=1
am(x)ym(ω) specification of am(x), ym(ω) via “Karhunen-Loewe-expansion”.... lots of stochastic assumptions ...
− div (aM(x; y1, . . . , yM)∇uM(x)) = f(x) x ∈ D u |∂D= 0 (2)
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Stochastic Multiscale Modeling
Issues and Objectives
Solve (2) by numerical methods balance discretization error and truncation error due to M Compute function u(x; y1, . . . , yM) of d + M variables Number of variables M in y becomes a discretization parameter
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects An Instance of “Manifold Learning”
An Instance of “Manifold Learning”
Optimal control, shape optimization parameter dependent PDEs F(u; y) = f
- u = u(·, y) ∈ H,
y ∈ Y
- Manifold:
M := {u(·; y) : y ∈ Y} ⊂ H Analogously: replace H by HN M ⊂ RN
uN(x; y) =
N
- i=1
ui(y)φi(x), uN(·; y) ↔ (u1(y), . . . , uN(y)) ∈ RN
Objective: Assess this manifold with complexity << N Reduced Order Methods: Maday, Patera,.....
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Summary of Key Issues
Summary of Key Issues
Enimies: Complexity of neighborhood search, strong dependence on particular norm Curse of dimensionality, exponential dependence on d Remedies ? Greedy techniques with problem adapted dictionaries Dimension reduction techniques (compressed sensing) Sparsity preserving recovery techniques, anisotropy
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Compressed Sensing
A New Paradigm in Signal Processing
Classical model:
- bandlimited signals
- sampling at Nyquist rate xi = x(ti)
Compressed Sensing (CS)
- Sparsity model:
x ∈ RN, x = Ψz, Ψ ∈ RN×N, #suppz = k < < N
- Change notion of sampling
x → φj · x, j = 1, . . . , n, n < < N
- Rate ∼ information content k ∼ n
Goal: Minimize a-priori the number of measurements from complex signals x ∈ RN while retaining “essential” information
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Compressed Sensing
A Simple Effect
yℓ = 1 2π
2π
- f(t)e−iℓtdt ≈ 1
N
N−1
- j=0
f(2πj/N)
- =:xj
e−iℓ2πj/N
- =:φℓ,j
= (Φx)ℓ Exact reconstruction: f = argmin {||g
′||1 : ˆ
g(ℓ) = ˆ f(ℓ), |ℓ| ≤ k}
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Compressed Sensing
Key Task
Question: How to design data-independent linear functionals φi, i = 1, . . . , n << N, such that one can still recover “substantial” information on x from φi · x, i = 1, . . . , n Formally: yi = φi · x
- y = Φx,
Φ ∈ Rn×N
n N
{
# T = k x
} k
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Compressed Sensing
Decoding Concepts
Main Issue: Recovery of sparsity ℓ1-minimization (Donoho, Candes/Romberg/Tao...): x∗ = argminΦz=yzℓ1 Greedy algorithms (Gilbert/Tropp, Cohen/D/DeVore, Temlyakov...) Goal: Use such concepts in the other contexts (Guermond/Popov)
High Dimensional Approxima- tion - Background and Sources Dahmen Learning Theory
Regression Basic Concepts Remedies
Specific Problem Areas
Climatology- An Example Finance Electronic Structure Calculation Stochastic Multiscale Modeling An Instance of “Manifold Learning”
Methodological Aspects
Summary of Key Issues Compressed Sensing
Learning Theory Specific Problem Areas Methodological Aspects Greedy techniques
Greedy agorithms - Curse of dimensionality
H Hilbert space, D ⊂ H dictionary, g = 1, g ∈ D, f ∈ H
- r0 = f, f0 = 0
- given fk−1 determine gk := argmaxg∈Drk−1, g and set