A Stochastic Convergence Analysis for Tikhonov-Regularization with - - PowerPoint PPT Presentation

a stochastic convergence analysis for tikhonov
SMART_READER_LITE
LIVE PREVIEW

A Stochastic Convergence Analysis for Tikhonov-Regularization with - - PowerPoint PPT Presentation

A Stochastic Convergence Analysis for Tikhonov-Regularization with Sparsity Constraints Daniel Gerth, Ronny Ramlau Sparse Tomo Days, Lyngby, Denmark 28.03.14 Doctoral Program Computational Mathematics Numerical Analysis and Symbolic


slide-1
SLIDE 1

Doctoral Program

Computational Mathematics

Numerical Analysis and Symbolic Computation

A Stochastic Convergence Analysis for Tikhonov-Regularization with Sparsity Constraints

Daniel Gerth, Ronny Ramlau Sparse Tomo Days, Lyngby, Denmark 28.03.14

Gerth,Ramlau 1 / 34

slide-2
SLIDE 2
  • Introduction
  • Bayesian approach
  • Convergence theorem
  • Convergence rates
  • Numerical examples

Gerth,Ramlau 2 / 34

slide-3
SLIDE 3

Introduction

Overview

  • Introduction
  • Bayesian approach
  • Convergence theorem
  • Convergence rates
  • Numerical examples

Gerth,Ramlau 2 / 34

slide-4
SLIDE 4

Introduction

We study the solution of the linear ill-posed problem Ax = y with A ∈ L(X, Y) where X and Y are Hilbert spaces we seek solutions x which are sparse w.r.t to a given ONB the observed data is assumed to be noisy Basic deterministic model: ||Ax − yδ||2 + ˆ αΦw,p(x) → min

x

(1) Penalty Φw,p(x) =

λ∈Λ wλ|x, ψλ|p for an ONB {ψλ}

Gerth,Ramlau 3 / 34

slide-5
SLIDE 5

Introduction

noise modelling

two different approaches deterministic stochastic worst case error stochastic information ||yδ − y|| ≤ δ e.g. yσ ∼ N(y, σ2), E(||yσ − y||) = f(σ),... . . . . . . “easy” analysis “hard” “fast” algorithms “slow” δ hard to get parameters σ easy to get ⇐? ⇒

Gerth,Ramlau 4 / 34

slide-6
SLIDE 6

Introduction

noise modelling

two different approaches deterministic stochastic worst case error stochastic information ||yδ − y|| ≤ δ e.g. yσ ∼ N(y, σ2), E(||yσ − y||) = f(σ),... . . . . . . “easy” analysis “hard” “fast” algorithms “slow” δ hard to get parameters σ easy to get ⇐? ⇒ We want to combine the advantages and find links between both branches. Question: Can we prove convergence (rates) for sparsity regularization, if we use an explicit stochastic noise model instead of the worst case error?

Gerth,Ramlau 4 / 34

slide-7
SLIDE 7

Introduction

stochastic noise model based on discretization, also computation requires discretization, done via projections Pm : Y → Rm, y → y, e.g. point evaluation Tn : X → Rn, x = Tnx = {x, ψi}i=1,...,n where {ψi}∞

i=1 is ONB in X.

each component of y carries stochastic noise, yσ = y + ε, ε ∼ N(0, σ2Im). Define A := PmAT ∗

n, then we want to find x s.t.

Ax = yσ (2)

Gerth,Ramlau 5 / 34

slide-8
SLIDE 8

Bayesian approach

Overview

  • Introduction
  • Bayesian approach
  • Convergence theorem
  • Convergence rates
  • Numerical examples

Gerth,Ramlau 5 / 34

slide-9
SLIDE 9

Bayesian approach

We use Bayes’ formula

to characterize the solution. In this framework, every quantity is treated as a random variable in a complete probability space (Ω, F, P). πpost(x|yσ) = πε(yσ|x)πpr(x) πyσ(yσ) . πpost(x|yσ) posterior density πε(yσ|x) likelihood function πpr(x) prior distribution πyσ(yσ) data distribution (irrelevant)

Gerth,Ramlau 6 / 34

slide-10
SLIDE 10

Bayesian approach

We use Bayes’ formula

to characterize the solution. In this framework, every quantity is treated as a random variable in a complete probability space (Ω, F, P). πpost(x|yσ) = πε(yσ|x)πpr(x) πyσ(yσ) . πpost(x|yσ) posterior density πε(yσ|x) likelihood function πpr(x) prior distribution πyσ(yσ) data distribution (irrelevant) gaussian error model: πε ∝ exp(− 1 2σ2 ||Ax − yσ||2), Now we need a prior

Gerth,Ramlau 6 / 34

slide-11
SLIDE 11

Bayesian approach

Besov spaces

We are looking for sparse reconstructions w.r.t. a basis in X

  • ur choice: Besov-space Bs

p,p(Rd) prior

Reasons:

”easy” characterization with coefficients of a wavelet expansion sparsity-promoting properties known, connection to TV regularization discretization invariance (Lassas, Saksman, Siltanen ’09), avoiding the following phenomena:

solutions diverge as m → ∞ solutions diverge as n → ∞ Representation of a-priori knowledge is incompatible with discretization (this is the case, e.g., for a TV prior)

Gerth,Ramlau 7 / 34

slide-12
SLIDE 12

Bayesian approach

we consider a wavelet basis suitable for multi resolution analysis let {ψλ : λ ∈ Λ} denote the set of all wavelets ψ, also including the scaling functions where Λ is an appropriate index set, possibly infinite set |λ| = j, then x ∈ Bs

p,p(Rd) ⊂ L2(Rd) , s < ˜

s, if ||x||Bs

p,p(Rd) :=

 

λ∈Λ

2ςp|λ|

|x, ψλ|p  

1/p

< ∞ and ς = s + d( 1

2 − 1 p) ≥ 0. We focus on 1 ≤ p ≤ 2.

Gerth,Ramlau 8 / 34

slide-13
SLIDE 13

Bayesian approach

Besov-space random variables

Definition (adapted from Lassas/Saksman/Siltanen, 2009)

Let 1 ≤ p < ∞ and s ∈ R. Let X be the random function X(t) =

  • λ∈Λ

2−ς|λ|Xα

λ ψλ(t),

t ∈ Rd, where the coefficients (Xα

λ )λ∈Λ are independent identically

distributed real-valued random variables with probability density function πXα

λ (τ) = cα

p exp(−α|τ|p

2 ), cα

p =

α 2 1

p

p 2Γ( 1

p),

τ ∈ R. Then we say X is distributed according to a Bs

p,p-prior,

X ∝ exp(− α

2 ||X||p Bs

p,p(Rd)). Gerth,Ramlau 9 / 34

slide-14
SLIDE 14

Bayesian approach

“Problem”: P(X ∈ Bs

p,p(Rd)) = 0

Gerth,Ramlau 10 / 34

slide-15
SLIDE 15

Bayesian approach

“Problem”: P(X ∈ Bs

p,p(Rd)) = 0

Theorem (adapted from Lassas/Saksman/Siltanen, 2009)

Let X be as before, 2 < α < ∞ and take r ∈ R. Then the following three conditions are equivalent: (i) ||X||Br

p,p(Rd) < ∞

almost surely, (ii) E exp

  • ||X||p

Br

p,p(Rd)

  • < ∞,

(iii) r < s − d

p.

same result as [LSS 2009], but here Rd instead of Td considered

Gerth,Ramlau 10 / 34

slide-16
SLIDE 16

Bayesian approach

How to avoid this phenomenon?

“finite model” (MI) “infinite model” (MII)

Gerth,Ramlau 11 / 34

slide-17
SLIDE 17

Bayesian approach

How to avoid this phenomenon?

“finite model” (MI)

consider discretization level m and n fixed, finite index set Λn Then Xn(t) :=

  • λ∈Λn

2−ς|λ|Xα

λ ψλ(t) ⇒ ||Xn||p Bs

p,p(Rd) =

  • λ∈Λn

|Xα

λ |p < ∞

and P(||Xn||Bs

p,p(Rd) > ̺) =

Γ( n

p , α̺p 2

) Γ( n

p )

≤ 1

̺

p

  • 2n

αp

“infinite model” (MII)

Gerth,Ramlau 11 / 34

slide-18
SLIDE 18

Bayesian approach

How to avoid this phenomenon?

“finite model” (MI)

consider discretization level m and n fixed, finite index set Λn Then Xn(t) :=

  • λ∈Λn

2−ς|λ|Xα

λ ψλ(t) ⇒ ||Xn||p Bs

p,p(Rd) =

  • λ∈Λn

|Xα

λ |p < ∞

and P(||Xn||Bs

p,p(Rd) > ̺) =

Γ( n

p , α̺p 2

) Γ( n

p )

≤ 1

̺

p

  • 2n

αp

“infinite model” (MII)

define X(t) in Br

p(Rd) with s < r − d p, then

E(||X||Bs

p,p(Rd)) =

  • 2

αp

  • c1

λ + c2 λ

j=0 2−j((r−s)p−d) 1

p < ∞

and P(||X||Bs

p,p(Rd) > ̺) ≤ 1

̺E(||X||Bs

p,p(Rd))

Gerth,Ramlau 11 / 34

slide-19
SLIDE 19

Bayesian approach

Recall πpost(x|yσ) = πpr(x)πε(yσ|x) πyσ(yσ) . πε(yσ|x) Gaussian noise, πpr(x) Besov-space prior ⇒ πpost(x|yσ) ∝ exp(− 1

2σ2 ||Ax − yσ||2) · exp(− α 2 ||x||p Bs

p,p(Rd))

we are interested in the maximum a-priori solution xmap

α

= argmax

x∈Rn

πpost(x|yσ)

  • r equivalently

xmap

α

= argmin

x∈Rn

||Ax − yσ||2 + ασ2||x||p

Bs

p(Rd)

(3)

Gerth,Ramlau 12 / 34

slide-20
SLIDE 20

Bayesian approach

Recall πpost(x|yσ) = πpr(x)πε(yσ|x) πyσ(yσ) . πε(yσ|x) Gaussian noise, πpr(x) Besov-space prior ⇒ πpost(x|yσ) ∝ exp(− 1

2σ2 ||Ax − yσ||2) · exp(− α 2 ||x||p Bs

p,p(Rd))

we are interested in the maximum a-priori solution xmap

α

= argmax

x∈Rn

πpost(x|yσ)

  • r equivalently

xmap

α

= argmin

x∈Rn

||Ax − yσ||2 + ασ2

  • ˆ

α

||x||p

Bs

p(Rd)

(3) same functional as in deterministic case, but we only know E(||y − yσ||) = f(σ)

Gerth,Ramlau 12 / 34

slide-21
SLIDE 21

Bayesian approach

stochastic setting requires different measure for convergence we use the Ky Fan metric

Definition

Let x1 and x2 be random variables in a probability space (Ω, F, P) with values in a metric space (χ, dχ). The distance between x1 and x2 in the Ky Fan metric is defined as ρK(x1, x2) := inf{ǫ > 0 : P(dχ(x1(ω), x2(ω)) > ǫ) < ǫ}.

Gerth,Ramlau 13 / 34

slide-22
SLIDE 22

Bayesian approach

stochastic setting requires different measure for convergence we use the Ky Fan metric

Definition

Let x1 and x2 be random variables in a probability space (Ω, F, P) with values in a metric space (χ, dχ). The distance between x1 and x2 in the Ky Fan metric is defined as ρK(x1, x2) := inf{ǫ > 0 : P(dχ(x1(ω), x2(ω)) > ǫ) < ǫ}. allows combination of deterministic and stochastic quantities metric for convergence in probability

Gerth,Ramlau 13 / 34

slide-23
SLIDE 23

Bayesian approach

Ky Fan error estimate

Theorem (Neubauer, Pikkarainen, 2008)

Let yσ be a random variable with values in Rm. Assume that the distribution of yσ is N(y, σ2I) with σ > 0. Then it holds in (Rm, || · ||) that ρK(yσ, y) ≤ min

  • 1,

√ 2σ

  • m − ln−

σ22πm2 e 2 m , where f−(h) := min{0, f(h)}.

Gerth,Ramlau 14 / 34

slide-24
SLIDE 24

Bayesian approach

Ky Fan error estimate

Theorem (Neubauer, Pikkarainen, 2008)

Let yσ be a random variable with values in Rm. Assume that the distribution of yσ is N(y, σ2I) with σ > 0. Then it holds in (Rm, || · ||) that ρK(yσ, y) ≤ min

  • 1,

√ 2σ

  • m − ln−

σ22πm2 e 2 m , where f−(h) := min{0, f(h)}. in practice ln-term mostly inactive, then ρK(yσ, y) ≤ min

  • 1,

√ 2σ√m

  • ,

c.f. E(||yσ − y||2) = σ2m

Gerth,Ramlau 14 / 34

slide-25
SLIDE 25

Convergence theorem

Overview

  • Introduction
  • Bayesian approach
  • Convergence theorem
  • Convergence rates
  • Numerical examples

Gerth,Ramlau 14 / 34

slide-26
SLIDE 26

Convergence theorem

Let x† be the unique solution of the equation Ax = y with minimum value of Φ(·).

Theorem (adapted from Hofinger, ’06)

Let α, σ > 0 and (3) have a unique minimizer. Let xmap

α

be this

  • solution. If α = α(σ) is chosen such that ˆ

α = ασ2 → 0 and

| ln σ| α

→ 0 as σ → 0, then lim

σ→0 ρK(xmap α

, x†) = 0.

Gerth,Ramlau 15 / 34

slide-27
SLIDE 27

Convergence theorem

Let x† be the unique solution of the equation Ax = y with minimum value of Φ(·).

Theorem (adapted from Hofinger, ’06)

Let α, σ > 0 and (3) have a unique minimizer. Let xmap

α

be this

  • solution. If α = α(σ) is chosen such that ˆ

α = ασ2 → 0 and

| ln σ| α

→ 0 as σ → 0, then lim

σ→0 ρK(xmap α

, x†) = 0. Uniqueness: p > 1 A injective A injective on any finite linear subspace

Gerth,Ramlau 15 / 34

slide-28
SLIDE 28

Convergence theorem

Discussion

10

−10

10

−9

10

−8

10

−7

10

−6

10

−5

10

−4

10

−3

10

−2

10

−1

10 20 40 60 80 100 120 σ mmin mmin vs σ

as long as σ22πm2 e

2

m > 1, then α → ∞ is sufficient

1 α corresponds to variance of the prior

main idea for the proof: use Ky Fan metric and split Ω = Ωdet(σ) ∪ Ωunbound(σ)

Gerth,Ramlau 16 / 34

slide-29
SLIDE 29

Convergence theorem

Discussion

10

−10

10

−9

10

−8

10

−7

10

−6

10

−5

10

−4

10

−3

10

−2

10

−1

10 20 40 60 80 100 120 σ mmin mmin vs σ

as long as σ22πm2 e

2

m > 1, then α → ∞ is sufficient

1 α corresponds to variance of the prior

main idea for the proof: use Ky Fan metric and split Ω = Ωdet(σ) ∪ Ωunbound(σ) The condition α → ∞ strange from a Bayesian perspective. To explain the discrepancy, it has to be interpreted relative to σ.

Gerth,Ramlau 16 / 34

slide-30
SLIDE 30

Convergence theorem

almost sure convergence

convergence in probability implies convergence a.s. of subsequences we can identify such subsequences

Theorem (D.G.)

Let m, n fixed and {σk}∞

k=1 be such that ∞

  • k=1

ρk(y, yσk) =

  • k=1

√ 2σk

  • m − ln−

σ2

k2πm2

e 2 m < ∞ then xmap

α(σk) a.s.

→ x† convergence a.s. allows no quantitative estimates

Gerth,Ramlau 17 / 34

slide-31
SLIDE 31

Convergence rates

Overview

  • Introduction
  • Bayesian approach
  • Convergence theorem
  • Convergence rates
  • Numerical examples

Gerth,Ramlau 17 / 34

slide-32
SLIDE 32

Convergence rates

deterministic convergence rate, DDD ’04

Assume A fulfils, for all h ∈ L2 A2

l

  • λ

2−2|λ|β|h, ψλ|2 ≤ ||Ah||2 ≤ A2

u

  • λ

2−2|λ|β|h, ψλ|2. (4) and ||x†||Bs

p,p(Rd) ≤ ̺, ̺ > 0. Then

sup{||xmap

α

− x|| : x ∈X, y ∈ Y, ||Ax − y|| ≤ δ, ||x||Bs

p,p(Rd) ≤ ̺}

< C δ + δ′ Al

  • ς

β+ς

(̺ + ̺′)

β β+ς

with δ′ = (δ2 + ˆ α̺p)

1 2 and ̺′ = (̺p + δ2

ˆ α )

1 p . Gerth,Ramlau 18 / 34

slide-33
SLIDE 33

Convergence rates

using Ky-Fan

in particular, if ||Ax† − yσ|| ≤ δ and ||x†||Bs

p,p(Rd) ≤ ̺, then

||xmap

α

− x†|| < C

  • δ + δ′η (̺ + ̺′)η′, or

P({ω ∈ Ω : ||xmap

ˆ α

(ω) − x†(ω)|| > C(δ + δ′)η(̺ + ̺′)η′}) ≤ P({ω : ||Ax†(ω) − yσ(ω)|| > δ}) + P({ω : ||T ∗x†(ω)||Bs

p,p(Rd) ≥ ̺})

(5)

Gerth,Ramlau 19 / 34

slide-34
SLIDE 34

Convergence rates

using Ky-Fan

in particular, if ||Ax† − yσ|| ≤ δ and ||x†||Bs

p,p(Rd) ≤ ̺, then

||xmap

α

− x†|| < C

  • δ + δ′η (̺ + ̺′)η′, or

P({ω ∈ Ω : ||xmap

ˆ α

(ω) − x†(ω)|| > C(δ + δ′)η(̺ + ̺′)η′}) ≤ P({ω : ||Ax†(ω) − yσ(ω)|| > δ}) + P({ω : ||T ∗x†(ω)||Bs

p,p(Rd) ≥ ̺})

(5) compare with definition of the Ky-Fan-metric: ρK(xmap

ˆ α

, x†) := inf{ǫ > 0 : P(||xmap

ˆ α

− x†|| > ǫ) < ǫ} ⇒ balance terms in (5),use δ = √ 2σ

  • m − ln−

σ22πm2 e

2

m

Gerth,Ramlau 19 / 34

slide-35
SLIDE 35

Convergence rates

Convergence rates, simplified

Theorem (D.G.)

Let all previous assumptions hold. Then there exists an explicit parameter choice rule α = α(σ, ̺, β, ς, p, m, n) depending also on the choice of model (I) or (II), such that xmap

α

→ x† and ρK(xmap

α

, x†) = O (f(α, σ, ̺, β, ς, p, m, n)) both f and α are known

Gerth,Ramlau 20 / 34

slide-36
SLIDE 36

Convergence rates

Theorem (D.G.)

Let A fulfil (4) and assume that we have an a-priori estimate ||x†||Bs

p,p(Rd) ≤ ̺ for some ̺ > 0. Set am := ln

  • 2m

2πm2

  • . Then as σ → 0,

xmap

α

converges with the parameter choice α = α(σ, ̺, β, ς, p, m, n) fulfilling

f(α) := min      1, 2   √ 2 Al σ

  • am − 2 ln σ +

α̺p 2  

ς β+ς

̺p + 2 α (am − 2 ln σ) 1/p

β β+ς

     − Γ( m

2 , m)

Γ( m

2 )

− P(||x·||Bs

p,p > ̺) = 0

to the unique solution x† and

ρK(xmap

α

, x†) = O   

  • σ
  • 1 + | ln σ| + α̺p
  • ς

β+ς

  • ̺p +

1 + | ln σ| α 1/p

β β+ς

   .

where P(||xn||Bs

p,p > ̺) =

Γ( n

p , α̺p 2 )

Γ( n

p )

  • r P(||x||Bs

p,p > ̺) =

E||x||Bs

p,p

̺

Gerth,Ramlau 21 / 34

slide-37
SLIDE 37

Numerical examples

Overview

  • Introduction
  • Bayesian approach
  • Convergence theorem
  • Convergence rates
  • Numerical examples

Gerth,Ramlau 21 / 34

slide-38
SLIDE 38

Numerical examples

We consider a convolution problem [Ax](s) = [k ∗ x](s) =

  • Rd k(s − t)x(t)dt,

s ∈ Rd (6) using a kernel

  • k(ξ) =

cκ,β (1 + κ|ξ|2)β/2 , ξ ∈ Rd, cκ,β s.t. ||k||L1(Rd) < 1 thus (4) is fulfilled with chosen β p = 1, d = 1

Gerth,Ramlau 22 / 34

slide-39
SLIDE 39

Numerical examples

Iteration [Daubechies, De Mol, Defrise 2004]:

With x0 = 0, xk+1 = Sw,p (xk + A∗(yσ − Axk)) , k = 1, 2, . . . , where Sw,p(h) :=

λ∈Λ Swλ,p(h, ψλ)ψλ is defined

component-wise (p = 1) via Sω,1(ξ) :=      ξ − ω

2

if ξ ≥ ω

2

if |ξ| < ω

2

ξ + ω

2

if ξ ≤ − ω

2

. converges since ||A|| < 1

Gerth,Ramlau 23 / 34

slide-40
SLIDE 40

Numerical examples

Parameter choice rule illustrated

σ = 0.01, m = 2500, ς = 0.5, β = 1, ̺ = 2.16 model (MI), s = 1 model (MII), s = 1, r = 2

20 40 60 80 100 5 10 15 20 25 left−hand side right−hand side α 20 40 60 80 100 1 2 3 4 5 left−hand side right−hand side

α

Gerth,Ramlau 24 / 34

slide-41
SLIDE 41

Numerical examples

example of a solution

500 1000 1500 2000 2500 0.2 0.4 0.6 0.8 1

signal x

500 1000 1500 2000 2500 −0.2 0.2 0.4 0.6 0.8 1 1.2

solution

true solution regularized solution 500 1000 1500 2000 2500 0.5 1

measurements yσ=Ax+ε

Figure : (MI), σ = 0.01, exact ̺, s = 1, β = 1. α = 45.85 according to

  • ur parameter choice rule ⇒ ˆ

α = ασ2 = 0.004585

Gerth,Ramlau 25 / 34

slide-42
SLIDE 42

Numerical examples

comparison of (MI) and (MII), m, n fixed, σ → 0

all plots averaged over 20 individual simulations model (MI), s = 1 model (MII), s = 1, r = 2

0.02 0.04 0.06 0.08 0.1 50 100 150

α vs σ σ α

0.02 0.04 0.06 0.08 0.1 50 100 150

α vs σ σ α

Figure : α plotted against σ, n = m = 2500, β = 1, exact ̺

Gerth,Ramlau 26 / 34

slide-43
SLIDE 43

Numerical examples

comparison of (MI) and (MII), m, n fixed, σ → 0

all plots averaged over 20 individual simulations model (MI), s = 1 model (MII), s = 1, r = 2

0.02 0.04 0.06 0.08 0.1 0.1 0.2 0.3 α⋅σ2 vs σ σ α⋅σ2

0.02 0.04 0.06 0.08 0.1 0.05 0.1 0.15 0.2 0.25 0.3

α⋅σ2 vs σ σ α⋅σ2

Figure : α · σ2 plotted against σ, n = m = 2500, β = 1, exact ̺

Gerth,Ramlau 27 / 34

slide-44
SLIDE 44

Numerical examples

comparison of (MI) and (MII), m, n fixed, σ → 0

model (MI), s = 1 model (MII), s = 1, r = 2

0.02 0.04 0.06 0.08 0.1 50 100 150 200

number of nonzero elements vs σ σ nonzero elements

0.02 0.04 0.06 0.08 0.1 50 100 150 200

number of nonzero elements vs σ σ nonzero elements

Figure : number of recovered nonzero coefficients plotted against σ, n = m = 2500, β = 1, exact ̺

Gerth,Ramlau 28 / 34

slide-45
SLIDE 45

Numerical examples

comparison of (MI) and (MII), m, n fixed, σ → 0

model (MI), s = 1 model (MII), s = 1, r = 2

0.02 0.04 0.06 0.08 0.1 0.2 0.4 0.6 0.8 1

convergence rates σ

computed predicted 0.02 0.04 0.06 0.08 0.1 0.2 0.4 0.6 0.8 1

convergence rates σ

computed predicted

Figure : predicted and observed convergence rates plotted against σ, n = m = 2500, β = 1, exact ̺

Gerth,Ramlau 29 / 34

slide-46
SLIDE 46

Numerical examples

comparison of (MI) and (MII), σ fixed, m, n variable

model (MI), s = 1 model (MII), s = 1, r = 2

0.5 1 1.5 2 2.5 x 10

5

50 100 150 200 250 300 350 400 450

α vs n n α

0.5 1 1.5 2 2.5 x 10

5

50 100 150 200 250 300 350 400 450

α vs n n α

Figure : α plotted against n, σ = 0.01, β = 1, exact ̺

Gerth,Ramlau 30 / 34

slide-47
SLIDE 47

Numerical examples

comparison of (MI) and (MII), σ fixed, m, n variable

model (MI), s = 1 model (MII), s = 1, r = 2

0.5 1 1.5 2 2.5 x 10

5

10 20 30 40 50 60 70

number of nonzero elements vs n n nonzero elements

0.5 1 1.5 2 2.5 x 10

5

10 20 30 40 50 60 70

number of nonzero elements vs n n nonzero elements

Figure : number of recovered nonzeros plotted against n, σ = 0.01, β = 1, exact ̺

Gerth,Ramlau 31 / 34

slide-48
SLIDE 48

Numerical examples

comparison of (MI) and (MII), σ fixed, m, n variable

model (MI), s = 1 model (MII), s = 1, r = 2

0.5 1 1.5 2 2.5 x 10

5

0.2 0.4 0.6 0.8 1

reconstruction errors vs n error n

0.5 1 1.5 2 2.5 x 10

5

0.2 0.4 0.6 0.8 1

reconstruction errors vs n error n

Figure : reconstruction error plotted against n, σ = 0.01, β = 1, exact ̺

Gerth,Ramlau 32 / 34

slide-49
SLIDE 49

Numerical examples

A 2D convolution example

σ = 0.1, β = 1, α = 130.5, ˆ α = 1.3

Figure : true solution - measurements - recovered solution

exactly the 68 original coefficients (out of 65536) were reconstructed

Gerth,Ramlau 33 / 34

slide-50
SLIDE 50

Numerical examples

G., R. Ramlau, A stochastic convergence analysis for Tikhonov-Regularization with sparsity constraints, submitted

  • M. Lassas, E. Saksman, S. Siltanen, Discretization-invariant

Bayesian inversion and Besov space priors, Inverse Probl. Imaging, 2009.

  • I. Daubechies, M. Defrise, C. De Mol, An iterative

thresholding algorithm for linear inverse problems with a sparsity constraint, Comm. Pure Appl. Math. 57, 2004.

  • A. Hofinger, H.K. Pikkarainen, Convergence Rates for Linear

Inverse Problems in the Presence of Additive Normal Noise,

  • Stoch. Anal. and Appl. 27:2, 2009.
  • J. Kaipio, E. Somersalo, Statistical and computational inverse

problems, Springer-Verlag, New York, 2005. Thank you for attention! Are there questions?

Gerth,Ramlau 34 / 34