Parallel preconditioners for problems arising from whole-microwave - - PowerPoint PPT Presentation

parallel preconditioners for problems arising from whole
SMART_READER_LITE
LIVE PREVIEW

Parallel preconditioners for problems arising from whole-microwave - - PowerPoint PPT Presentation

Parallel preconditioners for problems arising from whole-microwave system modeling for brain imaging Pierre-Henri Tournier 1 Pierre Jolivet 2 Marcella Bonazzoli 3 Victorita Dolean 3 , 4 eric Hecht 1 eric Nataf 1 Fr ed Fr ed 1 LJLL,


slide-1
SLIDE 1

Parallel preconditioners for problems arising from whole-microwave system modeling for brain imaging

Pierre-Henri Tournier 1 Pierre Jolivet 2 Marcella Bonazzoli 3 Victorita Dolean 3,4 Fr´ ed´ eric Hecht 1 Fr´ ed´ eric Nataf 1

1LJLL, Universit´

e Pierre et Marie Curie, INRIA ´ equipe ALPINES, Paris

2IRIT-CNRS, Toulouse 3LJAD, Universit´

e de Nice Sophia Antipolis, Nice

  • 4Dept. of Mathematics and Statistics, University of Strathclyde, Glasgow, UK

Numerical methods for wave propagation and applications

August 31, 2017

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 1/ 26

slide-2
SLIDE 2

Motivation

2 types of cerebro vascular accidents (strokes): ischemic (85 %) hemorrhagic (15 %) The correct treatment depends on the type of stroke: = ⇒ restore blood flow = ⇒ lower blood pressure

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 2/ 26

slide-3
SLIDE 3

Motivation

In order to differentiate between ischemic and hemorrhagic stroke, CT scan or MRI is typically used. Microwave tomography is a novel and promising imaging technique, especially for medical and brain imaging. CT scan MRI microwave tomography resolution excellent excellent good fast ✗ ✗ ✓ mobile ∼ ✗ ✓ cost ∼ 300 000 e ∼ 1 000 000 e < 100 000 e safe ✗ ✓ ✓ monitoring ✗ ✗ ✓ Diagnosing a stroke at the earliest possible stage is crucial for all following therapeutic decisions. Monitoring: Clinicians wish to have an image every fifteen minutes.

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 3/ 26

slide-4
SLIDE 4

Motivation

EMTensor GmbH, Vienna, Austria. First-generation prototype: cylindrical chamber composed of 5 rings

  • f 32 antennas (ceramic-loaded waveguides).

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 4/ 26

slide-5
SLIDE 5

The direct problem

We consider in Ω a linear, isotropic, non-magnetic, dispersive, dissipative dielectric material. The direct problem consists in finding the electromagnetic field distribution in the whole chamber, given a known material and transmitted signal.

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 5/ 26

slide-6
SLIDE 6

The direct problem

For each of the 5 × 32 antennas, the as- sociated electric field Ej is the solution of Maxwell’s equations: (1)

            

∇ × (∇ × Ej) − µ0

  • ω2ε + iωσ
  • Ej = 0

in Ω, Ej × n = 0

  • n Γmetal,

(∇ × Ej) × n + iβ(Ej × n) × n = gj

  • n Γj,

(∇ × Ej) × n + iβ(Ej × n) × n = 0

  • n Γi , i = j,

where µ0 is the permeability of free space, ω is the incident angular frequency , β is the wavenumber of the waveguide, ε > 0 is the dielectric permittivity, σ > 0 is the conductivity and gj corresponds to the excitation.

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 6/ 26

slide-7
SLIDE 7

The direct problem

Spatial discretization using N´ ed´ elec edge finite elements yields a large sparse linear system Au = fj for each transmitting antenna j. We need a robust and efficient solver for second order time-harmonic Maxwell’s equations with heterogeneous coefficients. = ⇒ Use domain decomposition methods to produce parallel preconditioners for the GMRES algorithm.

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 7/ 26

slide-8
SLIDE 8

Overlapping domain decomposition methods

Consider the linear system: Au = f ∈ Cn.

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 8/ 26

slide-9
SLIDE 9

Overlapping domain decomposition methods

Ω2 Ω1

Consider the linear system: Au = f ∈ Cn. Given a decomposition of 1; n, (N1, N2), define:

◮ the restriction operator Ri from 1; n into Ni, ◮ RT i

as the extension by 0 from Ni into 1; n. Then solve concurrently: um+1

1

= um

1 + A−1 11 R1(f − Aum)

um+1

2

= um

2 + A−1 22 R2(f − Aum)

where ui = Riu and Aij := RiART

j .

[Schwarz 1870]

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 8/ 26

slide-10
SLIDE 10

Overlapping domain decomposition methods

Duplicated unknowns coupled via a partition of unity: I =

N

  • i=1

RT

i DiRi.

To solve Au = f Schwarz methods can be viewed as preconditioners for a fixed point algorithm: un+1 = un + M−1(f − Aun).

◮ M−1 RAS := N

  • i=1

RT

i DiA−1 i

Ri with Ai = RiART

i ◮ M−1 ORAS := N

  • i=1

RT

i DiB−1 i

Ri Optimized transmission conditions [B. Despr´ es 1991] for Helmholtz

1 2

1

1 2

1

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 9/ 26

slide-11
SLIDE 11

HPDDM

HPDDM is an efficient parallel implementation of domain decomposition methods by Pierre Jolivet and Fr´ ed´ eric Nataf

◮ header-only library written in C++11 with MPI and OpenMP ◮ interfaced with the open source finite element software

FreeFem++ (Fr´ ed´ eric Hecht)

512 1,024 2,048 4,096 500 200 50 10

(54) (61) (73) (94)

# of subdomains Time to solution (in seconds) Setup Solve Ideal

Strong scalability test for Maxwell 3D with edge ele- ments of degree 2 - 119M d.o.f. - Curie (TGCC, CEA)

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 10/ 26

slide-12
SLIDE 12

Comparison with experiments

The experimental measurements obtained from the antennas are the reflection and transmission coefficients. Their numerical counterparts are computed as Sij =

  • Γi Ej · E0

i dγ

  • Γi |E0

i |2dγ , for i, j = 1, ..., 160,

where E0

i is the fundamental mode of the waveguide i.

  • 10

10 20 30 40 50 60 70

  • 200
  • 150
  • 100
  • 50

50 100 150 200 magnitude (dB) angle (degree) ring 3 - empty simulation experiment

  • 200
  • 150
  • 100
  • 50

50 100 150 200

  • 200
  • 150
  • 100
  • 50

50 100 150 200 phase (degree) angle (degree) ring 3 - empty simulation experiment

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 11/ 26

slide-13
SLIDE 13

The inverse problem

The inverse problem consists in recovering ε and σ such that for each transmitting antenna j, the solution Ej to the associated Maxwell’s problem matches the measurements:

  • Γi Ej · E0

i dγ

  • Γi |E0

i |2dγ = Smes ij

for each receiving antenna i. Difficulties:

◮ inverse problems are ill-posed ◮ noise in the experimental data ◮ solving the inverse problem means solving the direct problem

multiple times = ⇒ time-consuming

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 12/ 26

slide-14
SLIDE 14

The inverse problem

Let κ := µ0(ω2ε + iωσ) be the unknown parameter of our inverse

  • problem. Solving the inverse problem corresponds to minimizing the

following cost functional: J(κ) =1 2

160

  • j=1

160

  • i=1
  • Sij(κ) − Smes

ij

  • 2

=1 2

160

  • j=1

160

  • i=1
  • Γi Ej(κ) · E0

i dγ

  • Γi |E0

i |2dγ

− Smes

ij

  • 2

. Sij(κ) depends on the solution Ej(κ) to

          

∇ × (∇ × Ej) − κEj = 0 in Ω, Ej × n = 0

  • n Γmetal,

(∇ × Ej) × n + iβ(Ej × n) × n = gj

  • n Γj,

(∇ × Ej) × n + iβ(Ej × n) × n = 0

  • n Γi , i = j.

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 13/ 26

slide-15
SLIDE 15

The inverse problem

For j = 1, ..., 160, we introduce the adjoint problem

            

∇ × (∇ × Fj) − κFj = 0 in Ω, Fj × n = 0

  • n Γmetal,

(∇ × Fj) × n + iβ(Fj × n) × n = (Sij(κ) − Smes

ij

)

  • Γi |E0

i |2dγ

E0

i

  • n Γi ,

i = 1, ..., 160. We have DJ(κ, δκ) =

160

  • j=1

δκ Ej · Fjdx

  • .

We can then compute the gradient to use in a gradient-based

  • ptimization algorithm.

Here we use a limited-memory BFGS algorithm.

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 14/ 26

slide-16
SLIDE 16

Numerical experiment - hemorrhagic stroke

◮ Brain model from X-ray and MRI data (362 × 434 × 362) ◮ simulated hemorrhagic stroke of ellipsoidal shape ◮ f = 1GHz ◮ waveguides (ceramic) : ǫr = 59 ◮ matching liquid : ǫr = 44 + 20i ◮ 10% multiplicative white Gaussian noise on synthetic data

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 15/ 26

slide-17
SLIDE 17

Numerical experiment - hemorrhagic stroke

Idea: reconstruct the permittivity slice by slice, by taking into account the transmitting antennas corresponding to only one ring and truncating the computational domain.

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 16/ 26

slide-18
SLIDE 18

Numerical experiment - hemorrhagic stroke

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 17/ 26

slide-19
SLIDE 19

Numerical experiment - hemorrhagic stroke

Idea: reconstruct the permittivity slice by slice, by taking into account the transmitting antennas corresponding to only one ring and truncating the computational domain. = ⇒ Reconstructed images corresponding to one ring obtained in less than 2 minutes.

64 128 256 512 1 024 2 048 4 096 0.5 1 2 4 8 16 # of MPI processes Time in minutes Linear speedup

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 18/ 26

slide-20
SLIDE 20

Numerical experiment - hemorrhagic stroke

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 19/ 26

slide-21
SLIDE 21

Block iterative methods and recycling

Subspace recycling

Keep information between restarts or when solving sequences of linear systems: Aixi = bi ∀i = 1, 2, . . .

Block methods

Treat multiple right-hand sides simultaneously for faster convergence: AX = B B ∈ Cn×p

More about HPDDM

◮ open-source, https://github.com/hpddm/hpddm ◮ usable in C++, C, Python, or Fortran ◮ implementation of (pseudo-)Block GMRES/GCRO-DR ◮ support for left/right/variable preconditioning ◮ also has (pseudo-)Block CG and Breakdown-Free BCG

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 20/ 26

slide-22
SLIDE 22

GCRO-DR

Generalized Conjugate Residual method with inner Orthogonalization and Deflated Restarting

Proposed by [Parks et al. 2007] Closely related to GMRES-DR by [Morgan 2002]

Main idea

  • 1. end of GMRES cycle: compute Ritz eigenpairs
  • 2. next restart: use 1. to generate k vectors for Arnoldi basis
  • 3. perform extra orthogonalizations with k vectors

Overhead: ◮ persistent storage between cycles/solves

◮ one additional synchronization per cycle ◮ small dense (generalized) eigenvalue problem

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 21/ 26

slide-23
SLIDE 23

Why use block methods?

Numerical aspects

enlarged Krylov subspace = ⇒ faster convergence

Performance

◮ higher arithmetic intensity ◮ fewer synchronizations with more data

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 22/ 26

slide-24
SLIDE 24

Implementation details

Block Arnoldi

◮ block orthogonalization ◮ tall-and-skinny QR (default to CholQR V HV = LLH)

About CholQR: ◮ V = QR with R = LH, Q = L−HV

◮ BLAS 3 ◮ one reduction ◮ can rank-reveal

Pseudo-block methods

p subspaces, computation and communication steps fused

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 23/ 26

slide-25
SLIDE 25

Block methods for Medimax

Dissipative nature of brain tissues = ⇒ the one-level preconditioner with classical (pseudo-block) GMRES already works quite well. = ⇒ Try block methods for a harder test case:

◮ non-dissipative plastic cylinder (diameter 12

cm) immersed in the imaging chamber and surrounded by matching liquid.

◮ fine discretization with degree 2 edge

elements: 89 million unknowns.

◮ we solve the direct problem for 32

transmitting antennas (second ring) = ⇒ 32 RHSs

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 24/ 26

slide-26
SLIDE 26

Block methods for Medimax

alternative p solve # of it. per RHS eff.

GMRES

1 3,078.4 20,068 627 −

GCRO-DR

1 1,836.9 10,701 334 1.7

pseudo-BGMRES

32 1,577.9 653 − 2.0

BGMRES

32 724.8 158 − 4.2

pseudo-BGCRO-DR

8 1,357.8 1,508 377 2.3

pseudo-BGCRO-DR

32 1,376.1 469 − 2.2

BGCRO-DR

8 677.6 524 131 4.5

BGCRO-DR

32 992.3 127 − 3.1

◮ (m, k) = (50, 10) for solving 32 RHSs ◮ 2,048 subdomains and 2 threads per subdomain

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 25/ 26

slide-27
SLIDE 27

Block methods for Medimax

alternative p solve # of it. per RHS eff.

GMRES

1 3,078.4 20,068 627 −

GCRO-DR

1 1,836.9 10,701 334 1.7

pseudo-BGMRES

32 1,577.9 653 − 2.0

BGMRES

32 724.8 158 − 4.2

pseudo-BGCRO-DR

8 1,357.8 1,508 377 2.3

pseudo-BGCRO-DR

32 1,376.1 469 − 2.2

BGCRO-DR

8 677.6 524 131 4.5

BGCRO-DR

32 992.3 127 − 3.1

◮ (m, k) = (50, 10) for solving 32 RHSs ◮ 2,048 subdomains and 2 threads per subdomain

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 25/ 26

slide-28
SLIDE 28

Block methods for Medimax

alternative p solve # of it. per RHS eff.

GMRES

1 3,078.4 20,068 627 −

GCRO-DR

1 1,836.9 10,701 334 1.7

pseudo-BGMRES

32 1,577.9 653 − 2.0

BGMRES

32 724.8 158 − 4.2

pseudo-BGCRO-DR

8 1,357.8 1,508 377 2.3

pseudo-BGCRO-DR

32 1,376.1 469 − 2.2

BGCRO-DR

8 677.6 524 131 4.5

BGCRO-DR

32 992.3 127 − 3.1

◮ (m, k) = (50, 10) for solving 32 RHSs ◮ 2,048 subdomains and 2 threads per subdomain

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 25/ 26

slide-29
SLIDE 29

Block methods for Medimax

alternative p solve # of it. per RHS eff.

GMRES

1 3,078.4 20,068 627 −

GCRO-DR

1 1,836.9 10,701 334 1.7

pseudo-BGMRES

32 1,577.9 653 − 2.0

BGMRES

32 724.8 158 − 4.2

pseudo-BGCRO-DR

8 1,357.8 1,508 377 2.3

pseudo-BGCRO-DR

32 1,376.1 469 − 2.2

BGCRO-DR

8 677.6 524 131 4.5

BGCRO-DR

32 992.3 127 − 3.1

◮ (m, k) = (50, 10) for solving 32 RHSs ◮ 2,048 subdomains and 2 threads per subdomain

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 25/ 26

slide-30
SLIDE 30

Block methods for Medimax

alternative p solve # of it. per RHS eff.

GMRES

1 3,078.4 20,068 627 −

GCRO-DR

1 1,836.9 10,701 334 1.7

pseudo-BGMRES

32 1,577.9 653 − 2.0

BGMRES

32 724.8 158 − 4.2

pseudo-BGCRO-DR

8 1,357.8 1,508 377 2.3

pseudo-BGCRO-DR

32 1,376.1 469 − 2.2

BGCRO-DR

8 677.6 524 131 4.5

BGCRO-DR

32 992.3 127 − 3.1

◮ (m, k) = (50, 10) for solving 32 RHSs ◮ 2,048 subdomains and 2 threads per subdomain

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 25/ 26

slide-31
SLIDE 31

Block methods for Medimax

alternative p solve # of it. per RHS eff.

GMRES

1 3,078.4 20,068 627 −

GCRO-DR

1 1,836.9 10,701 334 1.7

pseudo-BGMRES

32 1,577.9 653 − 2.0

BGMRES

32 724.8 158 − 4.2

pseudo-BGCRO-DR

8 1,357.8 1,508 377 2.3

pseudo-BGCRO-DR

32 1,376.1 469 − 2.2

BGCRO-DR

8 677.6 524 131 4.5

BGCRO-DR

32 992.3 127 − 3.1

◮ (m, k) = (50, 10) for solving 32 RHSs ◮ 2,048 subdomains and 2 threads per subdomain ◮ alternative #1 to #8 =

⇒ 158× fewer iterations

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 25/ 26

slide-32
SLIDE 32

Block methods for Medimax

alternative p solve # of it. per RHS eff.

GMRES

1 3,078.4 20,068 627 −

GCRO-DR

1 1,836.9 10,701 334 1.7

pseudo-BGMRES

32 1,577.9 653 − 2.0

BGMRES

32 724.8 158 − 4.2

pseudo-BGCRO-DR

8 1,357.8 1,508 377 2.3

pseudo-BGCRO-DR

32 1,376.1 469 − 2.2

BGCRO-DR

8 677.6 524 131 4.5

BGCRO-DR

32 992.3 127 − 3.1

◮ (m, k) = (50, 10) for solving 32 RHSs ◮ 2,048 subdomains and 2 threads per subdomain ◮ alternative #1 to #8 =

⇒ 158× fewer iterations

◮ GCRO-DR always performs fewer iterations than GMRES

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 25/ 26

slide-33
SLIDE 33

Block methods for Medimax

alternative p solve # of it. per RHS eff.

GMRES

1 3,078.4 20,068 627 −

GCRO-DR

1 1,836.9 10,701 334 1.7

pseudo-BGMRES

32 1,577.9 653 − 2.0

BGMRES

32 724.8 158 − 4.2

pseudo-BGCRO-DR

8 1,357.8 1,508 377 2.3

pseudo-BGCRO-DR

32 1,376.1 469 − 2.2

BGCRO-DR

8 677.6 524 131 4.5

BGCRO-DR

32 992.3 127 − 3.1

◮ (m, k) = (50, 10) for solving 32 RHSs ◮ 2,048 subdomains and 2 threads per subdomain ◮ alternative #1 to #8 =

⇒ 158× fewer iterations

◮ GCRO-DR always performs fewer iterations than GMRES ◮ working on all 32 RHSs is costly (#5/#7 vs. #6/#8)

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 25/ 26

slide-34
SLIDE 34

Conclusion and perspectives

Conclusion: This work shows the feasibility of a microwave imaging technique for stroke diagnosis and monitoring, using parallel computing. Current work and perspectives:

◮ experiment with subspace recycling techniques between iterations

during the optimization process when solving the inverse problem

◮ choose a good coarse space for a two-level scalable preconditioner

for Maxwell’s equations (joint work with Ivan Graham and Euan Spence)

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 26/ 26

slide-35
SLIDE 35

Conclusion and perspectives

Conclusion: This work shows the feasibility of a microwave imaging technique for stroke diagnosis and monitoring, using parallel computing. Current work and perspectives:

◮ experiment with subspace recycling techniques between iterations

during the optimization process when solving the inverse problem

◮ choose a good coarse space for a two-level scalable preconditioner

for Maxwell’s equations (joint work with Ivan Graham and Euan Spence)

Thank you for your attention !

Pierre-Henri Tournier Parallel preconditioners for problems arising from microwave system modeling for brain imaging 26/ 26