Using Newtons method to solve linear systems or: How I learned to - - PowerPoint PPT Presentation

using newton s method to solve linear systems or how i
SMART_READER_LITE
LIVE PREVIEW

Using Newtons method to solve linear systems or: How I learned to - - PowerPoint PPT Presentation

M ULTISCALE B ASIS O PTIMIZATION FOR D ARCY F LOW James M. Rath work with Todd Arbogast in the Center for Subsurface Modeling / ICES at the University of Texas at Austin Using Newtons method to solve linear systems or: How I learned to


slide-1
SLIDE 1

MULTISCALE BASIS OPTIMIZATION

FOR DARCY FLOW

James M. Rath work with Todd Arbogast in the Center for Subsurface Modeling / ICES at the University of Texas at Austin

slide-2
SLIDE 2

Using Newton’s method to solve linear systems

  • r: How I learned to

stop worrying and love non- linear problems

Joint work with Todd Arbogast @ CSM / ICES / UT-Austin

slide-3
SLIDE 3

Problem: solving linear systems Solving large, sparse linear systems often requires the user of iterative

  • solvers. Storage and speed are the bogeymen.
slide-4
SLIDE 4

That’s the facts, Jack Solving large, sparse linear systems often requires the user of iterative

  • solvers. Storage and speed are the bogeymen.

However, generally speaking:

  • Solvers get only linear convergence rates toward the solution from the

initial guess.

  • Large condition numbers mean solvers behave poorly. They require more

and more iterations.

slide-5
SLIDE 5

What we’re after Solving large, sparse linear systems often requires the user of iterative

  • solvers. Storage and speed are the bogeymen.

However, generally speaking:

  • Solvers get only linear convergence rates toward the solution from the

initial guess.

  • Large condition numbers mean solvers behave poorly. They require more

and more iterations. We want to do better on both these counts.

slide-6
SLIDE 6
  • Dr. Obvious strikes again

Solving linear systems requires nonlinear operations, namely, division.

10x = 17 ⇒ x = 17

10

slide-7
SLIDE 7

Newton’s method is da bomb Solving linear systems requires nonlinear operations, namely, division.

10x = 17 ⇒ x = 17

10

Nonlinear solvers (Newton’s method and its ilk):

  • Get fast quadratic convergence to a solution, and
  • As applied to discretizations of nonlinear elliptic PDE,

are insensitive to mesh size.

slide-8
SLIDE 8

Hope beyond hope Solving linear systems requires nonlinear operations, namely, division.

10x = 17 ⇒ x = 17

10

Nonlinear solvers (Newton’s method and its ilk):

  • Get fast quadratic convergence to a solution, and
  • As applied to discretizations of nonlinear elliptic PDE,

are insensitive to mesh size. We want to carry over these properties to solving linear systems.

slide-9
SLIDE 9

Oopsie! A naive application of Newton’s method to solving a linear system results in a one-step procedure.

slide-10
SLIDE 10

Darn! A naive application of Newton’s method to solving a linear system results in a one-step procedure. Solving: Au = f Objection function: F(u) = f − Au Jacobian: F ′(u) = −A Newton step: ui+1 = ui − (−A)−1(f − Aui) = ui + u − ui = u

slide-11
SLIDE 11

But even worse To solve your linear system ... Solving: Au = f Newton step: ui+1 = ui − (−A)−1(f − Aui) ... you must solve your linear system.

slide-12
SLIDE 12

But even worse To solve your linear system ... Solving: Au = f Newton step: ui+1 = ui − (−A)−1(f − Aui) ... you must solve your linear system. And that’s no fun!

Especially if it’s a 106 × 106 sparse, ill-conditioned sytem you want to solve.

slide-13
SLIDE 13

If at first you don’t succeed ... To solve your linear system ... Solving: Au = f Newton step: ui+1 = ui − (−A)−1(f − Aui) ... you must solve your linear system. We have to try harder to find a nonlinear piece to attack, but it’s not

  • bvious where to begin or what will be successful.
slide-14
SLIDE 14

Try, try again Let’s examine a 3 × 3 linear system just to keep things simple.

A u = f

  

10 −6 4 −6 17 0 4 0 9

     

x y z

   =   

10 5 −1

  

slide-15
SLIDE 15

Hmmmm ... Let’s examine a 3 × 3 linear system just to keep things simple.

A u = f

  

10 −6 4 −6 17 0 4 0 9

     

ρ cos θ sin φ ρ sin θ sin φ ρ cos φ

   =   

10 5 −1

   But let’s use polar coordinates to represent the unknown.

slide-16
SLIDE 16

Rearranging ... Let’s examine a 3 × 3 linear system just to keep things simple.

A Uσρ = f

  

10 −6 4 −6 17 0 4 0 9

     

cos θ sin φ sin θ sin φ cos φ

  ρ =   

10 5 −1

   But let’s use polar coordinates to represent the unknown. And separate direction (or shape) from magnitude.

slide-17
SLIDE 17

Rearranging ... Let’s examine a 3 × 3 linear system just to keep things simple.

A Uσρ = f

  

10 −6 4 −6 17 0 4 0 9

     

cos θ sin φ sin θ sin φ cos φ

  ρ =   

10 5 −1

   But let’s use polar coordinates to represent the unknown. And separate direction (or shape) from magnitude. σ = (θ, φ)

slide-18
SLIDE 18

A nonlinear problem Objective function: r(σ, ρ) = f − AUσρ

slide-19
SLIDE 19

Split: some linear, some nonlinear Objective function: r(σ) = f − AUσρ(σ) Determine ρ as the “best” magnitude for a fixed σ: AUσρ = f

slide-20
SLIDE 20

Split: some linear, some nonlinear Objective function: r(σ) = f − AUσρ(σ) Determine ρ as the “best” magnitude for a fixed σ:

  • UT

σ AUσ

  • ρ = UT

σ f

Where:

  • “Best” = best in least-squares sense (in the energy or A-norm).
slide-21
SLIDE 21

Split: some linear, some nonlinear Objective function: r(σ) = f − AUσρ(σ) Determine ρ as the “best” magnitude for a fixed σ:

  • UT

σ AUσ

  • ρ = UT

σ f

Where:

  • “Best” = best in least-squares sense (in the energy or A-norm).
  • The system UT

σ AUσ is a smaller/coarser linear system to solve.

slide-22
SLIDE 22

Algorithm ` a la Newton

  • 1. Choose a shape σ. (Fix for now.)
slide-23
SLIDE 23

Algorithm ` a la Newton

  • 1. Choose a shape σ.
  • 2. Solve for ρ:
  • UT

σ AUσ

  • ρ = UT

σ f

This is an “easy” coarsened problem.

slide-24
SLIDE 24

Algorithm ` a la Newton

  • 1. Choose a shape σ.
  • 2. Solve for ρ:
  • UT

σ AUσ

  • ρ = UT

σ f

  • 3. Calculate objective/residual:

r(σ) = f − AUσρ

slide-25
SLIDE 25

Algorithm ` a la Newton

  • 1. Choose a shape σ.
  • 2. Solve for ρ:
  • UT

σ AUσ

  • ρ = UT

σ f

  • 3. Calculate objective/residual:

r(σ) = f − AUσρ

  • 4. Calculate Jacobian r′(σ).
slide-26
SLIDE 26

Algorithm ` a la Newton

  • 1. Choose a shape σ.
  • 2. Solve for ρ:
  • UT

σ AUσ

  • ρ = UT

σ f

  • 3. Calculate objective/residual:

r(σ) = f − AUσρ

  • 4. Calculate Jacobian r′(σ).

Oops, oh yeah ...

slide-27
SLIDE 27

Algorithm ` a la Newton

  • 1. Choose a shape σ.
  • 2. Solve for ρ:
  • UT

σ AUσ

  • ρ = UT

σ f

  • 3. Calculate objective/residual:

r(σ) = f − AUσρ

  • 4. Calculate Jacobian r′(σ).
  • 5. Calculate Newton step:

δσ = −

  • r′†r
slide-28
SLIDE 28

Algorithm ` a la Newton

  • 1. Choose a shape σ.
  • 2. Solve for ρ:
  • UT

σ AUσ

  • ρ = UT

σ f

  • 3. Calculate objective/residual:

r(σ) = f − AUσρ

  • 4. Calculate Jacobian r′(σ).
  • 5. Calculate Newton step:

δσ = −

  • r′†r
  • 6. Update shape σ:

σ ← σ + δσ

slide-29
SLIDE 29

Algorithm ` a la Newton

  • 1. Choose a shape σ.
  • 2. Solve for ρ:
  • UT

σ AUσ

  • ρ = UT

σ f

  • 3. Calculate objective/residual:

r(σ) = f − AUσρ

  • 4. Calculate Jacobian r′(σ).
  • 5. Calculate Newton step:

δσ = −

  • r′†r
  • 6. Update shape σ:

σ ← σ + δσ

  • 7. Repeat as necessary.
slide-30
SLIDE 30

Bummer

  • Calculating Jacobian r′(σ)
  • Solving the linear system
  • r′†r
slide-31
SLIDE 31

Bummer

  • Calculating Jacobian r′(σ)
  • Solving the linear system
  • r′†r
slide-32
SLIDE 32

Calculus ... yuck!

  • Calculating Jacobian r′(σ)
  • Solving the linear system
  • r′†r

Jacobians require calculus, and who wants to do calculus?

slide-33
SLIDE 33

Linear algebra is my bag, baby

  • Calculating Jacobian r′(σ)
  • Solving the linear system
  • r′†r

Jacobians require calculus, and who wants to do calculus? Blech! I wanna do linear algebra ...

slide-34
SLIDE 34

Jacobians are expensive

  • Calculating Jacobian r′(σ)
  • Solving the linear system
  • r′†r

Jacobians require calculus, and who wants to do calculus? Blech! I wanna do linear algebra ... (Jacobians are expensive to compute, anyway.)

slide-35
SLIDE 35

To be lazy, one must do work ...

  • Calculating Jacobian r′(σ)
  • Solving the linear system
  • r′†r

Jacobians require calculus, and who wants to do calculus? Blech! I wanna do linear algebra ... We’ll use calculus to avoid calculus. (And save the day!)

slide-36
SLIDE 36

Chain rule to the rescue! S’pose instead of computing the Newton step: δσ = −

  • r′†r
slide-37
SLIDE 37

Chain rule to the rescue! S’pose instead of computing the Newton step: δσ = −

  • r′†r

We compute the effect that the Newton step would have on the residual: δr =

  • r′

δσ

slide-38
SLIDE 38

Chain rule to the rescue! S’pose instead of computing the Newton step: δσ = −

  • r′†r

We compute the effect that the Newton step would have on the residual: δr =

  • r′

δσ = −

  • r′

r′†r

slide-39
SLIDE 39

A-ha! S’pose instead of computing the Newton step: δσ = −

  • r′†r

We compute the effect that the Newton step would have on the residual: δr =

  • r′

δσ = −

  • r′

r′†r The operation

  • r′

r′† is something familiar: the projection onto the range of r′!

slide-40
SLIDE 40

A-ha! S’pose instead of computing the Newton step: δσ = −

  • r′†r

We compute the effect that the Newton step would have on the residual: δr =

  • r′

δσ = −

  • r′

r′†r The operation

  • r′

r′† is something familiar: the projection onto the range of r′! The range of r′ is the tangent space to the manifold of all possible residual

  • vectors. This is an ellipsoid! The normal, it turns out, is easy to compute.
slide-41
SLIDE 41

Et voil` a! S’pose instead of computing the Newton step: δσ = −

  • r′†r

We compute the effect that the Newton step would have on the residual: δr =

  • r′

δσ = −

  • r′

r′†r The operation

  • r′

r′† is something familiar: the projection onto the range of r′! The range of r′ is the tangent space to the manifold of all possible residual

  • vectors. This is an ellipsoid! The normal, it turns out, is easy to compute.

A projection is a linear algebra sorta thing. And it’s a projection onto a low-dimensional space (1-D here). So it’s “easy”!

slide-42
SLIDE 42

How now brown cow? S’pose instead of computing the Newton step: δσ = −

  • r′†r

We compute the effect that the Newton step would have on the residual: δr =

  • r′

δσ = −

  • r′

r′†r The operation

  • r′

r′† is something familiar: the projection onto the range of r′! The range of r′ is the tangent space to the manifold of all possible residual

  • vectors. This is an ellipsoid! The normal, it turns out, is easy to compute.

A projection is a linear algebra sorta thing. And it’s a projection onto a low-dimensional space (1-D here). So it’s “easy”! So how do we use this?

slide-43
SLIDE 43

Algorithm v. 2.0

  • 1. Choose a shape σ.
slide-44
SLIDE 44

Algorithm v. 2.0

  • 1. Choose a shape σ.
  • 2. Solve coarse problem for ρ:
  • UT

σ AUσ

  • ρ = UT

σ f

slide-45
SLIDE 45

Algorithm v. 2.0

  • 1. Choose a shape σ.
  • 2. Solve coarse problem for ρ:
  • UT

σ AUσ

  • ρ = UT

σ f

  • 3. Calculate objective/residual:

r(σ) = f − AUσρ

slide-46
SLIDE 46

Algorithm v. 2.0

  • 1. Choose a shape σ.
  • 2. Solve coarse problem for ρ:
  • UT

σ AUσ

  • ρ = UT

σ f

  • 3. Calculate objective/residual:

r(σ) = f − AUσρ

  • 4. Calculate residual change (Newton step):

δr = −Ptanr

slide-47
SLIDE 47

Algorithm v. 2.0

  • 1. Choose a shape σ.
  • 2. Solve coarse problem for ρ:
  • UT

σ AUσ

  • ρ = UT

σ f

  • 3. Calculate objective/residual:

r(σ) = f − AUσρ

  • 4. Calculate residual change (Newton step):

δr = −Ptanr

  • 5. Impute new shape from updated residual r + δr.

(We leave out these details, but take it on faith that it’s easy, too.)

slide-48
SLIDE 48

Algorithm v. 2.0

  • 1. Choose a shape σ.
  • 2. Solve coarse problem for ρ:
  • UT

σ AUσ

  • ρ = UT

σ f

  • 3. Calculate objective/residual:

r(σ) = f − AUσρ

  • 4. Calculate residual change (Newton step):

δr = −Ptanr

  • 5. Impute new shape from updated residual r + δr.
  • 6. Repeat as necessary.
slide-49
SLIDE 49

You got chocolate in my peanut butter! We went from a linear problem to a nonlinear one, but ...

slide-50
SLIDE 50

What have we done?! We went from a linear problem to a nonlinear one, but ... We have traded solving a large, ill-conditioned linear problem Au = f for

  • solving a much smaller, better conditioned linear problem
  • UT

σ AUσ

  • ρ = UT

σ f, and

  • solving a small non-linear system (for the shape σ).
slide-51
SLIDE 51

An application (the I&A in SIAM) Steady single-phase flow through a porous medium can be described by: −∇ · a∇p = f This PDE can be discretized in a number of ways. We leave the details of this and our coarsening procedure to another talk.

slide-52
SLIDE 52

Challenges Steady single-phase flow can be described by: −∇ · a∇p = f The coefficient a depends on the permeability. The permeability is often geostatistically generated at high resolution. It can be very heterogeneous. Together these conditions make for an ill-conditioned and computationally expensive problem.

slide-53
SLIDE 53

What’s “heterogeneous”?

Seismics from a USGS survey in the northern Gulf of Mexico

slide-54
SLIDE 54

What’s “heterogeneous”?

Simulated fields from the SPE CSP10

slide-55
SLIDE 55

Newton’s method is da bomb To reiterate our desired features, Newton’s method:

  • Gets fast quadratic convergence to a solution, and
  • As applied to discretizations of nonlinear elliptic PDE,

is insensitive to mesh size.

slide-56
SLIDE 56

Newton’s method is da bomb To reiterate our desired features, Newton’s method:

  • Gets fast quadratic convergence to a solution, and
  • As applied to discretizations of linear elliptic PDE,

is insensitive to mesh size. And a new property:

  • Is insensitive to heterogeneity in coefficients

(the a in −∇ · a∇p = f).

slide-57
SLIDE 57

Some easy examples: quadratic convergence We apply this method to solving the flow problem on the unit square with constant permeability. There are sources at the bottom-left and top-right

  • f the domain (a quarter five-spot) and no gravity. A 2 × 2 coarse grid is

used with a 6 × 6 subgrid (corresponding to a 12 × 12 fine grid).

log10residual

1 2 3 4

  • 17.5
  • 15
  • 12.5
  • 10
  • 7.5
  • 5
  • 2.5

iteration number

Note the axis scales: we get quadratic convergence — on a linear problem!

slide-58
SLIDE 58

Some easy examples: resolution independence I

# of Newton iterations

10 15 20 30 50 2 4 6 8 10

resolution

Solve the flow problem with a fixed coarse grid, but let the underlying grid get finer and finer. Only a constant number of Newton iterations regardless

  • f resolution is needed.
slide-59
SLIDE 59

Some easy examples: resolution independence II

# of Newton iterations

10 15 20 30 2 4 6 8 10

resolution

Solve the flow problem with the coarse grid spacing at a fixed ratio to the fine grid spacing. As the underlying grid gets finer and finer, only a constant number of Newton iterations regardless of resolution is needed.

slide-60
SLIDE 60

Kick it up a notch: a more heterogeneous permeability The above graphic plots the variation from a statistically generated perme- ability field. Red areas indicate low permeability; blue areas indicate high permeability. The permeabilities span about five orders of magnitude (105).

slide-61
SLIDE 61

Resolution independence with heterogeneous coefficients

# of Newton iterations: maximum median minimum # of Newton iterations

10 15 20 30 50 5 10 15 20 25 10 15 20 30 50 5 10 15 20 25

resolution (1/h)

Fixed coarse grid Fixed coarse/fine ratio Let the grid get finer and finer (let the resolution increase). At each resolu- tion, grab several statistical subsamples of the permeability field. Roughly a constant number of Newton iterations is needed.

slide-62
SLIDE 62

Heterogeneity independence

# of Newton iterations

2 4 6 8 10 5 10 15 20 25 30

1/h = 8 1/h = 12 1/h = 16 1/h = 20 1/h = 24 1/h = 32 1/h = 40 1/h = 48 1/h = 56 log10 (amax/amin)

Take a subsample of the heterogeneous permeability field and rescale it so that amax/amin gets large.

slide-63
SLIDE 63

A channel/barrier permeability field In the above diagram, gray represents a permeability of 1 and red represents either a high or low permeability. When high, we have a channel; when low, we have a barrier.

slide-64
SLIDE 64

Heterogeneity independence for channel/barrier flow

# of Newton iterations

  • 10
  • 7.5
  • 5
  • 2.5

2.5 5 7.5 10 5 10 15 20 25 30

1/h = 8 1/h = 12 1/h = 16 1/h = 20 1/h = 24

log10(perm jump) The horizontal axis shows the base-10 log of the permeability of the bar- rier/channel. Points on the left are for a barrier; points in the center are constant permeability everywhere; points on the right are for a channel.