Null Space Gradient Flows for Constrained Optimization with - - PowerPoint PPT Presentation

null space gradient flows for constrained optimization
SMART_READER_LITE
LIVE PREVIEW

Null Space Gradient Flows for Constrained Optimization with - - PowerPoint PPT Presentation

Null Space Gradient Flows for Constrained Optimization with Applications to Shape Optimization Florian Feppon Gr egoire Allaire, Charles Dapogny Julien Cortial, Felipe Bordeu SIAM CSE Spokane February 26, 2019 Shape optimization


slide-1
SLIDE 1

Null Space Gradient Flows for Constrained Optimization with Applications to Shape Optimization

Florian Feppon Gr´ egoire Allaire, Charles Dapogny Julien Cortial, Felipe Bordeu SIAM CSE – Spokane – February 26, 2019

slide-2
SLIDE 2

Shape optimization

Multiphysics, non parametric, shape and topology optimization, in

  • 2D. . .
slide-3
SLIDE 3

Shape optimization

And in 3D. . .

slide-4
SLIDE 4

Shape optimization

And in 3D. . . . . . Nonlinear, non convex, infinite dimensional optimization problems featuring multiple and arbitrary constraints!

slide-5
SLIDE 5

Outline

  • 1. Gradient flows for equality and inequality constrained
  • ptimizations
  • 2. Demonstration on topology optimization test cases
slide-6
SLIDE 6
  • 1. Constrained optimization

min

x∈X

J(x) s.t.

  • ❣(x) = 0

❤(x) ≤ 0, with J : X → R, ❣ : X → Rp and ❤ : X → Rq Fr´ echet differentiable. The set X can be ◮ a finite dimensional vector space, X = Rn

slide-7
SLIDE 7
  • 1. Constrained optimization

min

x∈X

J(x) s.t.

  • ❣(x) = 0

❤(x) ≤ 0, with J : X → R, ❣ : X → Rp and ❤ : X → Rq Fr´ echet differentiable. The set X can be ◮ a finite dimensional vector space, X = Rn ◮ a Hilbert space equipped with a scalar product a(·, ·), X = V

slide-8
SLIDE 8
  • 1. Constrained optimization

min

x∈X

J(x) s.t.

  • ❣(x) = 0

❤(x) ≤ 0, with J : X → R, ❣ : X → Rp and ❤ : X → Rq Fr´ echet differentiable. The set X can be ◮ a finite dimensional vector space, X = Rn ◮ a Hilbert space equipped with a scalar product a(·, ·), X = V ◮ a “manifold”, as in topology optimization: X = {Ω ⊂ D | Ω Lipschitz }

slide-9
SLIDE 9
  • 1. A generic optimization algorithm

From a current guess xn, how to select the descent direction ξn given objective J and constraints ❣, ❤?

slide-10
SLIDE 10
  • 1. A generic optimization algorithm

From a current guess xn, how to select the descent direction ξn given objective J and constraints ❣, ❤? ◮ Many “iteratives” methods in literature:

◮ Penalty methods (like Augmented Lagrangian Method) ◮ Linearization methods : SLP, SQP, MMA, MFD

slide-11
SLIDE 11
  • 1. A generic optimization algorithm

From a current guess xn, how to select the descent direction ξn given objective J and constraints ❣, ❤? ◮ Many “iteratives” methods in literature:

◮ Penalty methods (like Augmented Lagrangian Method) ◮ Linearization methods : SLP, SQP, MMA, MFD

These methods suffer from: ◮ the need for tuning unintuitive parameters.

slide-12
SLIDE 12
  • 1. A generic optimization algorithm

From a current guess xn, how to select the descent direction ξn given objective J and constraints ❣, ❤? ◮ Many “iteratives” methods in literature:

◮ Penalty methods (like Augmented Lagrangian Method) ◮ Linearization methods : SLP, SQP, MMA, MFD

These methods suffer from: ◮ the need for tuning unintuitive parameters. ◮ “inconsistencies” when ∆t → 0: SLP, SQP, MFD subproblems may not have a solution if ∆t too small. ALG does not guarantee reducing the objective function if constraints are satisfied.

slide-13
SLIDE 13
  • 1. A generic optimization algorithm

Dynamical systems approaches: ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣

slide-14
SLIDE 14
  • 1. A generic optimization algorithm

Dynamical systems approaches: ◮ For unconstrained optimization, the celebrated gradient flow: ˙ x = −∇J(x) J(x(t)) decreases necessarily! ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣

slide-15
SLIDE 15
  • 1. A generic optimization algorithm

Dynamical systems approaches: ◮ For unconstrained optimization, the celebrated gradient flow: ˙ x = −∇J(x) J(x(t)) decreases necessarily! ◮ For equality constrained optimization, projected gradient flow (Tanabe (1980)): ˙ x = −(I − D❣ T(D❣D❣ T)−1D❣)∇J(x) (gradient flow on X = {x ∈ V | ❣(x) = 0}) ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣ ❣

slide-16
SLIDE 16
  • 1. A generic optimization algorithm

Dynamical systems approaches: ◮ For unconstrained optimization, the celebrated gradient flow: ˙ x = −∇J(x) J(x(t)) decreases necessarily! ◮ For equality constrained optimization, projected gradient flow (Tanabe (1980)): ˙ x = −(I − D❣ T(D❣D❣ T)−1D❣)∇J(x) (gradient flow on X = {x ∈ V | ❣(x) = 0}) Then Yamashita (1980) added a Gauss-Newton direction: ˙ x = −αJ(I − D❣ T(D❣D❣ T)−1D❣)∇J(x) −αCD❣ T(D❣D❣ T)−1❣(x) ❣ ❣ ❣

slide-17
SLIDE 17
  • 1. A generic optimization algorithm

Dynamical systems approaches: ◮ For unconstrained optimization, the celebrated gradient flow: ˙ x = −∇J(x) J(x(t)) decreases necessarily! ◮ For equality constrained optimization, projected gradient flow (Tanabe (1980)): ˙ x = −(I − D❣ T(D❣D❣ T)−1D❣)∇J(x) (gradient flow on X = {x ∈ V | ❣(x) = 0}) Then Yamashita (1980) added a Gauss-Newton direction: ˙ x = −αJ(I − D❣ T(D❣D❣ T)−1D❣)∇J(x) −αCD❣ T(D❣D❣ T)−1❣(x) ❣(x(t)) = ❣(x(0))e−αC t and J(x(t)) is guaranteed to decrease if ❣(x(t)) = 0.

slide-18
SLIDE 18
  • 1. A generic optimization algorithm

For both equality constraints ❣(x) = 0 and inequality constraints ❤(x) ≤ 0, consider I(x) the set of violated constraints:

  • I(x) = {i ∈ {1, . . . , q} | hi(x) 0}.

I(x) =

  • ❣(x)

| (hi(x))i∈

I(x)

T ❈ ❈ ❈ ❈

slide-19
SLIDE 19
  • 1. A generic optimization algorithm

For both equality constraints ❣(x) = 0 and inequality constraints ❤(x) ≤ 0, consider I(x) the set of violated constraints:

  • I(x) = {i ∈ {1, . . . , q} | hi(x) 0}.

I(x) =

  • ❣(x)

| (hi(x))i∈

I(x)

T We propose ˙ x = −αJξJ(x(t)) − αCξC(x(t)) with −ξJ(x) :=

  • the “best” descent direction

with respect to the constraints I(x) −ξC(x) :=

  • the Gauss-Newton direction

−D❈ T

  • I(x)(D❈

I(x)D❈ T

  • I(x))−1❈

I(x)(x)

slide-20
SLIDE 20
  • 1. A generic optimization algorithm

min

(x1,x2)∈R2

J(x1, x2) = x2

1 + (x2 + 3)2

s.t.

  • h1(x1, x2) = −x2

1 + x2

≤ 0 h2(x1, x2) = −x1 − x2 − 2 ≤ 0

slide-21
SLIDE 21
  • 1. A generic optimization algorithm

We propose: ˙ x = −αJξJ(x(t)) − αCξC(x(t)) with ξJ(x) := (I − D❈ T

  • I(x)(D❈

I(x)D❈ T

  • I(x))−1D❈

I(x))(∇J(x))

ξC(x) := D❈ T

  • I(x)(D❈

I(x)D❈ T

  • I(x))−1❈

I(x)(x).

slide-22
SLIDE 22
  • 1. A generic optimization algorithm

We propose: ˙ x = −αJξJ(x(t)) − αCξC(x(t)) with ξJ(x) := (I − D❈ T

  • I(x)(D❈

I(x)D❈ T

  • I(x))−1D❈

I(x))(∇J(x))

ξC(x) := D❈ T

  • I(x)(D❈

I(x)D❈ T

  • I(x))−1❈

I(x)(x).

  • I(x) ⊂

I(x) is a subset of the active or violated constraints which can be computed by mean of a dual subproblem.

slide-23
SLIDE 23
  • 1. A generic optimization algorithm

The best descent direction −ξJ(x) must be proportional to ξ∗ = arg min

ξ∈V

DJ(x)ξ s.t.      D❣(x)ξ = 0 D❤

I(x)(x)ξ ≤ 0

||ξ||V ≤ 1. where ❤

I(x)(x) = (hi(x))i∈ I(x)

slide-24
SLIDE 24
  • 1. A generic optimization algorithm

Proposition

Let (λ∗(x), µ∗(x)) ∈ Rp × RCard

I(x) the solutions of the following

dual minimization problem: (λ∗(x), µ∗(x)) := arg min

λ∈Rp µ∈R

q(x), µ0

||∇J(x)+D❣(x)T λ+D❤

I(x)(x)T µ||V .

Then, unless x is a KKT point, the best descent direction ξ∗(x) is given by ξ∗(x) = − ∇J(x) + D❣(x)T λ∗(x) + D❤

I(x)(x)T µ∗(x)

||∇J(x) + D❣(x)T λ∗(x) + D❤

I(x)(x)T µ∗(x)||V

.

slide-25
SLIDE 25
  • 1. A generic optimization algorithm

Proposition

Let I(x) the set obtained by collecting the non zero components of µ∗(x):

  • I(x) := {i ∈

I | µ∗

i (x) > 0}.

Then ξ∗(x) is explicitly given by: ξ∗(x) = − Π❈

I(x)(∇J(x))

||Π❈

I(x)(∇J(x))||V

, with Π❈

I(x)(∇J(x)) = (I − D❈ T

  • I(x)(D❈

I(x)D❈ T

  • I(x))−1D❈

I(x))(∇J(x))

slide-26
SLIDE 26
  • 1. A generic optimization algorithm

We can prove:

  • 1. Constraints are asymptotically satisfied:

❣(x(t)) = e−αC t❣(x(0)) and ❤

I(x(t)) ≤ e−αC t❤(x(0))

  • 2. J decreases as soon as the violation ❈

I(x(t)) is sufficiently

small

  • 3. All stationary points x∗ of the ODE are KKT points
slide-27
SLIDE 27
  • 2. Applications to shape optimization

What is truely required by the user:

  • 1. Specification of objective and constraints J, ❣, ❤

❣ ❤

slide-28
SLIDE 28
  • 2. Applications to shape optimization

What is truely required by the user:

  • 1. Specification of objective and constraints J, ❣, ❤
  • 2. Fr´

echet derivatives DJ(x), D❣(x), D❤(x) given as linear

  • perators
slide-29
SLIDE 29
  • 2. Applications to shape optimization

What is truely required by the user:

  • 1. Specification of objective and constraints J, ❣, ❤
  • 2. Fr´

echet derivatives DJ(x), D❣(x), D❤(x) given as linear

  • perators
  • 3. Scalar product a for identifying these derivatives
slide-30
SLIDE 30
  • 2. Applications to shape optimization

What is truely required by the user:

  • 1. Specification of objective and constraints J, ❣, ❤
  • 2. Fr´

echet derivatives DJ(x), D❣(x), D❤(x) given as linear

  • perators
  • 3. Scalar product a for identifying these derivatives
  • 4. Typical length scale ∆t (e.g. the mesh size)
slide-31
SLIDE 31
  • 2. Applications to shape optimization

What is truely required by the user:

  • 1. Specification of objective and constraints J, ❣, ❤
  • 2. Fr´

echet derivatives DJ(x), D❣(x), D❤(x) given as linear

  • perators
  • 3. Scalar product a for identifying these derivatives
  • 4. Typical length scale ∆t (e.g. the mesh size)
  • 5. αJ and αC for tuning the relative magnitude of ξJ and ξC,

i.e. the speed at which violated constraints become satisfied.

slide-32
SLIDE 32
  • 2. Applications to shape optimization

A multiple load case. ΓD ΓD ❣0 Γ0 ❣1 Γ1 ❣2 Γ2 ❣3 Γ3 ❣4 Γ4 ❣5 Γ5 ❣6 Γ6 ❣7 Γ7 ❣8 Γ8                −div(Ae(✉i)) = 0 in Ω Ae(✉i)♥ = 0

  • n Γ

Ae(✉i)♥ = ❣i

  • n Γi

Ae(✉i)♥ = 0

  • n Γj for j = i

✉i = 0

  • n ΓD,
slide-33
SLIDE 33
  • 2. Applications to shape optimization

Volume minimization subject to multiple load rigidity constraints min

dx s.t.

Ae(✉i) : e(✉i)dx ≤ C, ∀i = 1 . . . 9

slide-34
SLIDE 34

Demonstration on shape optimization test cases

(a) One load (only ❣4 is considered). (b) Three loads (only ❣0, ❣4, ❣8 are considered). (c) All nine loads.

slide-35
SLIDE 35
  • 2. Applications to shape optimization

50 100 150 200 250 300 0.45 0.50 0.55 0.60 0.65 0.70

(a) J(Ω) = Vol(Ω).

50 100 150 200 250 300 0.22 0.24 0.26 0.28 C4

(b) Constraints Ci.

Figure: Single load case.

slide-36
SLIDE 36
  • 2. Applications to shape optimization

50 100 150 200 250 300 0.45 0.50 0.55 0.60 0.65 0.70

(a) J(Ω) = Vol(Ω).

50 100 150 200 250 300 0.125 0.150 0.175 0.200 0.225 0.250 0.275 0.300 C0 C4 C8

(b) Constraints Ci.

Figure: Three load case.

slide-37
SLIDE 37
  • 2. Applications to shape optimization

50 100 150 200 250 300 0.50 0.55 0.60 0.65 0.70

(a) J(Ω) = Vol(Ω).

50 100 150 200 250 300 0.10 0.15 0.20 0.25 0.30 C0 C1 C2 C3 C4 C5 C6 C7 C8

(b) Constraints Ci.

Figure: Nine load case.

slide-38
SLIDE 38
  • 2. Applications to shape optimization

Heat exchange subject to maximal pressure drop and non penetration constraint: max

  • Ωf ,cold

ρcp✈ · ∇Tdx −

  • Ωf ,hot

ρcp✈ · ∇Tdx s.t.

  • ∂Ωf ,out

pds −

  • ∂Ωf ,in

pds ≤ DP0, d(Ωf ,hot, Ωf ,cold) dmin

slide-39
SLIDE 39

References

Feppon, F., Allaire, G., Bordeu, F., Cortial, J., and Dapogny, C. Shape optimization of a coupled thermal fluid-structure problem in a level set mesh evolution framework. SeMA Journal (2019). Feppon, F., Allaire, G., and Dapogny, C. Null space gradient flows for constrained optimization with applications to shape optimization. HAL preprint hal-01972915 (2019). Feppon, F., Allaire, G., and Dapogny, C. A variational formulation for computing shape derivatives of geometric constraints along rays. HAL preprint hal-01879571 (2019).

slide-40
SLIDE 40

Many thanks!

slide-41
SLIDE 41

Constrained optimization

◮ For a vector space X = V , a sequence of updates will be of the form xn+1 = xn − ∆tξn where −ξn is the current descent direction. ◮ For a manifold, this becomes xn+1 = ρxn(−∆tξn) xn+1 xn ∆tξn TxnM M ρxn

slide-42
SLIDE 42
  • 1. A generic optimization algorithm

Warning: ∇J(x) and the transpose T must be computed with respect to the scalar product a of the Hilbert space V or Txn. In practice this means solving ∀ξ ∈ V , a(∇J(x), ξ) = DJ(x)ξ ∀ξ ∈ V , a(∇gi(x), ξ) = Dgi(x)ξ ∀ξ ∈ V , a(∇hi(x), ξ) = Dhi(x)ξ Then D❣ T (x) =

  • ∇g0(x)

· · · ∇gp(x) T D❤T (x) =

  • ∇h0(x)

· · · ∇hq(x) T