NLO: April 18, 2013 Dr. Thomas M. Surowiec Humboldt University of - - PowerPoint PPT Presentation

nlo april 18 2013
SMART_READER_LITE
LIVE PREVIEW

NLO: April 18, 2013 Dr. Thomas M. Surowiec Humboldt University of - - PowerPoint PPT Presentation

Step Size Strategies and Algorithms The Wolfe-Powell Rule NLO: April 18, 2013 Dr. Thomas M. Surowiec Humboldt University of Berlin Department of Mathematics Summer 2013 Dr. Thomas M. Surowiec BMS Course NLO, Summer 2013 Step Size Strategies


slide-1
SLIDE 1

Step Size Strategies and Algorithms The Wolfe-Powell Rule

NLO: April 18, 2013

  • Dr. Thomas M. Surowiec

Humboldt University of Berlin Department of Mathematics

Summer 2013

  • Dr. Thomas M. Surowiec

BMS Course NLO, Summer 2013

slide-2
SLIDE 2

Step Size Strategies and Algorithms The Wolfe-Powell Rule

Armijo Rule

Definition 1.1 Let σ ∈ (0, 1) be fixed. The Armijo rule is a condition which ensures sufficient descent in the sense that f(x + αd) ≤ f(x) + σα∇f(x)Td. discussion On the board.

  • Dr. Thomas M. Surowiec

BMS Course NLO, Summer 2013

slide-3
SLIDE 3

Step Size Strategies and Algorithms The Wolfe-Powell Rule

Armijo Step-Size Strategy

Algorithm 1 Armijo Step Size Strategy Input: Descent Direction d, l := 0, α(0) = 1, 0 < ν1 ≤ ν2 < 1

1: while Armijo Rule not fulfilled do 2:

Determine α(l+1) ∈ [ν1α(l), ν2α(l)]

3:

Set l := l + 1.

4: end while

Set αk := α(l).

  • Dr. Thomas M. Surowiec

BMS Course NLO, Summer 2013

slide-4
SLIDE 4

Step Size Strategies and Algorithms The Wolfe-Powell Rule

Armijo Step-Size Strategy

Algorithm 2 Armijo Step Size Strategy Input: Descent Direction d, l := 0, α(0) = 1, 0 < ν1 ≤ ν2 < 1

1: while Armijo Rule not fulfilled do 2:

Determine α(l+1) ∈ [ν1α(l), ν2α(l)]

3:

Set l := l + 1.

4: end while

Set αk := α(l).

  • Dr. Thomas M. Surowiec

BMS Course NLO, Summer 2013

slide-5
SLIDE 5

Step Size Strategies and Algorithms The Wolfe-Powell Rule

Analysis of the Armijo Step-Size Strategy

Lemma 1.1 Let f ∈ C1,1(Rn, R), L Lipschitz constant for ∇f. Let

  • xk

be generated by Algorithm 1 (cf. Lecture Notes Algorithm 1) using Algorithm 2 for choice of step size. Let

  • Mk

be a sequence of symmetric pos. def. matrices such that:

1

dk = −Mk∇f(xk)

2

There exist λ1, λ2 such that 0 < λ1 ≤ λ2 < +∞ with λ1 ≤ λ(k)

s

≤ λ(k)

g

≤ λ2, ∀k ∈ N. Then αk fulfills αk ≥ α∗ = 2ν1λ1(1 − σ) Lκ∗ , ∀k ∈ N with κ∗ = λ2/λ1. Furthermore, in every iteration of Algorithm (cf. Lecture Notes Algorithm 1) there will be at most m ≤ log(2λ1(1 − σ) Lk∗ )/ log(ν2), m ∈ N step size reductions necessary.

  • Dr. Thomas M. Surowiec

BMS Course NLO, Summer 2013

slide-6
SLIDE 6

Step Size Strategies and Algorithms The Wolfe-Powell Rule

How Important is the Lipschitz Assumption?

Without the assumption that ∇f : Rn → Rn is Lipschitz continuous, there might be subsequences of step sizes

  • αk(l)
  • such that αk(l) → 0. Thus, the

main assumption of Lemma 1.1 is violated. Discussion on the board.

  • Dr. Thomas M. Surowiec

BMS Course NLO, Summer 2013

slide-7
SLIDE 7

Step Size Strategies and Algorithms The Wolfe-Powell Rule

Convergence without Sufficient Descent

Theorem 1.1 Let fRn → R be continuously differentiable and let

  • Mk

fulfill the assumptions of Lemma 1.1. It holds that either

  • f(xk)
  • is unbounded from

below or lim

k→∞ ∇f(xk) = 0,

and thus every accumulation point of

  • xk

is a stationary point of f in Rn. In particular, it holds ture that if

  • f(xk)
  • is bounded from below and

liml→+∞ xk(l) = x∗, then ∇f(x∗) = 0. Proof. This will be a homework question. Refer back to the proofs of the previous three results going back to the lecture notes from April 16. Note: there is no guarantee that a unique accumulation point exists!

  • Dr. Thomas M. Surowiec

BMS Course NLO, Summer 2013

slide-8
SLIDE 8

Step Size Strategies and Algorithms The Wolfe-Powell Rule

Alternatives and Extensions

Another possibility for choosing α that can be analyzed as above is: Let r > 0 be a scaling factor, set α = max

  • rβl : l = 0, 1, 2, . . .
  • unitl the Armijo rule is satisfied.

Both variants are known as backtracking strategies. The drawback is that one imediately chooses a reduction after the first step.

  • Dr. Thomas M. Surowiec

BMS Course NLO, Summer 2013

slide-9
SLIDE 9

Step Size Strategies and Algorithms The Wolfe-Powell Rule

Armijo-Goldstein Rule

A more flexible strategy: Given σ, µ with 0σ < 1/2 < µ < 1 one tests f(x + αd) ≤ f(x) + σα∇f(x)Td. (1) f(x + αd) ≥ f(x) + µα∇f(x)Td. (2) Equation (1) ensures that α is not too large, whereas Equation (2) ensures that α is not too small. Discussion of an implementation strategy on the board.

  • Dr. Thomas M. Surowiec

BMS Course NLO, Summer 2013

slide-10
SLIDE 10

Step Size Strategies and Algorithms The Wolfe-Powell Rule

Using Polynomial Models

Apart from simple backtracking methods, there are other strategies for minimizing ϕ(α) = f(x + αd) Here, we use the given data to define a model (function), which locally approximates ϕ. Derviation and Discussion on the board.

  • Dr. Thomas M. Surowiec

BMS Course NLO, Summer 2013

slide-11
SLIDE 11

Step Size Strategies and Algorithms The Wolfe-Powell Rule

Intro to the Wolfe-Powell Rule

Definition 2.1 Let σ ∈ (0, 1

2) and ρ ∈ [σ, 1) be fixed. The following relations are known as

the Wolfe-Powell conditions: For x, d ∈ Rn with ∇f(x)Td < 0 determine a step size α > 0 such that f(x + αd) ≤ f(x) + σα∇f(x)T, (3) ∇f(x + αd)Td ≥ ρ∇f(x)Td. (4) Discussion on the board.

  • Dr. Thomas M. Surowiec

BMS Course NLO, Summer 2013