Mutation Rates of the (1+1)-EA on Pseudo-Boolean Functions of - - PowerPoint PPT Presentation

mutation rates of the 1 1 ea on pseudo boolean functions
SMART_READER_LITE
LIVE PREVIEW

Mutation Rates of the (1+1)-EA on Pseudo-Boolean Functions of - - PowerPoint PPT Presentation

Mutation Rates of the (1+1)-EA on Pseudo-Boolean Functions of Bounded Epistasis Andrew M Sutton, Darrell Whitley, Adele Howe Andrew M Sutton, Darrell Whitley, Adele Howe 0 / 19 Introduction Want to maximize f : { 0 , 1 } n R + (1+1)-EA(


slide-1
SLIDE 1

Mutation Rates of the (1+1)-EA on Pseudo-Boolean Functions of Bounded Epistasis

Andrew M Sutton, Darrell Whitley, Adele Howe

Andrew M Sutton, Darrell Whitley, Adele Howe 0 / 19

slide-2
SLIDE 2

Introduction

Want to maximize f : {0, 1}n → R+ (1+1)-EA(ρ) Choose x ∈ {0, 1}n uniformly at random while stopping criteria not met do y ← x Flip each bit of y independently with prob. ρ if f(y) ≥ f(x) then x ← y Question: how do we choose the mutation rate ρ?

Andrew M Sutton, Darrell Whitley, Adele Howe 1 / 19

slide-3
SLIDE 3

Motivation

Early experimental evidence suggested 0.001 ≤ ρ ≤ 0.01 – De Jong (1975), Grefenstette (1986), Schaffer (1989) Droste et al. (1998): linear functions ρ = 1/n = ⇒ O(n log n) expected convergence Jansen and Wegener (2000): PathToJump ρ = 1/n = ⇒ superpolynomial runtime w.h.p. ρ = log n

n

= ⇒ polytime convergence Doerr et al. (2010): monotone functions changing ρ by a constant factor = ⇒ exponential performance gap

Andrew M Sutton, Darrell Whitley, Adele Howe 2 / 19

slide-4
SLIDE 4

Expected offspring fitness under rate ρ

We will study the class of functions f : {0, 1}n → R+ whose epistasis is bounded by a constant k. What is the “best” mutation rate? Maximizes probability of improvement (difficult to know in general) Maximizes the expected fitness of the offspring Let x ∈ {0, 1}n. We define Mx(ρ) - the expected fitness of the offspring of x under rate ρ.

Andrew M Sutton, Darrell Whitley, Adele Howe 3 / 19

slide-5
SLIDE 5

Expected offspring fitness under rate ρ

How does one compute Mx(ρ)? Appeal to a basis function decomposition of the fitness function Standard and alternative bases ex ex ex ex ey ey f f ϕi ϕj ϕk ey(x) = δxy f(x) =

  • y

vyey(x) f(x) =

  • i

aiϕi(x) Decomposition provides information about the relationship between the fitness function and the mutation operator.

Andrew M Sutton, Darrell Whitley, Adele Howe 4 / 19

slide-6
SLIDE 6

Appealing to Walsh decomposition

We can write the fitness function as a Walsh polynomial f(x) =

  • i

wiψi(x) Expected fitness of a point drawn uniformly at random at Hamming distance r from x (Sutton et al. 2011) Sr

x =

  • i

γi,rwiψi(x) (∗) From this we can obtain the expected fitness under rate ρ Mx(ρ) =

n

  • r=0

n r

  • ρr(1 − ρ)n−rSr

x

Mx(ρ) = A0 + A1ρ + A2ρ2 + · · · + Anρn where each Am is a linear combination of terms from (∗).

Andrew M Sutton, Darrell Whitley, Adele Howe 5 / 19

slide-7
SLIDE 7

The expected offspring fitness polynomial

For any pseudo-Boolean function, Mx(ρ) is a degree at most n polynomial in ρ. Mx(ρ) 1 ρ

  • pt

Maximum in the inverval [0, 1] gives the best ρ in terms of maximizing expected fitness. We are interested in exploring some properties of this polynomial.

Andrew M Sutton, Darrell Whitley, Adele Howe 6 / 19

slide-8
SLIDE 8

Degeneracy

Let ρ⋆ ∈ arg max

ρ∈[0,1]

Mx(ρ). ρ⋆ = 0: there exists a mutation rate that produces an expected improvement over f(x). ρ⋆ = 0: no mutation rate can produce an expected improvement

  • ver f(x).

Suppose we insist on flipping ℓ > 0 bits in expectation. Then ρ = ℓ/n.

  • 1 − ℓ

n n f(x) ≤ Mx ℓ n

  • < f(x)

e−ℓf(x) ≤ Mx ℓ n

  • < f(x).

We can conclude ρ = 1/n minimizes expected loss in fitness. In this case, expected fitness of offspring is bounded below by f(x)

e .

Andrew M Sutton, Darrell Whitley, Adele Howe 7 / 19

slide-9
SLIDE 9

Linear functions

Linear functions Expressed as a sum of linear terms (epistasis is k = 1) Walsh coefficients of order higher than 1 vanish Mx(ρ) = A0 + A1ρ + A2ρ2 + · · · + Anρn where Am is a linear combination of terms from Walsh series expansion

  • f f.

Proposition If f is a linear function, then Am =      f(x) if m = 0; 2 ¯ f − f(x)

  • if m = 1;
  • therwise.

Andrew M Sutton, Darrell Whitley, Adele Howe 8 / 19

slide-10
SLIDE 10

Linear functions

When f is linear, Mx(ρ) = f(x) + 2 ¯ f − f(x)

  • ρ

Mx(ρ) 1 1 1 ρ ρ ρ f(x) < ¯ f f(x) = ¯ f f(x) > ¯ f

Degenerate when f(x) > ¯ f, in this case 1/n maximizes expected

  • ffspring s.t. ℓ > 0 bits flipping in expectation

This illustrates a problem with using expectation: consider when f(x) = ¯ f.

Andrew M Sutton, Darrell Whitley, Adele Howe 9 / 19

slide-11
SLIDE 11

Epistatically bounded functions

f : {0, 1}n → R+ where the epistasis is bounded by k = O(1). Max-k-Sat Boolean formula over a set V of n variables and m clauses consisting of exactly k literals

m

  • i=1

(ℓi,1 ∨ ℓi,2 ∨ · · · ∨ ℓi,k), where ℓi,j ∈ {v, ¬v : v ∈ V } f : {0, 1}n → {0, . . . , m} counts clauses satisfied under x. NK-landscapes f(x) = 1 n

n

  • j=1

gj

  • x[j], x[b(j)

1 ], . . . , x[b(j) K ]

  • , where gj : {0, 1}K+1 → [0, 1]

Andrew M Sutton, Darrell Whitley, Adele Howe 10 / 19

slide-12
SLIDE 12

Epistatically bounded functions

Proposition When f is epistatically bounded by k, Am = 0 for m > k. Thus for any k-bounded pseudo-Boolean function, Mx(ρ) is a degree-k polynomial in ρ. E.g., for Max-2-Sat, the mutation polynomial is quadratic. Furthermore, Am depends only on Sr

x for r ≤ m.

Corollary If Sr

x < f(x) for all 0 < r ≤ k, then Mx(0) is maximal.

Andrew M Sutton, Darrell Whitley, Adele Howe 11 / 19

slide-13
SLIDE 13

Numerical results

Practically speaking, does solving Mx(ρ) for the best mutation rate provide any insight? d dρMx(ρ) = A1 + 2A2ρ + 3A3ρ2 + · · · + nAnρn−1, d2 dρ2 Mx(ρ) = 2A2 + 6A3ρ + 12A4ρ2 + · · · + n(n − 1)Anρn−2. Numerically finding ρ⋆ Find the stationary points of Mx(ρ) by numerically solving for the real roots of

d dρMx(ρ).

Second derivative test for concavity ρ⋆ is maximum of this set union {0, 1}

Andrew M Sutton, Darrell Whitley, Adele Howe 12 / 19

slide-14
SLIDE 14

Numerical results

Unrestricted NK-landscape, n = 100, k = 2

1 2 5 10 20 50 100 500 0.001 0.005 0.020 0.100 0.500 generation

  • ptimal

1/n 0.001 ρ

Andrew M Sutton, Darrell Whitley, Adele Howe 13 / 19

slide-15
SLIDE 15

Numerical results

Unrestricted NK-landscape, n = 100, k = 2

1 2 5 10 20 50 100 500 0.50 0.55 0.60 0.65 generation fitness

  • ptimal

1/n 0.001

Andrew M Sutton, Darrell Whitley, Adele Howe 14 / 19

slide-16
SLIDE 16

Numerical results

Max-3-Sat, n = 100

1 2 5 10 20 50 100 500 0.001 0.005 0.020 0.100 0.500 generation

  • ptimal

1/n 0.001 ρ

Andrew M Sutton, Darrell Whitley, Adele Howe 15 / 19

slide-17
SLIDE 17

Numerical results

Max-3-Sat, n = 100

1 2 5 10 20 50 100 500 380 390 400 410 420 generation fitness

  • ptimal

1/n 0.001

Andrew M Sutton, Darrell Whitley, Adele Howe 16 / 19

slide-18
SLIDE 18

Numerical results (high epistasis)

NK-landscape, N=10, K=9

1 2 5 10 20 50 100 500 0.50 0.55 0.60 0.65 0.70 0.75 generation fitness

  • ptimal

standard hardwired 1 2 5 10 20 50 100 500 0.001 0.005 0.020 0.100 0.500 generation rho

  • ptimal

standard hardwired Andrew M Sutton, Darrell Whitley, Adele Howe 17 / 19

slide-19
SLIDE 19

Research directions

Analyzing wi for certain specific problems (e.g., Max-k-Sat). Provide more precise statements about the expected fitness in general. Working with higher moments of the random variable distribution Connection to runtime analysis... might be possible if we can discover bounds for the higher moments of the distribution Generalization to broader problem classes (Fourier decomposition)

Andrew M Sutton, Darrell Whitley, Adele Howe 18 / 19

slide-20
SLIDE 20

Conclusion

As long as epistasis is bounded by k, it is possible to efficiently compute the expected fitness of a mutation for any rate. it is possible to efficiently find the rate that results in the highest possible expected fitness. For strings with fitness higher than expectation in spheres up to radius k, 1/n yields maximal expected fitness of the offspring while imposing the constraint that ℓ > 0 bits are flipped in expectation.

Andrew M Sutton, Darrell Whitley, Adele Howe 19 / 19