Tutorial: Theory of RaSH Made Easy Benjamin Doerr - - PowerPoint PPT Presentation

tutorial theory of rash made easy
SMART_READER_LITE
LIVE PREVIEW

Tutorial: Theory of RaSH Made Easy Benjamin Doerr - - PowerPoint PPT Presentation

Tutorial: Theory of RaSH Made Easy Benjamin Doerr Max-Planck-Institut fr Informatik Saarbrcken Outline: Some Theory of RaSH Part I: Drift Analysis Motivation: Explains daily life A simple and powerful drift theorem 4


slide-1
SLIDE 1

Benjamin Doerr

Max-Planck-Institut für Informatik Saarbrücken

Tutorial: Theory of RaSH Made Easy

slide-2
SLIDE 2

Benjamin Doerr

Outline: Some Theory of RaSH

  • Part I: Drift Analysis

– Motivation: Explains daily life – A simple and powerful drift theorem – 4 Applications Coupon collector RLS and (1+1) EA optimize OneMax RLS and (1+1) EA optimize linear functions Finding miminum spanning trees – Summary & Outlook

  • Part II: Random walk arguments [if time permits]
slide-3
SLIDE 3

Benjamin Doerr

Drift Analysis: Motivation

Life in the Saarland is easy... – Get salary on day 0: M0 = 1000 € (6 559.57 ₣) – Day 1: Spend half of it in the pub: M1 = ½ M0 = 500 – Day 2: Spend half of your money: M2 = ½ M1 = 250 – … – Day t: Spend half of your money: Mt = ½ Mt-1 – Question: When are you broke (MT < 1)? – Answer: T = ⌊log2(M0) + 1⌋ = 10

slide-4
SLIDE 4

Benjamin Doerr

Drift Analysis: Motivation +randomness

Life in the Saarland is easy... and chaotic – Get salary on day 0: M0 = 1000 € (6 559.57 ₣) – Day 1: Expect to spend half of it: E(M1) = ½ M0 = 500 – Day 2: Expect to spend half of your money: E(M2) = ½ M1 – … – Day t: Expect to spend half of your money: E(Mt) = ½ Mt-1 – Question: When do you expect to be broke? – Ideal answer: E(T) = ⌊log2(M0) + 1⌋ = 10 – Warnung: You hope for E(min{T|MT<1}) = min{T|E(MT)<1} – Solution: Drift-Theorem (next slide)

E(Mt) = (1/2)t M0 Truth: 10.95 is possible

= 10

slide-5
SLIDE 5

Benjamin Doerr

Drift Analysis: The Theorem

  • A ‘new’ drift theorem (BD, Leslie Goldberg, Daniel Johannsen):
  • Some history:

– Doob (1953), Tweedie (1976), Hajek (1982): Fundamental work, mathematical. – Early EA works (‘Dortmund’ 1995- ): Use direct methods, coupon collector, Chernoff bounds, ... [could have been done with drift] – Expected weight decrease method: ‘Drift-thinking’, but technical effort necess- ary to cope with not using drift analysis [should have been done with drift] – He&Yao (2001-04): First explicit use of drift analysis in EA theory. – Now: Many drift theorems and applications [BD: the above is the coolest ☺]

X, X, . . . {} ∪ , ∞ δ > t ∈ EXt|Xt− x ≤ − δx X x T {t|Xt } ET ≤

δ x

c > n ∈ T >

δ x c n ≤ n−c

slide-6
SLIDE 6

Benjamin Doerr

Drift Analysis: 4 Applications

  • A ‘new’ drift theorem (BD, Leslie Goldberg, Daniel Johannsen):
  • 4 Applications:

– Coupon collector – OneMax – Linear functions – Minimum spanning trees Making the Expected Weight Decrease Method obsolete

X, X, . . . {} ∪ , ∞ δ > t ∈ EXt|Xt− x ≤ − δx X x T {t|Xt } ET ≤

δ x

c > n ∈ T >

δ x c n ≤ n−c

slide-7
SLIDE 7

Benjamin Doerr

Application 1: Coupon Collector

  • Coupon Collector Problem:

– There are n different types of coupons: T1, …, Tn – Round 0: You start with no coupon – Each round t, you obtain a random coupon Ct Pr(Ct = Tk) = 1/n for all t and k – After how many rounds do you have all [types of] coupons?

  • Analysis:

– Xt := Number of missing coupon types after round t – X0 = n. Question: Smallest T such that XT = 0. – It Xt-1 = x, then the chance to get a new coupon in round t is x/n. Hence E(Xt) = x – x/n = (1 – 1/n) x. [δ = 1/n] – Drift-Thm gives: • E(T) ≤ (1/δ)(ln x0 + 1) = n (ln(n)+1)

  • For all c>0, Pr(T > (c+1) n ln(n)) < n-c

Best possible

slide-8
SLIDE 8

Benjamin Doerr

Application 2: RLS optimizes OneMax

  • One of the most simple randomized search heuristics (RaSH):

Randomized Local Search (RLS), here used to maximize f: {0,1}n → R

  • Question: How long does it take to find the maximum of a simple function

like OneMax = f: {0,1}n → R; x ↦ x1 + x2 + … + xn (number of ‘ones’ in x)

  • Remark: Of course, x = (1, 1, …, 1) is the maximum, and no-one needs an

algorithm to find this out. Aim: Start understanding RaSH via simple examples RLS:

  • 1. Pick x ∈ {0,1}n uniformly at random

% random start-point

  • 2. Pick k ∈ {1, …, n} uniformly at random
  • 3. y := x; yk := 1 – xk

% mutation: flip a random bit

  • 4. if f(y) ≥ f(x), then x := y

% selection: keep the fitter

  • 5. if not happy, go to 2.

% repeat or terminate

slide-9
SLIDE 9

Benjamin Doerr

Application 2: RLS optimizes OneMax

  • x
  • Question: How long does it take to find the maximum of a simple function

like OneMax = f: {0,1}n → R; x ↦ x1 + x2 + … + xn (number of ‘ones’ in x)

  • Analysis (same as for coupon collector):

– Xt: Number of zeroes after iteration t (= “fopt – f(x)”). Trivially, X0 ≤ n – If Xt-1 = k, then with probability k/n, we flip a ‘zero’ into a ‘one’ (Xt = k – 1). Otherwise, y is worse than x and thus Xt = k – Hence, E(Xt) = k – k/n = (1 – 1/n) k – Drift Thm gives: Maximum found after n (ln n +1) iterations (in expect.) RLS:

  • 1. Pick x ∈ {0,1}n uniformly at random

% random start-point

  • 2. Pick k ∈ {1, …, n} uniformly at random
  • 3. y := x; yk := 1 – xk

% mutation: flip a random bit

  • 4. if f(y) ≥ f(x), then x := y

% selection: keep the fitter

  • 5. if not happy, go to 2.

% repeat or terminate

slide-10
SLIDE 10

Benjamin Doerr

Application 2a: (1+1)-EA optimizes OneMax

  • One of the most simple evolutionary algorithms (EAs):

(1+1)-EA, again used to maximize f: {0,1}n → R

  • ‘(1+1)’: population size = 1, generate 1 off-spring, perform ‘plus’-selection:

choose new population from parents and off-springs

  • Cannot get stuck in local optima (“always converges”).
  • Question: Time to maximize OneMax = f: {0,1}n → R; x ↦ x1 + … + xn?

(1+1)-EA: 1. Pick x ∈ {0,1}n uniformly at random % random start-point

  • 2. y := x
  • 3. For each i ∈ {1, …, n} do

% mutation: Flip each bit w.p. 1/n with probability 1/n set yi := 1 – xi

  • 4. if f(y) ≥ f(x), then x := y

% selection: keep the fitter

  • 5. if not happy, go to 2.

% repeat or terminate

slide-11
SLIDE 11

Benjamin Doerr

Application 2a: (1+1)-EA optimizes OneMax

  • X
  • Question: Time to maximize OneMax = f: {0,1}n → R; x ↦ x1 + … + xn?
  • Analysis:

– Xt: Number of zeroes after iteration t. – If Xt-1 = k, then the probability that exactly one of the missing bits is flipped, is (1 – 1/n)n-1 (1/n) k ≥ (1/e) (k/n). – Hence, E(Xt) ≤ (k – 1)(k/en) + k(1 – k/en) = k (1 – 1/en) – Drift Thm: Expected optimization time at most en(ln n + 1) (1+1)-EA: 1. Pick x ∈ {0,1}n uniformly at random % random start-point

  • 2. y := x
  • 3. For each i ∈ {1, …, n} do

% mutation: Flip each bit w.p. 1/n with probability 1/n set yi := 1 – xi

  • 4. if f(y) ≥ f(x), then x := y

% selection: keep the fitter

  • 5. if not happy, go to 2.

% repeat or terminate

slide-12
SLIDE 12

Benjamin Doerr

A 3: RLS optimizes Linear Functions

  • x
  • Question: How long does it take to find the maximum of an arbitrary linear

function f: {0,1}n → R; x ↦ a1x1 + a2x2 + … + anxn (wlog 0<a1≤a2≤…≤an)

  • Analysis (same as for OneMax):

– Xt: Number of zeroes after iteration t. Trivially, X0 ≤ n – If Xt-1 = k, then with probability k/n, we flip a ‘zero’ into a ‘one’ (Xt = k – 1). Otherwise, y is worse than x and thus Xt = k – Message: You can use Xt different from “fopt – f(xt)”! – Why not Xt = “fopt – f(xt)”? RLS:

  • 1. Pick x ∈ {0,1}n uniformly at random

% random start-point

  • 2. Pick i ∈ {1, …, n} uniformly at random
  • 3. y := x; yi := 1 – xi

% mutation: flip a random bit

  • 4. if f(y) ≥ f(x), then x := y

% selection: keep the fitter

  • 5. if not happy, go to 2.

% repeat or terminate Drift Thm: E(T) ≤ (1/δ)(ln X0 +1), and X0 can be large!

slide-13
SLIDE 13

Benjamin Doerr

A 3a: (1+1)-EA optimizes Linear Functions

  • x
  • Maximize f: {0,1}n → R; x ↦ a1x1 + a2x2 + … + anxn

(wlog 0<a1≤a2≤…≤an) !

  • Classical difficult problem

– Droste, Jansen, Wegener (2002): Exp. opt. time E(T) = O(n log n) – He, Yao (2001-04): E(T) = O(n log n) via drift analysis – Jägersküpper (2008): E(T) ≲ 2.02 e n ln(n) via average drift analysis – D., Johannsen, Winzen (2010): e n ln(n) ≲ E(T) ≲ 1.39 e n ln(n) – D., Goldberg (2010+): O(n log n) whp for any c/n mutation probability (1+1)-EA: 1. Pick x ∈ {0,1}n uniformly at random % random start-point

  • 2. y := x
  • 3. For each i ∈ {1, …, n} do

% mutation: Flip each bit w.p. 1/n with probability 1/n set yi := 1 – xi

  • 4. if f(y) ≥ f(x), then x := y

% selection: keep the fitter

  • 5. if not happy, go to 2.

% repeat or terminate

slide-14
SLIDE 14

Benjamin Doerr

A 3a: (1+1)-EA optimizes Linear Functions

  • x
  • Maximize f: {0,1}n → R; x ↦ a1x1 + a2x2 + … + anxn

(wlog 0<a1≤a2≤…≤an) !

  • Analysis (sketched, from [DJW10]):

– Xt: x1 + … + x⌊n/2⌋ + (5/4) x⌊n/2+1⌋ + … + (5/4) xn for the x after iteration t – Compute: If Xt-1 = k, then E(Xt) ≤ (1 – 0.01/n) k. [less than 1 page] – Drift Thm: Optimization time is O(n log n) with high probability. (1+1)-EA: 1. Pick x ∈ {0,1}n uniformly at random % random start-point

  • 2. y := x
  • 3. For each i ∈ {1, …, n} do

% mutation: Flip each bit w.p. 1/n with probability 1/n set yi := 1 – xi

  • 4. if f(y) ≥ f(x), then x := y

% selection: keep the fitter

  • 5. if not happy, go to 2.

% repeat or terminate

New! [DJW10,

  • prev. works: only

in expectation]

slide-15
SLIDE 15

Benjamin Doerr

Application 4: (1+1)-EA optimizes MST

  • Minimum Spanning Tree (MST) problem:

– Input: Undirected connected graph G = (V, E), edge weights (we) in N – Task: Compute a connected spanning subgraph T = (V,E’) of G with minimal weight w(T) = ∑e∈E’ we

  • RaSH for combinatorial optimization problems – new aspects

– How to represent the solutions? E.g. bit-strings, permutations, … – What is a good mutation operator for this representation? – Possibly: Use a clever fitness function f.

1 1 6 7 4 5 8

slide-16
SLIDE 16

Benjamin Doerr

  • Minimum Spanning Tree (MST) problem:

– Input: Undirected connected graph G = (V, E), edge weights (we) in N – Task: Compute a connected spanning subgraph T = (V,E’) of G with minimal weight w(T) = ∑e∈E’ we

  • Here: Mostly standard

– Representation: Bitstring x of length m = |E|, xe = 1 if e ∈ T – Mutation: Standard bit mutation (flip each bit w.p. 1/m) – Fitness function (to be minimized): w(T) + cpenalty(#components of T – 1)

Application 4: (1+1)-EA optimizes MST

1 1 6 7 4 5 8

f(T) = 1+4+5+1 = 11 f(T) = 1+6+8+5+cpenalty= HUGE

slide-17
SLIDE 17

Benjamin Doerr

  • x
  • Theorem [Neumann, Wegener (2004)]:

The expected optimization time of the (1+1) EA searching for an MST is O(m2 log(mwmax)).

  • Proof: Expected weight decrease method
  • Drift Theorem: Same bound holds w.h.p., simpler proof

Application 4: (1+1)-EA optimizes MST

(1+1)-EA: 1. Pick x ∈ {0,1}m uniformly at random % random start-point

  • 2. y := x
  • 3. For each i ∈ {1, …, m} do

% mutation: Flip each bit w.p. 1/m with probability 1/m set yi := 1 – xi

  • 4. if f(y) ≥ f(x), then x := y

% f(x) = w(T) + cpenalty(#comp-1)

  • 5. if not happy, go to 2.

% repeat or terminate

1 1 6 7 4 5 8

slide-18
SLIDE 18

Benjamin Doerr

  • x
  • Analysis (1): After O(m log m) iterations,

T is connected w.h.p.: – Xt = #comp – 1 after iteration t – If Xt-1 = k > 0, then there are at least k edges that are all not in T adding each one decreases Xt – E(Xt) = (1 – 1/em) k as before. Done with Drift Thm, since X0 ≤ m.

Application 4: (1+1)-EA optimizes MST

(1+1)-EA: 1. Pick x ∈ {0,1}m uniformly at random % random start-point

  • 2. y := x
  • 3. For each i ∈ {1, …, m} do

% mutation: Flip each bit w.p. 1/m with probability 1/m set yi := 1 – xi

  • 4. if f(y) ≥ f(x), then x := y

% f(x) = w(T) + cpenalty(#comp-1)

  • 5. if not happy, go to 2.

% repeat or terminate

1 1 6 7 4 5 8

Xt=2

slide-19
SLIDE 19

Benjamin Doerr

  • Analysis (2): Let T be already connected. Then it stays connected. And

after O(m2 log(mwmax)) iterations, w.h.p. w(T) is minimal. – Xt = w(T) – wopt for the T after iteration t – If Xt-1 = D > 0, then there are e1, …, ek in T and e’1, …, e’k in E\T s.t. T’ = T – {e1, …, ek} + {e’1, …, e’k} is an MST, hence x = ∑i (w(ei) – w(e’i)), and for all i, Ti = T – ei + e’i is a spanning tree with w(Ti) < w(T) – With prob. ≥ 1/em2, one iteration flips exactly the edges ei and e’i. These are disjoint events that are “accepted”. – E(Xt) ≤ D – ∑i (1/em2) (w(ei) – w(e’i)) = (1 – 1/em2) D – Done with drift theorem, since X0 ≤ ∑e∈E w(e) ≤ m wmax.

Application 4: (1+1)-EA optimizes MST

1 1 6 7 4 5 8

e1 e2 e’2 e’1 Xt = 6+8+1+5 – 11 = 9

  • 5
  • 4
slide-20
SLIDE 20

Benjamin Doerr

  • Cool Drift Theorem:
  • Works well when “progress is proportional to distance from optimum”

– Natural: The more loose ends, the easier mutations finds and fixes one. – Examples: Coupon collector, linear functions, MST – Further examples: single-criterion SSSP, Eulerian cycles, …

  • New: Also gives bounds with high probability (w.h.p.)

Summary Drift Analysis

X, X, . . . {} ∪ , ∞ δ > t ∈ EXt|Xt− x ≤ − δx X x T {t|Xt } ET ≤

δ x

c > n ∈ T >

δ x c n ≤ n−c

slide-21
SLIDE 21

Benjamin Doerr

  • Things I didn’t tell you: It’s not always easy to define the Xt in a useful way.
  • Major challenges:

– A really simple proof for the fact that the (1+1) EA optimizes a linear function in time O(n log n). [watch out for Angelika Steger (ETH Zurich)] – Prove or disprove that the (1+1) EA finds an MST in time O(m2 log(m)) instead of O(m2 log(mwmax)). – Prove or disprove that the (1+1) EA solves the single-criterion formulation of the single-source shortest path problem in time O(n3 log(n)) instead of O(n3 log(nwmax).

Outlook Drift Analysis (1)

slide-22
SLIDE 22

Benjamin Doerr

  • Robustness of our knowledge

– Some proofs fail if the mutation probability is 10/n instead of 1/n [Carola’s talk]

  • Drift theorems for “progress not proportional to distance from optimum”

– Classical “additive” drift theorem: Drift independent from distance – Daniel’s talk: something with integrals

  • Additive drift theorems with “w.h.p.”?

– In general not. – With restrictions?

Outlook Drift Analysis (2)

slide-23
SLIDE 23

Benjamin Doerr

  • Warm-up: Classical coupon collector analysis

Part II: Random Walk Arguments

no coupon yet START 1 coupon 2 coupons ☺ (n-1)/n 1/n 2/n (n-2)/n 3 coupons ☺ 3/n k coupons ☺ k+1 coupons ☺ (n-k)/n k/n (k+1)/n (n-k-1)/n n-2 coupons ☺ n-1 coupons ☺ 2/n (n-2)/n (n-1)/n 1/n all coupons ☺ 1

slide-24
SLIDE 24

Benjamin Doerr

  • Tk: expected time when k-th coupon found

Coupon Collector “Forward”

no coupon yet START 1 coupon 2 coupons ☺ (n-1)/n 1/n 2/n (n-2)/n 3 coupons ☺ 3/n k coupons ☺ k+1 coupons ☺ (n-k)/n k/n (k+1)/n (n-k-1)/n n-2 coupons ☺ n-1 coupons ☺ 2/n (n-2)/n (n-1)/n 1/n all coupons ☺ 1 T0=0 T1=1 1 round n/(n-1) rounds T2=2+1/(n-1) T3=3+1/(n-1)+2/(n-2) n/(n-2) rounds

slide-25
SLIDE 25

Benjamin Doerr

  • Tk = expected time to finish, when k coupons found

Coupon Collector “Backwards”

no coupon yet START 1 coupon 2 coupons ☺ (n-1)/n 1/n 2/n (n-2)/n 3 coupons ☺ 3/n k coupons ☺ k+1 coupons ☺ (n-k)/n k/n (k+1)/n (n-k-1)/n n-2 coupons ☺ n-1 coupons ☺ 2/n (n-2)/n (n-1)/n 1/n all coupons ☺ 1 Tn=0 Tn-1=n Tn-2=n(1+1/2) Tn-3=n(1+1/2+1/3) n rounds n/2 rounds n/3 rounds

slide-26
SLIDE 26

Benjamin Doerr

  • Tk = expected time to finish, when k coupons found
  • Assume you have k coupons. Within one round

– with probability k/n nothing interesting happens – with probability (n-k)/n, you get a new coupon ☺

  • Observation: Tk = 1 + (k/n) Tk + [(n-k)/n] Tk+1

– Yields Tk = n/(n-k) + Tk+1 – Thus T0 = T0 – Tn = ∑ (Tk-1 – Tk) = ∑ n/(n-k+1) = n(1 + ½ + … + 1/n)

Coupon Collector “grab the middle”

k coupons ☺ k+1 coupons ☺ (n-k)/n k/n (k+1)/n (n-k-1)/n Tk Tk+1

[linear algebra, no thinking]

slide-27
SLIDE 27

Benjamin Doerr

  • Random walk on a path of length n.

– Random walk: In each round, go to a random neighbor – Question: How long does it take to go from “0” to “n”?

  • Tk = expected time to reach “n”, when standing on “k”

– Tn = 0. – Tk = 1 + ½ Tk-1 + ½ Tk+1, k = 1, …, n-1 – T0 = 1 + T1

  • View the Tk as variables: n+1 variables, n+1 independent equations.

– The unique solution tells us what we want. [Little exercise ☺]

Same Trick for Random Walks…

1 2 3 n-1 n

slide-28
SLIDE 28

Benjamin Doerr

An Evolutionary Example

  • x
  • Maximize LEADINGONES = f: {0,1}n → R; x ↦ max{i|0 ≤ i ≤ n, ∀j ≤ i: xj=1},

the number of leading ones of x!

  • Observation: Only one “loose end”!

– No matter if f(x) = 1 or f(x) = n-17, we have to flip a particular bit to make progress (namely the 2nd resp. the (n-16)th). – Unlikely, that drift as before works (1+1)-EA: 1. Pick x ∈ {0,1}n uniformly at random % random start-point

  • 2. y := x
  • 3. For each i ∈ {1, …, n} do

% mutation: Flip each bit w.p. 1/n with probability 1/n set yi := 1 – xi

  • 4. if f(y) ≥ f(x), then x := y

% selection: keep the fitter

  • 5. if not happy, go to 2.

% repeat or terminate

slide-29
SLIDE 29

Benjamin Doerr

An Evolutionary Example

  • x
  • Maximize LEADINGONES = f: {0,1}n → R; x ↦ max{i|0 ≤ i ≤ n, ∀j ≤ i: xj=1}!
  • Another observation:

– If my current solution has f(x)=k, then the first k bits are one the (k+1)st bit is zero all other bits are 0 or 1 independently with probability ½. – In a sense, there are only n+1 different states, described by the f-value! (1+1)-EA: 1. Pick x ∈ {0,1}n uniformly at random % random start-point

  • 2. y := x
  • 3. For each i ∈ {1, …, n} do

% mutation: Flip each bit w.p. 1/n with probability 1/n set yi := 1 – xi

  • 4. if f(y) ≥ f(x), then x := y

% selection: keep the fitter

  • 5. if not happy, go to 2.

% repeat or terminate

slide-30
SLIDE 30

Benjamin Doerr

An Evolutionary Example

  • State k: k leading ones, then a zero, then n-k-1 random bits.
  • Transitions:

– Never from k to an i < k (not accepted). – From k to some i > k: Flip no bit 1,…, k. Flip bit k+1. Don’t care about the rest. Probability (1-1/n)k (1/n) 1 =: qk Details: Rest remains random.

– From k to k+1 with prob. (1/2) qk [bit k+2 must be zero] – From k to k+2 with prob. (1/4) qk [bit k+2 is one, k+3 is zero] – From k to k+3 with prob. (1/8) qk …

– From k to k: 1 - qk

slide-31
SLIDE 31

Benjamin Doerr

An Evolutionary Example

  • State k: k leading ones, then a zero, then n-k-1 random bits.
  • Transitions:

– Never from k to an i < k (not accepted). – From k to some i > k with prob. (1-1/n)k (1/n) 1 =: qk From k to k+i with prob. 2-i qk [cheating for k+i=n, but allowed] – From k to k: 1 - qk

k k+1 k+2 qk/2 k+3 1-qk qk/8 qk/4

slide-32
SLIDE 32

Benjamin Doerr

The Optimization Time

  • State k: k leading ones, then a zero, then n-k-1 random bits.
  • Equations:

– Tn = 0 – Tk = 1 + (1 – qk) Tk + ∑i 2-i qk Tk+i = [some nasty term] – Expected optimization time of the EA: T = ∑k=0..n 2-k Tk = n2 [(1 – 1/n)-n+1 – 1 + 1/n] / 2 ≈ ½ (e-1) n2 ≈ 0.85 n2

k k+1 k+2 qk/2 k+3 1-qk qk/8 qk/4

slide-33
SLIDE 33

Benjamin Doerr

The Right Mutation Probability

  • Usually “flip each bit with probability 1/n”. Why?

– Do the same with general mutation probability p

  • Equations:

– Tn = 0 – Tk = 1 + (1 – qk) Tk + ∑i 2-i qk Tk+i = [nasty term depending on qk=p(1-p)k] – Expected optimization time of the EA: T = ∑k=0..n 2-k Tk = p-2 [(1 – p)-n+1 – 1 + p] / 2 ≈ 0.72 n2 for p≈1.59/n

k k+1 k+2 qk/2 k+3 1-qk qk/8 qk/4

slide-34
SLIDE 34

Benjamin Doerr

Fitness Dependent Mutation Probability

  • Artificial Immune Systems: Mutation prob. p may depend on the fitness.

– Here: pk when we have exactly k leading ones.

  • Equations:

– Tn = 0 – Tk = 1+(1 – qk) Tk + ∑i 2-i qk Tk+i = [nasty term, depends on qk=pk (1-pk)k] – Looks complicated, but is easy. Exp optimization time of the AIS: T = ½ ∑k=0..n-1 (1/qk) [easy to find optimal pk]

  • = ¼ e n2

[for pk = 1/(k+1)]

k k+1 k+2 qk/2 k+3 1-qk qk/8 qk/4

slide-35
SLIDE 35

Benjamin Doerr

Summary Random Walk Thinking

  • Technique:

– Identify reasonable “states” – Compute the transition probabilities – Set up linear equations – Solve!

  • Results for f = LEADINGONES:

– Expected optimization time with mutation prob. 1/n: ½ (e-1) n2 ≈ 0.85 n2 – Optimal mutation probability 1.59/n yields ≈ 0.72 n2 – AIS with (quite natural) mutation probability 1/f(x) yields ¼ en2 = 0.69 n2 – All this was Süntje Böttcher’s Master’s thesis [co-supervised with Frank Neumann]

Merci beaucoup!