Runtime Analysis of Convex Evolutionary Search Convex Evolutionary - - PowerPoint PPT Presentation

runtime analysis of convex evolutionary search convex
SMART_READER_LITE
LIVE PREVIEW

Runtime Analysis of Convex Evolutionary Search Convex Evolutionary - - PowerPoint PPT Presentation

Runtime Analysis of Convex Evolutionary Search Convex Evolutionary Search Alberto Moraglio & Dirk Sudholt University of Birmingham & University of Sheffield Research Goal Aim: identify matches between topographic features of


slide-1
SLIDE 1

Runtime Analysis of Convex Evolutionary Search Convex Evolutionary Search

Alberto Moraglio & Dirk Sudholt

University of Birmingham & University of Sheffield

slide-2
SLIDE 2

Research Goal

  • Aim: identify matches between “topographic

features” of fitness landscapes and “behavioural features” of evolutionary algorithms that alone explain/lead to good performance

  • Features: general, representation-independent
  • Features: general, representation-independent
  • Performance: optimisation in polynomial time
  • Potential Benefits:

– understanding the fundamental causes of good performance – general run-time analysis results for a class of algorithms on a class of landscapes

slide-3
SLIDE 3

Abstract Convex Evolutionary Search Evolutionary Search

slide-4
SLIDE 4

Example of Geometric Crossover

  • Geometric crossover: offspring are in the

segment between parents.

1 1 1

A B X 1

1 1 1 1 1 1 1

A B A

1 1 1

X X 2 3 H(A,X) + H(X,B) = H(A,B)

slide-5
SLIDE 5

Abstract Convex Evolutionary Search

It holds across representations for any EA with crossover & selection

5

slide-6
SLIDE 6

Abstract Concave Landscape

  • NFL: averaged over all fitness landscapes convex

search performs as random search. On what landscapes does it work better than random search? search?

  • Rephrased: what topographic feature of the

landscape is a good match for the convex behavioural feature of the search?

  • Intuition says: (approximately) concave landscapes
slide-7
SLIDE 7

Concave Fitness Landscapes

Concave landscapes can be defined in a representation-independent way

7

slide-8
SLIDE 8

Generalised Concave Landscapes

– Traditional notion does generalise to combinatorial spaces (but caution needed!) – Average concave landscapes: for all x, y: z~Unif([x,y]), E[f(z)]>=(f(x)+f(y))/2 e.g., OneMax is average affine e.g., OneMax is average affine – Quasi concave landscapes: for all x, y: z in [x,y], f(z)>=min(f(x),f(y)) e.g., LeadingOnes is quasi concave – Adding a e-bounded perturbation function, we obtain approximated concave landscapes: E[f(z)]>=(f(x)+f(y))/2 – e and f(z)>=min(f(x),f(y)) - e

slide-9
SLIDE 9

Theorem [Foga 2011]

  • On (average/quasi) concave landscapes, convex

evolutionary search produces steady improvements: the fitness of the next population is never less than the (average/worst) fitness of the current population (even without selection). the current population (even without selection).

  • This result degrades graciously as the landscape

becomes less concave (for increasing e).

  • This is a one-step result: it does not imply

convergence nor good performance.

9

slide-10
SLIDE 10

Research question

  • Is a general run-time analysis of evolutionary

algorithms on concave landscapes across representations possible?

  • Does convex search on concave landscapes have

exponentially better run-time than random exponentially better run-time than random search?

  • Refinement needed:

– Algorithm – Landscape – Performance

slide-11
SLIDE 11

Algorithm, Landscape & Performance Performance

slide-12
SLIDE 12

Abstract Convex Search Algorithm

  • Initialise Population Uniformly at Random
  • Until Population has converged to the same

individual

– Rank individuals on fitness – If there are at least two fitness values in the population, remove all individuals with the worst fitness – Apply k times Convex Hull Uniform Recombination to the remaining individuals to create the next population

  • Return individual in the last population
  • Parameter: population size k

This algorithm is formally well-defined for any metric and representation.

slide-13
SLIDE 13

Binary Convex Hull Recombination

  • The Specific Convexity on Binary Strings can be
  • btained by plugging in the Hamming distance on the

general notions of Abstract Convexity

  • Convex Sets Schemata
  • Convex Hull Smallest Schema Matching a set of

Binary Strings Convex Hull Smallest Schema Matching a set of Binary Strings

  • Uniform Convex Hull Recombination

At each position:

– If all parents have 1 or 0, the offspring has 1 or 0 respectively – If there is at least a 1 or at least a 0, the offspring has 1 or 0 with probability 0.5

slide-14
SLIDE 14

Abstract Quasi-Concave Landscape (Properties)

  • A landscape f is quasi-concave iff for all x,y

and z in [x,y]: f(z) >= min(f(x),f(y))

  • If f is quasi-concave: for all {x_i} and z in

co({x_i}): f(z)>= min{f(x_i)} co({x_i}): f(z)>= min{f(x_i)}

  • Level Set L_a: {all x in S: f(x) >= a}
  • A landscape f is quasi-concave iff all level sets

are convex sets

  • A landscape f is quasi-concave iff it is a “Tower
  • f Hanoi” of convex sets
slide-15
SLIDE 15

Polynomial Quasi-Concave Landscape

  • All fitness levels are convex sets
  • The number q of fitness levels is polynomial in

n (problem size, n = log(|S|)) The rate between areas of successive fitness

  • The rate between areas of successive fitness

levels |FL(i+1)|/|FL(i)|is 1/poly(n)

  • Parameters: q and r=min(|FL(i+1)|/|FL(i)|)
  • Example: LeadingOne is poly QC landscape,

Needle is QC landscape but not poly QC

slide-16
SLIDE 16

Performance

  • The algorithm does not converge to the optimum in

all runs

  • We are interested in:

– An upper-bound of the runtime when it converges (RT) – A lower-bound of the probability of convergence (PC) – A lower-bound of the probability of convergence (PC)

  • Multi-restart version:

– Repeat convex search until the optimum is first met – Expected run-time: RT/PC

  • Performance as a function of: n, k, q, r, and of the

underlying space (S,d)

slide-17
SLIDE 17

Pure Adaptive Search

slide-18
SLIDE 18

Pure Adaptive Search

  • Pick a initial point X_0 uniformly at random.
  • Generate X_(i+1) uniformly at random on the

level set S_i ={x: x in S and f(x)>= f(X_i)} (improving set). (improving set).

  • If optimum found stop. Otherwise repeat from

previous step.

slide-19
SLIDE 19

PAS remarks

  • Studied from the ’80 in the field of Global Optimisation

(mostly on continuous domains).

  • It is an ideal algorithm, in general not implementable

efficiently.

  • As PRS, the performance of PAS does not depend on
  • As PRS, the performance of PAS does not depend on

the structure of S but only on the distribution of f.

  • On almost all functions it is exponentially better than

Pure Random Search.

  • The result above holds also for relaxations of PAS which

are closer to implementable algorithms, e.g., Hesitant Adaptive Search.

slide-20
SLIDE 20

PRS vs. PAS (on poly QC landscapes)

  • L_0 <= L_1 <= … <= L_q
  • The shape of the level sets does not matter
  • HittingProb(PRS) =
  • HittingProb(PRS) =

Pr(L_0)*Pr(L_1|L_0)*…*Pr(L_q|L_(q-1)) = r^q

  • r= 1/poly(n), q=poly(n) r^q = 1/exp(n) RT(PRS) =

1/r^q = exp(n)

  • RT(PAS) = 1/Pr(L_0) + 1/Pr(L_1) + … +1/Pr(L_q)
  • RT(PAS) = q * 1/r = poly(n) * poly(n) = poly(n)
slide-21
SLIDE 21

Runtime of Convex Search (Sketch) (Sketch)

slide-22
SLIDE 22

RT of Convex Search

  • n Poly QC Landscapes
  • Initial Population: k points unif at random on S (L_0)
  • Effect of Selection: k*r points unif at random on L_1
  • Effect of Recombination:

– when co(sel(Pop))=L_1, (i.e., the convex hull of k*r points sampled at random in L_1 covers L_1) sampled at random in L_1 covers L_1) – k offspring unif at random on L_1

  • And so forth
  • The worst individual in the population conquers a new

fitness level at each iteration because selection increase the fitness of one level and recombination on concave landscapes keeps the minimum fitness of the parents. So RT= q * k.

slide-23
SLIDE 23

Success Probability

  • Each iteration can be seen as an iteration of PAS: k points are sampled

uniformly at random in the improving set (w.r.t. worst individual in the population)

  • We assumed that in a typical run:

– 1. (at least) the expected number of points (k*r) are selected – 2. the convex hull of selected points sampled at random in L_i covers L_i.

For continuous spaces event 2 has probability 0. For combinatorial

  • For continuous spaces event 2 has probability 0. For combinatorial

spaces this event has positive (and decent) probability.

  • E.g., for the Hamming space the worst case probability of covering any

convex set by m points sampled unif is the covering probability for the entire Hamming space. This happens when for each dimension a position of m binary strings has at least a zero and at least a one.

  • Success probability: probability that the events 1 and 2 occur at each

generation (q times).

  • For population k large enough, we get a good probability of success

(e.g., > 0.5). This requires 2 restarts to get to the optimum.

slide-24
SLIDE 24

Result Specialisation

  • The only space-dependent parameter in this

reasoning is the covering prob. (pr. of event 2)

  • We derived an expression of the success

probability as a function of the covering prob. probability as a function of the covering prob. valid for any space

  • We can determine the population size k for the

pair quasi-concave landscape & convex search when specialised to a new space as soon as we know the covering probability for that space

slide-25
SLIDE 25

Result Specialisation

  • Boolean Space & Hamming distance:

– with k ≥ 4 log(2(q + 2)n)/r Num Gen = 2q – on polynomial QC landscapes q and 1/r are poly(n) -> k and Num Gen are poly(n) Num Gen are poly(n) – for LeadingOnes: q=n, r=1/2 -> RT = n log n (better than any unary unbiased BB algorithm)

  • Integer Vectors with Hamming distance and Manhattan

distance (as a function of the cardinality of the alphabet)

– Easy to determine covering probability for product spaces

slide-26
SLIDE 26

Conclusions

slide-27
SLIDE 27

Summary

  • Put together a geometric theory of representations and runtime

analysis

  • Identified a class of landscapes (quasi-concave landscape) on

which a simplified EA (convex search) performs well across representations

  • Link with PAS: elicited the reason of the exponential speed-up on

PRS of convex search on poly qc landscapes PRS of convex search on poly qc landscapes

  • Suggested a more systematic approach to runtime analysis to
  • btain results that hold for classes of search algorithms on

classes of problems

  • Representation-independent: Run-time of a rep-ind algorithm on

a rep-ind class of landscapes

  • Specialisation: we only need the covering probability for a new

space

  • Corollary: n log n runtime for convex search on LeadingOnes
slide-28
SLIDE 28

Future work

  • Analysis for other spaces (calculation of probability of success):

– Permutations – Continuous – A systematic approach (convexity on graphs)

  • Analysis with other operators (less simplified algorithm):

– Standard truncation selection – Standard uniform crossover – Standard uniform crossover – Add mutation

  • Analysis of other landscapes (broader scope of landscape/robustness):

– e-approximated poly concavity – Average Concavity (encompassing Onemax) – What is the most general Poly Landscape Class for a given search algorithm?

  • Analysis of non-toy Combinatorial Problems (do real problems fit this

framework?):

– There is some hope: big valley hypothesis – How can they be shown analytically to fit the class of concave landscapes? – For NP-Hard problems: approximation, parameterized complexity