Black Box Search By Unbiased Variation Per Kristian Lehre and - - PowerPoint PPT Presentation
Black Box Search By Unbiased Variation Per Kristian Lehre and - - PowerPoint PPT Presentation
Black Box Search By Unbiased Variation Per Kristian Lehre and Carsten Witt CERCIA, University of Birmingham, UK DTU Informatics, Copenhagen, Denmark ThRaSH - March 24th 2010 State of the Art in Runtime Analysis of RSHs OneMax (1+1) EA O ( n
State of the Art in Runtime Analysis of RSHs
OneMax (1+1) EA O(n log n) [M¨ uhlenbein, 1992] (1+λ) EA O(λn + n log n) [Jansen et al., 2005] (µ+1) EA O(µn + n log n) [Witt, 2006] 1-ANT O(n2) w.h.p. [Neumann and Witt, 2006] (µ+1) IA O(µn + n log n) [Zarges, 2009] Linear Functions (1+1) EA Θ(n log n) [Droste et al., 2002] and [He and Yao, 2003] cGA Θ(n2+ε), ε > 0 const. [Droste, 2006]
- Max. Matching
(1+1) EA eΩ(n), PRAS [Giel and Wegener, 2003] Sorting (1+1) EA Θ(n2 log n) [Scharnow et al., 2002] SS Shortest Path (1+1) EA O(n3 log(nwmax)) [Baswana et al., 2009] MO (1+1) EA O(n3) [Scharnow et al., 2002] MST (1+1) EA Θ(m2 log(nwmax)) [Neumann and Wegener, 2007] (1+λ) EA O(nλ log(nwmax)), λ = ⌈ m2
n ⌉
[Neumann and Wegener, 2007] 1-ANT O(mn log(nwmax)) [Neumann and Witt, 2008]
- Max. Clique
(1+1) EA Θ(n5) [Storch, 2006] (rand. planar) (16n+1) RLS Θ(n5/3) [Storch, 2006] Eulerian Cycle (1+1) EA Θ(m2 log m) [Doerr et al., 2007] Partition (1+1) EA PRAS, avg. [Witt, 2005] Vertex Cover (1+1) EA eΩ(n), arb. bad approx. [Friedrich et al., 2007] and [Oliveto et al., 2007a] Set Cover (1+1) EA eΩ(n), arb. bad approx. [Friedrich et al., 2007] SEMO
- Pol. O(log n)-approx.
[Friedrich et al., 2007] Intersection of (1+1) EA 1/p-approximation in [Reichel and Skutella, 2008] p ≥ 3 matroids O(|E|p+2 log(|E|wmax)) UIO/FSM conf. (1+1) EA eΩ(n) [Lehre and Yao, 2007]
See survey [Oliveto et al., 2007b].
Motivation - A Theory of Randomised Search Heuristics
Computational Complexity
◮ Classification of problems according to inherent difficulty. ◮ Common limits on the efficiency of all algorithms. ◮ Assuming a particular model of computation.
Computational Complexity of Search Problems
◮ Polynomial-time Local Search [Johnson et al., 1988]. ◮ Black-Box Complexity [Droste et al., 2006].
Black Box Complexity
Function class F
Photo: E. Gerhard (1846). A f
[Droste et al., 2006]
Black Box Complexity
Function class F
Photo: E. Gerhard (1846).
f(x1), f(x2), f(x3), ... x1, x2, x3, ...
A f
[Droste et al., 2006]
Black Box Complexity
Function class F
Photo: E. Gerhard (1846).
f(x1), f(x2), f(x3), ..., f(xt) x1, x2, x3, ..., xt
A f
◮ Black box complexity on function class F
TF := min
A max f∈F TA,f
[Droste et al., 2006]
Results with old Model
◮ Very general model with few restrictions on resources. ◮ Example: Needle has BB complexity (2n + 1)/2. ◮ Some NP-hard problems have polynomial BB complexity. ◮ Artificially low BB complexity on example functions, e.g.
◮ n/ log(2n + 1) − 1 on OneMax ◮ n/2 − o(n) on LeadingOnes
Refined Black Box Model
Function class F
Photo: E. Gerhard (1846). A f
Refined Black Box Model
Function class F
Photo: E. Gerhard (1846).
f(x0)
A f
x0 f(x0)
Refined Black Box Model
Function class F
Photo: E. Gerhard (1846).
f(x0), f(x1) 0, 0
A f
x0 f(x0) x1 f(x1)
Refined Black Box Model
Function class F
Photo: E. Gerhard (1846).
f(x0), f(x1), f(x2) 0, 0, 2
A f
x0 f(x0) x1 f(x1) x2 f(x2)
Refined Black Box Model
Function class F
Photo: E. Gerhard (1846).
f(x0), f(x1), f(x2), f(x3) 0, 0, 2, 3
A f
x0 f(x0) x1 f(x1) x2 f(x2) x3 f(x3)
Refined Black Box Model
Function class F
Photo: E. Gerhard (1846).
f(x0), f(x1), f(x2), f(x3), f(x4), f(x5), f(x6) 0, 0, 2, 3, 0, 2
A f
x0 f(x0) x1 f(x1) x2 f(x2) x3 f(x3) x4 f(x4) x5 f(x5) x6 f(x6)
Refined Black Box Model
Function class F
Photo: E. Gerhard (1846). A f
x0 f(x0) x1 f(x1) x2 f(x2) x3 f(x3) x4 f(x4) x5 f(x5) x6 f(x6) ◮ Unbiased black box complexity on function class F
TF := min
A max f∈F TA,f
Unbiased Variation Operators 1
Encoding of solution by bitstring x = x1x2x3x4x5 x2 x1 x5 x4 x3
1Figure by Dake, available under a Creative Commons Attribution-Share Alike 2.5 Generic license.
Unbiased Variation Operators 1
Encoding of solution by bitstring x = x1x2x3x4x5 x3 x1 x2 x4 x5
1Figure by Dake, available under a Creative Commons Attribution-Share Alike 2.5 Generic license.
Unbiased Variation Operators 1
Encoding of solution by bitstring x = x1x2x3x4x5 x3 x1 x2 = 1 = ⇒ blue in! x4 = 1 = ⇒ orange in! x5
1Figure by Dake, available under a Creative Commons Attribution-Share Alike 2.5 Generic license.
Unbiased Variation Operators 1
Encoding of solution by bitstring x = x1x2x3x4x5 x3 x1 x2 = 1 = ⇒ blue in! x4 = 1 = ⇒ orange out! x5
1Figure by Dake, available under a Creative Commons Attribution-Share Alike 2.5 Generic license.
Unbiased Variation Operators p(y | x)
For any bitstrings x, y, z and permutation σ, we require 1) p(y | x) = p(y ⊕ z | x ⊕ z) 2) p(y | x) = p(yσ(1)yσ(2) · · · yσ(n) | xσ(1)xσ(2) · · · xσ(n)) → We consider unary operators, but higher arities possible. [Droste and Wiesmann, 2000, Rowe et al., 2007]
Unbiased Variation Operators
x y r x∗ Condition 1) and 2) imply Hamming-invariance.
Unbiased Black-Box Algorithm Scheme
1: t ← 0. 2: Choose x(t) uniformly at random from {0, 1}n. 3: repeat 4:
t ← t + 1.
5:
Compute f(x(t − 1)).
6:
I(t) ← (f(x(0)), ..., f(x(t − 1))).
7:
Depending on I(t), choose a prob. distr. ps on {0, ..., t − 1}.
8:
Randomly choose an index j according to ps.
9:
Depending on I(t), choose an unbiased variation op. pv(· | x(j)).
10:
Randomly choose a bitstring x(t) according to pv.
11: until termination condition met.
→ (µ + , λ) EA, simulated annealing, metropolis, RLS, any population size, any selection mechanism, steady state EAs, cellular EAs, ranked based mutation ...
Simple Unimodal Functions
Algorithm LeadingOnes (1+1) EA Θ(n2) (1+λ) EA Θ(n2 + λn) (µ+1) EA Θ(n2 + µn log n) BB Ω(n)
Simple Unimodal Functions
Algorithm LeadingOnes (1+1) EA Θ(n2) (1+λ) EA Θ(n2 + λn) (µ+1) EA Θ(n2 + µn log n) BB Ω(n)
Theorem
The expected runtime of any black box algorithm with unary, unbiased variation on LeadingOnes is Ω(n2).
Simple Unimodal Functions
Algorithm LeadingOnes (1+1) EA Θ(n2) (1+λ) EA Θ(n2 + λn) (µ+1) EA Θ(n2 + µn log n) BB Ω(n)
Theorem
The expected runtime of any black box algorithm with unary, unbiased variation on LeadingOnes is Ω(n2).
Proof idea
◮ Potential between n/2 and 3n/4. ◮ # 0-bits flipped hypergeometrically distributed. ◮ Lower bound by polynomial drift.
Escaping from Local Optima
|x| Jump(x) m
Escaping from Local Optima
|x| Jump(x) m
Theorem
For any m ≤ n(1 − ε)/2 with 0 < ε < 1, the expected runtime of any black box algorithm with unary, unbiased variation is at least
◮ 2cm with probability 1 − 2−Ω(m). ◮ n rm
cm with probability 1 − 2−Ω(m ln(n/(rm))). → These bounds are lower than the Θ(nm) bound for (1+1) EA!
Escaping from Local Optima
|x| Jump(x) m
Proof idea
◮ Simplified drift in gaps
- 1. Expectation of hypergeometric distribution.
- 2. Chv´
atal’s bound.
General Pseudo-boolean Functions
Algorithm OneMax (1+1) EA Θ(n log n) (1+λ) EA O(λn + n log n) (µ+1) EA O(µn + n log n) BB Ω(n/ log n)
General Pseudo-boolean Functions
Algorithm OneMax (1+1) EA Θ(n log n) (1+λ) EA O(λn + n log n) (µ+1) EA O(µn + n log n) BB Ω(n/ log n)
Theorem
The expected runtime of any black box search algorithm with unbiased, unary variation on any pseudo-boolean function with a single global optimum is Ω(n log n).
General Pseudo-boolean Functions
Algorithm OneMax (1+1) EA Θ(n log n) (1+λ) EA O(λn + n log n) (µ+1) EA O(µn + n log n) BB Ω(n/ log n)
Theorem
The expected runtime of any black box search algorithm with unbiased, unary variation on any pseudo-boolean function with a single global optimum is Ω(n log n).
Proof idea
◮ Expected multiplicative weight decrease. ◮ Chv´
atal’s bound.
Summary and Conclusion
◮ Refined black box model. ◮ Proofs are (relatively) easy! ◮ Comprises EAs never previously analysed. ◮ Ω(n log n) on general functions. ◮ Some bounds coincide with the runtime of (1+1) EA. ◮ Future work: k-ary variation operators for k > 1.
References I
Baswana, S., Biswas, S., Doerr, B., Friedrich, T., Kurur, P. P., and Neumann, F. (2009). Computing single source shortest paths using single-objective fitness. In FOGA ’09: Proceedings of the tenth ACM SIGEVO workshop on Foundations
- f genetic algorithms, pages 59–66, New York, NY, USA. ACM.
Doerr, B., Klein, C., and Storch, T. (2007). Faster evolutionary algorithms by superior graph representation. In Proceedings of the 1st IEEE Symposium on Foundations of Computational Intelligence (FOCI’2007), pages 245–250. Droste, S. (2006). A rigorous analysis of the compact genetic algorithm for linear functions. Natural Computing, 5(3):257–283. Droste, S., Jansen, T., and Wegener, I. (2002). On the analysis of the (1+1) Evolutionary Algorithm. Theoretical Computer Science, 276:51–81. Droste, S., Jansen, T., and Wegener, I. (2006). Upper and lower bounds for randomized search heuristics in black-box
- ptimization.
Theory of Computing Systems, 39(4):525–544.
References II
Droste, S. and Wiesmann, D. (2000). Metric based evolutionary algorithms. In Proceedings of Genetic Programming, European Conference, Edinburgh, Scotland, UK, April 15-16, 2000, Proceedings, volume 1802 of Lecture Notes in Computer Science, pages 29–43. Springer. Friedrich, T., Hebbinghaus, N., Neumann, F., He, J., and Witt, C. (2007). Approximating covering problems by randomized search heuristics using multi-objective models. In Proceedings of the 9th annual conference on Genetic and evolutionary computation (GECCO’2007), pages 797–804, New York, NY, USA. ACM Press. Giel, O. and Wegener, I. (2003). Evolutionary algorithms and the maximum matching problem. In Proceedings of the 20th Annual Symposium on Theoretical Aspects of Computer Science (STACS 2003), pages 415–426. He, J. and Yao, X. (2003). Towards an analytic framework for analysing the computation time of evolutionary algorithms. Artificial Intelligence, 145(1-2):59–97. Jansen, T., Jong, K. A. D., and Wegener, I. (2005). On the choice of the offspring population size in evolutionary algorithms. Evolutionary Computation, 13(4):413–440.
References III
Johnson, D. S., Papadimitriou, C. H., and Yannakakis, M. (1988). How easy is local search? Journal of Computer and System Sciences, 37(1):79–100. Lehre, P. K. and Yao, X. (2007). Runtime analysis of (1+1) EA on computing unique input output sequences. In Proceedings of 2007 IEEE Congress on Evolutionary Computation (CEC’2007), pages 1882–1889. IEEE Press. M¨ uhlenbein, H. (1992). How genetic algorithms really work I. Mutation and Hillclimbing. In Proceedings of the Parallel Problem Solving from Nature 2, (PPSN-II), pages 15–26. Elsevier. Neumann, F. and Wegener, I. (2007). Randomized local search, evolutionary algorithms, and the minimum spanning tree problem. Theoretical Computer Science, 378(1):32–40. Neumann, F. and Witt, C. (2006). Runtime analysis of a simple ant colony optimization algorithm. In Proceedings of The 17th International Symposium on Algorithms and Computation (ISAAC’2006), number 4288 in LNCS, pages 618–627.
References IV
Neumann, F. and Witt, C. (2008). Ant colony optimization and the minimum spanning tree problem. In Proceedings of Learning and Intelligent Optimization (LION’2008), pages 153–166. Oliveto, P. S., He, J., and Yao, X. (2007a). Evolutionary algorithms and the vertex cover problem. In In Proceedings of the IEEE Congress on Evolutionary Computation (CEC’2007). Oliveto, P. S., He, J., and Yao, X. (2007b). Time complexity of evolutionary algorithms for combinatorial optimization: A decade of results. International Journal of Automation and Computing, 4(1):100–106. Reichel, J. and Skutella, M. (2008). Evolutionary algorithms and matroid optimization problems. Algorithmica. Rowe, J. E., Vose, M. D., and Wright, A. H. (2007). Neighborhood graphs and symmetric genetic operators. In FOGA, pages 110–122.
References V
Scharnow, J., Tinnefeld, K., and Wegener, I. (2002). Fitness landscapes based on sorting and shortest paths problems. In Proceedings of 7th Conf. on Parallel Problem Solving from Nature (PPSN–VII), number 2439 in LNCS, pages 54–63. Storch, T. (2006). How randomized search heuristics find maximum cliques in planar graphs. In Proceedings of the 8th annual conference on Genetic and evolutionary computation (GECCO’2006), pages 567–574, New York, NY, USA. ACM Press. Witt, C. (2005). Worst-case and average-case approximations by simple randomized search heuristics. In In Proceedings of the 22nd Annual Symposium on Theoretical Aspects of Computer Science (STACS’05), number 3404 in LNCS, pages 44–56. Witt, C. (2006). Runtime Analysis of the (µ + 1) EA on Simple Pseudo-Boolean Functions. Evolutionary Computation, 14(1):65–86. Zarges, C. (2009). On the utility of the population size for inversely fitness proportional mutation rates. In FOGA ’09: Proceedings of the tenth ACM SIGEVO workshop on Foundations
- f genetic algorithms, pages 39–46, New York, NY, USA. ACM.