Ants, Mutants and Beyond Combining formal and stochastic techniques - - PowerPoint PPT Presentation

ants mutants and beyond
SMART_READER_LITE
LIVE PREVIEW

Ants, Mutants and Beyond Combining formal and stochastic techniques - - PowerPoint PPT Presentation

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References Ants, Mutants and Beyond Combining formal and stochastic techniques to improve software jerry.swan@york.ac.uk


slide-1
SLIDE 1

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

“Ants, Mutants and Beyond”

Combining formal and stochastic techniques to improve software

jerry.swan@york.ac.uk douglas carson@keysight.com zoltan.kocsis@cs.stir.ac.uk October 12, 2015

slide-2
SLIDE 2

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

Overview

1 Nature-inspired Computing. 2 Search Based Software Engineering (SBSE). 3 Genetic Improvement (GI). 4 Automatic Improvement (AIP). 5 Industry perspective on SBSE — Douglas Carson (Keysight). 6 Hylas and Antbox — Zoltan A. Kocsis (U. Stirling).

slide-3
SLIDE 3

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 3/26

Naturally-inspired Computing

Some classes of problem are well-understood and have efficient solution methods (e.g. the simplex algorithm for problems with linear objectives/constraints). However, in many cases we only have an operational description of the problem — calculating derivatives to drive a ‘conventional’ optimization method such as Newton’s method may not be easy (or even possible). In such cases, it is common to turn to a variety of ‘nature-inspired’ metaheuristic search methods: Genetic Algorithms, Ant-Colony Optimization etc.

slide-4
SLIDE 4

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 4/26

General problem-solving with Metaheuristics

Metaheuristics are stochastic (a form of ‘generate-and-test’) and require only a few problem-specific ingredients: Solution representation: e.g. list of cities for the TSP. Objective function: measures the quality of a solution (e.g. tour length). Search operators: some means of mutating or (re)combining solutions to produce a new solution (e.g. swapping two cities).

slide-5
SLIDE 5

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 5/26

Genetic Algorithms/Genetic Programming - I

Idea: generate/breed soutions and apply ‘survival of the fittest’ John H. Holland (1929-2015) John Koza

slide-6
SLIDE 6

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 6/26

Genetic Algorithms/Genetic Programming - II

slide-7
SLIDE 7

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 7/26

Search Based Software Engineering

There is a recent trend to tackle problems in software engineering using metaheuristics: Requirements Selection:

Objective function: cost, value, . . . Representation: bitvector of requirements.

Test case prioritization

Objective function: rate of coverage, time, faults, . . . Representation: permutation of the test suite.

Program Synthesis

Objective function: correctness, speed, power, memory . . . Representation: proof trees; expression trees; source code.

slide-8
SLIDE 8

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 8/26

Pragmatics of Program Synthesis

Scalability remains an issue for program synthesis: We don’t yet know how to generate sizeable algorithms from scratch. Generative approaches such as GP still work best at the scale of expressions . . . . . . but human ingenuity already provides a vast repertoire

  • f (abstract) algorithms and (concrete) programs . . .
slide-9
SLIDE 9

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 9/26

Templar - Improving algorithms

The ‘Template Method’ Design Pattern1 divides an algorithm into a fixed skeleton and some variants. The fixed parts orchestrate the behaviour of the variants. Example: Quicksort performance depends on the pivot function, so we can treat it as a variant:

DoubleArray q s o r t ( DoubleArray a r r ) { double p i v o t = pivotFn( a r r ) ; // Implementation

  • f

pivotFn can be v a r i e d g e n e r a t i v e l y return q s o r t ( a r r . f i l t e r ( < p i v o t ) ) ++ a r r . f i l t e r ( == p i v o t ) ++ q s o r t ( a r r . f i l t e r ( > p i v o t ) ) ; }

Expressing algorithms as templates allows us to learn good implementations for the variant parts.

1[Gamma, Helm, et al. 1995].

slide-10
SLIDE 10

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 10/26

‘Template Method Hyper-heuristics’3

Templar2 is a JavaTM framework designed to make the generation of customized algorithms as simple as possible. Ingredients: A list of variation points describing the parts of the algorithm to be automatically generated. An algorithm template expressing the algorithm skeleton. The template produces a customized version of the algorithm from automatically-generated implementations of the variation points. An objective function to evaluate the customized algorithm. An algorithm factory that searches the space of variation points to produce an optimized version of the algorithm.

2[Swan and Burles 2015]. 3[Woodward and Swan 2014].

slide-11
SLIDE 11

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 11/26

Hyper-quicksort: Optimizing for energy consumption

8 16 32 64 128 256 512 1024 2048 0.0625 0.25 1 4 16 64 256 input array size (log2 scale) Joules (log2 scale) Mid Sedgewick Random Hyper-quicksort

slide-12
SLIDE 12

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 12/26

Hyper-Quicksort - Results

Array size Middle index Sedgewick Random Index Hyper-quicksort 8 0.191 0.163 0.446 0.094 16 0.296 0.345 0.410 0.173 32 0.651 0.757 0.967 0.410 64 1.366 1.145 1.708 0.976 128 3.505 4.034 5.221 2.341 256 8.175 7.646 9.269 6.387 512 19.777 21.391 27.685 15.268 1024 62.961 42.508 41.245 33.012 2048 198.438 132.663 111.894 70.234

Energy consumption (Joules) against input array size

slide-13
SLIDE 13

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 13/26

Genetic Improvement (GI)

Addresses scalability by applying (a variant of) GP to pre-existing

  • programs. It can be used to:

Fix bugs4 (maintanance dominates software lifecycle cost). Obtain a multi-objective trade-off between Non-Functional Properties5 (NFPs). Optimize/improve functional properties6.

4[Le Goues, Forrest, et al. 2013]. 5[Harman, Langdon, et al. 2012]. 6[Burles, Swan, et al. 2015].

slide-14
SLIDE 14

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 14/26

Gen-O-Fix - Improving programs

A Scala framework for self-improving software systems7, i.e. it can improve a system as it runs. Performs both GIP and GP, rather than ‘plastic surgery’. Tight integration of compiler and improvement mechanism (via reflection) is more efficient and less brittle than existing approaches8. Callbacks to newly-generated functionality can also be injected into legacy Java code. Uses an actor-based approach for executing program variants.

7[Swan, Epitropakis, et al. 2014]. 8[Langdon and Harman 2013]; [Le Goues, Nguyen, et al. 2012].

slide-15
SLIDE 15

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 15/26

Why Scala?

Supports expressive webservice frameworks (‘Hello, Web’ in 6 lines of code). Increasingly popular for concurrency support (Twitter core rewritten in Scala). Extensively used in industry:

slide-16
SLIDE 16

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 16/26

Gen-O-Fix System Diagram

slide-17
SLIDE 17

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 17/26

Gen-O-Fix example: Hotswapping webserver code

A stock-price predictor for shares9 in David Bowie10. Achieved via univariate symbolic regression . . .

  • f a function extracted from web-application source code.

9*Not actual shares. 10*Not the actual David Bowie.

slide-18
SLIDE 18

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 18/26

Automatic Improvement

slide-19
SLIDE 19

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 19/26

AIP versus Formal Methods

Automatic Improvement Programming (AIP) provides (some of) the benefits of formal methods without the difficulty of writing formal specs. Formal Methods:

Require highly specialized developers. Hard to write large programs.

AIP:

Doesn’t need formal specs or tools with steep learning curve. Has no scalability issues since we search for transformable patterns in the source code.

slide-20
SLIDE 20

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 20/26

AIP versus GI

Metaheuristic approaches to program improvement (e.g. GI):

Rely heavily on random perturbation /recombination. Can degrade program structure/correctness/explanatory power.

AIP:

Can use semantics-preserving transformations. Can come with an asymptotic guarantee of superiority.

slide-21
SLIDE 21

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 21/26

Automatic Improvement Programming - 1

Well-known that implementing hashCode in Java is (often fatally) error-prone. Our analysis revealed 487 incorrect implementations in Apache Hadoop11. We repaired Hadoop by correcting hashCode implementations (semantics-preserving), whilst simultaneously improving on the efficiency of the incorrect version (generative).

11[Kocsis, Neumann, et al. 2014].

slide-22
SLIDE 22

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 22/26

Automatic Improvement Programming - 2

PolyFunic uses Category Theory to replace stochastic Gen-O-Fix mutation operators with catamorphisms: This guarantees that the mutation is semantics-preserving. Trial study obtained an asymptotic improvement in efficiency12 (O(n) to O(1)).

Performance comparison of na¨ ıve and AIP-optimized code

12[Kocsis and Swan 2014].

slide-23
SLIDE 23

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 23/26

Real-World Impact

£80K research income from: Dataductus: Funded the application of search combinators to hybrid search. BT: Funding research studentship in adaptive scheduling. Keysight: funded development of Hylas (one of only 30 such grants worldwide).

slide-24
SLIDE 24

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 24/26

References I

[1] Nathan Burles, Jerry Swan, Edward Bowles, Alexander E. I. Brownlee, Zoltan A. Kocsis, and Nadarajen Veerapen. “Object-Oriented Genetic Improvement for Improved Energy Consumption in Google Guava”. In: Search-Based Software Engineering - 7th International Symposium, SSBSE 2015, Bergamo, Italy, September 5-7, 2015, Proceedings. 2015, pp. 255–261. doi: 10.1007/978-3-319-22183-0_20. [2] Jerry Swan and Nathan Burles. “Templar - A Framework for Template-Method Hyper-Heuristics”. In: Genetic Programming, LNCS 9025. Ed. by Penousal Machado et al. 2015. isbn: 978-3-319-16500-4. [3]

  • Z. A. Kocsis, G. Neumann, J. Swan, M. G. Epitropakis, A. E. I. Brownlee,
  • S. O. Haraldsson, and E. Bowles. “Repairing and optimizing Hadoop hashCode

implementations”. In: Symposium on Search-Based Software Engineering, Brazil, August 26 - 29. 2014. [4] Zoltan A. Kocsis and Jerry Swan. “Asymptotic Genetic Improvement Programming with Type Functors and Catamorphisms”. In: Parallel Problem Solving from Nature - PPSN XIV - 13th International Conference, Ljubljana, Slovenia, September 13-17, 2014, Proceedings. Ed. by Jurij ˇ

  • Silc. Lecture Notes

in Computer Science. Springer, 2014.

slide-25
SLIDE 25

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 25/26

References II

[5] Jerry Swan, Michael G. Epitropakis, and John R. Woodward. Gen-O-Fix: An embeddable framework for Dynamic Adaptive Genetic Improvement

  • Programming. Tech. rep. CSM-195. Stirling FK9 4LA, Scotland: Computing

Science and Mathematics, University of Stirling, 2014, pp. 1–12. [6] John R. Woodward and Jerry Swan. “Template Method Hyper-heuristics”. In: Proceedings of the 2014 Conference Companion on Genetic and Evolutionary Computation Companion. GECCO Comp ’14. Vancouver, BC, Canada: ACM, 2014, pp. 1437–1438. isbn: 978-1-4503-2881-4. doi: 10.1145/2598394.2609843. url: http://doi.acm.org/10.1145/2598394.2609843. [7] William B. Langdon and Mark Harman. “Optimising Existing Software with Genetic Programming”. In: IEEE Transactions on Evolutionary Computation (2013). Accepted. issn: 1089-778X. doi: doi:10.1109/TEVC.2013.2281544. [8] Claire Le Goues, Stephanie Forrest, and Westley Weimer. “Current Challenges in Automatic Software Repair”. In: Software Quality Jornal 21 (3 2013),

  • pp. 421–443.
slide-26
SLIDE 26

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 26/26

References III

[9] Mark Harman, William B. Langdon, Yue Jia, David Robert White, Andrea Arcuri, and John A. Clark. “The GISMOE challenge: constructing the pareto program surface using genetic programming to find better programs.” In: ASE. Ed. by Michael Goedicke, Tim Menzies, and Motoshi Saeki. ACM, 2012, pp. 1–14. isbn: 978-1-4503-1204-2. [10] Claire Le Goues, ThanhVu Nguyen, Stephanie Forrest, and Westley Weimer. “GenProg: A Generic Method for Automatic Software Repair”. In: IEEE Transactions on Software Engineering 38 (2012), pp. 54–72. issn: 0098-5589. doi: http://doi.ieeecomputersociety.org/10.1109/TSE.2011.104. [11] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design patterns: elements of reusable object-oriented software. Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc., 1995. isbn: 0-201-63361-2.