Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References
Ants, Mutants and Beyond Combining formal and stochastic techniques - - PowerPoint PPT Presentation
Ants, Mutants and Beyond Combining formal and stochastic techniques - - PowerPoint PPT Presentation
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References Ants, Mutants and Beyond Combining formal and stochastic techniques to improve software jerry.swan@york.ac.uk
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References
Overview
1 Nature-inspired Computing. 2 Search Based Software Engineering (SBSE). 3 Genetic Improvement (GI). 4 Automatic Improvement (AIP). 5 Industry perspective on SBSE — Douglas Carson (Keysight). 6 Hylas and Antbox — Zoltan A. Kocsis (U. Stirling).
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 3/26
Naturally-inspired Computing
Some classes of problem are well-understood and have efficient solution methods (e.g. the simplex algorithm for problems with linear objectives/constraints). However, in many cases we only have an operational description of the problem — calculating derivatives to drive a ‘conventional’ optimization method such as Newton’s method may not be easy (or even possible). In such cases, it is common to turn to a variety of ‘nature-inspired’ metaheuristic search methods: Genetic Algorithms, Ant-Colony Optimization etc.
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 4/26
General problem-solving with Metaheuristics
Metaheuristics are stochastic (a form of ‘generate-and-test’) and require only a few problem-specific ingredients: Solution representation: e.g. list of cities for the TSP. Objective function: measures the quality of a solution (e.g. tour length). Search operators: some means of mutating or (re)combining solutions to produce a new solution (e.g. swapping two cities).
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 5/26
Genetic Algorithms/Genetic Programming - I
Idea: generate/breed soutions and apply ‘survival of the fittest’ John H. Holland (1929-2015) John Koza
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 6/26
Genetic Algorithms/Genetic Programming - II
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 7/26
Search Based Software Engineering
There is a recent trend to tackle problems in software engineering using metaheuristics: Requirements Selection:
Objective function: cost, value, . . . Representation: bitvector of requirements.
Test case prioritization
Objective function: rate of coverage, time, faults, . . . Representation: permutation of the test suite.
Program Synthesis
Objective function: correctness, speed, power, memory . . . Representation: proof trees; expression trees; source code.
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 8/26
Pragmatics of Program Synthesis
Scalability remains an issue for program synthesis: We don’t yet know how to generate sizeable algorithms from scratch. Generative approaches such as GP still work best at the scale of expressions . . . . . . but human ingenuity already provides a vast repertoire
- f (abstract) algorithms and (concrete) programs . . .
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 9/26
Templar - Improving algorithms
The ‘Template Method’ Design Pattern1 divides an algorithm into a fixed skeleton and some variants. The fixed parts orchestrate the behaviour of the variants. Example: Quicksort performance depends on the pivot function, so we can treat it as a variant:
DoubleArray q s o r t ( DoubleArray a r r ) { double p i v o t = pivotFn( a r r ) ; // Implementation
- f
pivotFn can be v a r i e d g e n e r a t i v e l y return q s o r t ( a r r . f i l t e r ( < p i v o t ) ) ++ a r r . f i l t e r ( == p i v o t ) ++ q s o r t ( a r r . f i l t e r ( > p i v o t ) ) ; }
Expressing algorithms as templates allows us to learn good implementations for the variant parts.
1[Gamma, Helm, et al. 1995].
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 10/26
‘Template Method Hyper-heuristics’3
Templar2 is a JavaTM framework designed to make the generation of customized algorithms as simple as possible. Ingredients: A list of variation points describing the parts of the algorithm to be automatically generated. An algorithm template expressing the algorithm skeleton. The template produces a customized version of the algorithm from automatically-generated implementations of the variation points. An objective function to evaluate the customized algorithm. An algorithm factory that searches the space of variation points to produce an optimized version of the algorithm.
2[Swan and Burles 2015]. 3[Woodward and Swan 2014].
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 11/26
Hyper-quicksort: Optimizing for energy consumption
8 16 32 64 128 256 512 1024 2048 0.0625 0.25 1 4 16 64 256 input array size (log2 scale) Joules (log2 scale) Mid Sedgewick Random Hyper-quicksort
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 12/26
Hyper-Quicksort - Results
Array size Middle index Sedgewick Random Index Hyper-quicksort 8 0.191 0.163 0.446 0.094 16 0.296 0.345 0.410 0.173 32 0.651 0.757 0.967 0.410 64 1.366 1.145 1.708 0.976 128 3.505 4.034 5.221 2.341 256 8.175 7.646 9.269 6.387 512 19.777 21.391 27.685 15.268 1024 62.961 42.508 41.245 33.012 2048 198.438 132.663 111.894 70.234
Energy consumption (Joules) against input array size
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 13/26
Genetic Improvement (GI)
Addresses scalability by applying (a variant of) GP to pre-existing
- programs. It can be used to:
Fix bugs4 (maintanance dominates software lifecycle cost). Obtain a multi-objective trade-off between Non-Functional Properties5 (NFPs). Optimize/improve functional properties6.
4[Le Goues, Forrest, et al. 2013]. 5[Harman, Langdon, et al. 2012]. 6[Burles, Swan, et al. 2015].
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 14/26
Gen-O-Fix - Improving programs
A Scala framework for self-improving software systems7, i.e. it can improve a system as it runs. Performs both GIP and GP, rather than ‘plastic surgery’. Tight integration of compiler and improvement mechanism (via reflection) is more efficient and less brittle than existing approaches8. Callbacks to newly-generated functionality can also be injected into legacy Java code. Uses an actor-based approach for executing program variants.
7[Swan, Epitropakis, et al. 2014]. 8[Langdon and Harman 2013]; [Le Goues, Nguyen, et al. 2012].
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 15/26
Why Scala?
Supports expressive webservice frameworks (‘Hello, Web’ in 6 lines of code). Increasingly popular for concurrency support (Twitter core rewritten in Scala). Extensively used in industry:
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 16/26
Gen-O-Fix System Diagram
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 17/26
Gen-O-Fix example: Hotswapping webserver code
A stock-price predictor for shares9 in David Bowie10. Achieved via univariate symbolic regression . . .
- f a function extracted from web-application source code.
9*Not actual shares. 10*Not the actual David Bowie.
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 18/26
Automatic Improvement
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 19/26
AIP versus Formal Methods
Automatic Improvement Programming (AIP) provides (some of) the benefits of formal methods without the difficulty of writing formal specs. Formal Methods:
Require highly specialized developers. Hard to write large programs.
AIP:
Doesn’t need formal specs or tools with steep learning curve. Has no scalability issues since we search for transformable patterns in the source code.
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 20/26
AIP versus GI
Metaheuristic approaches to program improvement (e.g. GI):
Rely heavily on random perturbation /recombination. Can degrade program structure/correctness/explanatory power.
AIP:
Can use semantics-preserving transformations. Can come with an asymptotic guarantee of superiority.
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 21/26
Automatic Improvement Programming - 1
Well-known that implementing hashCode in Java is (often fatally) error-prone. Our analysis revealed 487 incorrect implementations in Apache Hadoop11. We repaired Hadoop by correcting hashCode implementations (semantics-preserving), whilst simultaneously improving on the efficiency of the incorrect version (generative).
11[Kocsis, Neumann, et al. 2014].
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 22/26
Automatic Improvement Programming - 2
PolyFunic uses Category Theory to replace stochastic Gen-O-Fix mutation operators with catamorphisms: This guarantees that the mutation is semantics-preserving. Trial study obtained an asymptotic improvement in efficiency12 (O(n) to O(1)).
Performance comparison of na¨ ıve and AIP-optimized code
12[Kocsis and Swan 2014].
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 23/26
Real-World Impact
£80K research income from: Dataductus: Funded the application of search combinators to hybrid search. BT: Funding research studentship in adaptive scheduling. Keysight: funded development of Hylas (one of only 30 such grants worldwide).
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 24/26
References I
[1] Nathan Burles, Jerry Swan, Edward Bowles, Alexander E. I. Brownlee, Zoltan A. Kocsis, and Nadarajen Veerapen. “Object-Oriented Genetic Improvement for Improved Energy Consumption in Google Guava”. In: Search-Based Software Engineering - 7th International Symposium, SSBSE 2015, Bergamo, Italy, September 5-7, 2015, Proceedings. 2015, pp. 255–261. doi: 10.1007/978-3-319-22183-0_20. [2] Jerry Swan and Nathan Burles. “Templar - A Framework for Template-Method Hyper-Heuristics”. In: Genetic Programming, LNCS 9025. Ed. by Penousal Machado et al. 2015. isbn: 978-3-319-16500-4. [3]
- Z. A. Kocsis, G. Neumann, J. Swan, M. G. Epitropakis, A. E. I. Brownlee,
- S. O. Haraldsson, and E. Bowles. “Repairing and optimizing Hadoop hashCode
implementations”. In: Symposium on Search-Based Software Engineering, Brazil, August 26 - 29. 2014. [4] Zoltan A. Kocsis and Jerry Swan. “Asymptotic Genetic Improvement Programming with Type Functors and Catamorphisms”. In: Parallel Problem Solving from Nature - PPSN XIV - 13th International Conference, Ljubljana, Slovenia, September 13-17, 2014, Proceedings. Ed. by Jurij ˇ
- Silc. Lecture Notes
in Computer Science. Springer, 2014.
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 25/26
References II
[5] Jerry Swan, Michael G. Epitropakis, and John R. Woodward. Gen-O-Fix: An embeddable framework for Dynamic Adaptive Genetic Improvement
- Programming. Tech. rep. CSM-195. Stirling FK9 4LA, Scotland: Computing
Science and Mathematics, University of Stirling, 2014, pp. 1–12. [6] John R. Woodward and Jerry Swan. “Template Method Hyper-heuristics”. In: Proceedings of the 2014 Conference Companion on Genetic and Evolutionary Computation Companion. GECCO Comp ’14. Vancouver, BC, Canada: ACM, 2014, pp. 1437–1438. isbn: 978-1-4503-2881-4. doi: 10.1145/2598394.2609843. url: http://doi.acm.org/10.1145/2598394.2609843. [7] William B. Langdon and Mark Harman. “Optimising Existing Software with Genetic Programming”. In: IEEE Transactions on Evolutionary Computation (2013). Accepted. issn: 1089-778X. doi: doi:10.1109/TEVC.2013.2281544. [8] Claire Le Goues, Stephanie Forrest, and Westley Weimer. “Current Challenges in Automatic Software Repair”. In: Software Quality Jornal 21 (3 2013),
- pp. 421–443.
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References 26/26