CREST
Genetic Improvement
Justyna Petke
Centre for Research in Evolution, Search and Testing University College London
Genetic Improvement Justyna Petke C entre for R esearch in E - - PowerPoint PPT Presentation
Genetic Improvement Justyna Petke C entre for R esearch in E volution, S earch and T esting University College London CREST Thank you Yue Jia Alexandru Marginean Mark Harman CREST Genetic Improvement Justyna Petke What does the word
CREST
Justyna Petke
Centre for Research in Evolution, Search and Testing University College London
CREST
Justyna Petke Genetic Improvement
Mark Harman Yue Jia Alexandru Marginean
CREST
Justyna Petke Genetic Improvement
“a person who makes calculations, especially with a calculating machine.”
“The term "computer", in use from the mid 17th century, meant "one who computes": a person performing mathematical calculations.”
CREST
Justyna Petke Genetic Improvement
“a person who makes calculations, especially with a calculating machine.”
“The term "computer", in use from the mid 17th century, meant "one who computes": a person performing mathematical calculations.”
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
Memory Execution Time Battery Size Bandwidth functionality of the Program
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
Functional Requirements Non-Functional Requirements
CREST
Justyna Petke Genetic Improvement
Functional Requirements Non-Functional Requirements
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
500,000,000 LoC
before one is writing unique code
CREST
Justyna Petke Genetic Improvement
500,000,000 LoC
before one is writing unique code
A study of the uniqueness of source code. (FSE 2010)
CREST
Justyna Petke Genetic Improvement
after one line changes up to 89% of programs that compile run without error
CREST
Justyna Petke Genetic Improvement
after one line changes up to 89% of programs that compile run without error
Software is Not Fragile. (CS-DC 2015)
CREST
Justyna Petke
CREST
Justyna Petke
Can we improve the efficiency of an already highly-optimised piece of software using genetic programming?
Genetic Improvement Justyna Petke
Introduction of multi-donor software transplantation Use of genetic improvement as means to specialise software
Genetic Improvement Justyna Petke
Genetic Improvement Justyna Petke
Changes at the level of lines of source code Each individual is composed of a list of changes Specialised grammar used to preserve syntax
Genetic Improvement Justyna Petke
Justyna Petke
GP has access to both:
code bank contains all lines of source code GP has access to
Genetic Improvement Justyna Petke
Addition of one of the following operations: delete copy replace
Genetic Improvement Justyna Petke
Justyna Petke
Concatenation of two individuals by appending two lists of mutations
Justyna Petke
Based on solution quality and Efficiency in terms of lines of source code Avoids environmental bias
Genetic Improvement Justyna Petke
T est cases are sorted into groups One test case is sampled uniformly from each group Avoids overfitting
Genetic Improvement Justyna Petke
Fixed number of generations Fixed population size Initial population contains single-mutation individuals
Genetic Improvement Justyna Petke
T
Based on a threshold fitness value Mutation and Crossover applied
Genetic Improvement Justyna Petke
Genetic Improvement Justyna Petke
Mutations in best individuals are often independent Greedy approach used to combine best individuals
Genetic Improvement Justyna Petke
Can we improve the efficiency of an already highly-optimised piece of software using genetic programming?
Genetic Improvement Justyna Petke
Boolean satisfiability (SAT) example: x1 ∨ x2 ∨ ¬x4 ¬x2 ∨ ¬x3
Genetic Improvement Justyna Petke
Bounded Model Checking Planning Software Verification Automatic Test Pattern Generation Combinational Equivalence Checking Combinatorial Interaction Testing and many other applications..
Genetic Improvement Justyna Petke
MiniSAT
Genetic Improvement Justyna Petke
Can we evolve a version of the MiniSAT solver that is faster than any
Genetic Improvement Justyna Petke
Solvers used: MiniSAT2-070721 T est cases used: ∼ 2.5% improvement for general benchmarks (SSBSE’13)
Genetic Improvement Justyna Petke
MiniSAT
Genetic Improvement Justyna Petke
Can we evolve a version of the MiniSAT solver that is faster than any
problem class?
Genetic Improvement Justyna Petke
Solvers used: MiniSAT2-070721 T est cases used: from Combinatorial Interaction Testing field
Genetic Improvement Justyna Petke
Use of SAT
SAT benchmarks containing millions of clauses It takes hours to days to generate a CIT test suite using SAT
Genetic Improvement Justyna Petke
Host program: MiniSAT2-070721 (478 lines in main algorithm) Donor programs: MiniSAT
MiniSAT
Genetic Improvement Justyna Petke
Solver Donor Lines Seconds MiniSAT (original) — 1.00 1.00 MiniSAT
— 1.46 1.76 MiniSAT
— 0.72 0.87 MiniSAT
— 1.26 1.63
Genetic Improvement Justyna Petke
How much runtime improvement can we achieve?
Genetic Improvement Justyna Petke
Solver Donor Lines Seconds MiniSAT (original) — 1.00 1.00 MiniSAT
— 1.46 1.76 MiniSAT
— 0.72 0.87 MiniSAT
— 1.26 1.63 MiniSAT
best09 0.93 0.95
Genetic Improvement Justyna Petke
Donor: best09 13 delete, 9 replace, 1 copy Among changes: 3 assertions removed 1 deletion on variable used for statistics
Genetic Improvement Justyna Petke
Mainly if and for statements switched off Decreased iteration count in for loops
Genetic Improvement Justyna Petke
Solver Donor Lines Seconds MiniSAT (original) — 1.00 1.00 MiniSAT
— 1.46 1.76 MiniSAT
— 0.72 0.87 MiniSAT
— 1.26 1.63 MiniSAT
best09 0.93 0.95 MiniSAT
bestCIT 0.72 0.87
Genetic Improvement Justyna Petke
Donor: bestCIT 1 delete, 1 replace Among changes: 1 assertion deletion 1 replace operation triggers 95% of donor code
Genetic Improvement Justyna Petke
Solver Donor Lines Seconds MiniSAT (original) — 1.00 1.00 MiniSAT
— 1.46 1.76 MiniSAT
— 0.72 0.87 MiniSAT
— 1.26 1.63 MiniSAT
best09 0.93 0.95 MiniSAT
bestCIT 0.72 0.87 MiniSAT
best09+bestCIT 0.94 0.96
Genetic Improvement Justyna Petke
Donor: best09+bestCIT 50 delete, 20 replace, 5 copy Among changes: 5 assertions removed 4 semantically equivalent replacements 3 operations used for statistics removed ∼ half of the mutations remove dead code
Genetic Improvement Justyna Petke
Solver Donor Lines Seconds MiniSAT (original) — 1.00 1.00 MiniSAT
— 1.46 1.76 MiniSAT
— 0.72 0.87 MiniSAT
— 1.26 1.63 MiniSAT
best09 0.93 0.95 MiniSAT
bestCIT 0.72 0.87 MiniSAT
best09+bestCIT 0.94 0.96 MiniSAT
best09+bestCIT 0.54 0.83
Genetic Improvement Justyna Petke
Combining results: 37 delete, 15 replace, 4 copy 56 out of 100 mutations used Among changes: 8 assertion removed 95% of the bestCIT donor code executed
Genetic Improvement Justyna Petke
Introduced multi-donor software transplantation Used genetic improvement as means to specialise software Achieved 17% runtime improvement on MiniSAT for the Combinatorial Interaction Testing domain by combining best individuals
Genetic Improvement Justyna Petke
CREST
Justyna Petke Genetic Improvement GP Progra ms Progra ms Progra ms MiniSat Improv Non-functional property Test Fitne Test Sensitivity Analysis
Justyna Petke, Mark Harman, William B. Langdon and Westley Weimer Using Genetic Improvement & Code Transplants to Specialise a C++ program to a Problem Class (EuroGP’14)
Multi-doner transplant Specialized for CIT 17% faster
MiniSat MiniSat MiniSat
v1 v2 vn
G E C C O H u m i e s i l v e r m e d a l
CREST
Justyna Petke Genetic Improvement Bowtie 2 GP Progra ms Progra ms Progra ms Bowtie 2 Non-functional property Test Fitne Test Sensitivity Analysis
Optimising Existing Software with Genetic Programming. TEC 2015
70 times faster 30+ interventions HC clean up: 7 slight semantic improvement
CREST
Justyna Petke Genetic Improvement Cuda GP Progra ms Progra ms Progra ms Cuda Improv Non-functional property Test Fitne Test Sensitivity Analysis
Genetically Improved CUDA C++ Software, EuroGP 2014
7 times faster updated for new hardware automated updating
CREST
Justyna Petke Genetic Improvement GP Non-functional property Test Fitne Test Sensitivity Analysis
System malloc System
malloc
Fan Wu, Westley Weimer, Mark Harman, Yue Jia and Jens Krinke Deep Parameter Optimisation Conference on Genetic and Evolutionary Computation (GECCO 2015)
Improve execution time by 12% or achieve a 21% memory consumption reduction
CREST
Justyna Petke Genetic Improvement GP Non-functional property Test Fitne Test Sensitivity Analysis
Ensemble AProVE MiniSat
CIT
MiniSat MiniSat Ensemble AProVE Improved MiniSat
CIT
Improved MiniSat Improved MiniSat
Bobby R. Bruce Justyna Petke Mark Harman Reducing Energy Consumption Using Genetic Improvement Conference on Genetic and Evolutionary Computation (GECCO 2015)
Energy consumption can be reduced by as much as 25%
CREST
Justyna Petke Genetic Improvement GP Non-functional property Test Fitne Test
GP Non-functional property Test Fitne Test Sensitivi ty
Feature
Grow Graft
Human Knowledge
Feature
Host System
Mark Harman, Yue Jia and Bill Langdon, Babel Pidgin: SBSE can grow and graft entirely new functionality into a real world system Symposium on Search-Based Software Engineering SSBSE 2014. (Challenge track)
C h a l l e n g e T r a c k A w a r d
CREST
Justyna Petke Genetic Improvement GP Non-functional property Test Fitne Test Sensitivity Analysis
Earl T. Barr, Mark Harman, Yue Jia, Alexandru Marginean, and Justyna Petke Automated Software Transplantation (ISSTA 2015)
Donor Host feature Host’ feature
Successfully autotransplanted new functionality and passed all regression tests for 12 out of 15 real world systems
CREST
Justyna Petke Genetic Improvement
E.T. Barr, M. Harman,
ACM Distinguished Paper Award at ISSTA 2015 coverage in article in 2647 shares of
Video Player Start from scratch
Check open source repositories Why not handle H.264? ~100 players
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
Host Donor Organ Organ ENTRY
V
Organ Test Suite
Manual Work:
Organ Entry Organ’s Test Suite Implantation Point
CREST
Justyna Petke Genetic Improvement
Host Donor Stage 1: Static Analysis Host Beneficiary Stage 2: Genetic Programming Stage 3: Organ Implantation Organ Test Suite
Implantation Point Organ Entry
CREST
Justyna Petke Genetic Improvement
Donor OE ENTRY
Vein Organ
Matching Table Dependency Graph Host Implantation Point Stm: x = 10; -> Decl: int x; Donor: int X -> Host: int A, B, C
CREST
Justyna Petke Genetic Improvement
S1 S2 S3 S4 S5 … Sn
Matching Table V3H V4H
Donor Variable ID Host Variable ID (set)
V1D V2D … V1H V2H V5H Individual
Var Matching Stateme nts
V1D V1H V2D V4H S1 S7 S73 M1: M2: … Genetic Programming
fitness(i) =
3(1 + |T Xi| |T |
+ |T Pi|
|T | )
i ∈ IC i / ∈ IC
Weak Proxies: Does it execute test cases without crashing? Does it compile? Strong Proxies: Does it produce the correct output?
CREST
Justyna Petke Genetic Improvement
Host Organ Donor Do we break the initial functionality? Have we really added new functionality? How about the computational effort? Is autotransplantation useful?
Regression Tests Acceptance Tests
CREST
Justyna Petke Genetic Improvement
Do we break the initial functionality? How about the computational effort? Is autotransplantation useful? Have we really added new functionality? Empirical Study 15 Transplantations 300 Runs 5 Donors 3 Hosts Case Study: H.264 Encoding Transplantation
CREST
Justyna Petke Genetic Improvement
Regression Tests Augmented Regression Tests Donor Acceptance Tests Acceptance Tests Manual Validation Host Beneficiary
CREST
Justyna Petke Genetic Improvement
Minimal size: 0.4k Max size: 422k Average Donor:16k Average Host: 213k
Subjects Type Size KLOC Idct Donor 2.3 Mytar Donor 0.4 Cflow Donor 25 Webserver Donor 1.7 TuxCrypt Donor 2.7 Pidgin Host 363 Cflow Host 25 SoX Host 43 Case Study x264 Donor 63 VLC Host 422
C*
E v a l u a t e d
* I S S T A *
A r t i f a c t
* A E C
CREST
Justyna Petke Genetic Improvement
μSCALPEL Host Implantation Point Donor OE Organ Test Suite
Host Beneficiary
Implantation Point Organ64 bit Ubuntu 14.10 16 GB RAM 8 threads
CREST
Justyna Petke Genetic Improvement
Donor Host All Passed Regression Regression++ Acceptance Idct Pidgin 16 20 17 16 Mytar Pidgin 16 20 18 20 Web Pidgin 20 18 Cflow Pidgin 15 20 15 16 Tux Pidgin 15 20 17 16 Idct Cflow 16 17 16 16 Mytar Cflow 17 17 17 20 Web Cflow 17 Cflow Cflow 20 20 20 20 Tux Cflow 14 15 14 16 Idct SoX 15 18 17 16 Mytar SoX 17 17 17 20 Web SoX 17 Cflow SoX 14 16 15 14 Tux SoX 13 13 13 14 TOTAL 188/300 233/300 196/300 256/300
*
E v a l u a t e d
* I S S T A *
A r t i f a c t
* A E C
CREST
Justyna Petke Genetic Improvement
in 12 out of 15 experiments we successfully autotransplanted new functionality
Execution Time (minutes) Donor Host Idct Pidgin 5 7 97 Mytar Pidgin 3 1 65 Web Pidgin 8 5 160 Cflow Pidgin 58 16 1151 Tux Pidgin 29 10 574 Idct Cflow 3 5 59 Mytar Cflow 3 1 53 Web Cflow 5 2 102 Cflow Cflow 44 9 872 Tux Cflow 31 11 623 Idct SoX 12 17 233 Mytar SoX 3 1 60 Web SoX 7 3 132 Cflow SoX 89 53 74 Tux SoX 34 13 94 Total
Average 334 (min)
Total 72 (hours) 10 (Average)
C*
E v a l u a t e d
* I S S T A *
A r t i f a c t
* A E C
CREST
Justyna Petke Genetic Improvement
Transplant Time & Test Suites Time (hours) Regression Regression++ Acceptance H.264 26 100% 100% 100%
C*
E v a l u a t e d
* I S S T A *
A r t i f a c t
* A E C
VLC H264
CREST
Justyna Petke Genetic Improvement
within 26 hours performed a task that took developers an avg of 20 days of elapsed time
Automated Software Transplantation
H D O ENTRY
V
Organ’s Test Suite
Manual Work:
Organ Entry Organ’s Test Suite Implantation Point
Alexandru Marginean — Automated Software Transplantation
μTrans
Alexandru Marginean — Automated Software Transplantation
Host Donor Stage 1: Static Analysis Host Beneficiary Stage 2: Genetic Programming Stage 3: Organ Implantation Organ’s Test Suite
Validation
Alexandru Marginean — Automated Software Transplantation
Regression Tests Augmented Regression Tests Host Beneficiary Donor Acceptance Tests Acceptance Tests Manual Validation RQ1.a RQ1.b RQ2
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke Genetic Improvement
* http://crest.cs.ucl.ac.uk/autotransplantation/MuScalpel.html
CREST
Justyna Petke Genetic Improvement
Bug fixing
CREST
Justyna Petke Genetic Improvement
* http://dijkstra.cs.virginia.edu/genprog/ * http://people.csail.mit.edu/fanl/
Kali, SPR, ClearView
(from MIT)
and other …
Claire Le Goues, Stephanie Forrest, Westley Weimer: Current challenges in automatic software repair.
Software Quality Journal 21(3): 421-443 (2013)
CREST
Justyna Petke Genetic Improvement
Bug fixing Improving energy consumption Porting old code to new hardware Grafting new functionality into an existing system Specialising software for a particular problem class Other
CREST
Justyna Petke Genetic Improvement
GI4GI: Improving Genetic Improvement Fitness Functions Mark Harman & Justyna Petke (Genetic Improvement Workshop 2015)
CREST
Justyna Petke Genetic Improvement
many factors affecting energy consumption, including: screen behaviour memory access device communications CPU utilisation
CREST
Justyna Petke Genetic Improvement
a hardware-dependent linear energy model for GI:
Post-compiler software optimization for reducing energy (ASPLOS’14) Schulte et al.
CREST
Justyna Petke Genetic Improvement
Idea: Use GI to evolve a fitness function f for energy consumption. Use f to improve energy consumption of software.
CREST
Justyna Petke Genetic Improvement
CREST
Justyna Petke
CREST
Justyna Petke Genetic Improvement
1st International Genetic Improvement Workshop
at GECCO 2015, Madrid, Spain www.geneticimprovementofsoftware.com
CREST
Justyna Petke Genetic Improvement
Special Issue
Special Session on GI *http://www.wcci2016.org/
CREST
Justyna Petke Genetic Improvement
Functional Requirements Non-Functional Requirements
CREST
Justyna Petke Genetic Improvement
Search Based Optimisation Software Engineering
S B S E
Genetic Improvement Combinatorial Interaction Testing
CREST
Justyna Petke
CREST
Justyna Petke
CREST
Justyna Petke Contact me if you want to visit CREST: j.petke at ucl.ac.uk Centre for Research on Evolution, Search and Testing University College London
CREST
Justyna Petke
20 mins walk
National Gallery Nelson’s Column Eros RoyalCourts
Tate ModernGlobe Theatre Covent Garden Market Westminster Abbey House of Parliament London Eye British Museum Madame Tussaud’s Sherlock Holmes Museum Marble Arch National History Museum
CREST
Justyna Petke
CREST Open Workshop Roughly one per month Discussion based Recorded and archived
http://crest.cs.ucl.ac.uk/cow/
CREST
Justyna Petke
http://crest.cs.ucl.ac.uk/cow/
CREST
Justyna Petke
http://crest.cs.ucl.ac.uk/cow/ #Total Registrations 1512 #Unique Attendees 667 #Unique Institutions 244 #Countries 43 #Talks 421 (Last updated on November 4, 2015)
CREST
Justyna Petke
CREST Open Workshop (COW)
CREST
Justyna Petke
http://crest.cs.ucl.ac.uk/cow/
25-26 January 2016
CREST
Justyna Petke
EPSRC Grant
DAASE
CREST
Justyna Petke Genetic Improvement
Search Based Optimisation Software Engineering
S B S E
Genetic Improvement Combinatorial Interaction Testing COWs Visitor Scheme Open positions
CREST
Justyna Petke Genetic Improvement
Pickering's Harem: [Public domain], via Wikimedia Commons IBM 026 Card Punch: By Ben Franske (Own work) [GFDL (http://www.gnu.org/copyleft/fdl.html) or CC-BY-SA-3.0-2.5-2.0-1.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons BBC_Micro: [Public domain], via Wikimedia Commons Programmer: undesarchiv, B 145 Bild-F031434-0006 / Gathmann, Jens / CC-BY-SA [CC-BY-SA-3.0- de (http://creativecommons.org/licenses/by-sa/3.0/de/deed.en)], via Wikimedia Commons IBM PC: By Boffy B (Own work) [GFDL (http://www.gnu.org/copyleft/fdl.html) or CC-BY- SA-3.0-2.5-2.0-1.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons IMac: By Matthieu Riegler, Wikimedia Commons [CC-BY-3.0 (http://creativecommons.org/ licenses/by/3.0)], via Wikimedia Commons Ada Lovelace: By Alfred Edward Chalon [Public domain], via Wikimedia Commons Stonehenge: By Yuanyuan Zhang [All right reserved] via Flickr Bath Abbey: By Yuanyuan Zhang [All right reserved] via Flickr