How High Will It Be? Using Machine Learning Models to Predict - PowerPoint PPT Presentation

How High Will It Be? Using Machine Learning Models to Predict Branch Coverage in Automated Testing G. Grano, T. Titov, S. Panichella, H. Gall MaLTeSQuE@SANER 2018, 20, Campobasso (Italy) grano@ifi.uzh.ch giograno90

U.S. Economy Impact 1 $59.5 billion $22.2 billion avoidable 0.6% GDP 1 http://www.abeacha.com/NIST_press_release_bugs_cost.html 2 — Giovanni Grano @ s.e.a.l.

Continuous Integration The way we develop and release code is rapidly changing In an ideal world, at every single commit the entire test suite should be executed ... ... but companies like Google commit about 16.000 changes per day! 3 — Giovanni Grano @ s.e.a.l.

Test Data Generation Pretty active research area in the last years Mature tools able to generate test suites with high coverage: > EvoSuite > Randoop > ... > and many more! How do they fit in such a CI/CD environment? 4 — Giovanni Grano @ s.e.a.l.

Continuous Testing Generation 2 Continuous integration enhanced with automated test generation It raises many questions: > testing order > how much time to spend per class 2 Campos et al - Continuous Test Generation: Enhancing Continuous Integration with Automated Test Generation 5 — Giovanni Grano @ s.e.a.l.

Test Suite Augmentatio n 3 Automatic generation considering code changes and their effect on the previous codebase Hardy doable to the expensive amount of time needed to generate tests 3 Xu et al - Directed Test Suite Augmentation: Techniques and Tradeoffs 6 — Giovanni Grano @ s.e.a.l.

Coverage Prediction Knowing a priori the coverage achieved by test data generation tools > maximize the coverage for the entire system given an amount of time > budget allocation for critical components 7 — Giovanni Grano @ s.e.a.l.

Research Questions

> RQ1: Which type of features can we leverage to train machine learning models to predict the branch coverage achieved by test data generation tools? > RQ2: To what extend can we predict the coverage achieved by test data generation tools? 9 — Giovanni Grano @ s.e.a.l.

Project Selection Open Source Projects from Defect4j > Apache Cassandra > Apache Ivy > Google Guava > Google Dagger Guava Cassandr Dagger Ivy a LOC 78,525 220,573 848 50,430 Java 538 1,474 43 464 Files 10 — Giovanni Grano @ s.e.a.l.

Training Set > Run EvoSuite > Labeled data (with coverage) > Computed metrics = input variables Coverage Threats: Metrics Extractor > we did it once 11 — Giovanni Grano @ s.e.a.l.

Feature Selection Goal: capture complexity

Package Level Features Computed with JDepend 4 Name Description Ca indicator of the package's responsibility Ce indicator of the package's independence A abstract classes / total number of classes I indicator of the package's resilience to change ... ... 4 https://github.com/clarkware/jdepend 13 — Giovanni Grano @ s.e.a.l.

CK and OO Feature ck tool provided by Aniche 5 Name Description CBO coupling between objects DIT depth of inheritance tree NOC number of children NOSF number of static field ... ... 5 https://github.com/mauricioaniche/ck 14 — Giovanni Grano @ s.e.a.l.

Java Reserved Keywords To capture additional complexity Previously used in Information Retrieval as a feature 6 52 Java reserved keywords 6 Sanderson et al The history of information retrieval research 15 — Giovanni Grano @ s.e.a.l.

Grid Selection best hyper-parameters 3-cross fold validation Feature Transformation z-score 16 — Giovanni Grano @ s.e.a.l.

Algorithms Huber Regression Support Vector Regression Multi-Layer Perception 17 — Giovanni Grano @ s.e.a.l.

Results RQ1 Huber SVR MLP Tool's Average EvoSuit 0.255 0.216 0.242 0.238 e Randoop 0.172 0.088 0.139 0.131 Average 0.213 0.152 0.191 18 — Giovanni Grano @ s.e.a.l.

Results RQ2 Time Math Lang Tool's Aver. EvoSuit 0.255 0.330 0.289 0.291 e Randoo 0.168 0.262 0.246 0.225 p Aver. 0.211 0.296 0.267 19 — Giovanni Grano @ s.e.a.l.

Conclusion Knowing a priori the coverage achieved by test data generation tools might ease important decisions; > we took the first steps, investigating well known features > well know features (from gut feeling) give reasonable results 20 — Giovanni Grano @ s.e.a.l.

Future work Improvements > larger dataset > branch-level features > feature analysis > differences between employed tools 21 — Giovanni Grano @ s.e.a.l.

How High Will It Be? Using Machine Learning Models to Predict - PowerPoint PPT Presentation

How High Will It Be? Using Machine Learning Models to Predict Branch Coverage in Automated Testing G. Grano, T. Titov, S. Panichella, H. Gall MaLTeSQuE@SANER 2018, 20, Campobasso (Italy) grano@ifi.uzh.ch giograno90 U.S. Economy Impact 1

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning

APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft K-means DBSCAN 1 MACHINE

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Microservices Are your Frameworks ready? Martin Eigenbrodt | martin.eigenbrodt@innoq.com

You, me and jigsaw Tom Schindl <tom.schindl@bestsolution.at> Twitter: @tomsontom Blog:

Generating Readable Unit Tests for Guava Ermira Daka, Jos Campos, and Gordon Fraser University

Web media initiative and media support at Columbia University Brian O'Hagan October 8, 2010 3rd

Project Jigsaw_ Florian Trobach whoami codecentric Karlsruhe Plain Old Java Dev

Selective Monitoring Radu Grigore Stefan Kiefer Concur 2018 Beijing, 4 September 2018 Radu

Announcements/Follow-ups P7 is posted, due Friday August 2 at 11pm No late period You

Writing Datomic in Clojure Rich Hickey Overview What is Datomic? Architecture

Sambuz

Useful Links

Newsletter

Mail Us

How High Will It Be? Using Machine Learning Models to Predict - PowerPoint PPT Presentation

How High Will It Be? Using Machine Learning Models to Predict Branch Coverage in Automated Testing G. Grano, T. Titov, S. Panichella, H. Gall MaLTeSQuE@SANER 2018, 20, Campobasso (Italy) grano@ifi.uzh.ch giograno90 U.S. Economy Impact 1

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

MACHINE LEARNING Kernel Canonical Correlation Analysis 1 ADVANCED MACHINE LEARNING ADVANCED

Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning

APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft K-means DBSCAN 1 MACHINE

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Microservices Are your Frameworks ready? Martin Eigenbrodt | martin.eigenbrodt@innoq.com

You, me and jigsaw Tom Schindl &lt;tom.schindl@bestsolution.at&gt; Twitter: @tomsontom Blog:

Generating Readable Unit Tests for Guava Ermira Daka, Jos Campos, and Gordon Fraser University

Web media initiative and media support at Columbia University Brian O'Hagan October 8, 2010 3rd

Project Jigsaw_ Florian Trobach whoami codecentric Karlsruhe Plain Old Java Dev

Selective Monitoring Radu Grigore Stefan Kiefer Concur 2018 Beijing, 4 September 2018 Radu

Announcements/Follow-ups P7 is posted, due Friday August 2 at 11pm No late period You

Writing Datomic in Clojure Rich Hickey Overview What is Datomic? Architecture

Sambuz

Useful Links

Newsletter

Mail Us

You, me and jigsaw Tom Schindl <tom.schindl@bestsolution.at> Twitter: @tomsontom Blog: