Automatically Assessing Code Understandability Reanalyzed : - PowerPoint PPT Presentation

“Automatically Assessing Code Understandability” Reanalyzed : Combined Metrics Matter Asher Trockman , Keenen Cates, Mark Mozina, Tuan Nguyen, Christian Kästner, Bogdan Vasilescu

Automatically Assessing Code Understandability: How far are we? Simone Scalabrino, Gabriele Bavota, Christopher Vendome, Mario Linares-Vásquez, Denys Poshyvanyk, Rocco Olivetto • Motivation: Understandability… 1. is crucial for maintenance 2. could predict defects • Understandability metric: extremely useful

Automatically Assessing Code Understandability: Automatically Assessing Code Understandability: How far are we? How far are we? Simone Scalabrino, Gabriele Bavota, Christopher Vendome, Simone Scalabrino, Gabriele Bavota, Christopher Vendome, Mario Linares-Vásquez, Denys Poshyvanyk, Rocco Olivetto Mario Linares-Vásquez, Denys Poshyvanyk, Rocco Olivetto • 46 developers quizzed on 8 Java snippets • Recorded 121 code-related metrics for the snippets • n = 324 observations, p = 121 features

Original study: Individual correlations only Understandability vs. 121 Metrics All correlations less than 16%. from “Automatically Assessing Code Understandability”, Scalabrino et al. (2017)

Our reanalysis: Combined metrics Logistic models • Improvement: multiple regression models • (Understandability ~ Combination of metrics + ε ) • Public data set: Thank you, Scalabrino et al.! • Caveat: High dimensionality (121 metrics) • Solution: Automatic variable selection • e.g., forward stepwise selection and LASSO

1. Forward-Stepwise-Selected What explains understandability? Understandability Classifier 1. Developer Experience If a developer has 5 or more years of programming experience, their odds of understanding increase by 200% on average.

1. Forward-Stepwise-Selected What explains understandability? Understandability Classifier 2. Maximum Line Length Increasing the maximum line length by one character decreases the odds of understanding by 2%. Takeaway: keep lines short.

1. Forward-Stepwise-Selected What explains understandability? Understandability Classifier 3. Narrow Meaning Identifiers 1 Increasing NMI, a measure of descriptiveness of variable names, by one unit increases the odds of understanding by 80%. Takeaway: use specific variable names. [1] “Automatically Assessing Code Understandability”, Scalabrino et al. (2017)

1. Forward-Stepwise-Selected What explains understandability? Understandability Classifier By combining metrics on developer experience, code readability, and more… Pseudo-R 2 = 41%

Can we predict understandability? • Binary classifier (Logistic) • Understood or not Avg. ROC • Random cross validation 95 percentile band • Avg. AUC : 0.64 • i.e., ranks an easy-to-understand snippet above a hard-to-understand one 64% of the time

Original Study Our Reanalysis Linear models with Correlations with combined metrics… individual metrics… Can we measure Can we measure understandability? understandability? NO YES (Not with existing individual metrics.) (With more data.)

Creating a Metric of Code Understandability: Now Future Work 46 developers 1000 developers Small dataset Big data Simple models Advanced models ~64% accuracy Useful in real world Thanks, Scalabrino et al.!

Automatically Assessing Code Understandability Reanalyzed : - PowerPoint PPT Presentation

Automatically Assessing Code Understandability Reanalyzed : Combined Metrics Matter Asher Trockman , Keenen Cates, Mark Mozina, Tuan Nguyen, Christian Kstner, Bogdan Vasilescu Automatically Assessing Code Understandability: How far are

Designing for Understandability: the Raft Consensus Algorithm Diego Ongaro John Ousterhout

ASSESSING INTELLECTUAL DISABILITIES ASSESSING INTELLECTUAL DISABILITIES ASSESSING INTELLECTUAL

Assessing Earthquake Disaster Using ALOS Assessing Earthquake Disaster Using ALOS Assessing

Refactoring with Cognitive Complexity the New Option for Measuring Understandability G. Ann

Empirical Evaluation of the Understandability of Architectural Component Diagrams Srdjan

Developing an Instrument to Assess the Understandability and Actionability of Health Information

Code Generation Machine code generation cs4713 1 Machine code generation machine Intermediate

{Sequential Code} {Sequential Code} {Sequential Code} {Sequential Code} {Sequential Code}

Automatically Automatically Finding Patches Finding Patches Using Genetic Using Genetic

Automatically Identifying Automatically Identifying and Georeferencing Georeferencing and

80% of Code Red 2 Code Red 2 re-re- Code Red 1 and Code Red 2 Code Red 2 re- cleaned up

Selection Sort Section 10.2 Code for Selection Sort (cont.) Code for an Array Sort Code for an

in practice source code source code javac scalac groovyc jrubyc 0xCAFEBABE byte code

Assessing annotation Assessing annotation consistency in the Gene consistency in the Gene

Review Mining Automatically Assessing Review Helpfulness Sanae Sato Haotian He April 22, 2014 O

Assessing the Quality of Automatically Built Network Representations Lionel Eyraud-Dubois

Bo Bounded St Stream Sc Scheduli ling in in Polyh lyhedral l OpenStream Nuno Mig iguel Nob

Caveat Coercitor Mark Ryan and Peter Y. A. Ryan University of Birmingham University of

A Caveat for Applied Holography: Spacetime Reconstruction from a Non-Relativistic Boundary

Superconducting RF: Resonance Control Warren Schappert PIP-II Machine Advisory Committee 10

Queryable LINQ Radu Nicolescu Department of Computer Science University of Auckland 10 Oct 2018

A Friendly Caveat Administrative Issues This is not going to be an easy course. You need to

Challenges in Crowd-sourcing The positive side of things 150+ active volunteer translators

Week 1, video 2: Regressors Prediction Develop a model which can infer a single aspect of the