Bugs, au Naturale. % 0 % 0 0 1 0 l 1 a r l a u r t a - - PowerPoint PPT Presentation

bugs au naturale
SMART_READER_LITE
LIVE PREVIEW

Bugs, au Naturale. % 0 % 0 0 1 0 l 1 a r l a u r t a - - PowerPoint PPT Presentation

Bugs, au Naturale. % 0 % 0 0 1 0 l 1 a r l a u r t a u t N a N Premkumar Devanbu DECAL Laboratory public class FunctionCall { public static void funct1 () { System.out.println ("Inside funct1"); } University of


slide-1
SLIDE 1

Bugs, au Naturale.

Premkumar Devanbu DECAL Laboratory University of California, Davis

1 % N a t u r a l 1 % N a t u r a l

public class FunctionCall { public static void funct1 () { System.out.println ("Inside funct1"); } public static void main (String[] args) { int val; System.out.println ("Inside main"); funct1(); System.out.println ("About to call funct2"); val = funct2(8); System.out.println ("funct2 returned a value of " + val); System.out.println ("About to call funct2 again"); val = funct2(-3); System.out.println ("funct2 returned a value of " + val); } public static int funct2 (int param) { System.out.println ("Inside funct2 with param " + param); return param * 2; } }

1 % N a t u r a l

ICSE 2012, “On the Naturalness of Software” Hmmmm…. Tiger, Tiger, Burning 
 Bright…

slide-2
SLIDE 2

TIGER!! 
 RUN!!!

Meanwhile, back in Redmond.. (or Bangalore,

  • r Shangahi,
  • r Sunnyvale..)

Why is your feature behind Schedule? Code, Code, Code, Code, Code… On the Uniqueness of Code (FSE 2010)

“Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do...

slide-3
SLIDE 3

First, Some Differences

slide-4
SLIDE 4

Cross-Entropy per token

2.5 5 7.5 10 1-gram 2-gram 3-gram 4-gram 5-gram

Brown Java

The Skeptic asks..

Is it just that C, Java, Python... are syntactically simpler than English?

slide-5
SLIDE 5

Is buggy code odd?

5 10 15 non_buggy buggy fixed

Line Entropy

Title

Effect Size (Cohen’s D)

0.0 0.2 0.4 0.5 0.7

Defective Line Count

17 33 50 67 83 100

0.61 0.53 0.43 0.36 0.29 0.25 0.18 0.18

Buggy vs. Not.

slide-6
SLIDE 6

Title

Percent of total bugs

22.5 45 67.5 90

Defective Line Count

17 33 50 67 83 100

15 24 35 44 56 60 72 81

Defective line count

5 10 15 low duration medium duration high duration

Line Entropy

Does it work? How to tell?

  • Problem: Line-level!
  • Cost-sensitive measures more suitable.
  • Comparable to static analysis warnings!

…measured using Cost-effectiveness curve.

Findings Summary

  • More cost-effective than logistic regression

at 5% inspection budget, but not at 20%.

  • Cost-effectiveness similar to FindBugs and

PMD.

  • Entropy-based ordering improves cost-

effectiveness of PMD & FindBugs