1
Large-Scale API Protocol Mining for Automated Bug Detection
Michael Pradel Department of Computer Science ETH Zurich
Large-Scale API Protocol Mining for Automated Bug Detection Michael - - PowerPoint PPT Presentation
Large-Scale API Protocol Mining for Automated Bug Detection Michael Pradel Department of Computer Science ETH Zurich 1 Motivation LinkedList pinConnections = ...; Iterator i = pinConnections.iterator (); while ( i.hasNext () ) { PinLink
1
Michael Pradel Department of Computer Science ETH Zurich
2
LinkedList pinConnections = ...; Iterator i = pinConnections.iterator (); while ( i.hasNext () ) { PinLink curr = (PinLink) i.next (); if ( ... ) { pinConnections.remove(curr); } }
(from DaCapo benchmarks)
2
LinkedList pinConnections = ...; Iterator i = pinConnections.iterator (); while ( i.hasNext () ) { PinLink curr = (PinLink) i.next (); if ( ... ) { pinConnections.remove(curr); } }
(from DaCapo benchmarks)
3
call x before y eventually call x don’t call x while calling y and z
3
call x before y eventually call x don’t call x while calling y and z
x y z
4
API Training Programs
4
x y z
API Training Programs
4
x y z Target Program
API Training Programs
5
5
5
5
static vs. dynamic API-based vs. client-based single- vs. multi-object
5
static vs. dynamic API-based vs. client-based single- vs. multi-object static dynamic
verification testing verification testing
5
static vs. dynamic API-based vs. client-based single- vs. multi-object static dynamic
verification testing verification testing
6
7
Execution trace Subtraces Protocols Program & Input
7
Execution trace Subtraces Protocols Program & Input
8
List l = new LinkedList (); l.add(new Foo ()); Iterator i = l.iterator (); OutputStream s = new FileOutputStream("f"); while (i.hasNext ()) { Foo f = i.next (); if (f.isOK ()) s.write(f.getData ()); } s.close ();
8
List l = new LinkedList (); l.add(new Foo ()); Iterator i = l.iterator (); OutputStream s = new FileOutputStream("f"); while (i.hasNext ()) { Foo f = i.next (); if (f.isOK ()) s.write(f.getData ()); } s.close ();
8
List l = new LinkedList (); l.add(new Foo ()); Iterator i = l.iterator (); OutputStream s = new FileOutputStream("f"); while (i.hasNext ()) { Foo f = i.next (); if (f.isOK ()) s.write(f.getData ()); } s.close ();
new LinkedList → 1 1.add(2) → 3 1.iterator → 4 new FileOS(6) → 5 4.hasNext 4.next 5.write(7) 4.hasNext 5.close
9
Execution trace Subtraces Protocols Program & Input
9
Execution trace Subtraces Protocols Program & Input
10
new LinkedList → 1 1.add(2) → 3 1.iterator → 4 new FileOS(6) → 5 4.hasNext 4.next 5.write(7) 4.hasNext 5.close
10
new LinkedList → 1 1.add(2) → 3 1.iterator → 4 new FileOS(6) → 5 4.hasNext 4.next 5.write(7) 4.hasNext 5.close
10
Calls to x Calls to parameters
Calls to objects
new LinkedList → 1 1.add(2) → 3 1.iterator → 4 new FileOS(6) → 5 4.hasNext 4.next 5.write(7) 4.hasNext 5.close
10
new LinkedList → 1 1.add(2)→3 1.iterator→4 4.hasNext 4.next 4.hasNext 4.hasNext 4.next 4.hasNext new FileOS(6) → 5 5.write(7) 5.close
new LinkedList → 1 1.add(2) → 3 1.iterator → 4 new FileOS(6) → 5 4.hasNext 4.next 5.write(7) 4.hasNext 5.close
11
Execution trace Subtraces Protocols Program & Input
11
Execution trace Subtraces Protocols Program & Input
12
LinkedList, Iterator Iterator FileOS
13
Method → state Consecutive call → transition
13
Method → state Consecutive call → transition
new LinkedList → l l.add l.add l.add l.add l.iterator → i i.hasNext i.hasNext i.hasNext i.next l.add
13
Method → state Consecutive call → transition
new LinkedList → l l.add l.add l.add l.iterator → i i.hasNext i.hasNext i.next
14
14
14
14
15
new ZipFile → f f.entries → e e.hasMore Elements e.hasMore Elements e.hasMore Elements e.next Element f.close
16
new URL → u u.openStream → s s.close
17
OK/ Not OK 20+ mining approaches Use for some task
17
OK/ Not OK 20+ mining approaches Use for some task
18
18
19
Formatter format close close locale, out, ioException flush ioException format, locale, out, ioException println Formatter format close println
Mined protocol M Reference protocol
19
Formatter format close close locale, out, ioException flush ioException format, locale, out, ioException println Formatter format close println
Mined protocol M Reference protocol
19
Formatter format close close locale, out, ioException flush ioException format, locale, out, ioException println Formatter format close println
Mined protocol M Reference protocol
20
12 training programs 32 reference protocols
21
12 programs
22
Types and methods covered in mined protocols
22
Types and methods covered in mined protocols More programs ↓ Higher coverage
23
Recall of mined protocols
23
Recall of mined protocols More programs ↓ Higher recall
24
25
x y z Target Program
API Training Programs
25
x y z Target Program
API Training Programs
26
26
26
26
27
new LinkedList → l l.add l.add l.add l.iterator → i i.hasNext i.hasNext i.next
27
new LinkedList → l l.add l.add l.add l.iterator → i i.hasNext i.hasNext i.next
i.hasNext
28
new LinkedList → l l.add l.add l.add l.iterator → i i.hasNext i.hasNext i.next
Setup phase:
bind parameters
Liable phase:
all parameters bound violation: take non-existing transition end in non-final state
29
new LinkedList → l l.add l.add l.add l.iterator → i i.hasNext i.hasNext i.next
29
setup
new LinkedList → l l.add l.add l.add l.iterator → i i.hasNext i.hasNext i.next
29
liable
new LinkedList → l l.add l.add l.add l.iterator → i i.hasNext i.hasNext i.next
29
ambiguous → split
new LinkedList → l l.add l.add l.add l.iterator → i i.hasNext i.hasNext i.next
29
new LinkedList → l l.add l.iterator → i i.hasNext i.hasNext i.next l.add l.add l.add
29
new LinkedList → l l.add l.iterator → i i.hasNext i.hasNext i.next l.add l.add l.add
setup liable
30
30
31
Runtime verification (JavaMOP) Input
31
Challenge 1: Check many different execution paths Runtime verification (JavaMOP) Input
31
Challenge 1: Check many different execution paths Challenge 2: Monitoring mined protocols Runtime verification (JavaMOP) Input
32
Input
32
Random test generation Input
32
Random test generation Call sequences that trigger an exception Input
32
Random test generation Call sequences that trigger an exception Non-exceptional sequences Input
33
new LinkedList → l i.iterator → i i.hasNext i.next
l = new LinkedList ()
33
new LinkedList → l i.iterator → i i.hasNext i.next
l = new LinkedList ()
33
new LinkedList → l i.iterator → i i.hasNext i.next
l = new LinkedList ()
33
new LinkedList → l i.iterator → i i.hasNext i.next
l = new LinkedList ()
33
l = new LinkedList () i1 = l.iterator () i2.next()
new LinkedList → l i.iterator → i i.hasNext i.next
33
l = new LinkedList () i1 = l.iterator () i2.next()
new LinkedList → l i.iterator → i i.hasNext i.next
33
l = new LinkedList () i1 = l.iterator () i2.next()
new LinkedList → l i.iterator → i i.hasNext i.next
33
l = new LinkedList () i1 = l.iterator () i2.next()
new LinkedList → l i.iterator → i i.hasNext i.next
33
l = new LinkedList () i1 = l.iterator () i2.next()
new LinkedList → l i.iterator → i i.hasNext i.next
34
new LinkedList → l i.iterator → i i.hasNext i.next
F
i.next i.hasNext
34
Violation:
Reach fail state End in non-final, liable state
new LinkedList → l i.iterator → i i.hasNext i.next
F
i.next i.hasNext
35
Find relevant issues by monitoring
How useful is generated input?
Setup: DaCapo benchmarks, 1.6 MLOC Java
36
Protocol violations Program Test cases Total Relevant avrora 15,753 5 4 batik 3,477 daytrader 32,446 eclipse 816 fop 6,536 52 50 h2 7,584 14 7 lucene 1,985 pmd 1,286 sunflow 4,300 1 tomcat 14,627 1 1 xalan 21,083 1 1 Sum 160,857 74 63
Randomly generated Bug (exception, unexpected behavior)
code smell (perfor- mance/maintainability problem)
36
Protocol violations Program Test cases Total Relevant avrora 15,753 5 4 batik 3,477 daytrader 32,446 eclipse 816 fop 6,536 52 50 h2 7,584 14 7 lucene 1,985 pmd 1,286 sunflow 4,300 1 tomcat 14,627 1 1 xalan 21,083 1 1 Sum 160,857 74 63
Randomly generated Bug (exception, unexpected behavior)
code smell (perfor- mance/maintainability problem)
37
try { is = u.openStream (); r = new InputStreamReader(is, "UTF
br = new BufferedReader(r); } finally { if ( is != null ){ try { is.close (); } catch ( IOException ignored ){} is = null; } if ( r != null ){ try{ r.close (); } catch ( IOException ignored ){} r = null; } if ( br == null ){ try{ br.close (); } catch ( IOException ignored ){} br = null; } }
37
try { is = u.openStream (); r = new InputStreamReader(is, "UTF
br = new BufferedReader(r); } finally { if ( is != null ){ try { is.close (); } catch ( IOException ignored ){} is = null; } if ( r != null ){ try{ r.close (); } catch ( IOException ignored ){} r = null; } if ( br == null ){ try{ br.close (); } catch ( IOException ignored ){} br = null; } }
38
Iterator i = pinConnections.iterator (); PinLink currLink = (PinConnect.PinLink) i.next (); currLink.propagateSignals (); while (i.hasNext ()) { currLink = (PinConnect.PinLink) i.next (); currLink.propagateSignals (); }
38
Iterator i = pinConnections.iterator (); PinLink currLink = (PinConnect.PinLink) i.next (); currLink.propagateSignals (); while (i.hasNext ()) { currLink = (PinConnect.PinLink) i.next (); currLink.propagateSignals (); }
39
40
40
41
Typestate checking + specification Anomaly detection
41
+ Precise
+ Automatic
Typestate checking + specification Anomaly detection
41
+ Precise
+ Automatic
Combine both! Precise checker for mined multi-object protocols Typestate checking + specification Anomaly detection
42
Joint work with Ciera Jaspan and Jonathan Aldrich (ISR, CMU)
Fusion analysis f() ✓ g() ✗ h() ✗ f() g() h() Pruning Relationship constraints
@Constraint( requires="..." )
43
Checking Framework Interactions with Relationships ECOOP ’09
Reasons about interacting objects Distinguishes setup from checking
44
void m() { .. }
Effects: Add/remove objects Requirements: Check before call
44
void m() { .. }
Effects: Add/remove objects Requirements: Check before call
Keep track of protocol execution (e.g., current state) Check protocol constraints if in liable state
45
LinkedList l = new LinkedList (); Iterator i = l.iterator (); i.next ();
new LinkedList → l i.iterator → i i.hasNext i.next
1 2 3 4
45
LinkedList l = new LinkedList (); Iterator i = l.iterator (); i.next ();
new LinkedList → l i.iterator → i i.hasNext i.next
1 2 3 4 l ∈ rstate2, l ∈ riterator
45
LinkedList l = new LinkedList (); Iterator i = l.iterator (); i.next ();
new LinkedList → l i.iterator → i i.hasNext i.next
1 2 3 4 l ∈ rstate2, l ∈ riterator l ∈ rstate3, i ∈ rstate3, i ∈ rhasNext, (l, i) ∈ rprotocol
45
LinkedList l = new LinkedList (); Iterator i = l.iterator (); i.next ();
new LinkedList → l i.iterator → i i.hasNext i.next
1 2 3 4 l ∈ rstate2, l ∈ riterator l ∈ rstate3, i ∈ rstate3, i ∈ rhasNext, (l, i) ∈ rprotocol
46
Program Warnings Total Bugs Code smells True pos. avrora 13 9 69% batik 1 0% daytrader — eclipse 15 2 1 20% fop 13 8 1 69% h2 1 0% jython 7 2 1 43% lucene 13 3 3 46% pmd 15 2 8 67% sunflow — tomcat 2 0% xalan 1 1 100% Total 81 26 15 51% exception or unexpected behavior maintainability
issue
46
Program Warnings Total Bugs Code smells True pos. avrora 13 9 69% batik 1 0% daytrader — eclipse 15 2 1 20% fop 13 8 1 69% h2 1 0% jython 7 2 1 43% lucene 13 3 3 46% pmd 15 2 8 67% sunflow — tomcat 2 0% xalan 1 1 100% Total 81 26 15 51% exception or unexpected behavior maintainability
issue
47
LinkedList pinConnections = ...; Iterator i = pinConnections.iterator (); while ( i.hasNext () ) { PinLink curr = (PinLink) i.next (); if ( ... ) { pinConnections.remove(curr); } }
47
LinkedList pinConnections = ...; Iterator i = pinConnections.iterator (); while ( i.hasNext () ) { PinLink curr = (PinLink) i.next (); if ( ... ) { pinConnections.remove(curr); } }
48
BufferedReader in = null; try { in = new BufferedReader (...); ... in.close (); } finally { if (in != null) { try { in.close (); } catch (IOException e) { ... } } }
48
BufferedReader in = null; try { in = new BufferedReader (...); ... in.close (); } finally { if (in != null) { try { in.close (); } catch (IOException e) { ... } } }
49
liable setup
Both are practical Complement each other
50
understand large systems pinpoint problem areas
Reveals multi-object bugs Easy to use
51
52
53
next next
Positive protocols Negative protocols vs.
new Writer close write next hasNext hasNext
53
next next
Positive protocols Negative protocols vs.
new Writer close write next hasNext hasNext
Reaching final state Taking non-existing
transition
Not reaching final
state
54
Gabel & Su, FSE 2008 Language learning algorithm with
Don’t consider dataflow Lee, Chen & Rosu, ICSE 2011 First, learn related events; then, mine
Require unit tests for first step
55
56
57
58
59
60