Type-Directed Completion of Partial Expressions Daniel Perelman - - PowerPoint PPT Presentation

type directed completion of partial expressions
SMART_READER_LITE
LIVE PREVIEW

Type-Directed Completion of Partial Expressions Daniel Perelman - - PowerPoint PPT Presentation

Type-Directed Completion of Partial Expressions Daniel Perelman Sumit Gulwani Thomas Ball Dan Grossman University of Washington Microsoft Research Redmond June 12, 2012 I want to shrink an image... Document image = ...;


slide-1
SLIDE 1

Type-Directed Completion of Partial Expressions

Daniel Perelman† Sumit Gulwani‡ Thomas Ball‡ Dan Grossman†

†University of Washington ‡Microsoft Research Redmond

June 12, 2012

slide-2
SLIDE 2

I want to shrink an image...

Document image = ...; Size newSize = ...;

  • image. ✿✿✿✿✿✿✿

Shrink(newSize)

2 / 54

slide-3
SLIDE 3

I want to shrink an image...

Document image = ...; Size newSize = ...; image.

3 / 54

slide-4
SLIDE 4

I want to shrink an image...

Document image = ...; Size newSize = ...; image.

4 / 54

slide-5
SLIDE 5

I want to shrink an image...

Document image = ...; Size newSize = ...; image.

5 / 54

slide-6
SLIDE 6

I want to shrink an image...

Document image = ...; Size newSize = ...; image.

6 / 54

slide-7
SLIDE 7

I want to shrink an image...

Document image = ...; Size newSize = ...; PaintDotNet.

7 / 54

slide-8
SLIDE 8

I want to shrink an image...

Document image = ...; Size newSize = ...; PaintDotNet.

8 / 54

slide-9
SLIDE 9

I want to shrink an image...

Document image = ...; Size newSize = ...; PaintDotNet.

9 / 54

slide-10
SLIDE 10

I want to shrink an image...

Document image = ...; Size newSize = ...; PaintDotNet.

10 / 54

slide-11
SLIDE 11

I want to shrink an image...

Document image = ...; Size newSize = ...; PaintDotNet.Document.

11 / 54

slide-12
SLIDE 12

I want to shrink an image...

Document image = ...; Size newSize = ...; PaintDotNet.Document.

12 / 54

slide-13
SLIDE 13

I want to shrink an image...

13 / 54

slide-14
SLIDE 14

I want to shrink an image...

Document image = ...; Size newSize = ...; PaintDotNet.Data.

14 / 54

slide-15
SLIDE 15

I want to shrink an image...

Document image = ...; Size newSize = ...; PaintDotNet.Actions.

15 / 54

slide-16
SLIDE 16

I want to shrink an image...

Document image = ...; Size newSize = ...; PaintDotNet.Actions.

16 / 54

slide-17
SLIDE 17

I want to shrink an image...

Document image = ...; Size newSize = ...; PaintDotNet.Actions.

17 / 54

slide-18
SLIDE 18

I want to shrink an image...

Document image = ...; Size newSize = ...; PaintDotNet.Actions.

18 / 54

slide-19
SLIDE 19

I want to shrink an image...

Document image = ...; Size newSize = ...; PaintDotNet.Actions.ResizeAction.

19 / 54

slide-20
SLIDE 20

I want to shrink an image...

Document image = ...; Size newSize = ...; PaintDotNet.Actions.

20 / 54

slide-21
SLIDE 21

I want to shrink an image...

Document image = ...; Size newSize = ...; PaintDotNet.Actions.CanvasSizeAction.

21 / 54

slide-22
SLIDE 22

I want to shrink an image...

Document image = ...; Size newSize = ...; PaintDotNet.Actions.CanvasSizeAction. .ResizeDocument( /* PaintDotNet.Document image */, /* System.Drawing.Size size */, /* PaintDotNet.AnchorEdge edge */, /* PointDotNet.ColorBgra bgColor */);

22 / 54

slide-23
SLIDE 23

Programmer thought process

◮ I have a Document and a Size ◮ I want to shrink the Document ◮ There must be a method

23 / 54

slide-24
SLIDE 24

Programmer thought process

◮ I have a Document and a Size ◮ I want to shrink the Document ◮ There must be a method ◮ Current code completion

◮ Left-to-right ◮ Complete, alphabetic list of just next token ◮ Very limited filtering 24 / 54

slide-25
SLIDE 25

Proposed workflow

Document image = ...; Size newSize = ...; var newImage =

❄({image, newSize})

25 / 54

slide-26
SLIDE 26

Proposed workflow

Document image = ...; Size newSize = ...; var newImage =

❄({image, newSize})

26 / 54

slide-27
SLIDE 27

Programmer thought process

◮ I have a Document and a Size ◮ I want to shrink the Document ◮ There must be a method ◮ Query should contain what the programmer knows

◮ Some values and types the expression should involve ◮ Loose syntactic structure

◮ Query shouldn’t require what the programmer doesn’t know

◮ Names ◮ Argument order ◮ Other arguments

◮ Show “best” results first ◮ Similar in spirit to Prospector [Mandelin et. al., PLDI’05]

27 / 54

slide-28
SLIDE 28

Overview

◮ Expression of API queries as partial expressions ◮ Algorithm to generate results quickly in ranked order ◮ Experiment showing simple queries represent real code well

28 / 54

slide-29
SLIDE 29

Unknown method queries

◮ Ex. ✿✿

❄({image, size})

◮ ⇒ PaintDotNet.Actions.CanvasSizeAction

.ResizeDocument(img, size, ⋄, ⋄)

◮ ⇒ PaintDotNet.Functional.Func.Bind(⋄, size, img) ◮ ⇒ PaintDotNet.Pair.Create(size, img) ◮ ⇒ PaintDotNet.Quadruple.Create(size, img, ⋄, ⋄) ◮ ⇒ PaintDotNet.Triple.Create(size, img, ⋄) ◮ ⇒ PaintDotNet.PropertySystem

.StaticListChoiceProperty .CreateForEnum(img, size, ⋄)

◮ ⇒ System.Drawing.Size.Equals(size, img) ◮ ⇒ System.Object.ReferenceEquals(size, img) 29 / 54

slide-30
SLIDE 30

Unknown lookup queries

◮ Ex. float f = pointPair.

✿✿

◮ ⇒ pointPair.P1.X ◮ ⇒ pointPair.P1.Y ◮ ⇒ pointPair.P2.X ◮ ⇒ pointPair.P2.Y ◮ ⇒ pointPair.Midpoint.X ◮ ⇒ pointPair.Midpoint.Y ◮ ⇒ pointPair.FirstValidValue().X ◮ ⇒ pointPair.Length 30 / 54

slide-31
SLIDE 31

Unknown expression queries

◮ Ex. XmlReader xr =

◮ ⇒ System.Xml.XmlReader.Create(⋄) ◮ ⇒ new System.Xml.XmlNodeReader(⋄) ◮ ⇒ System.Data.SqlTypes.SqlXml.Null.CreateReader() ◮ ⇒ new System.Xml.XmlNodeReader(⋄).ReadSubtree() ◮ ⇒ new System.Xml.XmlValidatingReader(⋄).Reader ◮ ⇒ Microsoft.SqlServer.Server.SqlContext

.TriggerContext.EventData.CreateReader()

◮ ⇒ new System.Xml.XmlValidatingReader(⋄)

.Reader.ReadSubtree()

31 / 54

slide-32
SLIDE 32

Partial expression language

(a) e ::= call | varName | e.fieldName | e:=e | e<e call ::= methodName(e1, . . . ,en) (b)

  • e

::=

  • a | ✿

❄ | ⋄

  • a

::= e | a.✿ ✯ | call | e:= e | e< e

  • call

::=

❄({ e1, . . . , en}) | methodName( e1, . . . , en) ❄ ✯ ✯ ❄

32 / 54

slide-33
SLIDE 33

Partial expression language

(a) e ::= call | varName | e.fieldName | e:=e | e<e call ::= methodName(e1, . . . ,en) (b)

  • e

::=

  • a | ✿

❄ | ⋄

  • a

::= e | a.✿ ✯ | call | e:= e | e< e

  • call

::=

❄({ e1, . . . , en}) | methodName( e1, . . . , en)

◮ Ex. ✿✿

❄({strBuilder.✿ ✯, e.

✯}) ⇒ ✿✿ ❄({strBuilder, e.StackTrace}) ⇒ strBuilder.Append(e.StackTrace)

33 / 54

slide-34
SLIDE 34

Algorithm

◮ Problem: given query, generate completions

34 / 54

slide-35
SLIDE 35

Method index by parameter type

Object 2210 methods Equals GetHashCode Registry .SetValue Array .IndexOf IList.Add Console.WriteLine ... ICloneable 2211 methods Clone IList 2257 methods Add Remove ... ArrayList 2299 methods BinarySearch Reverse ...

35 / 54

slide-36
SLIDE 36

Infinite results

◮ Problem: too many results

◮ inefficient to generate thousands of results to show only 20 to

the programmer

◮ programmer does not want to look at every result ◮ result set is often infinite

◮ Ex. var res = foo.✿

✯;

◮ ⇒ foo ◮ ⇒ foo.GetType() ◮ ⇒ foo.GetType().GetType() ◮ ⇒ foo.GetType().GetType().GetType() ◮ ⇒ foo.GetType().GetType().GetType().GetType() ◮ ⇒ . . .

◮ Solution: generate in ranked order

36 / 54

slide-37
SLIDE 37

Algorithm

◮ Simple structually recursive algorithm ◮ Group by type to minimize redundant work ◮ Generate results in ranking order

◮ Allows determination of top n without computing all results 37 / 54

slide-38
SLIDE 38

Heuristics: Type distance Object Shape Rectangle 2 1 IDrawingElement 2

38 / 54

slide-39
SLIDE 39

Heuristics: Type distance Object Shape Rectangle 2 1 IDrawingElement 2

39 / 54

slide-40
SLIDE 40

Heuristics: Length

◮ Number of field/property lookups or method calls added

❄ ✯ ✯

40 / 54

slide-41
SLIDE 41

Heuristics: Length

◮ Number of field/property lookups or method calls added ◮

❄({strBuilder. ✿ ✯,e. ✿ ✯}) Good (1): ⇒ strBuilder.Append(e.StackTrace) Bad (3): ⇒ strBuilder.Clear().Append(e.Data.Count)

41 / 54

slide-42
SLIDE 42

Heuristics: Inferred abstract types

Example usages elsewhere in codebase: string f = Path.GetTempFileName(); ...; File.Delete(f); File.Delete(Path.Combine(dir, filename)); if(File.Exists(Path.Combine(otherDir, file))) {...} Query: string p = Path.GetTempFileName();

❄({p}) ⇒ GetCursor(p) ⇒ File.Delete(p) ⇒ File.Exists(p)

42 / 54

slide-43
SLIDE 43

Ranking function

◮ Linear combination of these and other heuristics ◮ Sensitivity analysis showed these are most important and

coefficients do not matter much

43 / 54

slide-44
SLIDE 44

Outline

Motivation Approach Language Algorithm Ranking Experiment Results Related work Conclusion

44 / 54

slide-45
SLIDE 45

Experiment

◮ Automated test of expressiveness of partial expressions ◮ Generated queries for each call and looked at rank of actual

call in query results

◮ Advantage: able to do many queries ◮ Disadvantage: many of the method calls are not ones a

programmer would need API discovery for

45 / 54

slide-46
SLIDE 46

Experiment

◮ Used Microsoft CCI to disassemble mature C# projects ◮ Converted every call with at least 3 arguments (including

receiver) to a query with 1 or 2 arguments (including receiver)

◮ For ResizeDocument(document, size, anchorEdge,

background) 16 queries would be generated: ⇒ ✿ ❄(document) ⇒ ✿ ❄(size) ⇒ ✿ ❄(anchorEdge) ⇒ ✿ ❄(background) ⇒ ✿ ❄(document, size) ⇒ ✿ ❄(document, background) ⇒ . . .

◮ Report rank for best-performing query for each call

46 / 54

slide-47
SLIDE 47

Projects used

◮ Paint.NET image editor ◮ Windows Installer XML library ◮ Gnome Do program launcher ◮ Banshee music player ◮ .NET core libraries ◮ Family.Show (WPF example application) ◮ LiveGeometry geometry visualizer ◮ Scale: .NET contains 280,000 methods in 30,000 types ◮ Analyzed 21,176 method calls in these applications

47 / 54

slide-48
SLIDE 48

CDF of rank for best method query

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 20 30 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of analyzed calls Rank of correct answer is < x

?({foo, bar}) baz.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 20 30 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of analyzed calls Rank of correct answer is < x

?({foo, bar}) baz.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 20 30 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of analyzed calls Rank of correct answer is < x

?({foo, bar}) baz.

Partial expressions Code completion

48 / 54

slide-49
SLIDE 49

CDF of rank for best method query (correct is static)

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 20 30 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of analyzed static calls Rank of correct answer is < x

?({foo, bar}) NS.Baz.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 20 30 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of analyzed static calls Rank of correct answer is < x

?({foo, bar}) NS.Baz.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 20 30 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of analyzed static calls Rank of correct answer is < x

?({foo, bar}) NS.Baz.

Partial expressions Code completion

49 / 54

slide-50
SLIDE 50

CDF of rank for best method query

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 20 30 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of analyzed calls Rank of correct answer is < x

?({foo, bar}) ?({foo}) baz.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 20 30 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of analyzed calls Rank of correct answer is < x

?({foo, bar}) ?({foo}) baz.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 20 30 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Proportion of analyzed calls Rank of correct answer is < x

?({foo, bar}) ?({foo}) baz.

Using two arguments Using one argument Code completion

50 / 54

slide-51
SLIDE 51

Other experiments

◮ Time: unknown method queries take under 0.1 second ◮ Ran similar experiments on other partial expression templates ◮ Similar results: one argument or one lookup could be

predicted within the top 10 about 80% of the time

51 / 54

slide-52
SLIDE 52

Related work

◮ Lots of other work on API discovery discussed in paper

52 / 54

slide-53
SLIDE 53

Related work

◮ Lots of other work on API discovery discussed in paper ◮ Prospector (for Java) [Mandelin et. al., PLDI’05]

◮ Input is target type ◮ Similar to XmlReader xr = ✿

❄ query

◮ Uses mined expressions which convert from one type to another ◮ Output is chain of mined expressions starting with some local ◮ Advantage: able to synthesize larger expressions ◮ Disadvantage: queries only specify a single input type and a

single output type

53 / 54

slide-54
SLIDE 54

Contributions

◮ Expressed API searches in terms of partial expressions ◮ Leveraged rich type structure to reduce information needed

for queries

◮ Automated experiments across large codebases show small

partial expressions often match real method calls

◮ Created Visual Studio plugin

◮ https://pec.codeplex.com/ 54 / 54