Fast Synthesis of Fast Collections
Calvin Loncaric Emina Torlak Michael D. Ernst University of Washington
Fast Synthesis of Fast Collections Calvin Loncaric Emina Torlak - - PowerPoint PPT Presentation
Fast Synthesis of Fast Collections Calvin Loncaric Emina Torlak Michael D. Ernst University of Washington Data structures are everywhere Lists, maps, and sets solve many problems What if I need a custom data structure? 2 Cozy
Calvin Loncaric Emina Torlak Michael D. Ernst University of Washington
2
Lists, maps, and sets solve many problems What if I need a custom data structure?
3
Impl. Impl. Impl. Outline Rep. Rep. Rep.
Inductive Synthesizer Verifier
Specification
implementations, synthesized in < 90 seconds
4
Request 1 Request 2 time
Goal: efficient retrieval of entries for a particular request ID in a particular timespan
class AnalyticsLog { void log(Entry e) Iterator<Entry> getEntries( int queryId, int subqueryId, int fragmentId, long start, long end) }
5
Insert an entry into the data structure Retrieve entries
6
Specification:
Entry has: queryId : Int, subqueryId : Int, fragmentId : Int, start, end : Long, … getEntries: all e where e.queryId = queryId and e.subqueryId = subqueryId and e.fragmentId = fragmentId and e.end >= start and e.start <= end
class AnalyticsLog { void log(Entry e) Iterator<Entry> getEntries( int queryId, int subqueryId, int fragmentId, long start, long end) }
7
Specification:
Entry has: field1 : Type1, field2 : Type2, …, start, end, … retrieveA: all e where condition e.subqueryId = subqueryId and retrieveB: all e where condition art and e.start <= end
Cozy
class Structure { void add(Entry e) void remove(Entry e) void update(Entry e, …) Iterator<Entry> retrieveA(…) Iterator<Entry> retrieveB(…) }
8
List<Entry> data; Iterator<Entry> retrieve(input) { for e in data: if P(e, input): yield e }
In the quest for a good solution, the search space of “all possible programs” is simply too large
9
Specification Implementation
Intractable
synthesis algorithm
Specification → Outline Outline → Implementation
specific enough to describe asymptotic performance general enough to encode a data structure succinctly
Entry has: field1, field2, … retrieveA: all e where condition retrieveB: all e where condition void add(Entry e) void remove(Entry e) void update(Entry e, …) Iterator retrieveA(…) Iterator retrieveB(…)
Tractable ? Tractable
10
Plans for retrieving entries
11
Impl. Impl. Impl. Outline Rep. Rep. Rep.
Inductive Synthesizer Verifier
Specification
Impl. Impl. Impl. Outline Rep. Rep. Rep.
Inductive Synthesizer Verifier
Specification
12
class Structure { Iterator<Entry> retrieve(q) { … } }
T data;
13
class Structure { Iterator<Entry> retrieve(q) { … } }
T data;
14
class Structure { Iterator<Entry> retrieve(q) { … } }
HMap<K,V> data;
15
class Structure { Iterator<Entry> retrieve(q) { … } }
HMap<int,V> data;
V = ArrayList<Entry> V = LinkedList<Entry>
16
class Structure { Iterator<Entry> retrieve(q) { … } }
HMap<int,V> data;
17
class Structure { Iterator<Entry> retrieve(q)
HMap<int,V> data;
{ v = data.get(q); return v.iterator(); }
add, remove, update
18
Impl. Impl. Impl. Rep. Rep. Rep. Outline
Inductive Synthesizer Verifier
Specification
Impl. Impl. Impl. Outline Rep. Rep. Rep.
Inductive Synthesizer Verifier
Specification
19
Inductive Synthesizer
candidate
Verifier
counterexample
certification of correctness Remembers all examples; only reasons about examples collected thus far. Must ensure the
all possible inputs and all possible data structure states.
retrieve: all e where e.queryId = q and …
20
21
All
size 1 size 2
HashLookup(All, x=y) BinarySearch(All, x>y) Filter(All, x=y)
… size 3
HashLookup( HashLookup(…), a=b) Filter( BinarySearch(…), x<y) Filter( HashLookup(…), p=q)
…
Filter( HashLookup(…), p=q)
correct on all current examples
Concat(HashLookup(…),…) vs Concat(Filter(…),…)
22
Specification:
Entry has: queryId : Int, subqueryId : Int, … retrieve: all e where e.queryId = q and …
HashLookup( All(), e.queryId = q) e.queryId = q representative predicate Q { e | e ∈ S ∧ Q(I, e) } P { e | e ∈ S ∧ P(I, e) }
23
equivalence can be checked with an SMT solver { e | e ∈ S ∧ P(I, e) } { e | e ∈ S ∧ Q(I, e) }
yes if and only if for all I, e: P(I, e) = Q(I, e)
24
25
Analytics data indexed by timespan and by request ID Tracks map tiles in a least-recently-used cache Stores axis-aligned bounding boxes for fast collision detection
Tracks information about each variable in the formula
11 bugs 15 bugs 7 bugs
26
Myria ZTopo Sat4j Bullet Lines of code 11 292 25 22 23 1383 2582 269 Original Spec
27
Myria ZTopo Sat4j Bullet Time (s) Outline Synthesis Auto-Tuning 30 90 60
Myria
Original implementation has worst-case linear time Small overhead; performance dominated by other factors
Sat4j Bullet
Binary search tree vs. space partitioning tree
ZTopo
Data structures are nearly identical
28
Original Synthesized
designing data structure representation” (1974)
indexes in sql databases” (2000)
the planner to decide which ones to keep
29
problem tractable
matches handwritten implementation performance
30
Special thanks to: Michael Ernst Emina Torlak
also Haoming Liu & Daniel Perelman
Impl. Impl. Impl. Outline Rep. Rep. Rep.
Inductive Synthesizer Verifier
Specification