The Bigλ Project
Aws Albarghouthi Calvin Smith University of Wisconsin-Madison
The Big Project Aws Albarghouthi Calvin Smith University of - - PowerPoint PPT Presentation
The Big Project Aws Albarghouthi Calvin Smith University of Wisconsin-Madison input data map shu ffl e reduce output data m(i 1 ) i 1 reduce i 2 m(i 2 ) reduce output i 3 reduce m(i 3 ) Big : Analyses from Examples [PLDI16]
Aws Albarghouthi Calvin Smith University of Wisconsin-Madison
input data map reduce
shuffle
i1
i2 i3 m(i1) … m(i2) m(i3) reduce reduce reduce
Synthesize data-parallel programs from input/output examples Example:
i1
Output: [PLDI16]
Non-determinism generate proven-deterministic solutions Variety of domains parameterize by extensible APIs Sparse search space syntactically restrict to data- parallel programs
Bias search heavily towards data-parallel programs
Bigλ uses 8 templates, gathered from reference implementations
Bias search heavily towards data-parallel programs
map x . reduce x flatmap x . reduce x . apply x map x . reduceByKey x . filter x
e.g.
Bigλ uses 8 templates, gathered from reference implementations
@Alice: “Hello AAIP #aaip #germany” @Bob: “Coffee machine refilled yet? #caffeine #java #4thcup #zzz” @Claire: “Torn between wine cellar and seminar #wine #seminar #zzz”
@Alice: “Hello AAIP #aaip #germany” @Bob: “Coffee machine refilled yet? #caffeine #java #4thcup #zzz” @Claire: “Torn between wine cellar and seminar #wine #seminar #zzz”
2, 4, 3…must be @Bob!
@Bob
let p = map m . reduce r . apply f
@Alice: “Hello AAIP #aaip #germany” @Bob: “Coffee machine refilled yet? #caffeine #java #4thcup #zzz” @Claire: “Torn between wine cellar and seminar #wine #seminar #zzz”
let p = map m . reduce r . apply f where m = λt. (len(filter(is_hashtag, t)), author(t))
@Alice: “Hello AAIP #aaip #germany” @Bob: “Coffee machine refilled yet? #caffeine #java #4thcup #zzz” @Claire: “Torn between wine cellar and seminar #wine #seminar #zzz”
{2, @Alice} {4, @Bob} {3, @Claire}
let p = map m . reduce r . apply f where m = λt. (len(filter(is_hashtag, t)), author(t))
{2, @Alice} {4, @Bob} {3, @Claire}
@Alice: “Hello AAIP #aaip #germany” @Bob: “Coffee machine refilled yet? #caffeine #java #4thcup #zzz” @Claire: “Torn between wine cellar and seminar #wine #seminar #zzz”
let p = map m . reduce r . apply f where m = λt. (len(filter(is_hashtag, t)), author(t)) r = λx,y. max(x, y)
{2, @Alice} {4, @Bob} {3, @Claire}
{4, @Bob} {4, @Bob}
{2, @Alice} {4, @Bob} {3, @Claire}
let p = map m . reduce r . apply f where m = λt. (len(filter(is_hashtag, t)), author(t)) r = λx,y. max(x, y)
{4, @Bob} {4, @Bob}
{2, @Alice} {4, @Bob} {3, @Claire}
let p = map m . reduce r . apply f where m = λt. (len(filter(is_hashtag, t)), author(t)) r = λx,y. max(x, y)
{4, @Bob} {4, @Bob}
{3, @Claire} {2, @Alice} {4, @Bob} {3, @Claire}
let p = map m . reduce r . apply f where m = λt. (len(filter(is_hashtag, t)), author(t)) r = λx,y. max(x, y)
{4, @Bob}
let p = map m . reduce r . apply f where m = λt. (len(filter(is_hashtag, t)), author(t)) r = λx,y. max(x, y)
{4, @Bob} {2, @Alice} {3, @Claire} {3, @Claire}
{4, @Bob}
let p = map m . reduce r . apply f where m = λt. (len(filter(is_hashtag, t)), author(t)) r = λx,y. max(x, y) f = λp. snd(p)
{4, @Bob} @Bob
Synthesis modulo differential privacy? [in progress]
map m . reduce r
map m . reduce r
compute sensitivity charge price
map m . reduce r
compute sensitivity charge price add noise
Linear type system induce cheapest program
How can we automatically learn relational specifications? [FSE17, best paper award]
add(x, y) = z ⇐
add
add(x, y) = z ⇐
add
add(x, y) = z ⇐
Unsupervised learning learn constraints consistent with
Applied technique to learn specifications of Python APIs Used ~1000 randomly sampled inputs per function Strings
concat(y, reverse(y)) = x ⇒ reverse(x) = x
Z3
valid(x) = p ∧ valid(y) = p ⇒ valid(and(x, y)) = p
Trig
x = y − π/2 ⇒ (sin(x) = z ⇐
Synthesis of Datalog programs—graph analytics Synthesis of fair decision-making programs Active-learning-based user interaction Proofs as programs …