Programming Not Only by Example Hila Peleg September 2017 Joint - - PowerPoint PPT Presentation

programming not only
SMART_READER_LITE
LIVE PREVIEW

Programming Not Only by Example Hila Peleg September 2017 Joint - - PowerPoint PPT Presentation

Programming Not Only by Example Hila Peleg September 2017 Joint work with Sharon Shoham and Eran Yahav The research leading to these results has received funding from the European Union's - Seventh Framework Programme (FP7) under grant


slide-1
SLIDE 1

Programming Not Only by Example

Hila Peleg September 2017 Joint work with Sharon Shoham and Eran Yahav

The research leading to these results has received funding from the European Union's - Seventh Framework Programme (FP7) under grant agreement n° 615688 – ERC- COG-PRIME. Art from xkcd.com by Randall Munroe, licensed under Creative Commons

slide-2
SLIDE 2

Program Synthesis

  • The problem of producing code that will

satisfy a (usually) partial specifications

  • Many existing works are aimed at end-users

performing simple repetitive tasks

  • Others produce code meant to be executed
  • These all require human-synthesizer

interaction, often iteratively

I need code that will solve my problem

slide-3
SLIDE 3

Programming by Example

An example of the desired behavior:

  • Input:

"abdfibfcfdebdfd ebdihgfkjfdebd"

  • Output: "bd"

Find the most frequent bigram in a string Synthesize!

slide-4
SLIDE 4

Many examples are inherently ambiguous

  • PBE aims for

consistency with examples

  • Examples don’t

convey intent uniquely

  • In other words:
  • verfitting

Wait, that’s not what I meant!

input .takeRight(2)

Input: "abdfibfcfdebdfd ebdihgfkjfdebd" Output: "bd"

slide-5
SLIDE 5

Problem I: Finding a differentiating example is hard

  • Both a counterexample and an

example

  • Implicitly convey explicit knowledge
  • Usually takes a RTE and some time

Drat! Still consistent with the bad program!

slide-6
SLIDE 6

Problem II: Examples are terrible at directing search

input

//ab,bd,df,…

.sliding(2)

//bd,df,fi,…

.drop(1)

//bd

.min

I want it to never use min again Even if we give an example where it’s the maximum? Impossible.

Input: "abbba" Output: "bb" input//abbba .sliding(2)//ab,bb,… .drop(1)//bb,bb,ba .dropRight(1)//bb,bb .min //bb

slide-7
SLIDE 7

Problem II: An impossibility result

  • Claim: If f is a function in the vocabulary, f

cannot be eliminated from the search space by examples alone

  • And in practice, undesirable elements are all
  • ver the search space – they come back
  • As often as half the session!
  • The interaction model is too lean:
  • You can see the problem
  • But you can’t communicate the problem
  • Longer search
  • But also frustration

Aargh! I just need it to not use that anymore!

slide-8
SLIDE 8

Granularity of the interaction

  • Programmers can interact on a lower

level

  • They understand sub-problems
  • They understand the code
  • (They can be helped to understand)
  • They should be given the power

This part looks fine… but that’s clearly wrong.

slide-9
SLIDE 9

The Granular Interaction Model (GIM)

  • A programmer can talk at

the level of the program

  • Read debug info
  • Reason about subtrees or

sequences of methods

  • Or intermediate states
  • But also examples, if those

happen to be easier input

//ab,bd,df,…

.sliding(2)

//bd,df,fi,…

.drop(1)

//bd

.min

That looks right Those are wrong

slide-10
SLIDE 10

Back to our example

The user can answer locally: exclude takeRight(2)

Find the most frequent bigram in a string

input .takeRight(2)

Input: "abdfibfcfdebdfd ebdihgfkjfdebd" Output: "bd"

slide-11
SLIDE 11

Another step

The synthesizer prunes the search space and produces another answer Answer: exclude drop(1)∙take(2) input //"abdfibfcfdebdfdebdihgfkjfdebd" .drop(1) //"bdfibfcfdebdfdebdihgfkjfdebd" .take(2) //"bd"

slide-12
SLIDE 12

The synthesizer answers User provides a compound answer: retain zip(input.drop(1)) exclude take(2) And possibly even retain map(p => p._1.toString + p._2)

input//"abdfibfcfdebdfdebdihgfkjfdebd" .zip(input.drop(1))//List((a,b),(b,d),(d,f),(f,i),...) .take(2)//List((a,b),(b,d)) .map(p => p._1.toString + p._2)//List("ab","bd") .max//"bd"

slide-13
SLIDE 13

Until finally

input//"abdfibfcfdebdfdebdihgfkjfdebd" .zip(input.drop(1))//List((a,b),(b,d),(d,f),(f,i),… .map(p => p._1.toString + p._2)//List("ab","bd",... .groupBy(x => x)//Map("bf"->List("bf"),"ib"-> List("ib"),... .map(kv => kv._1 -> kv._2.length)//Map("bf"->1, "ib"->1,... .maxBy(_._2)//("bd",4) ._1//"bd"

slide-14
SLIDE 14

Can this also help the synthesizer?

  • “Well-suited” feedback
  • To the domain
  • And to the synthesizer
  • For example: functional concatenations are

trees

  • Help prune the state

Dammit, GIM, I’m a programmer, not a gardener!

slide-15
SLIDE 15

The usability of GIM

  • We studied GIM in the wild (32 users,

industry/academia)

  • Users loved the debug information
  • Total synthesis session time is the same
  • But each iteration was shorter than PBE
  • Users spent less time fighting “distracting”

sub-programs

Can your data fully reflect my frustration? Well…

slide-16
SLIDE 16

Users like examples (but not that much)

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% histogram

  • no. lines

with text most frequent word histogram

  • no. lines

with text most frequent word histogram

  • no. lines

with text most frequent word all users familiar with Scala not familiar with Scala

Portion of examples (%)

2 4 6 8 10 12

PBE Syntax GIM PBE Syntax GIM PBE Syntax GIM histogram

  • no. lines with text

most frequent word

Reached correct answer (no. users) Reached target answer

slide-17
SLIDE 17