Path-based Inductive Synthesis for Program Inversion Saurabh - - PowerPoint PPT Presentation

path based inductive synthesis for program inversion
SMART_READER_LITE
LIVE PREVIEW

Path-based Inductive Synthesis for Program Inversion Saurabh - - PowerPoint PPT Presentation

Path-based Inductive Synthesis for Program Inversion Saurabh Srivastava (a) Sumit Gulwani (b) Swarat Chaudhuri (c) Jeffrey S. Foster (d) (a) University of California, Berkeley (b) Microsoft Research, Redmond (c) Rice University (d) University of


slide-1
SLIDE 1

Path-based Inductive Synthesis for Program Inversion

Saurabh Srivastava (a) Sumit Gulwani (b) Swarat Chaudhuri (c) Jeffrey S. Foster (d)

(a) University of California, Berkeley (b) Microsoft Research, Redmond (c) Rice University (d) University of Maryland, College Park

slide-2
SLIDE 2

Program Inversion as Synthesis

  • Task

Given a program P, synthesis P-1 such that P-1(P(x)) = x

  • Motivation: Many common program/inverse pairs
  • Compress/decompress, insert/delete, lossless encode/decode,

encrypt/decrypt, rollback, many more

  • Only having to write one increases productivity, reduces bugs
  • Problem
  • Existing synthesis techniques not well-suited for inversion
  • Dedicated inversion techniques limited in scope
slide-3
SLIDE 3

PINS: Path-based Inductive Synthesis

  • Specification
  • Program to be inverted
  • Template hints: Control flow, and expressions, predicates
  • Functional requirement: Program + Inverse = Identity
  • Engine: SMT solver (Z3)
  • Algorithm: Inspired by testing
  • Explore path through program + template
  • Ask engine for instantiations on path to match spec
  • Iterate, refining space
slide-4
SLIDE 4

Small path-bound hypothesis

  • Same hypothesis underlies program testing
  • As in testing, two questions:

1) Which paths?

  • Especially since the template describes “set of programs”

2) How can we ensure the generated inverse is correct?

  • We check using: manual inspection, testing, bounded verification

“Program behavior can be summarized by examining a carefully chosen, small, finite set of paths”

slide-5
SLIDE 5

Example of templates

In-place run-length encoding:

A = [1,1,1,0,0,2,2,2,2] A = [1,0,2] N = [3,2,4] A’ = [1,1,1,0,0,2,2,2,2]

slide-6
SLIDE 6

Example of templates

In-place run-length encoding:

A = [1,1,1,0,0,2,2,2,2] A = [1,0,2] N = [3,2,4] A’ = [1,1,1,0,0,2,2,2,2]

assume(n>=0); i, m := 0, 0; // parallel assignment while (i < n) r := 1; while (i+1 < n && A[i] = A[i+1]) r, i := r + 1, i+1; A[m], N[m], m, i := A[i], r, m+1, i+1;

Original encoder

slide-7
SLIDE 7

Example of templates

In-place run-length encoding:

A = [1,1,1,0,0,2,2,2,2] A = [1,0,2] N = [3,2,4] A’ = [1,1,1,0,0,2,2,2,2]

assume(n>=0); i, m := 0, 0; // parallel assignment while (i < n) r := 1; while (i+1 < n && A[i] = A[i+1]) r, i := r + 1, i+1; A[m], N[m], m, i := A[i], r, m+1, i+1;

Original encoder

iʼ, mʼ := e1, e2; // ei ∈ E while (p1) // pi ∈ P rʼ := e3; while (p2) rʼ, iʼ, Aʼ := e4, e5, e6; mʼ := e7;

Template decoder

slide-8
SLIDE 8

Example of templates

In-place run-length encoding:

A = [1,1,1,0,0,2,2,2,2] A = [1,0,2] N = [3,2,4] A’ = [1,1,1,0,0,2,2,2,2]

assume(n>=0); i, m := 0, 0; // parallel assignment while (i < n) r := 1; while (i+1 < n && A[i] = A[i+1]) r, i := r + 1, i+1; A[m], N[m], m, i := A[i], r, m+1, i+1;

Original encoder

iʼ, mʼ := e1, e2; // ei ∈ E while (p1) // pi ∈ P rʼ := e3; while (p2) rʼ, iʼ, Aʼ := e4, e5, e6; mʼ := e7;

Template decoder

E = {

0, 1, mʼ+1, mʼ-1, rʼ+1, rʼ-1, iʼ+1, iʼ-1, Aʼ[mʼ]:=A[iʼ], Aʼ[iʼ] := A[mʼ], N[mʼ] }

slide-9
SLIDE 9

Example of templates

In-place run-length encoding:

A = [1,1,1,0,0,2,2,2,2] A = [1,0,2] N = [3,2,4] A’ = [1,1,1,0,0,2,2,2,2]

assume(n>=0); i, m := 0, 0; // parallel assignment while (i < n) r := 1; while (i+1 < n && A[i] = A[i+1]) r, i := r + 1, i+1; A[m], N[m], m, i := A[i], r, m+1, i+1;

Original encoder

iʼ, mʼ := e1, e2; // ei ∈ E while (p1) // pi ∈ P rʼ := e3; while (p2) rʼ, iʼ, Aʼ := e4, e5, e6; mʼ := e7;

Template decoder

P = {

mʼ<m, rʼ>0, Aʼ[iʼ]=Aʼ[iʼ+1] }

E = {

0, 1, mʼ+1, mʼ-1, rʼ+1, rʼ-1, iʼ+1, iʼ-1, Aʼ[mʼ]:=A[iʼ], Aʼ[iʼ] := A[mʼ], N[mʼ] }

slide-10
SLIDE 10

Example of templates

In-place run-length encoding:

A = [1,1,1,0,0,2,2,2,2] A = [1,0,2] N = [3,2,4] A’ = [1,1,1,0,0,2,2,2,2]

assume(n>=0); i, m := 0, 0; // parallel assignment while (i < n) r := 1; while (i+1 < n && A[i] = A[i+1]) r, i := r + 1, i+1; A[m], N[m], m, i := A[i], r, m+1, i+1;

Original encoder

iʼ, mʼ := e1, e2; // ei ∈ E while (p1) // pi ∈ P rʼ := e3; while (p2) rʼ, iʼ, Aʼ := e4, e5, e6; mʼ := e7;

Template decoder

P = {

mʼ<m, rʼ>0, Aʼ[iʼ]=Aʼ[iʼ+1] }

E = {

0, 1, mʼ+1, mʼ-1, rʼ+1, rʼ-1, iʼ+1, iʼ-1, Aʼ[mʼ]:=A[iʼ], Aʼ[iʼ] := A[mʼ], N[mʼ] }

Template control flow, expressions E, and predicates P, semi-automatically mined from

  • riginal
slide-11
SLIDE 11

assume(n>=0); i, m := 0, 0; while (i < n) r := 1; while (i+1 < n && A[i] = A[i+1]) r, i := r + 1, i+1; A[m], N[m], m, i := A[i], r, m+1, i+1;

Symbolic execution of program paths

iʼ, mʼ := e1, e2; while (p1) rʼ := e3; while (p2) rʼ, iʼ, Aʼ := e4, e5, e6; mʼ := e7;

slide-12
SLIDE 12

assume(n>=0); i, m := 0, 0; while (i < n) r := 1; while (i+1 < n && A[i] = A[i+1]) r, i := r + 1, i+1; A[m], N[m], m, i := A[i], r, m+1, i+1;

Symbolic execution of program paths

iʼ, mʼ := e1, e2; while (p1) rʼ := e3; while (p2) rʼ, iʼ, Aʼ := e4, e5, e6; mʼ := e7;

slide-13
SLIDE 13

¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1V) ⇒ identity assume(n>=0); i, m := 0, 0; while (i < n) r := 1; while (i+1 < n && A[i] = A[i+1]) r, i := r + 1, i+1; A[m], N[m], m, i := A[i], r, m+1, i+1;

Symbolic execution of program paths

iʼ, mʼ := e1, e2; while (p1) rʼ := e3; while (p2) rʼ, iʼ, Aʼ := e4, e5, e6; mʼ := e7; (n>=0) i, m := 0, 0 (i < n) iʼ, mʼ := e1, e2 (p1) ¬ ¬

slide-14
SLIDE 14

assume(n>=0); i, m := 0, 0; while (i < n) r := 1; while (i+1 < n && A[i] = A[i+1]) r, i := r + 1, i+1; A[m], N[m], m, i := A[i], r, m+1, i+1;

Symbolic execution of program paths

iʼ, mʼ := e1, e2; while (p1) rʼ := e3; while (p2) rʼ, iʼ, Aʼ := e4, e5, e6; mʼ := e7; ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1V) ⇒ identity

slide-15
SLIDE 15

assume(n>=0); i, m := 0, 0; while (i < n) r := 1; while (i+1 < n && A[i] = A[i+1]) r, i := r + 1, i+1; A[m], N[m], m, i := A[i], r, m+1, i+1;

Symbolic execution of program paths

iʼ, mʼ := e1, e2; while (p1) rʼ := e3; while (p2) rʼ, iʼ, Aʼ := e4, e5, e6; mʼ := e7; ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1V) ⇒ identity

slide-16
SLIDE 16

¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1Vʼ) ∧ ⇒ identity rʼ1=e3Vʼ ∧ (p2Vʼʼ) ∧ mʼ2=e7Vʼʼ ∧ (p1Vʼʼʼ) ¬ assume(n>=0); i, m := 0, 0; while (i < n) r := 1; while (i+1 < n && A[i] = A[i+1]) r, i := r + 1, i+1; A[m], N[m], m, i := A[i], r, m+1, i+1;

Symbolic execution of program paths

iʼ, mʼ := e1, e2; while (p1) rʼ := e3; while (p2) rʼ, iʼ, Aʼ := e4, e5, e6; mʼ := e7; (n>=0) i, m := 0, 0 (i < n) iʼ, mʼ := e1, e2 (p1) ¬ (p1) ¬ rʼ := e3 (p2) ¬ mʼ := e7 ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1V) ⇒ identity

slide-17
SLIDE 17

assume(n>=0); i, m := 0, 0; while (i < n) r := 1; while (i+1 < n && A[i] = A[i+1]) r, i := r + 1, i+1; A[m], N[m], m, i := A[i], r, m+1, i+1;

Symbolic execution of program paths

iʼ, mʼ := e1, e2; while (p1) rʼ := e3; while (p2) rʼ, iʼ, Aʼ := e4, e5, e6; mʼ := e7; ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1Vʼ) ∧ ⇒ identity rʼ1=e3Vʼ ∧ (p2Vʼʼ) ∧ mʼ2=e7Vʼʼ ∧ (p1Vʼʼʼ) ¬ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1V) ⇒ identity

slide-18
SLIDE 18

assume(n>=0); i, m := 0, 0; while (i < n) r := 1; while (i+1 < n && A[i] = A[i+1]) r, i := r + 1, i+1; A[m], N[m], m, i := A[i], r, m+1, i+1;

Symbolic execution of program paths

iʼ, mʼ := e1, e2; while (p1) rʼ := e3; while (p2) rʼ, iʼ, Aʼ := e4, e5, e6; mʼ := e7; ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1Vʼ) ∧ ⇒ identity rʼ1=e3Vʼ ∧ (p2Vʼʼ) ∧ mʼ2=e7Vʼʼ ∧ (p1Vʼʼʼ) ¬ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1V) ⇒ identity

slide-19
SLIDE 19

assume(n>=0); i, m := 0, 0; while (i < n) r := 1; while (i+1 < n && A[i] = A[i+1]) r, i := r + 1, i+1; A[m], N[m], m, i := A[i], r, m+1, i+1;

Symbolic execution of program paths

iʼ, mʼ := e1, e2; while (p1) rʼ := e3; while (p2) rʼ, iʼ, Aʼ := e4, e5, e6; mʼ := e7; (n>=0) i, m := 0, 0 (i < n) r := 1 (i+1 < n && A[i] = A[i+1]) A[m], N[m], m, i := A[i], r, m+1, i+1 iʼ, mʼ := e1, e2 (p1) (i < n) ¬ ¬ ¬ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1Vʼ) ∧ ⇒ identity rʼ1=e3Vʼ ∧ (p2Vʼʼ) ∧ mʼ2=e7Vʼʼ ∧ (p1Vʼʼʼ) ¬ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1V) ⇒ identity

slide-20
SLIDE 20

(n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ r1 = 1 ∧ (i1+1 < n0 && A[i1] = A[i1+1]) A[m1]=A[i1]∧N[m1]=r1∧m2= m1+1∧i2=i1+1 (p1V) (i2 < n0) ¬ ¬ ¬ ⇒ identity iʼ1=e1V ∧ mʼ1=e2V ∧ assume(n>=0); i, m := 0, 0; while (i < n) r := 1; while (i+1 < n && A[i] = A[i+1]) r, i := r + 1, i+1; A[m], N[m], m, i := A[i], r, m+1, i+1;

Symbolic execution of program paths

iʼ, mʼ := e1, e2; while (p1) rʼ := e3; while (p2) rʼ, iʼ, Aʼ := e4, e5, e6; mʼ := e7; (n>=0) i, m := 0, 0 (i < n) r := 1 (i+1 < n && A[i] = A[i+1]) A[m], N[m], m, i := A[i], r, m+1, i+1 iʼ, mʼ := e1, e2 (p1) (i < n) ¬ ¬ ¬ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1Vʼ) ∧ ⇒ identity rʼ1=e3Vʼ ∧ (p2Vʼʼ) ∧ mʼ2=e7Vʼʼ ∧ (p1Vʼʼʼ) ¬ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1V) ⇒ identity

slide-21
SLIDE 21

Symbolic execution of program paths

(n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ r1 = 1 ∧ (i1+1 < n0 && A[i1] = A[i1+1]) A[m1]=A[i1]∧N[m1]=r1∧m2= m1+1∧i2=i1+1 (p1V) (i2 < n0) ¬ ¬ ¬ ⇒ identity iʼ1=e1V ∧ mʼ1=e2V ∧ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1Vʼ) ∧ ⇒ identity rʼ1=e3Vʼ ∧ (p2Vʼʼ) ∧ mʼ2=e7Vʼʼ ∧ (p1Vʼʼʼ) ¬ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1V) ⇒ identity

slide-22
SLIDE 22

Solving using SMT and SAT

(n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ r1 = 1 ∧ (i1+1 < n0 && A[i1] = A[i1+1]) A[m1]=A[i1]∧N[m1]=r1∧m2= m1+1∧i2=i1+1 (p1V) (i2 < n0) ¬ ¬ ¬ ⇒ identity iʼ1=e1V ∧ mʼ1=e2V ∧ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1Vʼ) ∧ ⇒ identity rʼ1=e3Vʼ ∧ (p2Vʼʼ) ∧ mʼ2=e7Vʼʼ ∧ (p1Vʼʼʼ) ¬ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1V) ⇒ identity

slide-23
SLIDE 23

Solving using SMT and SAT

(n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ r1 = 1 ∧ (i1+1 < n0 && A[i1] = A[i1+1]) A[m1]=A[i1]∧N[m1]=r1∧m2= m1+1∧i2=i1+1 (p1V) (i2 < n0) ¬ ¬ ¬ ⇒ identity iʼ1=e1V ∧ mʼ1=e2V ∧ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1Vʼ) ∧ ⇒ identity rʼ1=e3Vʼ ∧ (p2Vʼʼ) ∧ mʼ2=e7Vʼʼ ∧ (p1Vʼʼʼ) ¬ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1V) ⇒ identity

ϕ1(e1,e2,p1)

⇒ identity

slide-24
SLIDE 24

Solving using SMT and SAT

(n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ r1 = 1 ∧ (i1+1 < n0 && A[i1] = A[i1+1]) A[m1]=A[i1]∧N[m1]=r1∧m2= m1+1∧i2=i1+1 (p1V) (i2 < n0) ¬ ¬ ¬ ⇒ identity iʼ1=e1V ∧ mʼ1=e2V ∧ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1Vʼ) ∧ ⇒ identity rʼ1=e3Vʼ ∧ (p2Vʼʼ) ∧ mʼ2=e7Vʼʼ ∧ (p1Vʼʼʼ) ¬ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1V) ⇒ identity

ϕ1(e1,e2,p1)

⇒ identity

ϕ2(e1,e2,p1,e3,e7,p2)

⇒ identity

slide-25
SLIDE 25

Solving using SMT and SAT

(n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ r1 = 1 ∧ (i1+1 < n0 && A[i1] = A[i1+1]) A[m1]=A[i1]∧N[m1]=r1∧m2= m1+1∧i2=i1+1 (p1V) (i2 < n0) ¬ ¬ ¬ ⇒ identity iʼ1=e1V ∧ mʼ1=e2V ∧ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1Vʼ) ∧ ⇒ identity rʼ1=e3Vʼ ∧ (p2Vʼʼ) ∧ mʼ2=e7Vʼʼ ∧ (p1Vʼʼʼ) ¬ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1V) ⇒ identity

ϕ1(e1,e2,p1)

⇒ identity

ϕ2(e1,e2,p1)

⇒ identity

ϕ2(e1,e2,p1,e3,e7,p2)

⇒ identity

slide-26
SLIDE 26

Solving using SMT and SAT

(n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ r1 = 1 ∧ (i1+1 < n0 && A[i1] = A[i1+1]) A[m1]=A[i1]∧N[m1]=r1∧m2= m1+1∧i2=i1+1 (p1V) (i2 < n0) ¬ ¬ ¬ ⇒ identity iʼ1=e1V ∧ mʼ1=e2V ∧ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1Vʼ) ∧ ⇒ identity rʼ1=e3Vʼ ∧ (p2Vʼʼ) ∧ mʼ2=e7Vʼʼ ∧ (p1Vʼʼʼ) ¬ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1V) ⇒ identity

ϕ1(e1,e2,p1)

⇒ identity

ϕ2(e1,e2,p1)

⇒ identity

ϕ2(e1,e2,p1,e3,e7,p2)

⇒ identity

∧ ∃ ei,pj∀program vars

slide-27
SLIDE 27

Solving using SMT and SAT

(n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ r1 = 1 ∧ (i1+1 < n0 && A[i1] = A[i1+1]) A[m1]=A[i1]∧N[m1]=r1∧m2= m1+1∧i2=i1+1 (p1V) (i2 < n0) ¬ ¬ ¬ ⇒ identity iʼ1=e1V ∧ mʼ1=e2V ∧ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1Vʼ) ∧ ⇒ identity rʼ1=e3Vʼ ∧ (p2Vʼʼ) ∧ mʼ2=e7Vʼʼ ∧ (p1Vʼʼʼ) ¬ ¬ ¬ (n0>=0) ∧ i1 = 0 ∧ m1 = 0 ∧ (i1 < n0) ∧ iʼ1=e1V ∧ mʼ1=e2V ∧ (p1V) ⇒ identity

ϕ1(e1,e2,p1)

⇒ identity

ϕ2(e1,e2,p1)

⇒ identity

ϕ2(e1,e2,p1,e3,e7,p2)

⇒ identity

∧ ∃ ei,pj∀program vars

  • Naive approach:
  • Enumerate ei,pj and “validate”
  • Will not scale
  • 211 to 237 candidates our experiments
slide-28
SLIDE 28

Efficient solving from prior work on verification using SAT/SMT

  • Efficient solving strategy:
  • Verification solves ∃Invariant∀vars
  • Reuse SMT-based verifier technology
  • Core idea:
  • Predicates/expressions form a lattice
  • Efficient encoding using lattice instead
  • f enumerating entire domain
  • See prior work in PLDI’09/POPL’10

∃ei,pj∀vars

ϕk(e1,e2,p1) ⇒ identity

∧k

slide-29
SLIDE 29

The PINS Algorithm

C = termination (T) while (true) { solns = solve (C,P,E,Spec) if (empty(solns)) fail if (stabilized(solns)) return solns s = pickone (solns) C = C∧ directed-path-explore (T,s) }

slide-30
SLIDE 30

The PINS Algorithm

C = termination (T) while (true) { solns = solve (C,P,E,Spec) if (empty(solns)) fail if (stabilized(solns)) return solns s = pickone (solns) C = C∧ directed-path-explore (T,s) }

C holds the accumulated constraints

slide-31
SLIDE 31

The PINS Algorithm

C = termination (T) while (true) { solns = solve (C,P,E,Spec) if (empty(solns)) fail if (stabilized(solns)) return solns s = pickone (solns) C = C∧ directed-path-explore (T,s) }

C holds the accumulated constraints Initialize with termination cnstr

Simple linear constraints that ensure that symbolic execution terminates

( )

slide-32
SLIDE 32

The PINS Algorithm

C = termination (T) while (true) { solns = solve (C,P,E,Spec) if (empty(solns)) fail if (stabilized(solns)) return solns s = pickone (solns) C = C∧ directed-path-explore (T,s) }

C holds the accumulated constraints

If no change to candidate set then they are likely not refutable

Initialize with termination cnstr

Simple linear constraints that ensure that symbolic execution terminates

( )

slide-33
SLIDE 33

The PINS Algorithm

C = termination (T) while (true) { solns = solve (C,P,E,Spec) if (empty(solns)) fail if (stabilized(solns)) return solns s = pickone (solns) C = C∧ directed-path-explore (T,s) }

C holds the accumulated constraints

If no change to candidate set then they are likely not refutable

Else use one s to parameterize next path exploration Initialize with termination cnstr

Simple linear constraints that ensure that symbolic execution terminates

( )

slide-34
SLIDE 34

The PINS Algorithm

C = termination (T) while (true) { solns = solve (C,P,E,Spec) if (empty(solns)) fail if (stabilized(solns)) return solns s = pickone (solns) C = C∧ directed-path-explore (T,s) }

C holds the accumulated constraints

If no change to candidate set then they are likely not refutable

Else use one s to parameterize next path exploration Explore another path and add its constraint Initialize with termination cnstr

Simple linear constraints that ensure that symbolic execution terminates

( )

slide-35
SLIDE 35

The PINS Algorithm

C = termination (T) while (true) { solns = solve (C,P,E,Spec) if (empty(solns)) fail if (stabilized(solns)) return solns s = pickone (solns) C = C∧ directed-path-explore (T,s) }

C holds the accumulated constraints

If no change to candidate set then they are likely not refutable

Else use one s to parameterize next path exploration Explore another path and add its constraint Initialize with termination cnstr

Simple linear constraints that ensure that symbolic execution terminates

( )

We do not have a way of certifiably saying which remaining solutions are correct and which are not. So how do we find a path that prunes the space further?

slide-36
SLIDE 36

Directed path exploration

x |E|(# Expr Holes) 2|P|(# Pred Holes)

Template program T Remaining search space Explored paths

slide-37
SLIDE 37

Directed path exploration

x |E|(# Expr Holes) 2|P|(# Pred Holes)

Template program T Remaining search space Explored paths

slide-38
SLIDE 38

Directed path exploration

x |E|(# Expr Holes) 2|P|(# Pred Holes)

S✓ S✗

Template program T Remaining search space Explored paths

slide-39
SLIDE 39

Directed path exploration

⇒ spec

S✓ S✗

x |E|(# Expr Holes) 2|P|(# Pred Holes)

S✓ S✗

Template program T Remaining search space Explored paths

slide-40
SLIDE 40

Directed path exploration

⇒ spec

S✓ S✗

What if? x |E|(# Expr Holes) 2|P|(# Pred Holes)

S✓ S✗

Template program T Remaining search space Explored paths

slide-41
SLIDE 41

Directed path exploration

⇒ spec

S✓ S✗

S✓

⇒ false

What if? x |E|(# Expr Holes) 2|P|(# Pred Holes)

S✓ S✗

Template program T Remaining search space Explored paths T Instantiated with S✓

slide-42
SLIDE 42

Directed path exploration

⇒ spec

S✓ S✗

⇒ spec

S✓

⇒ false

What if? x |E|(# Expr Holes) 2|P|(# Pred Holes)

S✓ S✗

Template program T Remaining search space Explored paths T Instantiated with S✓

slide-43
SLIDE 43

Directed path exploration

⇒ spec

S✓ S✗

⇒ spec

S✓

⇒ false

S✗

⇒ false

What if? x |E|(# Expr Holes) 2|P|(# Pred Holes)

S✓ S✗

Template program T Remaining search space Explored paths T Instantiated with S✓

slide-44
SLIDE 44

Directed path exploration

⇒ spec

S✓ S✗

⇒ spec

S✓

⇒ false

S✗

⇒ false

What if?

⇒ spec

x |E|(# Expr Holes) 2|P|(# Pred Holes)

S✓ S✗

Template program T Remaining search space Explored paths T Instantiated with S✓

slide-45
SLIDE 45

Directed path exploration

S✓ S✗

⇒ spec

S✓

⇒ spec ⇒ false

S✗

⇒ false ⇒ spec

slide-46
SLIDE 46

Directed path exploration

S✓ S✗

⇒ spec

S✓

⇒ spec ⇒ false

S✗

⇒ false Pick any solution from remaining space; don’t care about its validity ⇒ spec

slide-47
SLIDE 47

Directed path exploration

S✓ S✗

⇒ spec

S✓

⇒ spec ⇒ false

S✗

⇒ false Pick any solution from remaining space; don’t care about its validity Instantiate template with picked solution, and now symbolically execute to find feasible path

Directed path exploration

⇒ spec

slide-48
SLIDE 48

Program inversion benchmarks

  • Three domains
  • Lossless compression
  • Format conversion
  • Arithmetic
  • Semi-automatic procedure to extract template T
  • Control-flow derived from original program
  • Expression/predicates mined
  • Ran PINS to invert using template T
slide-49
SLIDE 49

Results

!"#$%&'()* +"'($%* ,-'$"* ("./$01#* 2/&3"(** 14** 56"('01#,* 78&"* 9'#/':* ;%"$)* 91.":* ;%"$)"(* !"#$%&#'()$ **+$,-$.$ /$ *01$ 23$ 45%675(&7$ 8#$9%5:&$!;$ *<=$,-$.$ /$ <01$ 23$ 45%675(&7$ ;>//$ **+$,-$*$ 0$ .?.=1$ .$2@$*$23$ 45%675(&7$ ;>A$ *<.$,-$*$ B$ .+=1$ *$2@$*$23$ (22$:2C9%&D$ E51&$0B$ *</$,-$B$ .*$ .<//1$ .$2@$B$23$ (22$:2C9%&D$ FF&#:27&$ **=$,-$.$ /$ <B1$ 23$ (22$:2C9%&D$ G3($AH59$ **=$,-$.$ 0$ .<*1$ 23$ (22$:2C9%&D$ I&H65%6J&$ *..$,-$.$ .B$ ++1$ 23$ (22$:2C9%&D$ I"C$6$ *.+$,-$.$ B$ .1$ 23$ 45%675(&7$ K&:(2H$H2(5(&$ *.0$,-$.$ <$ B1$ 23$ (22$:2C9%&D$ K&:(2H$1)6L$ *.0$,-$.$ <$ B1$ 23$ 45%675(&7$ K&:(2H$1:5%&$ *.0$,-$.$ <$ <01$ 23$ (22$:2C9%&D$

;211%&11$ M2C9H&1162#$ N2HC5($ M2#4&H162#$ OH6()C&P:$

slide-50
SLIDE 50

Results

!"#$%&'()* +"'($%* ,-'$"* ("./$01#* 2/&3"(** 14** 56"('01#,* 78&"* 9'#/':* ;%"$)* 91.":* ;%"$)"(* !"#$%&#'()$ **+$,-$.$ /$ *01$ 23$ 45%675(&7$ 8#$9%5:&$!;$ *<=$,-$.$ /$ <01$ 23$ 45%675(&7$ ;>//$ **+$,-$*$ 0$ .?.=1$ .$2@$*$23$ 45%675(&7$ ;>A$ *<.$,-$*$ B$ .+=1$ *$2@$*$23$ (22$:2C9%&D$ E51&$0B$ *</$,-$B$ .*$ .<//1$ .$2@$B$23$ (22$:2C9%&D$ FF&#:27&$ **=$,-$.$ /$ <B1$ 23$ (22$:2C9%&D$ G3($AH59$ **=$,-$.$ 0$ .<*1$ 23$ (22$:2C9%&D$ I&H65%6J&$ *..$,-$.$ .B$ ++1$ 23$ (22$:2C9%&D$ I"C$6$ *.+$,-$.$ B$ .1$ 23$ 45%675(&7$ K&:(2H$H2(5(&$ *.0$,-$.$ <$ B1$ 23$ (22$:2C9%&D$ K&:(2H$1)6L$ *.0$,-$.$ <$ B1$ 23$ 45%675(&7$ K&:(2H$1:5%&$ *.0$,-$.$ <$ <01$ 23$ (22$:2C9%&D$

;211%&11$ M2C9H&1162#$ N2HC5($ M2#4&H162#$ OH6()C&P:$

PINS narrowed the valid candidates to 1 in almost all cases

slide-51
SLIDE 51

Results

!"#$%&'()* +"'($%* ,-'$"* ("./$01#* 2/&3"(** 14** 56"('01#,* 78&"* 9'#/':* ;%"$)* 91.":* ;%"$)"(* !"#$%&#'()$ **+$,-$.$ /$ *01$ 23$ 45%675(&7$ 8#$9%5:&$!;$ *<=$,-$.$ /$ <01$ 23$ 45%675(&7$ ;>//$ **+$,-$*$ 0$ .?.=1$ .$2@$*$23$ 45%675(&7$ ;>A$ *<.$,-$*$ B$ .+=1$ *$2@$*$23$ (22$:2C9%&D$ E51&$0B$ *</$,-$B$ .*$ .<//1$ .$2@$B$23$ (22$:2C9%&D$ FF&#:27&$ **=$,-$.$ /$ <B1$ 23$ (22$:2C9%&D$ G3($AH59$ **=$,-$.$ 0$ .<*1$ 23$ (22$:2C9%&D$ I&H65%6J&$ *..$,-$.$ .B$ ++1$ 23$ (22$:2C9%&D$ I"C$6$ *.+$,-$.$ B$ .1$ 23$ 45%675(&7$ K&:(2H$H2(5(&$ *.0$,-$.$ <$ B1$ 23$ (22$:2C9%&D$ K&:(2H$1)6L$ *.0$,-$.$ <$ B1$ 23$ 45%675(&7$ K&:(2H$1:5%&$ *.0$,-$.$ <$ <01$ 23$ (22$:2C9%&D$

;211%&11$ M2C9H&1162#$ N2HC5($ M2#4&H162#$ OH6()C&P:$

Directed path exploration is successful in finding a small set of paths that prune the space

slide-52
SLIDE 52

Results

!"#$%&'()* +"'($%* ,-'$"* ("./$01#* 2/&3"(** 14** 56"('01#,* 78&"* 9'#/':* ;%"$)* 91.":* ;%"$)"(* !"#$%&#'()$ **+$,-$.$ /$ *01$ 23$ 45%675(&7$ 8#$9%5:&$!;$ *<=$,-$.$ /$ <01$ 23$ 45%675(&7$ ;>//$ **+$,-$*$ 0$ .?.=1$ .$2@$*$23$ 45%675(&7$ ;>A$ *<.$,-$*$ B$ .+=1$ *$2@$*$23$ (22$:2C9%&D$ E51&$0B$ *</$,-$B$ .*$ .<//1$ .$2@$B$23$ (22$:2C9%&D$ FF&#:27&$ **=$,-$.$ /$ <B1$ 23$ (22$:2C9%&D$ G3($AH59$ **=$,-$.$ 0$ .<*1$ 23$ (22$:2C9%&D$ I&H65%6J&$ *..$,-$.$ .B$ ++1$ 23$ (22$:2C9%&D$ I"C$6$ *.+$,-$.$ B$ .1$ 23$ 45%675(&7$ K&:(2H$H2(5(&$ *.0$,-$.$ <$ B1$ 23$ (22$:2C9%&D$ K&:(2H$1)6L$ *.0$,-$.$ <$ B1$ 23$ 45%675(&7$ K&:(2H$1:5%&$ *.0$,-$.$ <$ <01$ 23$ (22$:2C9%&D$

;211%&11$ M2C9H&1162#$ N2HC5($ M2#4&H162#$ OH6()C&P:$

Symbolic execution is sometimes expensive; but mostly the paths are explored in reasonable time

slide-53
SLIDE 53

Results

!"#$%&'()* +"'($%* ,-'$"* ("./$01#* 2/&3"(** 14** 56"('01#,* 78&"* 9'#/':* ;%"$)* 91.":* ;%"$)"(* !"#$%&#'()$ **+$,-$.$ /$ *01$ 23$ 45%675(&7$ 8#$9%5:&$!;$ *<=$,-$.$ /$ <01$ 23$ 45%675(&7$ ;>//$ **+$,-$*$ 0$ .?.=1$ .$2@$*$23$ 45%675(&7$ ;>A$ *<.$,-$*$ B$ .+=1$ *$2@$*$23$ (22$:2C9%&D$ E51&$0B$ *</$,-$B$ .*$ .<//1$ .$2@$B$23$ (22$:2C9%&D$ FF&#:27&$ **=$,-$.$ /$ <B1$ 23$ (22$:2C9%&D$ G3($AH59$ **=$,-$.$ 0$ .<*1$ 23$ (22$:2C9%&D$ I&H65%6J&$ *..$,-$.$ .B$ ++1$ 23$ (22$:2C9%&D$ I"C$6$ *.+$,-$.$ B$ .1$ 23$ 45%675(&7$ K&:(2H$H2(5(&$ *.0$,-$.$ <$ B1$ 23$ (22$:2C9%&D$ K&:(2H$1)6L$ *.0$,-$.$ <$ B1$ 23$ 45%675(&7$ K&:(2H$1:5%&$ *.0$,-$.$ <$ <01$ 23$ (22$:2C9%&D$

;211%&11$ M2C9H&1162#$ N2HC5($ M2#4&H162#$ OH6()C&P:$

Either only one remained or were easily examined

slide-54
SLIDE 54

Results

!"#$%&'()* +"'($%* ,-'$"* ("./$01#* 2/&3"(** 14** 56"('01#,* 78&"* 9'#/':* ;%"$)* 91.":* ;%"$)"(* !"#$%&#'()$ **+$,-$.$ /$ *01$ 23$ 45%675(&7$ 8#$9%5:&$!;$ *<=$,-$.$ /$ <01$ 23$ 45%675(&7$ ;>//$ **+$,-$*$ 0$ .?.=1$ .$2@$*$23$ 45%675(&7$ ;>A$ *<.$,-$*$ B$ .+=1$ *$2@$*$23$ (22$:2C9%&D$ E51&$0B$ *</$,-$B$ .*$ .<//1$ .$2@$B$23$ (22$:2C9%&D$ FF&#:27&$ **=$,-$.$ /$ <B1$ 23$ (22$:2C9%&D$ G3($AH59$ **=$,-$.$ 0$ .<*1$ 23$ (22$:2C9%&D$ I&H65%6J&$ *..$,-$.$ .B$ ++1$ 23$ (22$:2C9%&D$ I"C$6$ *.+$,-$.$ B$ .1$ 23$ 45%675(&7$ K&:(2H$H2(5(&$ *.0$,-$.$ <$ B1$ 23$ (22$:2C9%&D$ K&:(2H$1)6L$ *.0$,-$.$ <$ B1$ 23$ 45%675(&7$ K&:(2H$1:5%&$ *.0$,-$.$ <$ <01$ 23$ (22$:2C9%&D$

;211%&11$ M2C9H&1162#$ N2HC5($ M2#4&H162#$ OH6()C&P:$

A testing-based approach is the

  • nly viable option, as most of the

examples are too complex for even for bounded verification

slide-55
SLIDE 55

Conclusions

  • PINS seems very promising
  • First testing-based approach to program synthesis
  • To our knowledge, no other technique can invert these

programs with as little guidance

  • Supports small path-bound hypothesis for synthesis
  • Makes sense, since it works for testing (approximate

verification), and we know verification and synthesis are related (see POPL’10 paper)

  • PINS should be applicable to other domains too

http://www.cs.umd.edu/~saurabhs/vs3/PINS/

slide-56
SLIDE 56
slide-57
SLIDE 57

PINS approach vs CEGAR/CEGIS

Verifier Constraint Solver Counterexample SATisfying Soln C

  • r

r e c t n e s s c h e c k e r C

  • n

s t r a i n t S

  • l

v e r E v i d e n c e

  • f

c

  • r

r e c t n e s s k S A T i s f y i n g S

  • l

n s p i c k e d

slide-58
SLIDE 58

void main(int n, BitString A) { BitString *D; int *B; int i,p,k,j,r,size,x,go; IN(str(A,0,n-1),n); ASSUME(n >= 1); D[0] = "0"; D[1] = "1"; i = 0; p = 2; k = 0; while (i < n) { j = i; r = 0; size=-1; while (j < n && r != -1) { x = 0; r = -1; while (x < p) { if (D[x] == substr(A,i,j)) r = x; x++; } if (r != -1) { go = r; size = j-i+1; } j++; } B[k++] = go; D[p++] = substr(A,i,j-1); i += size; } OUT(B,k); }

LZW

void main(int *A, int n) { int *P,*N,*C; int i,j,k,c,p,r; IN(BOUND(A,0,n),n); ASSUME(n >= 0); i = 0; k = 0; while (i < n) { c = 0; p = 0; j = 0; while (j < i) { r = 0; while (i+r < n-1 && A[j+r] == A[i+r]) r++; if (c < r) { c = r; p = i-j; } j++; } P[k] = p; N[k] = c; C[k] = A[i+c]; i = i+1+c; k++; } OUT(P,N,C,k); }

PINS

LZW compressor LZW decompressor