Generic and Complete Techniques for Straight- Line String - - PowerPoint PPT Presentation

generic and complete techniques for straight line string
SMART_READER_LITE
LIVE PREVIEW

Generic and Complete Techniques for Straight- Line String - - PowerPoint PPT Presentation

Generic and Complete Techniques for Straight- Line String Constraints T aolue Chen (Birkbeck) Matthew Hague (Royal Holloway) Anthony W. Lin (Kaiserslautern) Philipp Ruemmer (Uppsala) Zhilin Wu (Chinese Academy of Science) 1 / 51 Abstract


slide-1
SLIDE 1

1 / 51

Generic and Complete Techniques for Straight- Line String Constraints

T aolue Chen (Birkbeck) Matthew Hague (Royal Holloway) Anthony W. Lin (Kaiserslautern) Philipp Ruemmer (Uppsala) Zhilin Wu (Chinese Academy of Science)

slide-2
SLIDE 2

2 / 51

Abstract

  • New techniques for string constraint solving

– Straight-line fragment – String operations/assertions not fjxed – T

wo semantic-conditions (regularity)

– Proof of decidability

  • Implementation

– OSTRICH solver – Competitive, expressive, and complete

slide-3
SLIDE 3

3 / 51

String Programs

S ::= x := f(x1, …, xn) | assert g(x1, …, xn) | S1; S2

  • f is a function from strings to strings
  • g is a function from strings to boolean
  • ; is sequential composition
slide-4
SLIDE 4

4 / 51

Example

assert x in a*b*; assert y in b*; z := concat(x, y); assert z in a*b*;

slide-5
SLIDE 5

5 / 51

Example

assert x in a*b*; assert y in b*; z := concat(x, y); assert z in a*b*; assert in(x, a*b*) assert in(x, a*b*)

slide-6
SLIDE 6

6 / 51

Example

assert x in a*b*; assert y in b*; z := concat(x, y); assert z in a*b*;

Solution

assert in(x, a*b*) assert in(x, a*b*)

slide-7
SLIDE 7

7 / 51

Example

assert x in a*b*; assert y in b*; z := concat(x, y); assert z in a*b*;

Solution

  • x = aa

assert in(x, a*b*) assert in(x, a*b*)

slide-8
SLIDE 8

8 / 51

Example

assert x in a*b*; assert y in b*; z := concat(x, y); assert z in a*b*;

Solution

  • x = aa
  • y = bb

assert in(x, a*b*) assert in(x, a*b*)

slide-9
SLIDE 9

9 / 51

Example

assert x in a*b*; assert y in b*; z := concat(x, y); assert z in a*b*;

Solution

  • x = aa
  • y = bb
  • (z = aabb)

assert in(x, a*b*) assert in(x, a*b*)

slide-10
SLIDE 10

10 / 51

Straight-Line Fragment

  • Similar to single-static assignment form

– Each variable only assigned once – Variables not used before they are

assigned

  • Free-variables are never assigned

– (Our language has no loop support)

slide-11
SLIDE 11

11 / 51

Straight-Line Fragment

  • Similar to single-static assignment form

– Each variable only assigned once – Variables not used before they are

assigned

  • Free-variables are never assigned

– (Our language has no loop support)

x := concat(y,z) y := x y := z

Non-Example

slide-12
SLIDE 12

12 / 51

Straight-Line Fragment

  • Similar to single-static assignment form

– Each variable only assigned once – Variables not used before they are

assigned

  • Free-variables are never assigned

– (Our language has no loop support)

x := concat(y,z) y := x y := z

Non-Example

Assigned after use (Circular dependency) Assigned after use (Circular dependency)

slide-13
SLIDE 13

13 / 51

Straight-Line Fragment

  • Similar to single-static assignment form

– Each variable only assigned once – Variables not used before they are

assigned

  • Free-variables are never assigned

– (Our language has no loop support)

x := concat(y,z) y := x y := z

Non-Example

Assigned after use (Circular dependency) Assigned after use (Circular dependency) Double assignment Double assignment

slide-14
SLIDE 14

14 / 51

Symbolic Execution

  • Explore paths through a program
  • Variables represented symbolically
  • If-conditions &c. lead to constraints on

variables

  • Path is feasible if constraints are satisfjable
  • Verifjcation / T

est-case generation

  • Famous tools such as Klee
slide-15
SLIDE 15

15 / 51

Example

function get_user_header(name) while name.contains(“<script>”) name = name.replaceAll(“<script>”, “”) header = “<h1>” + name + “</h1>” assert not header.contains(“script”) end

Program Path

slide-16
SLIDE 16

16 / 51

Example

function get_user_header(name) while name.contains(“<script>”) name = name.replaceAll(“<script>”, “”) header = “<h1>” + name + “</h1>” assert not header.contains(“script”) end

Program Path

slide-17
SLIDE 17

17 / 51

Example

function get_user_header(name) while name.contains(“<script>”) name = name.replaceAll(“<script>”, “”) header = “<h1>” + name + “</h1>” assert not header.contains(“script”) end

Program Path

assert contains(n1, “<script>”);

slide-18
SLIDE 18

18 / 51

Example

function get_user_header(name) while name.contains(“<script>”) name = name.replaceAll(“<script>”, “”) header = “<h1>” + name + “</h1>” assert not header.contains(“script”) end

Program Path

assert contains(n1, “<script>”); n2 := replaceAll(n1, “<script>”, “”);

slide-19
SLIDE 19

19 / 51

Example

function get_user_header(name) while name.contains(“<script>”) name = name.replaceAll(“<script>”, “”) header = “<h1>” + name + “</h1>” assert not header.contains(“script”) end

Program Path

assert contains(n1, “<script>”); n2 := replaceAll(n1, “<script>”, “”); assert contains(n2, “<script>”);

slide-20
SLIDE 20

20 / 51

Example

function get_user_header(name) while name.contains(“<script>”) name = name.replaceAll(“<script>”, “”) header = “<h1>” + name + “</h1>” assert not header.contains(“script”) end

Program Path

assert contains(n1, “<script>”); n2 := replaceAll(n1, “<script>”, “”); assert contains(n2, “<script>”); n3 := replaceAll(n2, “<script>”, “”);

slide-21
SLIDE 21

21 / 51

Example

function get_user_header(name) while name.contains(“<script>”) name = name.replaceAll(“<script>”, “”) header = “<h1>” + name + “</h1>” assert not header.contains(“script”) end

Program Path

assert contains(n1, “<script>”); n2 := replaceAll(n1, “<script>”, “”); assert contains(n2, “<script>”); n3 := replaceAll(n2, “<script>”, “”); assert not contains(n3, “<script>”);

slide-22
SLIDE 22

22 / 51

Example

function get_user_header(name) while name.contains(“<script>”) name = name.replaceAll(“<script>”, “”) header = “<h1>” + name + “</h1>” assert not header.contains(“script”) end

Program Path

assert contains(n1, “<script>”); n2 := replaceAll(n1, “<script>”, “”); assert contains(n2, “<script>”); n3 := replaceAll(n2, “<script>”, “”); assert not contains(n3, “<script>”); hdr = concat(“<h1>”, n3, “</h1>”);

slide-23
SLIDE 23

23 / 51

Example

function get_user_header(name) while name.contains(“<script>”) name = name.replaceAll(“<script>”, “”) header = “<h1>” + name + “</h1>” assert not header.contains(“script”) end

Program Path

assert contains(n1, “<script>”); n2 := replaceAll(n1, “<script>”, “”); assert contains(n2, “<script>”); n3 := replaceAll(n2, “<script>”, “”); assert not contains(n3, “<script>”); hdr = concat(“<h1>”, n3, “</h1>”); assert contains(hdr, “<script>”);

slide-24
SLIDE 24

24 / 51

Example

function get_user_header(name) while name.contains(“<script>”) name = name.replaceAll(“<script>”, “”) header = “<h1>” + name + “</h1>” assert not header.contains(“script”) end

Program Path

assert contains(n1, “<script>”); n2 := replaceAll(n1, “<script>”, “”); assert contains(n2, “<script>”); n3 := replaceAll(n2, “<script>”, “”); assert not contains(n3, “<script>”); hdr = concat(“<h1>”, n3, “</h1>”); assert contains(hdr, “<script>”);

Assertion in code negated

slide-25
SLIDE 25

25 / 51

Example

function get_user_header(name) while name.contains(“<script>”) name = name.replaceAll(“<script>”, “”) header = “<h1>” + name + “</h1>” assert not header.contains(“script”) end

Program Path

assert contains(n1, “<script>”); n2 := replaceAll(n1, “<script>”, “”); assert contains(n2, “<script>”); n3 := replaceAll(n2, “<script>”, “”); assert not contains(n3, “<script>”); hdr = concat(“<h1>”, n3, “</h1>”); assert contains(hdr, “<script>”);

Assertion in code negated

  • No solution: path correct!
slide-26
SLIDE 26

26 / 51

Solving Such Constraints

Straight-line with

  • Regular constraints, concat, fjnite transductions

– x := concat(y, z); x’ = T(x); assert x’ in a*b*; – EXPSPACE-c / PSPACE-c [Lin, Barcelo, 2016]

  • Regular constraints, concat, replaceAll

– x := replaceAll(y, e, z) – Undecidable if e can be a variable – EXPSPACE / PSPACE if e is a regular expression – Undecidable with length constraints – [Chen et al, 2018]

slide-27
SLIDE 27

27 / 51

Generic Approach

Which string constraints can we allow?

  • Maintain decidability
  • Expressivity: capture most benchmarks
  • Easy: solve with a straight-forward algorithm
  • Extensible: allow users-defjned string

functions

  • Effjcient: solve competitively
slide-28
SLIDE 28

28 / 51

Basic Approach: Go Backwards

For one variable, assume:

  • assert g(x)

– g is a regular constraint

  • x : = f(y)

– suppose x must satisfy a regular constraint – take the weakest precondition Pre(f, x) – Pre(f, x) is a regular constraint on y

slide-29
SLIDE 29

29 / 51

Basic Approach: Go Backwards

For one variable, assume:

  • assert g(x)

– g is a regular constraint

  • x : = f(y)

– suppose x must satisfy a regular constraint – take the weakest precondition Pre(f, x) – Pre(f, x) is a regular constraint on y

Regular contraints on output variables become regular constraints on input variables.

slide-30
SLIDE 30

30 / 51

Example

assert x in a*b*; y = reverse(x); assert y in b*a*; z = replaceAll(y, a, b); assert z in b*;

slide-31
SLIDE 31

31 / 51

Example

}

assert x in a*b*; y = reverse(x); assert y in b*a*; z = replaceAll(y, a, b); assert z in b*; assert y in (a | b)*;

slide-32
SLIDE 32

32 / 51

Example

assert x in a*b*; y = reverse(x); assert y in b*a*; assert y in (a | b)*;

slide-33
SLIDE 33

33 / 51

Example

assert x in a*b*; y = reverse(x); assert y in b*a*; assert y in (a | b)*; } assert y in (a | b)* & b*a*;

slide-34
SLIDE 34

34 / 51

Example

assert x in a*b*; y = reverse(x); assert y in (a | b)* & b*a*;

slide-35
SLIDE 35

35 / 51

Example

assert x in a*b*; assert x in a*b*; y = reverse(x); assert y in (a | b)* & b*a*;}

slide-36
SLIDE 36

36 / 51

Example

assert x in a*b*; assert x in a*b*;

slide-37
SLIDE 37

37 / 51

Example

assert x in a*b*; assert x in a*b*; } assert x in a*b*;

slide-38
SLIDE 38

38 / 51

Example

assert x in a*b*;

slide-39
SLIDE 39

39 / 51

Example

assert x in a*b*; Easy to solve Easy to solve

slide-40
SLIDE 40

40 / 51

Algorithm in General

Assertions and functions may take several variables

  • assert g(x1, …, xn)

– g admits a regular monadic decomposition – i.e. U L1 x … x Ln

  • x := f(x1, …, xn)

– if x is a regular language, then – Pre(f, x) is U L1 x … x Ln

slide-41
SLIDE 41

41 / 51

Algorithm in General

Assertions and functions may take several variables

  • assert g(x1, …, xn)

– g admits a regular monadic decomposition – i.e. U L1 x … x Ln

  • x := f(x1, …, xn)

– if x is a regular language, then – Pre(f, x) is U L1 x … x Ln

Given these, the backwards algorithm still works

slide-42
SLIDE 42

42 / 51

Genericity

Which string functions satisfy these constraints?

  • Concatenation
  • Reverse
  • One-way / T

wo-way transductions

  • x := replaceAll(y, e, z)

Subsume previous results and allow extensions

  • E.g. capture groups in real-world regular

expressions

slide-43
SLIDE 43

43 / 51

Complexity

Depends on string operations permitted

  • PSPACE – conjunction of regular constraints
  • EXPSPACE – concat, one-way transductions,

replaceAll

  • Non-elementary – two-way non-deterministic

transductions

  • Undecidable – equals(x, y) and replaceAll(x,

a, y)

slide-44
SLIDE 44

44 / 51

Complexity

Depends on string operations permitted

  • PSPACE – conjunction of regular constraints
  • EXPSPACE – concat, one-way transductions,

replaceAll

  • Non-elementary – two-way non-deterministic

transductions

  • Undecidable – equals(x, y) and replaceAll(x,

a, y) Determinism handled carefully

slide-45
SLIDE 45

45 / 51

Complexity

Depends on string operations permitted

  • PSPACE – conjunction of regular constraints
  • EXPSPACE – concat, one-way transductions,

replaceAll

  • Non-elementary – two-way non-deterministic

transductions

  • Undecidable – equals(x, y) and replaceAll(x,

a, y) Determinism handled carefully

  • f-1(L1 & L2) = f-1(L1) & f-1(L2) if f deterministic
slide-46
SLIDE 46

46 / 51

Complexity

Depends on string operations permitted

  • PSPACE – conjunction of regular constraints
  • EXPSPACE – concat, one-way transductions,

replaceAll

  • Non-elementary – two-way non-deterministic

transductions

  • Undecidable – equals(x, y) and replaceAll(x,

a, y) Determinism handled carefully

  • f-1(L1 & L2) = f-1(L1) & f-1(L2) if f deterministic
  • avoid taking conjunctions until the end
slide-47
SLIDE 47

47 / 51

OSTRICH

Approach implemented in OSTRICH

  • Written in Scala
  • Built on Princess SMT solver
  • Extensible

– Each string operation is a single class – New operations easily added

Benchmarking

  • Kaluza, Stranger, SLOG examples
  • Compared with CVC 4.1.6, Z3-str, and SLOTH
slide-48
SLIDE 48

48 / 51

Benchmarks on All Solvers

slide-49
SLIDE 49

49 / 51

Benchmarks Unique Features

slide-50
SLIDE 50

50 / 51

Optimisations

Pre-image computation should be done carefully

  • x := concat(y, z)
  • Pre(concat, L) = U Lq x qL

– Lq – words to state q – qL – word from state q

  • Multiplies search by number of states
  • Only choose q that are feasible

Pre-image of replaceAll uses Caley graphs

slide-51
SLIDE 51

51 / 51

Summary

  • Generic decision procedure for straight-line

string constriants

  • Semantic conditions for decidability

– Regular monadic decomposition

  • OSTRICH

– Competitive on popular benchmarks – Extensible with new string operations