SMT, Strings, Security Philipp Rmmer Uppsala University SAT/SMT/AR, - - PowerPoint PPT Presentation

smt strings security
SMART_READER_LITE
LIVE PREVIEW

SMT, Strings, Security Philipp Rmmer Uppsala University SAT/SMT/AR, - - PowerPoint PPT Presentation

SMT, Strings, Security Philipp Rmmer Uppsala University SAT/SMT/AR, July 6 th , 2018 1 Plan String constraints by example A word equation primer Decidable fragments of string constraints 2 Strings in Verifcation 3 String in


slide-1
SLIDE 1

1

Philipp Rümmer

Uppsala University SAT/SMT/AR, July 6th, 2018

SMT, Strings, Security

slide-2
SLIDE 2

2

Plan

 String constraints by example  A word equation primer  Decidable fragments of string constraints

slide-3
SLIDE 3

3

Strings in Verifcation

slide-4
SLIDE 4

4

String in verifiation

slide-5
SLIDE 5

5

String in verifiation

ASCII, Unicode

slide-6
SLIDE 6

6

String in verifiation

Regular expression assertion:

slide-7
SLIDE 7

7

String in verifiation

Word/string ioniatenation

slide-8
SLIDE 8

8

String in verifiation

Loop invariant combining word equations, regex constraints, length constraints

slide-9
SLIDE 9

9

String in verifiation

Substring constraint

slide-10
SLIDE 10

10

String in verifiation

Or regex:

slide-11
SLIDE 11

11

String in verifiation

Presburger length constraint

slide-12
SLIDE 12

12

String in verifiation

→ Need a solver that supports all those

  • perators!
slide-13
SLIDE 13

13

Alphabets

 All constraints are formulated w.r.t. to

some fxed fnite alphabet

  (e.g., 8-bit ASCII)  (e.g., UTF-32)

slide-14
SLIDE 14

14

Semantiis and notation

 Finite sequences of letters:  Empty word:  Concatenation:  Equations:  Language/regex membership:  Word length:

slide-15
SLIDE 15

15

LARGE Alphabets

 Naive use of fnite-state automata quickly

becomes impossible

 Conirete letters as transition guards →

far too many transitions are needed to express interesting languages

 Symbolii handling of letters is necessary

 Sometimes complex string conversion

functions necessary, e.g. UTF-8 ↔ UTF-32

slide-16
SLIDE 16

16

Injeition attaiks

xkcd.com

slide-17
SLIDE 17

17

What is happening here?

Possible SQL iommand in a program

database.execute( "INSERT INTO students (name) VALUES ('" + name + "');");

slide-18
SLIDE 18

18

What is happening here?

Possible SQL iommand in a program

database.execute( "INSERT INTO students (name) VALUES ('" + name + "');");

Command with input substituted

INSERT INTO students (name) VALUES ('Robert'); DROP TABLE students;--');

slide-19
SLIDE 19

19

What is happening here?

Possible SQL iommand in a program

database.execute( "INSERT INTO students (name) VALUES ('" + name + "');");

Command with input substituted

INSERT INTO students (name) VALUES ('Robert'); DROP TABLE students;--');

Problem: Input string ends quotation! Command embedded in user input is executed

slide-20
SLIDE 20

20

What is happening here?

Possible SQL iommand in a program

database.execute( "INSERT INTO students (name) VALUES ('" + name + "');");

 Since no sanitisation is applied,

program is vulnerable to SQL injection attacks!

slide-21
SLIDE 21

21

How ian this be deteited?

Program code Input: User-controlled strings Output: SQL commands

slide-22
SLIDE 22

22

How ian this be deteited?

Program code Input: User-controlled strings Output: SQL commands

slide-23
SLIDE 23

23

How ian this be deteited?

Program code Input: User-controlled strings Output: SQL commands Regex

  • r CFG
slide-24
SLIDE 24

24

What is happening here?

Possible SQL iommand in a program

database.execute( "INSERT INTO students (name) VALUES ('" + name + "');");

 However, this case could more easily be

found with techniques like taint traiking

 But what if sanitisation were actually

applied?

slide-25
SLIDE 25

25

A subtle XSS vulnerability

JavaSiript embedded in a web-page

var x = goog.string.htmlEscape(cat); var y = goog.string.escapeString(x); catElem.innerHTML = '<button onclick="createCatList(\'' + y + '\')">' + x + '</button>';

slide-26
SLIDE 26

26

A subtle XSS vulnerability

JavaSiript embedded in a web-page

var x = goog.string.htmlEscape(cat); var y = goog.string.escapeString(x); catElem.innerHTML = '<button onclick="createCatList(\'' + y + '\')">' + x + '</button>'; Input string

slide-27
SLIDE 27

27

A subtle XSS vulnerability

JavaSiript embedded in a web-page

var x = goog.string.htmlEscape(cat); var y = goog.string.escapeString(x); catElem.innerHTML = '<button onclick="createCatList(\'' + y + '\')">' + x + '</button>'; Input string HTML escape: & → &amp; JavaScript escape: ' → \'

slide-28
SLIDE 28

28

A subtle XSS vulnerability

JavaSiript embedded in a web-page

var x = goog.string.htmlEscape(cat); var y = goog.string.escapeString(x); catElem.innerHTML = '<button onclick="createCatList(\'' + y + '\')">' + x + '</button>'; Input string HTML escape: & → &amp; JavaScript escape: ' → \' Impliiit HTML unescape

  • f the onclick

attribute: &amp; → &

slide-29
SLIDE 29

29

An XSS vulnerability (2)

JavaSiript embedded in a web-page

var x = goog.string.htmlEscape(cat); var y = goog.string.escapeString(x); catElem.innerHTML = '<button onclick="createCatList(\'' + y + '\')">' + x + '</button>';

One possible attaik

Choose cat to be ');alert(1);// Generated HTML string is then:

<button onclick="createCatList('&#39;);alert(1);//')"> &#39;);alert(1);//</button>

slide-30
SLIDE 30

30

An XSS vulnerability (2)

JavaSiript embedded in a web-page

var x = goog.string.htmlEscape(cat); var y = goog.string.escapeString(x); catElem.innerHTML = '<button onclick="createCatList(\'' + y + '\')">' + x + '</button>';

One possible attaik

Choose cat to be ');alert(1);// Generated HTML string is then:

<button onclick="createCatList('&#39;);alert(1);//')"> &#39;);alert(1);//</button>

This will be unesiaped to

createCatList('');alert(1);//')

slide-31
SLIDE 31

31

An XSS vulnerability (2)

JavaSiript embedded in a web-page

var x = goog.string.htmlEscape(cat); var y = goog.string.escapeString(x); catElem.innerHTML = '<button onclick="createCatList(\'' + y + '\')">' + x + '</button>';

One possible attaik

Choose cat to be ');alert(1);// Generated HTML string is then:

<button onclick="createCatList('&#39;);alert(1);//')"> &#39;);alert(1);//</button>

This will be unesiaped to

createCatList('');alert(1);//')

Vulnerability since escape functions are applied in wrong order

slide-32
SLIDE 32

32

Cross-site siripting

http://blog.aboutme.vn/choi-xss-tai-knock-xss-moe/

slide-33
SLIDE 33

33

Solvers for esiape ops?

slide-34
SLIDE 34

34

Solvers for esiape ops?

 We need transducers!

→ Automata with multiple tracks

slide-35
SLIDE 35

35

Solvers for esiape ops?

toUpperCase

a/A b/B c/C ...

 We need transducers!

→ Automata with multiple tracks

slide-36
SLIDE 36

36

Solvers for esiape ops?

toUpperCase

a/A b/B c/C ...

 We need transducers!

→ Automata with multiple tracks

htmlEsiape

</&lt; >/&gt; &/&amp; ...

replaieAll

...

slide-37
SLIDE 37

37

Solvers for esiape ops?

toUpperCase

a/A b/B c/C ...

 We need transducers!

→ Automata with multiple tracks

htmlEsiape

</&lt; >/&gt; &/&amp; ...

replaieAll

...

Do not preserve length ...

slide-38
SLIDE 38

38

Other operations

 String reversal  Context-free grammars  String-to-number conversions  Replace-all with symbolic arguments  ...

slide-39
SLIDE 39

39

Solving String Constraints

slide-40
SLIDE 40

40

Bit of Solver History

 Bounded-length solvers

 Bit-vector-based: Hampi, Kaluza  CP-based: Gecode

 Automata-based tools

 Stranger, TRAU

 SMT/DPLL/CDCL-based methods

 Z3-str/2/3, CVC4, S3/p, Norn, Sloth

(+ much theoretic work)

slide-41
SLIDE 41

41

Solving Word Equations

 What are the solutions those equations?

slide-42
SLIDE 42

42

Nielsen's transformation

(also called Levi's lemma)

Theorem

slide-43
SLIDE 43

43

Nielsen's transformation

(also called Levi's lemma)

Theorem

slide-44
SLIDE 44

44

Nielsen's transformation

As a tableau rule

slide-45
SLIDE 45

45

Nielsen's transformation

As a tableau rule

slide-46
SLIDE 46

46

In the example

slide-47
SLIDE 47

47

How about this one?

slide-48
SLIDE 48

48

How about this one?

slide-49
SLIDE 49

49

How about this one?

Cyile!

slide-50
SLIDE 50

50

What ian be done?

Ignore cycles and hope for the best! Identify fragments for which NT is guaranteed to terminate

 Acyclic; straight-line

Improve NT and add termination criteria

 Makanin's method  Simpler algorithms for quadratii equations

slide-51
SLIDE 51

51

Quadratii word equations

 E.g.  Consider satisfability of a

single quadratii equation

Defnition A word equation is quadratii if each variable

  • ccurs at most twice in the equation.
slide-52
SLIDE 52

52

Nielsen's transformation

Quadratii = simpler?

slide-53
SLIDE 53

53

Nielsen's transformation

Quadratii = simpler?

Number of variable

  • ccurrences

cannot increase!

slide-54
SLIDE 54

54

A deiision proiedure

Modifed Nielsen rule

slide-55
SLIDE 55

55

A deiision proiedure

Modifed Nielsen rule Further rules

slide-56
SLIDE 56

56

Example

slide-57
SLIDE 57

57

Even more rules

One-sided Nielsen rule

slide-58
SLIDE 58

58

Deiision proiedure?

Soundness

  • If root is satisfable, at least one branch cannot

be closed

Completeness

  • If root is unsat, a closed proof exists
  • Follows from termination
  • Open branches → satisfying assignments

Termination

  • # of variable occurrences does not increase
  • Up to renaming of variables, only fnitely many

diferent equations exist

slide-59
SLIDE 59

59

Soundness argument

 Label equations in the proof with:

 if equation is unsat  if equation is sat, has variable

  • ccurrences, and is length of for the

shortest solution

 Order pairs lexicographically

Lemma In each application of the Nielsen rule, if the parent is labelled with , then at least one child has label .

slide-60
SLIDE 60

60

Soundness argument

 Label equations in the proof with:

 if equation is unsat  if equation is sat, has variable

  • ccurrences, and is length of for the

shortest solution

 Order pairs lexicographically

Lemma In each application of the Nielsen rule, if the parent is labelled with , then at least one child has label .

Decreasing labels → Branch cannot be closed!

slide-61
SLIDE 61

61

Combinations ...

Equations Quadratii

slide-62
SLIDE 62

62

Combinations ...

Equations Quadratii Regex Constraints

slide-63
SLIDE 63

63

Combinations ...

Equations Quadratii Regex Constraints

slide-64
SLIDE 64

64

Combinations ...

Equations Quadratii Regex Constraints

Length Constraints

slide-65
SLIDE 65

65

Combinations ...

Equations Quadratii Regex Constraints

Length Constraints

?

slide-66
SLIDE 66

66

Combinations ...

Equations Quadratii Regex Constraints

Length Constraints

? ?

slide-67
SLIDE 67

67

Transduition

Combinations ...

Equations Quadratii Regex Constraints

Length Constraints

? ?

slide-68
SLIDE 68

68

Transduition

Combinations ...

Equations Quadratii Regex Constraints

Length Constraints

? ?

Undeiidable

slide-69
SLIDE 69

69

Transduition

Combinations ...

Equations Quadratii Regex Constraints

Length Constraints

? ?

Undeiidable

slide-70
SLIDE 70

70

Transduition

Combinations ...

Equations Quadratii Regex Constraints

Length Constraints

? ?

Undeiidable

slide-71
SLIDE 71

71

Transduition

Combinations ...

Equations Quadratii Regex Constraints

Length Constraints

? ?

Undeiidable

slide-72
SLIDE 72

72

The Norn fragment

  • 1. Boolean structure
  • 2. Acyclic (linear) word equations
  • 3. Regex memberships
  • 4. Length constraints

Parosh Aziz Abdulla, Mohamed Faouzi Atig, Yu-Fang Chen, Lukás Holík, Ahmed Rezine, Philipp Rümmer, Jari Stenman: String Constraints for Verifcation. CAV 2014

slide-73
SLIDE 73

73

The Norn fragment

  • 1. Boolean structure
  • 2. Acyclic (linear) word equations
  • 3. Regex memberships
  • 4. Length constraints

Parosh Aziz Abdulla, Mohamed Faouzi Atig, Yu-Fang Chen, Lukás Holík, Ahmed Rezine, Philipp Rümmer, Jari Stenman: String Constraints for Verifcation. CAV 2014

(a decidable fragment)

slide-74
SLIDE 74

74

The Norn fragment

  • 1. Boolean structure
  • 2. Acyclic (linear) word equations
  • 3. Regex memberships
  • 4. Length constraints

Parosh Aziz Abdulla, Mohamed Faouzi Atig, Yu-Fang Chen, Lukás Holík, Ahmed Rezine, Philipp Rümmer, Jari Stenman: String Constraints for Verifcation. CAV 2014

(a decidable fragment)

Order in which procedure handles

  • perators
slide-75
SLIDE 75

75

Examples

slide-76
SLIDE 76

76

  • 1. Boolean struiture

 Use standard DPLL/CDCL → Easy  Just consider conjunctions of literals  But we need to handle negation!

 Negated word equations  Negated regex constraints  Negated length constraints

slide-77
SLIDE 77

77

  • 1. Boolean struiture

 Use standard DPLL/CDCL → Easy  Just consider conjunctions of literals  But we need to handle negation!

 Negated word equations  Negated regex constraints  Negated length constraints

✓ ✓

?

slide-78
SLIDE 78

78

  • 1b. Negative word eqs.

Lemma

Can be reduced to positive equations:

slide-79
SLIDE 79

79

  • 1b. Negative word eqs.

Lemma

Large alphabets → a, b need to be handled symbolically in practice

Can be reduced to positive equations:

slide-80
SLIDE 80

80

  • 1b. Negative word eqs.

Lemma

Can be reduced to positive equations:

Theorem Any Boolean combination of word equations can be reduced to a single word equation with the same set of solutions (when projected to the

  • riginal set of variables).
slide-81
SLIDE 81

81

  • 2. Aiyilii word equations

 Reduce to solved form by systematic

application of Nielsen’s transformation: ( do not occur in )

 After that, eliminate equations by

inlining!

slide-82
SLIDE 82

82

  • 3. Regular expressions

 Membership tests with ioniatenation

can be split:

 Tests with same left-hand side can be

merged:

slide-83
SLIDE 83

83

  • 3. Regular expressions

 Membership tests with ioniatenation

can be split:

 Tests with same left-hand side can be

merged:

Disjunction over states of automaton representing

slide-84
SLIDE 84

84

  • 4. Length ionstraints

 Compute the length abstraition of

each regex constraint:

 Conjoin length abstractions with other

length constraints and check satisfability

slide-85
SLIDE 85

85

  • 4. Length ionstraints

 Compute the length abstraition of

each regex constraint:

 Conjoin length abstractions with other

length constraints and check satisfability

A Presburger formula that can be extracted in linear time from

slide-86
SLIDE 86

86

  • 5. Optimisations ...

 E.g., exploit length information when

splitting equations or regexes (still too slow ...)

slide-87
SLIDE 87

87

Adding Transducers .

slide-88
SLIDE 88

88

  • 3. Regular expressions

 Membership tests with ioniatenation

can be split:

 Tests with same left-hand side can be

merged:

slide-89
SLIDE 89

89

  • 3. Regular expressions

 Membership tests with ioniatenation

can be split:

 Tests with same left-hand side can be

merged:

Does not work any more with transducers!

slide-90
SLIDE 90

90

The Sloth fragments

  • 1. Boolean structure (no negation)
  • 2. Straight-line word equations
  • 3. n-track transducer constraints

Lukás Holík, Petr Janku, Anthony W. Lin, Philipp Rümmer, Tomás Vojnar: String constraints with concatenation and transducers solved efciently. PACMPL 2(POPL): 4:1-4:32 (2018)

slide-91
SLIDE 91

91

The Sloth fragments

  • 1. Boolean structure (no negation)
  • 2. Straight-line word equations
  • 3. n-track transducer constraints

Lukás Holík, Petr Janku, Anthony W. Lin, Philipp Rümmer, Tomás Vojnar: String constraints with concatenation and transducers solved efciently. PACMPL 2(POPL): 4:1-4:32 (2018)

→ also decidable!

slide-92
SLIDE 92

92

Transduiers

Defnition An n-traik transduier is a fnite-state automaton over the alphabet An n-track transducer defnes an n-ary rational relation.

slide-93
SLIDE 93

93

Transduiers

Defnition An n-traik transduier is a fnite-state automaton over the alphabet An n-track transducer defnes an n-ary rational relation.

slide-94
SLIDE 94

94

Transduiers

Defnition An n-traik transduier is a fnite-state automaton over the alphabet An n-track transducer defnes an n-ary rational relation.

slide-95
SLIDE 95

95

HTML Esiaping

slide-96
SLIDE 96

96

Undeiidability

Proposition/Folklore String constraints with rational relations are undeiidable.

 Post correspondence problem:

Given word pairs is there an index sequence with

slide-97
SLIDE 97

97

Undeiidability

Proposition/Folklore String constraints with rational relations are undeiidable.

 Post correspondence problem:

Given word pairs is there an index sequence with U n d e i i d a b l e

slide-98
SLIDE 98

98

Fragments: aiyilii formulas

 Positive Boolean comb. of rational

relations applied to distinct variables

 In every , and share at most

  • ne variable

 PSPACE-complete

[Barcelo, Figuiera, and Libkin’13]

slide-99
SLIDE 99

99

Straight-line fragment SL

 Conjunction of equations sorted by

dependency:

 All pairwise distinct  Each may only occur in  Each is concatenation, or

(interpreted as )

 Regex constraints

slide-100
SLIDE 100

100

SL example

JavaSiript embedded in a web-page

var x = goog.string.htmlEscape(cat); var y = goog.string.escapeString(x); catElem.innerHTML = '<button onclick="createCatList(\'' + y + '\')">' + x + '</button>';

slide-101
SLIDE 101

101

SL example

JavaSiript embedded in a web-page

var x = goog.string.htmlEscape(cat); var y = goog.string.escapeString(x); catElem.innerHTML = '<button onclick="createCatList(\'' + y + '\')">' + x + '</button>';

slide-102
SLIDE 102

102

SL example

JavaSiript embedded in a web-page

var x = goog.string.htmlEscape(cat); var y = goog.string.escapeString(x); catElem.innerHTML = '<button onclick="createCatList(\'' + y + '\')">' + x + '</button>';

slide-103
SLIDE 103

103

SL example

JavaSiript embedded in a web-page

var x = goog.string.htmlEscape(cat); var y = goog.string.escapeString(x); catElem.innerHTML = '<button onclick="createCatList(\'' + y + '\')">' + x + '</button>';

slide-104
SLIDE 104

104

SL example

JavaSiript embedded in a web-page

var x = goog.string.htmlEscape(cat); var y = goog.string.escapeString(x); catElem.innerHTML = '<button onclick="createCatList(\'' + y + '\')">' + x + '</button>';

slide-105
SLIDE 105

105

SL example

JavaSiript embedded in a web-page

var x = goog.string.htmlEscape(cat); var y = goog.string.escapeString(x); catElem.innerHTML = '<button onclick="createCatList(\'' + y + '\')">' + x + '</button>';

slide-106
SLIDE 106

106

Deiision proiedure for SL

SL Acyclic

Splitting concat

slide-107
SLIDE 107

107

Deiision proiedure for SL

SL Acyclic

Splitting concat Linear blow-up for each split

slide-108
SLIDE 108

108

Deiision proiedure for SL

SL Acyclic

Splitting concat Linear blow-up for each split

Boolean Trans. System

slide-109
SLIDE 109

109

Deiision proiedure for SL

SL Acyclic

Splitting concat Linear blow-up for each split

Boolean Trans. System

Hardware Model Cheiker (PSPACE!)

slide-110
SLIDE 110

110

Deiidability for aiyilii f.

slide-111
SLIDE 111

111

Deiidability for aiyilii f.

slide-112
SLIDE 112

112

Deiidability for aiyilii f.

(Product automaton)

slide-113
SLIDE 113

113

Deiidability for aiyilii f.

(Product automaton) Consistency = Non-emptiness

slide-114
SLIDE 114

114

Deiidability for aiyilii f.

(Product automaton) Consistency = Non-emptiness

But how to do this in PSPACE?

slide-115
SLIDE 115

Alternating Finite Automata

q1 q2 a q3 a q4 q5 a q6 a

AFA P AFA P has

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 11 / 22

slide-116
SLIDE 116

Alternating Finite Automata

q1 q2 a q3 a q4 q5 a q6 a

AFA P AFA P has Q – a set of states, ∆ – a set of transitions

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 11 / 22

slide-117
SLIDE 117

Alternating Finite Automata

q1 q2 a q3 a q4 q5 a q6 a

AFA P AFA P has Q – a set of states, ∆ – a set of transitions, for example,

◮ ∆(q4) = a ∧ (q5 ∨ q6),

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 11 / 22

slide-118
SLIDE 118

Alternating Finite Automata

q1 q2 a q3 a q4 q5 a q6 a

AFA P AFA P has Q – a set of states, ∆ – a set of transitions, for example,

◮ ∆(q4) = a ∧ (q5 ∨ q6), ◮ ∆(q1) = a ∧ q2 ∧ q3,

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 11 / 22

slide-119
SLIDE 119

Alternating Finite Automata

q1 q2 a q3 a q4 q5 a q6 a

AFA P AFA P has Q – a set of states, ∆ – a set of transitions, for example,

◮ ∆(q4) = a ∧ (q5 ∨ q6), ◮ ∆(q1) = a ∧ q2 ∧ q3,

The main advantage compared to NFA is that AFA/AFT can easily encode concatenation and all Boolean operations on formulae.

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 11 / 22

slide-120
SLIDE 120

Alternating Finite Automata

q1 q2 a q3 a q4 q5 a q6 a

AFA P AFA P has Q – a set of states, ∆ – a set of transitions, for example,

◮ ∆(q4) = a ∧ (q5 ∨ q6), ◮ ∆(q1) = a ∧ q2 ∧ q3,

I – a positive Boolean formula over states, F – a negative Boolean formula over states, The main advantage compared to NFA is that AFA/AFT can easily encode concatenation and all Boolean operations on formulae.

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 11 / 22

slide-121
SLIDE 121

AND between Two AFTs

c/a a/d a/b b/c d/c c/b

R1(name, x) R2(x, y) ∧

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 14 / 22

slide-122
SLIDE 122

AND between Two AFTs

c/a/? ?/a/d a/b/? ?/b/c d/c/? ?/c/b

R1(name, x) R2(x, y) R12(name, x, y) ∧ QR12 = QR1 ∪ QR2 IR12 = IR1 ∧ IR2 FR12 = FR1 ∧ FR2 ∆R12 = ∆R1 ∪ ∆R2

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 14 / 22

slide-123
SLIDE 123

SL → AFT

The translation into an AFT will be carried out in the following steps:

1 A translation of conjunctions to AFTs.

ϕ ::= R12(name, x, y) ∧ z = w1 ◦ y ◦ w2 ◦ x ◦ w3 ∧ R3(z, innerHtml) ∧ P(innerHtml) AFT R12 AFT R3 AFT P

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 15 / 22

slide-124
SLIDE 124

AFT → BTS

Variables of the Boolean transition system (BTS) = states of the automaton (AFT):

◮ q1, q2, q3.

q1 q2 q3 a

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 20 / 22

slide-125
SLIDE 125

AFT → BTS

Variables of the Boolean transition system (BTS) = states of the automaton (AFT):

◮ q1, q2, q3.

The initial and final formulae of the BTS = initial and final formulae of the AFT:

◮ I = q1, ◮ F = q2 ∧ q3.

q1 q2 q3 a

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 20 / 22

slide-126
SLIDE 126

AFT → BTS

Variables of the Boolean transition system (BTS) = states of the automaton (AFT):

◮ q1, q2, q3.

The initial and final formulae of the BTS = initial and final formulae of the AFT:

◮ I = q1, ◮ F = q2 ∧ q3.

The transition function of the BTS = conjunction of formulae derived from individual transitions of the AFT:

◮ Trans = ∃ symb : q1 → symb = a ∧ q′

2 ∧ q′ 3.

q1 q2 q3 a

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 20 / 22

slide-127
SLIDE 127

SL → AFT

The translation into an AFT will be carried out in the following steps:

1 A translation of conjunctions to AFTs. 2 A translation of concatenations to AFTs: 1 Eliminating equations by substitutions.

ϕ ::= R12(name, x, y) ∧ z = w1 ◦ y ◦ w2 ◦ x ◦ w3 ∧ R3(z, innerHtml) ∧ P(innerHtml) AFT R12 AFT R3 AFT P

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 15 / 22

slide-128
SLIDE 128

Eliminating Equations by Substitutions

ϕ ::= R12(name, x, y) ∧ z = w1 ◦ y ◦ w2 ◦ x ◦ w3 ∧ R3(z, innerHtml) ∧ P(innerHtml)

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 16 / 22

slide-129
SLIDE 129

Eliminating Equations by Substitutions

ϕ ::= R12(name, x, y) ∧ z = w1 ◦ y ◦ w2 ◦ x ◦ w3 ∧ R3(z, innerHtml) ∧ P(innerHtml)

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 16 / 22

slide-130
SLIDE 130

Eliminating Equations by Substitutions

ϕ ::= R12(name, x, y) ∧ z = w1 ◦ y ◦ w2 ◦ x ◦ w3 ∧ R3(z, innerHtml) ∧ P(innerHtml) ϕ ::= R12(name, x, y) ∧ R3(w1 ◦ y ◦ w2 ◦ x ◦ w3, innerHtml) ∧ P(innerHtml)

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 16 / 22

slide-131
SLIDE 131

SL → AFT

The translation into an AFT will be carried out in the following steps:

1 A translation of conjunctions to AFTs. 2 A translation of concatenations to AFTs: 1 Eliminating equations by substitutions.

ϕ ::= R12(name, x, y) ∧ R3(w1 ◦ y ◦ w2 ◦ x ◦ w3, innerHtml) ∧ P(innerHtml) AFT R12 AFT R3 AFT P

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 17 / 22

slide-132
SLIDE 132

SL → AFT

The translation into an AFT will be carried out in the following steps:

1 A translation of conjunctions to AFTs. 2 A translation of concatenations to AFTs: 1 Eliminating equations by substitutions. 2 Handling rational relations with concatenations in arguments.

ϕ ::= R12(name, x, y) ∧ R3(w1 ◦ y ◦ w2 ◦ x ◦ w3, innerHtml) ∧ P(innerHtml) AFT R12 AFT R3 AFT P

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 17 / 22

slide-133
SLIDE 133

Splitting of Arguments

Run of the AFT R z x y Consider ϕ ::= R(x ◦ y, z) ∧ ψ.

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 18 / 22

slide-134
SLIDE 134

Splitting of Arguments

Run of the AFT R z x y Consider ϕ ::= R(x ◦ y, z) ∧ ψ. We need to split z into two parts.

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 18 / 22

slide-135
SLIDE 135

Splitting of Arguments

Run of the AFT R z1 z2 x y Consider ϕ ::= R(x ◦ y, z) ∧ ψ. We need to split z into two parts. z = z1 ◦ z2 where z1 and z2 are fresh variables.

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 18 / 22

slide-136
SLIDE 136

Splitting of Arguments

Run of the AFT R z1 z2 C x y Consider ϕ ::= R(x ◦ y, z) ∧ ψ. We need to split z into two parts. z = z1 ◦ z2 where z1 and z2 are fresh variables. ϕ ::= RCF(x, z1) ∧ RCI(y, z2) ∧ ψ[z/z1 ◦ z2].

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 18 / 22

slide-137
SLIDE 137

Splitting of Arguments

Run of the AFT R z1 z2 C x y Consider ϕ ::= R(x ◦ y, z) ∧ ψ. We need to split z into two parts. z = z1 ◦ z2 where z1 and z2 are fresh variables. ϕ ::= RCF(x, z1) ∧ RCI(y, z2) ∧ ψ[z/z1 ◦ z2]. A simple variant:

◮ One conjuction for every

configuration (set of states) C where a run may be split.

◮ Exponentional (2Q) for general

AFTs.

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 18 / 22

slide-138
SLIDE 138

Splitting of Arguments

Given a formula ϕ ::= R(x ◦ y, z) ∧ ψ.

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 19 / 22

slide-139
SLIDE 139

Splitting of Arguments

Given a formula ϕ ::= R(x ◦ y, z) ∧ ψ. R is split to two AFTs: R(x, z1) and R(y, z2). R(x, z1) R(y, z2)

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 19 / 22

slide-140
SLIDE 140

Splitting of Arguments

Given a formula ϕ ::= R(x ◦ y, z) ∧ ψ. R is split to two AFTs: R(x, z1) and R(y, z2). Guess the initial configuration R(y, z2) nondeterministically. R(x, z1) R(y, z2)

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 19 / 22

slide-141
SLIDE 141

Splitting of Arguments

Given a formula ϕ ::= R(x ◦ y, z) ∧ ψ. R is split to two AFTs: R(x, z1) and R(y, z2). Guess the initial configuration R(y, z2) nondeterministically.

◮ Remember the configuration by using additional states.

R(x, z1) R(y, z2) R′(x, y, z1, z2)

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 19 / 22

slide-142
SLIDE 142

Splitting of Arguments

Given a formula ϕ ::= R(x ◦ y, z) ∧ ψ. R is split to two AFTs: R(x, z1) and R(y, z2). Guess the initial configuration R(y, z2) nondeterministically.

◮ Remember the configuration by using additional states.

Check whether the final states of R(x, z1) correspond to the initial states of R(y, z2). R(x, z1) R(y, z2) R′(x, y, z1, z2)

  • P. Jank˚

u, et.al. String Constraints with Concatenation and Transducers Solved Efficiently 19 / 22

slide-143
SLIDE 143

115

Deiision proiedure for SL

SL Acyclic

Splitting concat Linear blow-up for each split

Boolean Trans. System

Model Cheiker (nuXmv, ABC)

slide-144
SLIDE 144

116

The TRAU fragment

  • 1. General word-equations
  • 2. General transducers
  • 3. Context-free grammars
  • 4. Length constraints

Parosh Aziz Abdulla, Mohamed Faouzi Atig, Yu-Fang Chen, Bui Phi Diep, Lukas Holik, Ahmed Rezine, Philipp Rümmer: Flatten and conquer, a framework for efcient analysis of string

  • constraints. PLDI 2017: 602-617
slide-145
SLIDE 145

117

The TRAU fragment

  • 1. General word-equations
  • 2. General transducers
  • 3. Context-free grammars
  • 4. Length constraints

Parosh Aziz Abdulla, Mohamed Faouzi Atig, Yu-Fang Chen, Bui Phi Diep, Lukas Holik, Ahmed Rezine, Philipp Rümmer: Flatten and conquer, a framework for efcient analysis of string

  • constraints. PLDI 2017: 602-617

→ undecidable!

slide-146
SLIDE 146

118

Overview of TRAU ...

slide-147
SLIDE 147

119

Are we there yet?

Expressiveness Efciency Precision/ guarantees

slide-148
SLIDE 148

120

Joint work with ...

 Parosh Aziz Abdulla  Mohamed Faouzi

Atig

 Yu-Fang Chen  Bui Phi Diep  Lukás Holík  Petr Janků  Anthony W. Lin  Ahmed Rezine  Jari Stenman  Tomás Vojnar  and others