Detection of Software Vulnerabilities: Static Analysis (Part I) - - PowerPoint PPT Presentation

detection of software vulnerabilities static analysis
SMART_READER_LITE
LIVE PREVIEW

Detection of Software Vulnerabilities: Static Analysis (Part I) - - PowerPoint PPT Presentation

Systems and Software Verification Laboratory Detection of Software Vulnerabilities: Static Analysis (Part I) Lucas Cordeiro Department of Computer Science lucas.cordeiro@manchester.ac.uk Static Analysis Lucas Cordeiro (Formal Methods


slide-1
SLIDE 1

Detection of Software Vulnerabilities: Static Analysis (Part I)

Lucas Cordeiro Department of Computer Science lucas.cordeiro@manchester.ac.uk Systems and Software Verification Laboratory

slide-2
SLIDE 2

Static Analysis

  • Lucas Cordeiro (Formal Methods Group)

§ lucas.cordeiro@manchester.ac.uk § Office: 2.28 § Office hours: 15-16 Tuesday, 14-15 Wednesday

  • Textbook:

§ Model checking (Chapter 14) § Software model checking. ACM Comput. Surv., 2009 § The Cyber Security Body of Knowledge, 2019 § Software Engineering (Chapters 8, 13)

slide-3
SLIDE 3
  • Functionality demanded increased significantly

– Peer reviewing and testing

Motivating Example

slide-4
SLIDE 4
  • Functionality demanded increased significantly

– Peer reviewing and testing

  • Multi-core processors with scalable shared memory /

message passing

– Static and dynamic verification

Motivating Example

slide-5
SLIDE 5
  • Functionality demanded increased significantly

– Peer reviewing and testing

  • Multi-core processors with scalable shared memory /

message passing

– Static and dynamic verification

void *threadA(void *arg) { lock(&mutex); x++; if (x == 1) lock(&lock); unlock(&mutex); lock(&mutex); x--; if (x == 0) unlock(&lock); unlock(&mutex); } void *threadB(void *arg) { lock(&mutex); y++; if (y == 1) lock(&lock); unlock(&mutex); lock(&mutex); y--; if (y == 0) unlock(&lock); unlock(&mutex); }

Motivating Example

slide-6
SLIDE 6
  • Functionality demanded increased significantly

– Peer reviewing and testing

  • Multi-core processors with scalable shared memory /

message passing

– Static and dynamic verification

void *threadA(void *arg) { lock(&mutex); x++; if (x == 1) lock(&lock); unlock(&mutex); lock(&mutex); x--; if (x == 0) unlock(&lock); unlock(&mutex); } void *threadB(void *arg) { lock(&mutex); y++; if (y == 1) lock(&lock); unlock(&mutex); lock(&mutex); y--; if (y == 0) unlock(&lock); unlock(&mutex); }

(CS1)

Motivating Example

slide-7
SLIDE 7
  • Functionality demanded increased significantly

– Peer reviewing and testing

  • Multi-core processors with scalable shared memory /

message passing

– Static and dynamic verification

void *threadA(void *arg) { lock(&mutex); x++; if (x == 1) lock(&lock); unlock(&mutex); lock(&mutex); x--; if (x == 0) unlock(&lock); unlock(&mutex); } void *threadB(void *arg) { lock(&mutex); y++; if (y == 1) lock(&lock); unlock(&mutex); lock(&mutex); y--; if (y == 0) unlock(&lock); unlock(&mutex); }

(CS1) (CS2)

Motivating Example

slide-8
SLIDE 8
  • Functionality demanded increased significantly

– Peer reviewing and testing

  • Multi-core processors with scalable shared memory /

message passing

– Static and dynamic verification

void *threadA(void *arg) { lock(&mutex); x++; if (x == 1) lock(&lock); unlock(&mutex); lock(&mutex); x--; if (x == 0) unlock(&lock); unlock(&mutex); } void *threadB(void *arg) { lock(&mutex); y++; if (y == 1) lock(&lock); unlock(&mutex); lock(&mutex); y--; if (y == 0) unlock(&lock); unlock(&mutex); }

(CS1) (CS2) (CS3)

Motivating Example

slide-9
SLIDE 9
  • Functionality demanded increased significantly

– Peer reviewing and testing

  • Multi-core processors with scalable shared memory /

message passing

– Static and dynamic verification

void *threadA(void *arg) { lock(&mutex); x++; if (x == 1) lock(&lock); unlock(&mutex); lock(&mutex); x--; if (x == 0) unlock(&lock); unlock(&mutex); } void *threadB(void *arg) { lock(&mutex); y++; if (y == 1) lock(&lock); unlock(&mutex); lock(&mutex); y--; if (y == 0) unlock(&lock); unlock(&mutex); }

(CS1) (CS2) (CS3)

Deadlock

Motivating Example

slide-10
SLIDE 10
  • Introduce software verification and validation

Intended learning outcomes

slide-11
SLIDE 11
  • Introduce software verification and validation
  • Understand soundness and completeness

concerning detection techniques

Intended learning outcomes

slide-12
SLIDE 12
  • Introduce software verification and validation
  • Understand soundness and completeness

concerning detection techniques

  • Emphasize the difference among static

analysis, testing / simulation, and debugging

Intended learning outcomes

slide-13
SLIDE 13
  • Introduce software verification and validation
  • Understand soundness and completeness

concerning detection techniques

  • Emphasize the difference among static

analysis, testing / simulation, and debugging

  • Explain bounded model checking of software

Intended learning outcomes

slide-14
SLIDE 14
  • Introduce software verification and validation
  • Understand soundness and completeness

concerning detection techniques

  • Emphasize the difference among static

analysis, testing / simulation, and debugging

  • Explain bounded model checking of software
  • Explain precise memory model for software

verification

Intended learning outcomes

slide-15
SLIDE 15
  • Introduce software verification and validation
  • Understand soundness and completeness

concerning detection techniques

  • Emphasize the difference among static

analysis, testing / simulation, and debugging

  • Explain bounded model checking of software
  • Explain precise memory model for software

verification

Intended learning outcomes

slide-16
SLIDE 16
  • Verification: "Are we building the product right?”

§ The software should conform to its specification

Verification vs Validation

slide-17
SLIDE 17
  • Verification: "Are we building the product right?”

§ The software should conform to its specification

  • Validation: "Are we building the right product?”

§ The software should do what the user requires

Verification vs Validation

slide-18
SLIDE 18
  • Verification: "Are we building the product right?”

§ The software should conform to its specification

  • Validation: "Are we building the right product?”

§ The software should do what the user requires

  • Verification and validation must be applied at each

stage in the software process

§ The discovery of defects in a system § The assessment of whether or not the system is usable in an operational situation

Verification vs Validation

slide-19
SLIDE 19
  • Software inspections are concerned with the

analysis of the static system representation to discover problems (static verification)

§ Supplement by tool-based document and code analysis § Code analysis can prove the absence of errors but might subject to incorrect results

Static and Dynamic Verification

slide-20
SLIDE 20
  • Software inspections are concerned with the

analysis of the static system representation to discover problems (static verification)

§ Supplement by tool-based document and code analysis § Code analysis can prove the absence of errors but might subject to incorrect results

  • Software testing is concerned with exercising and
  • bserving product behaviour (dynamic verification)

§ The system is executed with test data § Operational behaviour is observed § Can reveal the presence of errors NOT their absence

Static and Dynamic Verification

slide-21
SLIDE 21

Static and Dynamic Verification

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

Ian Sommerville. Software Engineering (6th,7th or 8th Edn) Addison Wesley

slide-22
SLIDE 22
  • Careful planning is required to get the most out of

dynamic and static verification

§ Planning should start early in the development process

§ The plan should identify the balance between static and dynamic verification

V & V planning

slide-23
SLIDE 23
  • Careful planning is required to get the most out of

dynamic and static verification

§ Planning should start early in the development process

§ The plan should identify the balance between static and dynamic verification

  • V & V should establish confidence that the software

is fit for purpose

V & V planning

slide-24
SLIDE 24
  • Careful planning is required to get the most out of

dynamic and static verification

§ Planning should start early in the development process

§ The plan should identify the balance between static and dynamic verification

  • V & V should establish confidence that the software

is fit for purpose

V & V planning

V & V planning depends on system’s purpose, user expectations and marketing environment

slide-25
SLIDE 25

The V-model of development

Requirements specification System specification System design Detailed design Module and unit code and tess Sub-system integration test plan System integration test plan Acceptance test plan Service Acceptance test System integration test Sub-system integration test

Ian Sommerville. Software Engineering (6th,7th or 8th Edn) Addison Wesley

slide-26
SLIDE 26
  • Introduce software verification and validation
  • Understand soundness and completeness

concerning detection techniques

  • Emphasize the difference among static

analysis, testing / simulation, and debugging

  • Explain bounded model checking of software
  • Explain unbounded model checking of

software

Intended learning outcomes

slide-27
SLIDE 27

Detection of Vulnerabilities

  • Detect the presence of vulnerabilities in the code

during the development, testing, and maintenance

slide-28
SLIDE 28

Detection of Vulnerabilities

  • Detect the presence of vulnerabilities in the code

during the development, testing, and maintenance

  • Trade-off between soundness and completeness
slide-29
SLIDE 29

Detection of Vulnerabilities

  • Detect the presence of vulnerabilities in the code

during the development, testing, and maintenance

  • Trade-off between soundness and completeness

§ A detection technique is sound for a given category if it can correctly conclude that a given program has no vulnerabilities

  • An unsound detection technique may have false negatives, i.e.,

actual vulnerabilities that the detection technique fails to find

slide-30
SLIDE 30

Detection of Vulnerabilities

  • Detect the presence of vulnerabilities in the code

during the development, testing, and maintenance

  • Trade-off between soundness and completeness

§ A detection technique is sound for a given category if it can correctly conclude that a given program has no vulnerabilities

  • An unsound detection technique may have false negatives, i.e.,

actual vulnerabilities that the detection technique fails to find

§ A detection technique is complete for a given category, if any vulnerability it finds is an actual vulnerability

  • An incomplete detection technique may have false positives, i.e., it

may detect issues that do not turn out to be actual vulnerabilities

slide-31
SLIDE 31

Detection of Vulnerabilities

  • Achieving soundness requires reasoning about all

executions of a program (usually an infinite number)

§ This can be done by static checking of the program code while making suitable abstractions of the executions

slide-32
SLIDE 32

Detection of Vulnerabilities

  • Achieving soundness requires reasoning about all

executions of a program (usually an infinite number)

§ This can be done by static checking of the program code while making suitable abstractions of the executions

  • Achieving completeness can be done by performing

actual, concrete executions of a program that are witnesses to any vulnerability reported

§ The analysis technique has to come up with concrete inputs for the program that triggers a vulnerability § A typical dynamic approach is software testing: the tester writes test cases with concrete inputs and specific checks for the outputs

slide-33
SLIDE 33

Detection of Vulnerabilities

Detection tools can use a hybrid combination of static and dynamic analysis techniques to achieve a good trade-off between soundness and completeness

slide-34
SLIDE 34

Detection of Vulnerabilities

Detection tools can use a hybrid combination of static and dynamic analysis techniques to achieve a good trade-off between soundness and completeness Dynamic verification should be used in conjunction with static verification to provide full code coverage

slide-35
SLIDE 35
  • Introduce software verification and validation
  • Understand soundness and completeness

concerning detection techniques

  • Emphasize the difference among static

analysis, testing / simulation, and debugging

  • Explain bounded model checking of software
  • Explain unbounded model checking of

software

Intended learning outcomes

slide-36
SLIDE 36

Static analysis vs Testing/ Simulation

  • Checks only some of the system executions

§ May miss errors

  • A successful execution is an execution that

discovers one or more errors Simulation/ testing OK error

slide-37
SLIDE 37

Static analysis vs Testing/ Simulation

  • Exhaustively explores all executions
  • Report errors as traces
  • May produce incorrect results

Model Checking OK Error trace Specification

Line 5: … Line 12: … … Line 41:…

slide-38
SLIDE 38

Avoiding state space explosion

  • Bounded Model Checking (BMC)

§ Breadth-first search (BFS) approach

  • Symbolic Execution

§ Depth-first search (DFS) approach

slide-39
SLIDE 39

Bounded Model Checking

  • Bounded model

checkers explore the state space in depth

  • Can only prove

correctness if all states are reachable within the bound

k = 0 k = 1 k = 2 k = 3 k = 4 k = 5 k = 6

A graph G = (V, E) consists of:

  • V: a set of vertices or nodes
  • E ⊆ V x V: set of edges connecting the nodes
slide-40
SLIDE 40

Breadth-First Search (BFS)

BFS(G,s) 01 for each vertex u ∈ V[G]-{s} // anchor (s) 02 colour[u] ← white // u colour 03 d[u] ← ∞ // s distance 04 π[u] ← NIL // u predecessor 05 colour[s] ← grey 06 d[s] ← 0 07 π[s] ← NIL 08 enqueue(Q,s) 09 while Q ≠ ∅ do 10 u ← dequeue(Q) 11 for each v ∈ Adj[u] do 12 If colour[v] = white then 13 colour[v] ← grey 14 d[v] ← d[u] + 1 15 π[v] ← u 16 enqueue(Q,v) 17 colour[u] ← blue

Initialization of graph nodes Initializes the anchor node (s) Visit each adjacent node of u

slide-41
SLIDE 41

2 1 7 6 5 4 3

BFS Example

slide-42
SLIDE 42

BFS Example

2 1 7 6 5 4 3

slide-43
SLIDE 43

BFS Example

2 1 7 6 5 4 3

slide-44
SLIDE 44

BFS Example

2 1 7 6 5 4 3

slide-45
SLIDE 45

BFS Example

2 1 7 6 5 4 3

slide-46
SLIDE 46

BFS Example

2 1 7 6 5 4 3

slide-47
SLIDE 47

BFS Example

2 1 7 6 5 4 3

slide-48
SLIDE 48

BFS Example

2 1 7 6 5 4 3

slide-49
SLIDE 49

Symbolic Execution

  • Symbolic execution

explores all paths individually

  • Can only prove

correctness if all paths are explored

slide-50
SLIDE 50

Depth-first search (DFS)

Paint all vertices white and initialize the fields π with NIL where π [u] represents the predecessor of u

slide-51
SLIDE 51

DFS Example

2 1 7 6 5 4 3

1/

slide-52
SLIDE 52

DFS Example

2 1 7 6 5 4 3

1/ 2/

slide-53
SLIDE 53

DFS Example

2 1 7 6 5 4 3

1/ 2/ 3/

slide-54
SLIDE 54

DFS Example

2 1 7 6 5 4 3

1/ 4/ 3/ 2/

slide-55
SLIDE 55

DFS Example

2 1 7 6 5 4 3

1/ 3/ 4/ 5/ 2/

slide-56
SLIDE 56

DFS Example

2 1 7 6 5 4 3

1/ 2/9 4/7 5/6 3/8

slide-57
SLIDE 57

DFS Example

2 1 7 6 5 4 3

1/ 2/9 4/7 5/6 3/8 10/

slide-58
SLIDE 58

DFS Example

2 1 7 6 5 4 3

1/ 2/9 4/7 5/6 3/8 10/ 11/

slide-59
SLIDE 59

DFS Example

2 1 7 6 5 4 3

1/ 2/9 4/7 5/6 3/8 10/ 11/12

slide-60
SLIDE 60

DFS Example

2 1 7 6 5 4 3

1/ 2/9 4/7 5/6 3/8 10/13 11/12

slide-61
SLIDE 61

DFS Example

2 1 7 6 5 4 3

1/14 2/9 4/7 5/6 3/8 10/13 11/12

slide-62
SLIDE 62

DFS Example

2 1 7 6 5 4 3

1/14 2/9 4/7 5/6 3/8 10/13 11/12 15/16

slide-63
SLIDE 63
  • V & V and debugging are distinct processes

V&V and debugging

slide-64
SLIDE 64
  • V & V and debugging are distinct processes
  • V & V is concerned with establishing the absence or

existence of defects in a program, resp.

V&V and debugging

slide-65
SLIDE 65
  • V & V and debugging are distinct processes
  • V & V is concerned with establishing the absence or

existence of defects in a program, resp.

  • Debugging is concerned with two main tasks

§ Locating and § Repairing these errors

V&V and debugging

slide-66
SLIDE 66
  • V & V and debugging are distinct processes
  • V & V is concerned with establishing the absence or

existence of defects in a program, resp.

  • Debugging is concerned with two main tasks

§ Locating and § Repairing these errors

  • Debugging involves

§ Formulating a hypothesis about program behaviour § Test these hypotheses to find the system error

V&V and debugging

slide-67
SLIDE 67

The debugging process

Locate error Design error repair Repair error Re-test program Test results Specification Test cases

Ian Sommerville. Software Engineering (6th,7th or 8th Edn) Addison Wesley

slide-68
SLIDE 68
  • Introduce software verification and validation
  • Understand soundness and completeness

concerning detection techniques

  • Emphasize the difference among static

analysis, testing / simulation, and debugging

  • Explain bounded model checking of software
  • Explain precise memory model for software

verification

Intended learning outcomes

slide-69
SLIDE 69

Circuit Satisfiability

  • A Boolean formula contains

§ Variables whose values are 0 or 1

slide-70
SLIDE 70

Circuit Satisfiability

  • A Boolean formula contains

§ Variables whose values are 0 or 1 § Connectives: ∧ (AND), ∨ (OR), and ¬ (NOT)

slide-71
SLIDE 71

Circuit Satisfiability

  • A Boolean formula contains

§ Variables whose values are 0 or 1 § Connectives: ∧ (AND), ∨ (OR), and ¬ (NOT)

  • A Boolean formula is SAT if there exists some

assignment to its variables that evaluates it to 1

slide-72
SLIDE 72

Circuit Satisfiability

  • A Boolean combinational circuit consists of
  • ne or more Boolean combinational elements

interconnected by wires

SAT: <x1 = 1, x2 = 1, x3 = 0>

slide-73
SLIDE 73

Circuit-Satisfiability Problem

  • Given a Boolean combinational circuit of

AND, OR, and NOT gates, is it satisfiable?

CIRCUIT-SAT = {<C> : C is a satisfiable Boolean combinational circuit}

slide-74
SLIDE 74

Circuit-Satisfiability Problem

  • Given a Boolean combinational circuit of

AND, OR, and NOT gates, is it satisfiable?

§ Size: number of Boolean combinational elements plus the number of wires

  • if the circuit has k inputs, then we would have to check up to

2k possible assignments CIRCUIT-SAT = {<C> : C is a satisfiable Boolean combinational circuit}

slide-75
SLIDE 75

Circuit-Satisfiability Problem

  • Given a Boolean combinational circuit of

AND, OR, and NOT gates, is it satisfiable?

§ Size: number of Boolean combinational elements plus the number of wires

  • if the circuit has k inputs, then we would have to check up to

2k possible assignments

§ When the size of C is polynomial in k, checking each one takes Ω(2k)

  • Super-polynomial in the size of k

CIRCUIT-SAT = {<C> : C is a satisfiable Boolean combinational circuit}

slide-76
SLIDE 76

Formula Satisfiability (SAT)

  • The SAT problem asks whether a given Boolean

formula is satisfiable SAT = {<Φ> : Φ is a satisfiable Boolean formula}

slide-77
SLIDE 77

Formula Satisfiability (SAT)

  • The SAT problem asks whether a given Boolean

formula is satisfiable

§ Example:

  • Φ = ((x1 →x2) ∨ ¬((¬x1 ↔ x3) ∨ x4)) ∧¬x2

SAT = {<Φ> : Φ is a satisfiable Boolean formula}

slide-78
SLIDE 78

Formula Satisfiability (SAT)

  • The SAT problem asks whether a given Boolean

formula is satisfiable

§ Example:

  • Φ = ((x1 →x2) ∨ ¬((¬x1 ↔ x3) ∨ x4)) ∧¬x2
  • Assignment: <x1 = 0, x2 = 0, x3 = 1, x4 = 1>

SAT = {<Φ> : Φ is a satisfiable Boolean formula}

slide-79
SLIDE 79

Formula Satisfiability (SAT)

  • The SAT problem asks whether a given Boolean

formula is satisfiable

§ Example:

  • Φ = ((x1 →x2) ∨ ¬((¬x1 ↔ x3) ∨ x4)) ∧¬x2
  • Assignment: <x1 = 0, x2 = 0, x3 = 1, x4 = 1>
  • Φ = ((0 →0) ∨ ¬((¬0 ↔ 1) ∨ 1)) ∧¬0

SAT = {<Φ> : Φ is a satisfiable Boolean formula}

slide-80
SLIDE 80

Formula Satisfiability (SAT)

  • The SAT problem asks whether a given Boolean

formula is satisfiable

§ Example:

  • Φ = ((x1 →x2) ∨ ¬((¬x1 ↔ x3) ∨ x4)) ∧¬x2
  • Assignment: <x1 = 0, x2 = 0, x3 = 1, x4 = 1>
  • Φ = ((0 →0) ∨ ¬((¬0 ↔ 1) ∨ 1)) ∧¬0
  • Φ = (1 ∨ ¬(1 ∨ 1)) ∧1

SAT = {<Φ> : Φ is a satisfiable Boolean formula}

slide-81
SLIDE 81

Formula Satisfiability (SAT)

  • The SAT problem asks whether a given Boolean

formula is satisfiable

§ Example:

  • Φ = ((x1 →x2) ∨ ¬((¬x1 ↔ x3) ∨ x4)) ∧¬x2
  • Assignment: <x1 = 0, x2 = 0, x3 = 1, x4 = 1>
  • Φ = ((0 →0) ∨ ¬((¬0 ↔ 1) ∨ 1)) ∧¬0
  • Φ = (1 ∨ ¬(1 ∨ 1)) ∧1
  • Φ = (1 ∨ 0) ∧1

SAT = {<Φ> : Φ is a satisfiable Boolean formula}

slide-82
SLIDE 82

Formula Satisfiability (SAT)

  • The SAT problem asks whether a given Boolean

formula is satisfiable

§ Example:

  • Φ = ((x1 →x2) ∨ ¬((¬x1 ↔ x3) ∨ x4)) ∧¬x2
  • Assignment: <x1 = 0, x2 = 0, x3 = 1, x4 = 1>
  • Φ = ((0 →0) ∨ ¬((¬0 ↔ 1) ∨ 1)) ∧¬0
  • Φ = (1 ∨ ¬(1 ∨ 1)) ∧1
  • Φ = (1 ∨ 0) ∧1
  • Φ = 1

SAT = {<Φ> : Φ is a satisfiable Boolean formula}

slide-83
SLIDE 83

DPLL satisfiability solving

Given a Boolean formula φ in clausal form (an AND of ORs) {{a, b}, {¬a, b}, {a,¬b}, {¬a,¬b}} determine whether a satisfying assignment of variables to truth values exists.

slide-84
SLIDE 84

DPLL satisfiability solving

Given a Boolean formula φ in clausal form (an AND of ORs) {{a, b}, {¬a, b}, {a,¬b}, {¬a,¬b}} determine whether a satisfying assignment of variables to truth values exists. Solvers based on Davis-Putnam-Logemann-Loveland algorithm:

  • 1. If φ = ∅ then SAT
  • 2. If ⃞ ∈ φ then UNSAT
  • 3. If φ = φ’ ∪ {x} then DPLL(φ’[x ↦ true])

If φ = φ’ ∪ {¬x} then DPLL(φ’[x ↦ false])

  • 4. Pick arbitrary x and return

DPLL(φ[x ↦ false]) ∨ DPLL(φ[x ↦ true]) {{a, b}, {¬a, b}, {a,¬b}} {{b}, {¬b}} {{b}} {⃞} {⃞} ∅

a ↦ false a ↦ true b ↦ false b ↦ true b ↦ true

slide-85
SLIDE 85

DPLL satisfiability solving

Given a Boolean formula φ in clausal form (an AND of ORs) {{a, b}, {¬a, b}, {a,¬b}, {¬a,¬b}} determine whether a satisfying assignment of variables to truth values exists. Solvers based on Davis-Putnam-Logemann-Loveland algorithm:

  • 1. If φ = ∅ then SAT
  • 2. If ⃞ ∈ φ then UNSAT
  • 3. If φ = φ’ ∪ {x} then DPLL(φ’[x ↦ true])

If φ = φ’ ∪ {¬x} then DPLL(φ’[x ↦ false])

  • 4. Pick arbitrary x and return

DPLL(φ[x ↦ false]) ∨ DPLL(φ[x ↦ true])

+ NP-complete but many heuristics and optimizations ⇒ can handle problems with 100,000’s of variables

{{a, b}, {¬a, b}, {a,¬b}} {{b}, {¬b}} {{b}} {⃞} {⃞} ∅

a ↦ false a ↦ true b ↦ false b ↦ true b ↦ true

slide-86
SLIDE 86

SAT solving as enabling technology

slide-87
SLIDE 87

SAT Competition

slide-88
SLIDE 88

Bounded Model Checking (BMC)

MC: check if a property holds for all states Init error . . .

slide-89
SLIDE 89

Bounded Model Checking (BMC)

MC: check if a property holds for all states BMC: check if a property holds for a subset of states Init error . . . k

slide-90
SLIDE 90

Bounded Model Checking (BMC)

IS THERE ANY ERROR?

no yes

M, S

  • k

fail MC:

slide-91
SLIDE 91

Bounded Model Checking (BMC)

IS THERE ANY ERROR? IS THERE ANY ERROR IN k STEPS?

no yes completeness threshold reached k+1 still tractable k+1 intractable no yes

M, S M, S

  • k
  • k

fail fail bound MC: BMC:

“never” happens in practice

slide-92
SLIDE 92

Bounded Model Checking

Basic Idea: check negation of given property up to given depth . . .

M0 M1 M2 Mk-1 Mk ¬ϕ0 ¬ϕ1 ¬ϕ2 ¬ϕk-1 ¬ϕk counterexample trace ∨ ∨ ∨ ∨ transition system property bound

slide-93
SLIDE 93

Bounded Model Checking

Basic Idea: check negation of given property up to given depth

  • transition system M unrolled k times

– for programs: unroll loops, unfold arrays, …

. . .

M0 M1 M2 Mk-1 Mk ¬ϕ0 ¬ϕ1 ¬ϕ2 ¬ϕk-1 ¬ϕk counterexample trace ∨ ∨ ∨ ∨ transition system property bound

slide-94
SLIDE 94

Bounded Model Checking

Basic Idea: check negation of given property up to given depth

  • transition system M unrolled k times

– for programs: unroll loops, unfold arrays, …

  • translated into verification condition ψ such that

ψ satisfiable iff ϕ has counterexample of max. depth k . . .

M0 M1 M2 Mk-1 Mk ¬ϕ0 ¬ϕ1 ¬ϕ2 ¬ϕk-1 ¬ϕk counterexample trace ∨ ∨ ∨ ∨ transition system property bound

slide-95
SLIDE 95

Bounded Model Checking

Basic Idea: check negation of given property up to given depth

  • transition system M unrolled k times

– for programs: unroll loops, unfold arrays, …

  • translated into verification condition ψ such that

ψ satisfiable iff ϕ has counterexample of max. depth k

  • has been applied successfully to verify HW/SW systems

. . .

M0 M1 M2 Mk-1 Mk ¬ϕ0 ¬ϕ1 ¬ϕ2 ¬ϕk-1 ¬ϕk counterexample trace ∨ ∨ ∨ ∨ transition system property bound

slide-96
SLIDE 96

Satisfiability Modulo Theories (1)

SMT decides the satisfiability of first-order logic

formulae using the combination of different background theories (building-in operators)

slide-97
SLIDE 97

Satisfiability Modulo Theories (1)

SMT decides the satisfiability of first-order logic

formulae using the combination of different background theories (building-in operators)

Theory Example Equality x1=x2 ∧ ¬ (x1=x3) ⇒ ¬(x1=x3)

slide-98
SLIDE 98

Satisfiability Modulo Theories (1)

SMT decides the satisfiability of first-order logic

formulae using the combination of different background theories (building-in operators)

Theory Example Equality x1=x2 ∧ ¬ (x1=x3) ⇒ ¬(x1=x3) Bit-vectors (b >> i) & 1 = 1

slide-99
SLIDE 99

Satisfiability Modulo Theories (1)

SMT decides the satisfiability of first-order logic

formulae using the combination of different background theories (building-in operators)

Theory Example Equality x1=x2 ∧ ¬ (x1=x3) ⇒ ¬(x1=x3) Bit-vectors (b >> i) & 1 = 1 Linear arithmetic (4y1 + 3y2 ≥ 4) ∨ (y2 – 3y3 ≤ 3)

slide-100
SLIDE 100

Satisfiability Modulo Theories (1)

SMT decides the satisfiability of first-order logic

formulae using the combination of different background theories (building-in operators)

Theory Example Equality x1=x2 ∧ ¬ (x1=x3) ⇒ ¬(x1=x3) Bit-vectors (b >> i) & 1 = 1 Linear arithmetic (4y1 + 3y2 ≥ 4) ∨ (y2 – 3y3 ≤ 3) Arrays (j = k ∧ a[k]=2) ⇒ a[j]=2

slide-101
SLIDE 101

Satisfiability Modulo Theories (1)

SMT decides the satisfiability of first-order logic

formulae using the combination of different background theories (building-in operators)

Theory Example Equality x1=x2 ∧ ¬ (x1=x3) ⇒ ¬(x1=x3) Bit-vectors (b >> i) & 1 = 1 Linear arithmetic (4y1 + 3y2 ≥ 4) ∨ (y2 – 3y3 ≤ 3) Arrays (j = k ∧ a[k]=2) ⇒ a[j]=2 Combined theories (j ≤ k ∧ a[j]=2) ⇒ a[i] < 3

slide-102
SLIDE 102

Satisfiability Modulo Theories (2)

  • Given

§ a decidable ∑-theory T § a quantifier-free formula ϕ

ϕ is T-satisfiable iff T ∪ {ϕ} is satisfiable, i.e., there exists a structure that satisfies both formula and sentences of T

slide-103
SLIDE 103

Satisfiability Modulo Theories (2)

  • Given

§ a decidable ∑-theory T § a quantifier-free formula ϕ

ϕ is T-satisfiable iff T ∪ {ϕ} is satisfiable, i.e., there exists a structure that satisfies both formula and sentences of T

  • Given

§ a set Γ ∪ {ϕ} of first-order formulae over T

ϕ is a T-consequence of Γ (Γ ⊧T ϕ) iff every model of T ∪ Γ is also a model of ϕ

slide-104
SLIDE 104

Satisfiability Modulo Theories (2)

  • Given

§ a decidable ∑-theory T § a quantifier-free formula ϕ

ϕ is T-satisfiable iff T ∪ {ϕ} is satisfiable, i.e., there exists a structure that satisfies both formula and sentences of T

  • Given

§ a set Γ ∪ {ϕ} of first-order formulae over T

ϕ is a T-consequence of Γ (Γ ⊧T ϕ) iff every model of T ∪ Γ is also a model of ϕ

  • Checking Γ ⊧T ϕ can be reduced in the usual way to

checking the T-satisfiability of Γ ∪ {¬ϕ}

slide-105
SLIDE 105

Satisfiability Modulo Theories (3)

  • let a be an array, b, c and d be signed bit-vectors of width

16, 32 and 32 respectively, and let g be an unary function.

slide-106
SLIDE 106

Satisfiability Modulo Theories (3)

  • let a be an array, b, c and d be signed bit-vectors of width

16, 32 and 32 respectively, and let g be an unary function.

slide-107
SLIDE 107

Satisfiability Modulo Theories (3)

  • let a be an array, b, c and d be signed bit-vectors of width

16, 32 and 32 respectively, and let g be an unary function.

b' extends b to the signed equivalent bit-vector of size 32

slide-108
SLIDE 108

Satisfiability Modulo Theories (3)

  • let a be an array, b, c and d be signed bit-vectors of width

16, 32 and 32 respectively, and let g be an unary function.

b' extends b to the signed equivalent bit-vector of size 32 replace b' by c−3 in the inequality

slide-109
SLIDE 109

Satisfiability Modulo Theories (3)

  • let a be an array, b, c and d be signed bit-vectors of width

16, 32 and 32 respectively, and let g be an unary function.

b' extends b to the signed equivalent bit-vector of size 32 replace b' by c−3 in the inequality using facts about bit-vector arithmetic

slide-110
SLIDE 110

Satisfiability Modulo Theories (4)

( ) ( ) ( ) ( )

4 1 3 3 1 , 12 , , : 3 − = + ∧ − = − ∧ ≠ d c c c g c c a store select g step

slide-111
SLIDE 111

Satisfiability Modulo Theories (4)

applying the theory of arrays

( ) ( )

4 1 3 1 12 : 4 − = + ∧ − ∧ ≠ d c c g g step

( ) ( ) ( ) ( )

4 1 3 3 1 , 12 , , : 3 − = + ∧ − = − ∧ ≠ d c c c g c c a store select g step

slide-112
SLIDE 112

Satisfiability Modulo Theories (4)

applying the theory of arrays

( ) ( )

4 1 3 1 12 : 4 − = + ∧ − ∧ ≠ d c c g g step

The function g implies that for all x and y, if x = y, then g (x) = g (y) (congruence rule).

10) d 5, (c AT : 5 = = S step

( ) ( ) ( ) ( )

4 1 3 3 1 , 12 , , : 3 − = + ∧ − = − ∧ ≠ d c c c g c c a store select g step

slide-113
SLIDE 113

Satisfiability Modulo Theories (4)

applying the theory of arrays

( ) ( )

4 1 3 1 12 : 4 − = + ∧ − ∧ ≠ d c c g g step

The function g implies that for all x and y, if x = y, then g (x) = g (y) (congruence rule).

10) d 5, (c AT : 5 = = S step

  • SMT solvers also apply:

– standard algebraic reduction rules – contextual simplification

false false r

( ) ( )

7 7 7 p a a p a ∧ = ∧ =

  • (

) ( ) ( ) ( )

4 1 3 3 1 , 12 , , : 3 − = + ∧ − = − ∧ ≠ d c c c g c c a store select g step

slide-114
SLIDE 114

BMC of Software

  • program modelled as state transition system

– state: program counter and program variables – derived from control-flow graph – checked safety properties give extra nodes

  • program unfolded up to given bounds

– loop iterations – context switches

  • unfolded program optimized to reduce blow-up

– constant propagation – forward substitutions

int main() { int a[2], i, x; if (x==0) a[i]=0; else a[i+2]=1; assert(a[i+1]==1); }

crucial

slide-115
SLIDE 115

BMC of Software

  • program modelled as state transition system

– state: program counter and program variables – derived from control-flow graph – checked safety properties give extra nodes

  • program unfolded up to given bounds

– loop iterations – context switches

  • unfolded program optimized to reduce blow-up

– constant propagation – forward substitutions

  • front-end converts unrolled and
  • ptimized program into SSA

int main() { int a[2], i, x; if (x==0) a[i]=0; else a[i+2]=1; assert(a[i+1]==1); }

crucial

g1 = x1 == 0 a1 = a0 WITH [i0:=0] a2 = a0 a3 = a2 WITH [2+i0:=1] a4 = g1 ? a1 : a3 t1 = a4 [1+i0] == 1

slide-116
SLIDE 116

BMC of Software

  • program modelled as state transition system

– state: program counter and program variables – derived from control-flow graph – checked safety properties give extra nodes

  • program unfolded up to given bounds

– loop iterations – context switches

  • unfolded program optimized to reduce blow-up

– constant propagation – forward substitutions

  • front-end converts unrolled and
  • ptimized program into SSA
  • extraction of constraints C and properties P

– specific to selected SMT solver, uses theories

  • satisfiability check of C ∧ ¬P

int main() { int a[2], i, x; if (x==0) a[i]=0; else a[i+2]=1; assert(a[i+1]==1); }

crucial

( ) ( ) ( )

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ = ∧ + = ∧ = ∧ = ∧ = = = ) , , ( : 1 , 2 , : : , , : : :

3 1 1 4 2 3 2 1 1 1

a a g ite a i a store a a a i a store a x g C

( )

⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ = + ∧ < + ∧ ≥ + ∧ < + ∧ ≥ + ∧ < ∧ ≥ = 1 1 , 2 1 1 2 2 2 2 :

4

i a select i i i i i i P

slide-117
SLIDE 117

Encoding of Numeric Types

  • SMT solvers typically provide different encodings for

numbers:

– abstract domains (Z, R) – fixed-width bit vectors (unsigned int, …)

▹ “internalized bit-blasting”

slide-118
SLIDE 118

Encoding of Numeric Types

  • SMT solvers typically provide different encodings for

numbers:

– abstract domains (Z, R) – fixed-width bit vectors (unsigned int, …)

▹ “internalized bit-blasting”

  • verification results can depend on encodings

(a > 0) ∧ (b > 0) ⇒

⇒ (a + b > 0)

slide-119
SLIDE 119

Encoding of Numeric Types

  • SMT solvers typically provide different encodings for

numbers:

– abstract domains (Z, R) – fixed-width bit vectors (unsigned int, …)

▹ “internalized bit-blasting”

  • verification results can depend on encodings

(a > 0) ∧ (b > 0) ⇒

⇒ (a + b > 0)

valid in abstract domains such as Z or R doesn’t hold for bitvectors, due to possible overflows

slide-120
SLIDE 120

Encoding of Numeric Types

  • SMT solvers typically provide different encodings for

numbers:

– abstract domains (Z, R) – fixed-width bit vectors (unsigned int, …)

▹ “internalized bit-blasting”

  • verification results can depend on encodings

(a > 0) ∧ (b > 0) ⇒

⇒ (a + b > 0)

– majority of VCs solved faster if numeric types are modelled by abstract domains but possible loss of precision – ESBMC supports both types of encoding and also combines them to improve scalability and precision

valid in abstract domains such as Z or R doesn’t hold for bitvectors, due to possible overflows

slide-121
SLIDE 121

Encoding Numeric Types as Bitvectors

Bitvector encodings need to handle

  • type casts and implicit conversions

§ arithmetic conversions implemented using word-level functions (part of the bitvector theory: Extract, SignExt, …)

  • different conversions for every pair of types
  • uses type information provided by front-end
slide-122
SLIDE 122

Encoding Numeric Types as Bitvectors

Bitvector encodings need to handle

  • type casts and implicit conversions

§ arithmetic conversions implemented using word-level functions (part of the bitvector theory: Extract, SignExt, …)

  • different conversions for every pair of types
  • uses type information provided by front-end

§ conversion to / from bool via if-then-else operator

t = ite(v ≠ k, true, false) //conversion to bool v = ite(t, 1, 0) //conversion from bool

slide-123
SLIDE 123

Encoding Numeric Types as Bitvectors

Bitvector encodings need to handle

  • type casts and implicit conversions

§ arithmetic conversions implemented using word-level functions (part of the bitvector theory: Extract, SignExt, …)

  • different conversions for every pair of types
  • uses type information provided by front-end

§ conversion to / from bool via if-then-else operator

t = ite(v ≠ k, true, false) //conversion to bool v = ite(t, 1, 0) //conversion from bool

  • arithmetic over- / underflow

§ standard requires modulo-arithmetic for unsigned integer

unsigned_overflow ⇔ (r – (r mod 2w)) < 2w

slide-124
SLIDE 124

Encoding Numeric Types as Bitvectors

Bitvector encodings need to handle

  • type casts and implicit conversions

§ arithmetic conversions implemented using word-level functions (part of the bitvector theory: Extract, SignExt, …)

  • different conversions for every pair of types
  • uses type information provided by front-end

§ conversion to / from bool via if-then-else operator

t = ite(v ≠ k, true, false) //conversion to bool v = ite(t, 1, 0) //conversion from bool

  • arithmetic over- / underflow

§ standard requires modulo-arithmetic for unsigned integer

unsigned_overflow ⇔ (r – (r mod 2w)) < 2w

§ define error literals to detect over- / underflow for other types

res_op ⇔ ¬ overflow(x, y) ∧ ¬ underflow(x, y)

  • similar to conversions
slide-125
SLIDE 125

Floating-Point Numbers

  • Over-approximate floating-point by fixed-point numbers

– encode the integral (i) and fractional (f) parts

slide-126
SLIDE 126

Floating-Point Numbers

  • Over-approximate floating-point by fixed-point numbers

– encode the integral (i) and fractional (f) parts

  • Binary encoding: get a new bit-vector b = i @ f with the

same bitwidth before and after the radix point of a.

// m = number of bits of i // n = number of bits of f

i = Extract(b, nb + ma – 1, nb) : ma ≤ mb SignExt(Extract(b, tb – 1, nb), ma – mb) :

  • therwise

f = Extract(b, nb – 1, nb – nb) : na ≤ nb Extract(b, nb, 0) @ SignExt(b, na - nb) :

  • therwise
slide-127
SLIDE 127

Floating-Point Numbers

  • Over-approximate floating-point by fixed-point numbers

– encode the integral (i) and fractional (f) parts

  • Binary encoding: get a new bit-vector b = i @ f with the

same bitwidth before and after the radix point of a.

  • Rational encoding: convert a to a rational number

⎪ ⎪ ⎩ ⎪ ⎪ ⎨ ⎧ ≠ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + ∗ + ∗ =

  • therwise

i f p p f p i a

n

: : 1 2

// p = number of decimal places // m = number of bits of i // n = number of bits of f

i = Extract(b, nb + ma – 1, nb) : ma ≤ mb SignExt(Extract(b, tb – 1, nb), ma – mb) :

  • therwise

f = Extract(b, nb – 1, nb – nb) : na ≤ nb Extract(b, nb, 0) @ SignExt(b, na - nb) :

  • therwise
slide-128
SLIDE 128

Floating-point SMT Encoding

  • The SMT floating-point theory is an addition to the

SMT standard, proposed in 2010 and formalises:

§ Floating-point arithmetic

slide-129
SLIDE 129

Floating-point SMT Encoding

  • The SMT floating-point theory is an addition to the

SMT standard, proposed in 2010 and formalises:

§ Floating-point arithmetic § Positive and negative infinities and zeroes

slide-130
SLIDE 130

Floating-point SMT Encoding

  • The SMT floating-point theory is an addition to the

SMT standard, proposed in 2010 and formalises:

§ Floating-point arithmetic § Positive and negative infinities and zeroes § NaNs

slide-131
SLIDE 131

Floating-point SMT Encoding

  • The SMT floating-point theory is an addition to the

SMT standard, proposed in 2010 and formalises:

§ Floating-point arithmetic § Positive and negative infinities and zeroes § NaNs § Comparison operators

slide-132
SLIDE 132

Floating-point SMT Encoding

  • The SMT floating-point theory is an addition to the

SMT standard, proposed in 2010 and formalises:

§ Floating-point arithmetic § Positive and negative infinities and zeroes § NaNs § Comparison operators § Five rounding modes: round nearest with ties choosing the even value, round nearest with ties choosing away from zero, round towards zero, round towards positive infinity and round towards negative infinity

slide-133
SLIDE 133

Floating-point SMT Encoding

  • Missing from the standard:

§ Floating-point exceptions § Signaling NaNs

slide-134
SLIDE 134

Floating-point SMT Encoding

  • Missing from the standard:

§ Floating-point exceptions § Signaling NaNs

  • Two solvers currently support the standard:

§ Z3: implements all operators § MathSAT: implements all but two operators

  • fp.rem: remainder: x - y * n, where n in Z is nearest to x/y
  • fp.fma: fused multiplication and addition; (x * y) + z
slide-135
SLIDE 135

Floating-point SMT Encoding

  • Missing from the standard:

§ Floating-point exceptions § Signaling NaNs

  • Two solvers currently support the standard:

§ Z3: implements all operators § MathSAT: implements all but two operators

  • fp.rem: remainder: x - y * n, where n in Z is nearest to x/y
  • fp.fma: fused multiplication and addition; (x * y) + z
  • Both solvers offer non-standard functions:

§ fp_as_ieeebv: converts floating-point to bitvectors § fp_from_ieeebv: converts bitvectors to floating-point

slide-136
SLIDE 136

How to encode Floating-point programs?

  • Most operations performed at program-level to encode

FP numbers have a one-to-one conversion to SMT

  • Special cases being casts to

boolean types and the fp.eq

  • perator

§ Usually, cast operations are encoded using extend/extract

  • peration

§ Extending floating-point numbers is non-trivial because of the format

slide-137
SLIDE 137

Cast to/from booleans

  • Simpler solutions:

§ Casting booleans to floating-point numbers can be done using an ite operator

slide-138
SLIDE 138

Cast to/from booleans

  • Simpler solutions:

§ Casting booleans to floating-point numbers can be done using an ite operator If true, assign 1f to b

slide-139
SLIDE 139

Cast to/from booleans

  • Simpler solutions:

§ Casting booleans to floating-point numbers can be done using an ite operator Otherwise, assign 0f to b

slide-140
SLIDE 140

Cast to/from booleans

  • Simpler solutions:

§ Casting floating-point numbers to booleans can be done using an equality and one not:

slide-141
SLIDE 141

Cast to/from booleans

  • Simpler solutions:

§ Casting floating-point numbers to booleans can be done using an equality and one not: true when the floating is not 0.0

slide-142
SLIDE 142

Cast to/from booleans

  • Simpler solutions:

§ Casting floating-point numbers to booleans can be done using an equality and one not:

  • therwise, the result is

false

slide-143
SLIDE 143

Cast to/from booleans

  • Simpler solutions:

§ Casting floating-point numbers to booleans can be done using an equality and one not:

slide-144
SLIDE 144

Floating-point Encoding: Illustrative Example

slide-145
SLIDE 145

Floating-point Encoding: Illustrative Example

slide-146
SLIDE 146

Variable declarations

Floating-point Encoding: Illustrative Example

slide-147
SLIDE 147

Nondeterministic symbol declaration (optional)

Floating-point Encoding: Illustrative Example

slide-148
SLIDE 148

Guard used to check satisfiability

Floating-point Encoding: Illustrative Example

slide-149
SLIDE 149

Assignment of nondeterministic value to x

Floating-point Encoding: Illustrative Example

slide-150
SLIDE 150

Assignment x to y

Floating-point Encoding: Illustrative Example

slide-151
SLIDE 151

Check if the comparison satisfies the guard

Floating-point Encoding: Illustrative Example

slide-152
SLIDE 152
  • Z3 produces:

Floating-point Encoding: Illustrative Example

slide-153
SLIDE 153
  • MathSAT produces:

Floating-point Encoding: Illustrative Example

slide-154
SLIDE 154

Floating-point Encoding: Illustrative Example

slide-155
SLIDE 155
  • Introduce software verification and validation
  • Understand soundness and completeness

concerning detection techniques

  • Emphasize the difference among static

analysis, testing / simulation, and debugging

  • Explain bounded model checking of software
  • Explain precise memory model for software

verification

Intended learning outcomes

slide-156
SLIDE 156

Encoding of Pointers

  • arrays and records / tuples typically handled directly by

SMT-solver

  • pointers modelled as tuples

– p.o ≙ representation of underlying object – p.i ≙ index (if pointer used as array base)

slide-157
SLIDE 157

Encoding of Pointers

  • arrays and records / tuples typically handled directly by

SMT-solver

  • pointers modelled as tuples

– p.o ≙ representation of underlying object – p.i ≙ index (if pointer used as array base)

int main() { int a[2], i, x, *p; p=a; if (x==0) a[i]=0; else a[i+1]=1; assert(*(p+2)==1); }

slide-158
SLIDE 158

Encoding of Pointers

  • arrays and records / tuples typically handled directly by

SMT-solver

  • pointers modelled as tuples

– p.o ≙ representation of underlying object – p.i ≙ index (if pointer used as array base)

int main() { int a[2], i, x, *p; p=a; if (x==0) a[i]=0; else a[i+1]=1; assert(*(p+2)==1); }

p1 := store(p0, 0, &a[0]) ∧ p2 := store(p1, 1, 0) ∧ g2 := (x2 == 0) ∧ a1 := store(a0, i0, 0) ∧ a2 := a0 ∧ a3 := store(a2, 1+ i0, 1) ∧ a4 := ite(g1, a1, a3) ∧ p3 := store(p2, 1, select(p2 , 1)+2) C:=

slide-159
SLIDE 159

Encoding of Pointers

  • arrays and records / tuples typically handled directly by

SMT-solver

  • pointers modelled as tuples

– p.o ≙ representation of underlying object – p.i ≙ index (if pointer used as array base)

int main() { int a[2], i, x, *p; p=a; if (x==0) a[i]=0; else a[i+1]=1; assert(*(p+2)==1); }

p1 := store(p0, 0, &a[0]) ∧ p2 := store(p1, 1, 0) ∧ g2 := (x2 == 0) ∧ a1 := store(a0, i0, 0) ∧ a2 := a0 ∧ a3 := store(a2, 1+ i0, 1) ∧ a4 := ite(g1, a1, a3) ∧ p3 := store(p2, 1, select(p2 , 1)+2) C:=

Store object at position 0

slide-160
SLIDE 160

Encoding of Pointers

  • arrays and records / tuples typically handled directly by

SMT-solver

  • pointers modelled as tuples

– p.o ≙ representation of underlying object – p.i ≙ index (if pointer used as array base)

int main() { int a[2], i, x, *p; p=a; if (x==0) a[i]=0; else a[i+1]=1; assert(*(p+2)==1); }

p1 := store(p0, 0, &a[0]) ∧ p2 := store(p1, 1, 0) ∧ g2 := (x2 == 0) ∧ a1 := store(a0, i0, 0) ∧ a2 := a0 ∧ a3 := store(a2, 1+ i0, 1) ∧ a4 := ite(g1, a1, a3) ∧ p3 := store(p2, 1, select(p2 , 1)+2) C:=

Store object at position 0 Store index at position 1

slide-161
SLIDE 161

Encoding of Pointers

  • arrays and records / tuples typically handled directly by

SMT-solver

  • pointers modelled as tuples

– p.o ≙ representation of underlying object – p.i ≙ index (if pointer used as array base)

int main() { int a[2], i, x, *p; p=a; if (x==0) a[i]=0; else a[i+1]=1; assert(*(p+2)==1); }

p1 := store(p0, 0, &a[0]) ∧ p2 := store(p1, 1, 0) ∧ g2 := (x2 == 0) ∧ a1 := store(a0, i0, 0) ∧ a2 := a0 ∧ a3 := store(a2, 1+ i0, 1) ∧ a4 := ite(g1, a1, a3) ∧ p3 := store(p2, 1, select(p2 , 1)+2) C:=

Store object at position 0 Store index at position 1 Update index

slide-162
SLIDE 162

Encoding of Pointers

  • arrays and records / tuples typically handled directly by

SMT-solver

  • pointers modelled as tuples

– p.o ≙ representation of underlying object – p.i ≙ index (if pointer used as array base)

int main() { int a[2], i, x, *p; p=a; if (x==0) a[i]=0; else a[i+1]=1; assert(*(p+2)==1); }

i0 ≥ 0 ∧ i0 < 2 ∧ 1+ i0 ≥ 0 ∧ 1+ i0 < 2 ∧ select(p3 , 0) == &a[0] ∧ select(select(p3 , 0), select(p3 , 1)) == 1 P:=

negation satisfiable (a[2] unconstrained) ⇒ assert fails

slide-163
SLIDE 163

Encoding of Memory Allocation

  • model memory just as an array of bytes (array theories)

– read and write operations to the memory array on the logic level

slide-164
SLIDE 164

Encoding of Memory Allocation

  • model memory just as an array of bytes (array theories)

– read and write operations to the memory array on the logic level

  • each dynamic object do consists of

– m ≙ memory array – s ≙ size in bytes of m – ρ ≙ unique identifier – υ ≙ indicate whether the object is still alive – l ≙ the location in the execution where m is allocated

slide-165
SLIDE 165

Encoding of Memory Allocation

  • model memory just as an array of bytes (array theories)

– read and write operations to the memory array on the logic level

  • each dynamic object do consists of

– m ≙ memory array – s ≙ size in bytes of m – ρ ≙ unique identifier – υ ≙ indicate whether the object is still alive – l ≙ the location in the execution where m is allocated

  • to detect invalid reads/writes, we check whether

– do is a dynamic object – i is within the bounds of the memory array

( )

n i j d l

  • k

j

  • bject

dynamic is

< ≤ ∧ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ = ∨ ⇔

=

.

1 _ _

ρ

slide-166
SLIDE 166

Encoding of Memory Allocation

  • to check for invalid objects, we

– set υ to true if the function malloc can allocate memory (do is alive) – set υ to false if the function free is called (do is not longer alive) lvalid_object ⇔ (lis_dynamic_object ⇒ do.υ)

slide-167
SLIDE 167

Encoding of Memory Allocation

  • to check for invalid objects, we

– set υ to true if the function malloc can allocate memory (do is alive) – set υ to false if the function free is called (do is not longer alive)

  • to detect forgotten memory, at the end of the (unrolled)

program we check

– whether the do has been deallocated by the function free ldeallocated_object ⇔ (lis_dynamic_object ⇒ ¬ do.υ) lvalid_object ⇔ (lis_dynamic_object ⇒ do.υ)

slide-168
SLIDE 168

Example of Memory Allocation

#include <stdlib.h> void main() { char *p = malloc(5); // ρ = 1 char *q = malloc(5); // ρ = 2 p=q; free(p) p = malloc(5); // ρ = 3 free(p) }

Assume that the malloc call succeeds

slide-169
SLIDE 169

Example of Memory Allocation

#include <stdlib.h> void main() { char *p = malloc(5); // ρ = 1 char *q = malloc(5); // ρ = 2 p=q; free(p) p = malloc(5); // ρ = 3 free(p) }

memory leak: pointer reassignment makes do1.υ to become an orphan

slide-170
SLIDE 170

Example of Memory Allocation

#include <stdlib.h> void main() { char *p = malloc(5); // ρ = 1 char *q = malloc(5); // ρ = 2 p=q; free(p) p = malloc(5); // ρ = 3 free(p) }

do1.ρ=1 ∧ do1.s=5 ∧ do1.υ=true ∧ p=do1 ∧ do2.ρ=2 ∧ do2.s=5 ∧ do2.υ=true ∧ q=do2 ∧ p=do2 ∧ do2.υ=false ∧ do3.ρ=3 ∧ do3.s=5 ∧ do3.υ=true ∧ p=do3 ∧ do3.υ=false C:= ¬do1.υ ∧ ¬do2.υ ¬do3.υ P:=

slide-171
SLIDE 171

Example of Memory Allocation

#include <stdlib.h> void main() { char *p = malloc(5); // ρ = 1 char *q = malloc(5); // ρ = 2 p=q; free(p) p = malloc(5); // ρ = 3 free(p) }

do1.ρ=1 ∧ do1.s=5 ∧ do1.υ=true ∧ p=do1 ∧ do2.ρ=2 ∧ do2.s=5 ∧ do2.υ=true ∧ q=do2 ∧ p=do2 ∧ do2.υ=false ∧ do3.ρ=3 ∧ do3.s=5 ∧ do3.υ=true ∧ p=do3 ∧ do3.υ=false C:= ¬do1.υ ∧ ¬do2.υ ¬do3.υ P:=

slide-172
SLIDE 172

Align-guaranteed memory mode

  • Alignment rules require that any pointer variable

must be aligned to at least the alignment of the pointer type

§ E.g., an integer pointer’s value must be aligned to at least 4 bytes, for 32-bit integers

slide-173
SLIDE 173

Align-guaranteed memory mode

  • Alignment rules require that any pointer variable

must be aligned to at least the alignment of the pointer type

§ E.g., an integer pointer’s value must be aligned to at least 4 bytes, for 32-bit integers

  • Encode property assertions when dereferences
  • ccur during symbolic execution

§ To guard against executions where an unaligned pointer is dereferenced

slide-174
SLIDE 174

Align-guaranteed memory mode

  • Alignment rules require that any pointer variable

must be aligned to at least the alignment of the pointer type

§ E.g., an integer pointer’s value must be aligned to at least 4 bytes, for 32-bit integers

  • Encode property assertions when dereferences
  • ccur during symbolic execution

§ To guard against executions where an unaligned pointer is dereferenced § This is not as strong as the C standard requirement, that a pointer variable may never hold an unaligned value

  • But it provides a guarantee that any pointer dereference will either

be correctly aligned or result in a verification failure

slide-175
SLIDE 175

ESBMC’s memory model

  • statically tracks possible pointer variable targets (objects)

– dereferencing a pointer leads to the construction of guarded references to each potential target

slide-176
SLIDE 176

ESBMC’s memory model

  • statically tracks possible pointer variable targets (objects)

– dereferencing a pointer leads to the construction of guarded references to each potential target

  • C is very liberal about permitted dereferences

struct foo { uint16_t bar[2]; uint8_t baz; }; struct foo qux; char *quux = &qux; quux++; *quux;

pointer and object types do not match

slide-177
SLIDE 177

ESBMC’s memory model

  • statically tracks possible pointer variable targets (objects)

– dereferencing a pointer leads to the construction of guarded references to each potential target

  • C is very liberal about permitted dereferences
  • SAT: immediate access to bit-level representation

struct foo { uint16_t bar[2]; uint8_t baz; }; struct foo qux; char *quux = &qux; quux++; *quux;

pointer and object types do not match

slide-178
SLIDE 178

ESBMC’s memory model

  • statically tracks possible pointer variable targets (objects)

– dereferencing a pointer leads to the construction of guarded references to each potential target

  • C is very liberal about permitted dereferences
  • SMT: sorts must be repeatedly unwrapped

struct foo { uint16_t bar[2]; uint8_t baz; }; struct foo qux; char *quux = &qux; quux++; *quux;

pointer and object types do not match

slide-179
SLIDE 179

Byte-level data extraction in SMT

  • access to underlying data bytes is complicated

– requires manipulation of arrays / tuples

slide-180
SLIDE 180

Byte-level data extraction in SMT

  • access to underlying data bytes is complicated

– requires manipulation of arrays / tuples

  • problem is magnified by nondeterministic offsets

uint16_t *fuzz; if (nondet_bool()) { fuzz = &qux.bar[0]; } else { fuzz = &qux.baz; }

─ chooses accessed field nondeterministically ─ requires a byte_extract expression ─ handles the tuple that encoded the struct

slide-181
SLIDE 181

Byte-level data extraction in SMT

  • access to underlying data bytes is complicated

– requires manipulation of arrays / tuples

  • problem is magnified by nondeterministic offsets
  • supporting all legal behaviors at SMT layer difficult

– extract (unaligned) 16bit integer from *fuzz

uint16_t *fuzz; if (nondet_bool()) { fuzz = &qux.bar[0]; } else { fuzz = &qux.baz; }

─ chooses accessed field nondeterministically ─ requires a byte_extract expression ─ handles the tuple that encoded the struct

slide-182
SLIDE 182

Byte-level data extraction in SMT

  • access to underlying data bytes is complicated

– requires manipulation of arrays / tuples

  • problem is magnified by nondeterministic offsets
  • supporting all legal behaviors at SMT layer difficult

– extract (unaligned) 16bit integer from *fuzz

  • experiments showed significantly increased memory

consumption

uint16_t *fuzz; if (nondet_bool()) { fuzz = &qux.bar[0]; } else { fuzz = &qux.baz; }

─ chooses accessed field nondeterministically ─ requires a byte_extract expression ─ handles the tuple that encoded the struct

slide-183
SLIDE 183

“Aligned” Memory Model

  • framework cannot easily be changed to SMT-level

byte representation (a la LLBMC)

slide-184
SLIDE 184

“Aligned” Memory Model

  • framework cannot easily be changed to SMT-level

byte representation (a la LLBMC)

  • push unwrapping of SMT data structures to dereference
slide-185
SLIDE 185

“Aligned” Memory Model

  • framework cannot easily be changed to SMT-level

byte representation (a la LLBMC)

  • push unwrapping of SMT data structures to dereference
  • enforce C alignment rules

– static analysis of pointer alignment eliminates need to encode unaligned data accesses → reduces number of behaviors that must be modeled

slide-186
SLIDE 186

“Aligned” Memory Model

  • framework cannot easily be changed to SMT-level

byte representation (a la LLBMC)

  • push unwrapping of SMT data structures to dereference
  • enforce C alignment rules

– static analysis of pointer alignment eliminates need to encode unaligned data accesses → reduces number of behaviors that must be modeled – add alignment assertions (if static analysis not conclusive)

slide-187
SLIDE 187

“Aligned” Memory Model

  • framework cannot easily be changed to SMT-level

byte representation (a la LLBMC)

  • push unwrapping of SMT data structures to dereference
  • enforce C alignment rules

– static analysis of pointer alignment eliminates need to encode unaligned data accesses → reduces number of behaviors that must be modeled – add alignment assertions (if static analysis not conclusive)

– extracting 16-bit integer from *fuzz if guard is true:

– offset = 0: project bar[0] out of foo – offset = 1: “unaligned memory access” failure – offset = 2: project bar[1] out of foo – offset = 3: “unaligned memory access” failure – offset = 4: “access to object out of bounds” failure

slide-188
SLIDE 188
  • Described the difference between soundness and

completeness concerning detection techniques

– False positive and false negative

  • Pointed out the difference between static analysis

and testing / simulation

– hybrid combination of static and dynamic analysis techniques to achieve a good trade-off between soundness and completeness

  • Explained bounded model checking of software

– they have been applied successfully to verify single- threaded software using a precise memory model

Summary