SLIDE 1 Detection of Software Vulnerabilities: Static Analysis (Part I)
Lucas Cordeiro Department of Computer Science lucas.cordeiro@manchester.ac.uk Systems and Software Verification Laboratory
SLIDE 2 Static Analysis
- Lucas Cordeiro (Formal Methods Group)
§ lucas.cordeiro@manchester.ac.uk § Office: 2.28 § Office hours: 15-16 Tuesday, 14-15 Wednesday
§ Model checking (Chapter 14) § Software model checking. ACM Comput. Surv., 2009 § The Cyber Security Body of Knowledge, 2019 § Software Engineering (Chapters 8, 13)
SLIDE 3
- Functionality demanded increased significantly
– Peer reviewing and testing
Motivating Example
SLIDE 4
- Functionality demanded increased significantly
– Peer reviewing and testing
- Multi-core processors with scalable shared memory /
message passing
– Static and dynamic verification
Motivating Example
SLIDE 5
- Functionality demanded increased significantly
– Peer reviewing and testing
- Multi-core processors with scalable shared memory /
message passing
– Static and dynamic verification
void *threadA(void *arg) { lock(&mutex); x++; if (x == 1) lock(&lock); unlock(&mutex); lock(&mutex); x--; if (x == 0) unlock(&lock); unlock(&mutex); } void *threadB(void *arg) { lock(&mutex); y++; if (y == 1) lock(&lock); unlock(&mutex); lock(&mutex); y--; if (y == 0) unlock(&lock); unlock(&mutex); }
Motivating Example
SLIDE 6
- Functionality demanded increased significantly
– Peer reviewing and testing
- Multi-core processors with scalable shared memory /
message passing
– Static and dynamic verification
void *threadA(void *arg) { lock(&mutex); x++; if (x == 1) lock(&lock); unlock(&mutex); lock(&mutex); x--; if (x == 0) unlock(&lock); unlock(&mutex); } void *threadB(void *arg) { lock(&mutex); y++; if (y == 1) lock(&lock); unlock(&mutex); lock(&mutex); y--; if (y == 0) unlock(&lock); unlock(&mutex); }
(CS1)
Motivating Example
SLIDE 7
- Functionality demanded increased significantly
– Peer reviewing and testing
- Multi-core processors with scalable shared memory /
message passing
– Static and dynamic verification
void *threadA(void *arg) { lock(&mutex); x++; if (x == 1) lock(&lock); unlock(&mutex); lock(&mutex); x--; if (x == 0) unlock(&lock); unlock(&mutex); } void *threadB(void *arg) { lock(&mutex); y++; if (y == 1) lock(&lock); unlock(&mutex); lock(&mutex); y--; if (y == 0) unlock(&lock); unlock(&mutex); }
(CS1) (CS2)
Motivating Example
SLIDE 8
- Functionality demanded increased significantly
– Peer reviewing and testing
- Multi-core processors with scalable shared memory /
message passing
– Static and dynamic verification
void *threadA(void *arg) { lock(&mutex); x++; if (x == 1) lock(&lock); unlock(&mutex); lock(&mutex); x--; if (x == 0) unlock(&lock); unlock(&mutex); } void *threadB(void *arg) { lock(&mutex); y++; if (y == 1) lock(&lock); unlock(&mutex); lock(&mutex); y--; if (y == 0) unlock(&lock); unlock(&mutex); }
(CS1) (CS2) (CS3)
Motivating Example
SLIDE 9
- Functionality demanded increased significantly
– Peer reviewing and testing
- Multi-core processors with scalable shared memory /
message passing
– Static and dynamic verification
void *threadA(void *arg) { lock(&mutex); x++; if (x == 1) lock(&lock); unlock(&mutex); lock(&mutex); x--; if (x == 0) unlock(&lock); unlock(&mutex); } void *threadB(void *arg) { lock(&mutex); y++; if (y == 1) lock(&lock); unlock(&mutex); lock(&mutex); y--; if (y == 0) unlock(&lock); unlock(&mutex); }
(CS1) (CS2) (CS3)
Deadlock
Motivating Example
SLIDE 10
- Introduce software verification and validation
Intended learning outcomes
SLIDE 11
- Introduce software verification and validation
- Understand soundness and completeness
concerning detection techniques
Intended learning outcomes
SLIDE 12
- Introduce software verification and validation
- Understand soundness and completeness
concerning detection techniques
- Emphasize the difference among static
analysis, testing / simulation, and debugging
Intended learning outcomes
SLIDE 13
- Introduce software verification and validation
- Understand soundness and completeness
concerning detection techniques
- Emphasize the difference among static
analysis, testing / simulation, and debugging
- Explain bounded model checking of software
Intended learning outcomes
SLIDE 14
- Introduce software verification and validation
- Understand soundness and completeness
concerning detection techniques
- Emphasize the difference among static
analysis, testing / simulation, and debugging
- Explain bounded model checking of software
- Explain precise memory model for software
verification
Intended learning outcomes
SLIDE 15
- Introduce software verification and validation
- Understand soundness and completeness
concerning detection techniques
- Emphasize the difference among static
analysis, testing / simulation, and debugging
- Explain bounded model checking of software
- Explain precise memory model for software
verification
Intended learning outcomes
SLIDE 16
- Verification: "Are we building the product right?”
§ The software should conform to its specification
Verification vs Validation
SLIDE 17
- Verification: "Are we building the product right?”
§ The software should conform to its specification
- Validation: "Are we building the right product?”
§ The software should do what the user requires
Verification vs Validation
SLIDE 18
- Verification: "Are we building the product right?”
§ The software should conform to its specification
- Validation: "Are we building the right product?”
§ The software should do what the user requires
- Verification and validation must be applied at each
stage in the software process
§ The discovery of defects in a system § The assessment of whether or not the system is usable in an operational situation
Verification vs Validation
SLIDE 19
- Software inspections are concerned with the
analysis of the static system representation to discover problems (static verification)
§ Supplement by tool-based document and code analysis § Code analysis can prove the absence of errors but might subject to incorrect results
Static and Dynamic Verification
SLIDE 20
- Software inspections are concerned with the
analysis of the static system representation to discover problems (static verification)
§ Supplement by tool-based document and code analysis § Code analysis can prove the absence of errors but might subject to incorrect results
- Software testing is concerned with exercising and
- bserving product behaviour (dynamic verification)
§ The system is executed with test data § Operational behaviour is observed § Can reveal the presence of errors NOT their absence
Static and Dynamic Verification
SLIDE 21 Static and Dynamic Verification
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
Ian Sommerville. Software Engineering (6th,7th or 8th Edn) Addison Wesley
SLIDE 22
- Careful planning is required to get the most out of
dynamic and static verification
§ Planning should start early in the development process
§ The plan should identify the balance between static and dynamic verification
V & V planning
SLIDE 23
- Careful planning is required to get the most out of
dynamic and static verification
§ Planning should start early in the development process
§ The plan should identify the balance between static and dynamic verification
- V & V should establish confidence that the software
is fit for purpose
V & V planning
SLIDE 24
- Careful planning is required to get the most out of
dynamic and static verification
§ Planning should start early in the development process
§ The plan should identify the balance between static and dynamic verification
- V & V should establish confidence that the software
is fit for purpose
V & V planning
V & V planning depends on system’s purpose, user expectations and marketing environment
SLIDE 25 The V-model of development
Requirements specification System specification System design Detailed design Module and unit code and tess Sub-system integration test plan System integration test plan Acceptance test plan Service Acceptance test System integration test Sub-system integration test
Ian Sommerville. Software Engineering (6th,7th or 8th Edn) Addison Wesley
SLIDE 26
- Introduce software verification and validation
- Understand soundness and completeness
concerning detection techniques
- Emphasize the difference among static
analysis, testing / simulation, and debugging
- Explain bounded model checking of software
- Explain unbounded model checking of
software
Intended learning outcomes
SLIDE 27 Detection of Vulnerabilities
- Detect the presence of vulnerabilities in the code
during the development, testing, and maintenance
SLIDE 28 Detection of Vulnerabilities
- Detect the presence of vulnerabilities in the code
during the development, testing, and maintenance
- Trade-off between soundness and completeness
SLIDE 29 Detection of Vulnerabilities
- Detect the presence of vulnerabilities in the code
during the development, testing, and maintenance
- Trade-off between soundness and completeness
§ A detection technique is sound for a given category if it can correctly conclude that a given program has no vulnerabilities
- An unsound detection technique may have false negatives, i.e.,
actual vulnerabilities that the detection technique fails to find
SLIDE 30 Detection of Vulnerabilities
- Detect the presence of vulnerabilities in the code
during the development, testing, and maintenance
- Trade-off between soundness and completeness
§ A detection technique is sound for a given category if it can correctly conclude that a given program has no vulnerabilities
- An unsound detection technique may have false negatives, i.e.,
actual vulnerabilities that the detection technique fails to find
§ A detection technique is complete for a given category, if any vulnerability it finds is an actual vulnerability
- An incomplete detection technique may have false positives, i.e., it
may detect issues that do not turn out to be actual vulnerabilities
SLIDE 31 Detection of Vulnerabilities
- Achieving soundness requires reasoning about all
executions of a program (usually an infinite number)
§ This can be done by static checking of the program code while making suitable abstractions of the executions
SLIDE 32 Detection of Vulnerabilities
- Achieving soundness requires reasoning about all
executions of a program (usually an infinite number)
§ This can be done by static checking of the program code while making suitable abstractions of the executions
- Achieving completeness can be done by performing
actual, concrete executions of a program that are witnesses to any vulnerability reported
§ The analysis technique has to come up with concrete inputs for the program that triggers a vulnerability § A typical dynamic approach is software testing: the tester writes test cases with concrete inputs and specific checks for the outputs
SLIDE 33
Detection of Vulnerabilities
Detection tools can use a hybrid combination of static and dynamic analysis techniques to achieve a good trade-off between soundness and completeness
SLIDE 34
Detection of Vulnerabilities
Detection tools can use a hybrid combination of static and dynamic analysis techniques to achieve a good trade-off between soundness and completeness Dynamic verification should be used in conjunction with static verification to provide full code coverage
SLIDE 35
- Introduce software verification and validation
- Understand soundness and completeness
concerning detection techniques
- Emphasize the difference among static
analysis, testing / simulation, and debugging
- Explain bounded model checking of software
- Explain unbounded model checking of
software
Intended learning outcomes
SLIDE 36 Static analysis vs Testing/ Simulation
- Checks only some of the system executions
§ May miss errors
- A successful execution is an execution that
discovers one or more errors Simulation/ testing OK error
SLIDE 37 Static analysis vs Testing/ Simulation
- Exhaustively explores all executions
- Report errors as traces
- May produce incorrect results
Model Checking OK Error trace Specification
Line 5: … Line 12: … … Line 41:…
SLIDE 38 Avoiding state space explosion
- Bounded Model Checking (BMC)
§ Breadth-first search (BFS) approach
§ Depth-first search (DFS) approach
SLIDE 39 Bounded Model Checking
checkers explore the state space in depth
correctness if all states are reachable within the bound
k = 0 k = 1 k = 2 k = 3 k = 4 k = 5 k = 6
A graph G = (V, E) consists of:
- V: a set of vertices or nodes
- E ⊆ V x V: set of edges connecting the nodes
SLIDE 40 Breadth-First Search (BFS)
BFS(G,s) 01 for each vertex u ∈ V[G]-{s} // anchor (s) 02 colour[u] ← white // u colour 03 d[u] ← ∞ // s distance 04 π[u] ← NIL // u predecessor 05 colour[s] ← grey 06 d[s] ← 0 07 π[s] ← NIL 08 enqueue(Q,s) 09 while Q ≠ ∅ do 10 u ← dequeue(Q) 11 for each v ∈ Adj[u] do 12 If colour[v] = white then 13 colour[v] ← grey 14 d[v] ← d[u] + 1 15 π[v] ← u 16 enqueue(Q,v) 17 colour[u] ← blue
Initialization of graph nodes Initializes the anchor node (s) Visit each adjacent node of u
SLIDE 41 2 1 7 6 5 4 3
BFS Example
SLIDE 42 BFS Example
2 1 7 6 5 4 3
SLIDE 43 BFS Example
2 1 7 6 5 4 3
SLIDE 44 BFS Example
2 1 7 6 5 4 3
SLIDE 45 BFS Example
2 1 7 6 5 4 3
SLIDE 46 BFS Example
2 1 7 6 5 4 3
SLIDE 47 BFS Example
2 1 7 6 5 4 3
SLIDE 48 BFS Example
2 1 7 6 5 4 3
SLIDE 49 Symbolic Execution
explores all paths individually
correctness if all paths are explored
SLIDE 50
Depth-first search (DFS)
Paint all vertices white and initialize the fields π with NIL where π [u] represents the predecessor of u
SLIDE 51 DFS Example
2 1 7 6 5 4 3
1/
SLIDE 52 DFS Example
2 1 7 6 5 4 3
1/ 2/
SLIDE 53 DFS Example
2 1 7 6 5 4 3
1/ 2/ 3/
SLIDE 54 DFS Example
2 1 7 6 5 4 3
1/ 4/ 3/ 2/
SLIDE 55 DFS Example
2 1 7 6 5 4 3
1/ 3/ 4/ 5/ 2/
SLIDE 56 DFS Example
2 1 7 6 5 4 3
1/ 2/9 4/7 5/6 3/8
SLIDE 57 DFS Example
2 1 7 6 5 4 3
1/ 2/9 4/7 5/6 3/8 10/
SLIDE 58 DFS Example
2 1 7 6 5 4 3
1/ 2/9 4/7 5/6 3/8 10/ 11/
SLIDE 59 DFS Example
2 1 7 6 5 4 3
1/ 2/9 4/7 5/6 3/8 10/ 11/12
SLIDE 60 DFS Example
2 1 7 6 5 4 3
1/ 2/9 4/7 5/6 3/8 10/13 11/12
SLIDE 61 DFS Example
2 1 7 6 5 4 3
1/14 2/9 4/7 5/6 3/8 10/13 11/12
SLIDE 62 DFS Example
2 1 7 6 5 4 3
1/14 2/9 4/7 5/6 3/8 10/13 11/12 15/16
SLIDE 63
- V & V and debugging are distinct processes
V&V and debugging
SLIDE 64
- V & V and debugging are distinct processes
- V & V is concerned with establishing the absence or
existence of defects in a program, resp.
V&V and debugging
SLIDE 65
- V & V and debugging are distinct processes
- V & V is concerned with establishing the absence or
existence of defects in a program, resp.
- Debugging is concerned with two main tasks
§ Locating and § Repairing these errors
V&V and debugging
SLIDE 66
- V & V and debugging are distinct processes
- V & V is concerned with establishing the absence or
existence of defects in a program, resp.
- Debugging is concerned with two main tasks
§ Locating and § Repairing these errors
§ Formulating a hypothesis about program behaviour § Test these hypotheses to find the system error
V&V and debugging
SLIDE 67 The debugging process
Locate error Design error repair Repair error Re-test program Test results Specification Test cases
Ian Sommerville. Software Engineering (6th,7th or 8th Edn) Addison Wesley
SLIDE 68
- Introduce software verification and validation
- Understand soundness and completeness
concerning detection techniques
- Emphasize the difference among static
analysis, testing / simulation, and debugging
- Explain bounded model checking of software
- Explain precise memory model for software
verification
Intended learning outcomes
SLIDE 69 Circuit Satisfiability
- A Boolean formula contains
§ Variables whose values are 0 or 1
SLIDE 70 Circuit Satisfiability
- A Boolean formula contains
§ Variables whose values are 0 or 1 § Connectives: ∧ (AND), ∨ (OR), and ¬ (NOT)
SLIDE 71 Circuit Satisfiability
- A Boolean formula contains
§ Variables whose values are 0 or 1 § Connectives: ∧ (AND), ∨ (OR), and ¬ (NOT)
- A Boolean formula is SAT if there exists some
assignment to its variables that evaluates it to 1
SLIDE 72 Circuit Satisfiability
- A Boolean combinational circuit consists of
- ne or more Boolean combinational elements
interconnected by wires
SAT: <x1 = 1, x2 = 1, x3 = 0>
SLIDE 73 Circuit-Satisfiability Problem
- Given a Boolean combinational circuit of
AND, OR, and NOT gates, is it satisfiable?
CIRCUIT-SAT = {<C> : C is a satisfiable Boolean combinational circuit}
SLIDE 74 Circuit-Satisfiability Problem
- Given a Boolean combinational circuit of
AND, OR, and NOT gates, is it satisfiable?
§ Size: number of Boolean combinational elements plus the number of wires
- if the circuit has k inputs, then we would have to check up to
2k possible assignments CIRCUIT-SAT = {<C> : C is a satisfiable Boolean combinational circuit}
SLIDE 75 Circuit-Satisfiability Problem
- Given a Boolean combinational circuit of
AND, OR, and NOT gates, is it satisfiable?
§ Size: number of Boolean combinational elements plus the number of wires
- if the circuit has k inputs, then we would have to check up to
2k possible assignments
§ When the size of C is polynomial in k, checking each one takes Ω(2k)
- Super-polynomial in the size of k
CIRCUIT-SAT = {<C> : C is a satisfiable Boolean combinational circuit}
SLIDE 76 Formula Satisfiability (SAT)
- The SAT problem asks whether a given Boolean
formula is satisfiable SAT = {<Φ> : Φ is a satisfiable Boolean formula}
SLIDE 77 Formula Satisfiability (SAT)
- The SAT problem asks whether a given Boolean
formula is satisfiable
§ Example:
- Φ = ((x1 →x2) ∨ ¬((¬x1 ↔ x3) ∨ x4)) ∧¬x2
SAT = {<Φ> : Φ is a satisfiable Boolean formula}
SLIDE 78 Formula Satisfiability (SAT)
- The SAT problem asks whether a given Boolean
formula is satisfiable
§ Example:
- Φ = ((x1 →x2) ∨ ¬((¬x1 ↔ x3) ∨ x4)) ∧¬x2
- Assignment: <x1 = 0, x2 = 0, x3 = 1, x4 = 1>
SAT = {<Φ> : Φ is a satisfiable Boolean formula}
SLIDE 79 Formula Satisfiability (SAT)
- The SAT problem asks whether a given Boolean
formula is satisfiable
§ Example:
- Φ = ((x1 →x2) ∨ ¬((¬x1 ↔ x3) ∨ x4)) ∧¬x2
- Assignment: <x1 = 0, x2 = 0, x3 = 1, x4 = 1>
- Φ = ((0 →0) ∨ ¬((¬0 ↔ 1) ∨ 1)) ∧¬0
SAT = {<Φ> : Φ is a satisfiable Boolean formula}
SLIDE 80 Formula Satisfiability (SAT)
- The SAT problem asks whether a given Boolean
formula is satisfiable
§ Example:
- Φ = ((x1 →x2) ∨ ¬((¬x1 ↔ x3) ∨ x4)) ∧¬x2
- Assignment: <x1 = 0, x2 = 0, x3 = 1, x4 = 1>
- Φ = ((0 →0) ∨ ¬((¬0 ↔ 1) ∨ 1)) ∧¬0
- Φ = (1 ∨ ¬(1 ∨ 1)) ∧1
SAT = {<Φ> : Φ is a satisfiable Boolean formula}
SLIDE 81 Formula Satisfiability (SAT)
- The SAT problem asks whether a given Boolean
formula is satisfiable
§ Example:
- Φ = ((x1 →x2) ∨ ¬((¬x1 ↔ x3) ∨ x4)) ∧¬x2
- Assignment: <x1 = 0, x2 = 0, x3 = 1, x4 = 1>
- Φ = ((0 →0) ∨ ¬((¬0 ↔ 1) ∨ 1)) ∧¬0
- Φ = (1 ∨ ¬(1 ∨ 1)) ∧1
- Φ = (1 ∨ 0) ∧1
SAT = {<Φ> : Φ is a satisfiable Boolean formula}
SLIDE 82 Formula Satisfiability (SAT)
- The SAT problem asks whether a given Boolean
formula is satisfiable
§ Example:
- Φ = ((x1 →x2) ∨ ¬((¬x1 ↔ x3) ∨ x4)) ∧¬x2
- Assignment: <x1 = 0, x2 = 0, x3 = 1, x4 = 1>
- Φ = ((0 →0) ∨ ¬((¬0 ↔ 1) ∨ 1)) ∧¬0
- Φ = (1 ∨ ¬(1 ∨ 1)) ∧1
- Φ = (1 ∨ 0) ∧1
- Φ = 1
SAT = {<Φ> : Φ is a satisfiable Boolean formula}
SLIDE 83
DPLL satisfiability solving
Given a Boolean formula φ in clausal form (an AND of ORs) {{a, b}, {¬a, b}, {a,¬b}, {¬a,¬b}} determine whether a satisfying assignment of variables to truth values exists.
SLIDE 84 DPLL satisfiability solving
Given a Boolean formula φ in clausal form (an AND of ORs) {{a, b}, {¬a, b}, {a,¬b}, {¬a,¬b}} determine whether a satisfying assignment of variables to truth values exists. Solvers based on Davis-Putnam-Logemann-Loveland algorithm:
- 1. If φ = ∅ then SAT
- 2. If ⃞ ∈ φ then UNSAT
- 3. If φ = φ’ ∪ {x} then DPLL(φ’[x ↦ true])
If φ = φ’ ∪ {¬x} then DPLL(φ’[x ↦ false])
- 4. Pick arbitrary x and return
DPLL(φ[x ↦ false]) ∨ DPLL(φ[x ↦ true]) {{a, b}, {¬a, b}, {a,¬b}} {{b}, {¬b}} {{b}} {⃞} {⃞} ∅
a ↦ false a ↦ true b ↦ false b ↦ true b ↦ true
SLIDE 85 DPLL satisfiability solving
Given a Boolean formula φ in clausal form (an AND of ORs) {{a, b}, {¬a, b}, {a,¬b}, {¬a,¬b}} determine whether a satisfying assignment of variables to truth values exists. Solvers based on Davis-Putnam-Logemann-Loveland algorithm:
- 1. If φ = ∅ then SAT
- 2. If ⃞ ∈ φ then UNSAT
- 3. If φ = φ’ ∪ {x} then DPLL(φ’[x ↦ true])
If φ = φ’ ∪ {¬x} then DPLL(φ’[x ↦ false])
- 4. Pick arbitrary x and return
DPLL(φ[x ↦ false]) ∨ DPLL(φ[x ↦ true])
+ NP-complete but many heuristics and optimizations ⇒ can handle problems with 100,000’s of variables
{{a, b}, {¬a, b}, {a,¬b}} {{b}, {¬b}} {{b}} {⃞} {⃞} ∅
a ↦ false a ↦ true b ↦ false b ↦ true b ↦ true
SLIDE 86
SAT solving as enabling technology
SLIDE 87
SAT Competition
SLIDE 88
Bounded Model Checking (BMC)
MC: check if a property holds for all states Init error . . .
SLIDE 89
Bounded Model Checking (BMC)
MC: check if a property holds for all states BMC: check if a property holds for a subset of states Init error . . . k
SLIDE 90 Bounded Model Checking (BMC)
IS THERE ANY ERROR?
no yes
M, S
fail MC:
SLIDE 91 Bounded Model Checking (BMC)
IS THERE ANY ERROR? IS THERE ANY ERROR IN k STEPS?
no yes completeness threshold reached k+1 still tractable k+1 intractable no yes
M, S M, S
fail fail bound MC: BMC:
“never” happens in practice
SLIDE 92 Bounded Model Checking
Basic Idea: check negation of given property up to given depth . . .
M0 M1 M2 Mk-1 Mk ¬ϕ0 ¬ϕ1 ¬ϕ2 ¬ϕk-1 ¬ϕk counterexample trace ∨ ∨ ∨ ∨ transition system property bound
SLIDE 93 Bounded Model Checking
Basic Idea: check negation of given property up to given depth
- transition system M unrolled k times
– for programs: unroll loops, unfold arrays, …
. . .
M0 M1 M2 Mk-1 Mk ¬ϕ0 ¬ϕ1 ¬ϕ2 ¬ϕk-1 ¬ϕk counterexample trace ∨ ∨ ∨ ∨ transition system property bound
SLIDE 94 Bounded Model Checking
Basic Idea: check negation of given property up to given depth
- transition system M unrolled k times
– for programs: unroll loops, unfold arrays, …
- translated into verification condition ψ such that
ψ satisfiable iff ϕ has counterexample of max. depth k . . .
M0 M1 M2 Mk-1 Mk ¬ϕ0 ¬ϕ1 ¬ϕ2 ¬ϕk-1 ¬ϕk counterexample trace ∨ ∨ ∨ ∨ transition system property bound
SLIDE 95 Bounded Model Checking
Basic Idea: check negation of given property up to given depth
- transition system M unrolled k times
– for programs: unroll loops, unfold arrays, …
- translated into verification condition ψ such that
ψ satisfiable iff ϕ has counterexample of max. depth k
- has been applied successfully to verify HW/SW systems
. . .
M0 M1 M2 Mk-1 Mk ¬ϕ0 ¬ϕ1 ¬ϕ2 ¬ϕk-1 ¬ϕk counterexample trace ∨ ∨ ∨ ∨ transition system property bound
SLIDE 96
Satisfiability Modulo Theories (1)
SMT decides the satisfiability of first-order logic
formulae using the combination of different background theories (building-in operators)
SLIDE 97
Satisfiability Modulo Theories (1)
SMT decides the satisfiability of first-order logic
formulae using the combination of different background theories (building-in operators)
Theory Example Equality x1=x2 ∧ ¬ (x1=x3) ⇒ ¬(x1=x3)
SLIDE 98
Satisfiability Modulo Theories (1)
SMT decides the satisfiability of first-order logic
formulae using the combination of different background theories (building-in operators)
Theory Example Equality x1=x2 ∧ ¬ (x1=x3) ⇒ ¬(x1=x3) Bit-vectors (b >> i) & 1 = 1
SLIDE 99
Satisfiability Modulo Theories (1)
SMT decides the satisfiability of first-order logic
formulae using the combination of different background theories (building-in operators)
Theory Example Equality x1=x2 ∧ ¬ (x1=x3) ⇒ ¬(x1=x3) Bit-vectors (b >> i) & 1 = 1 Linear arithmetic (4y1 + 3y2 ≥ 4) ∨ (y2 – 3y3 ≤ 3)
SLIDE 100
Satisfiability Modulo Theories (1)
SMT decides the satisfiability of first-order logic
formulae using the combination of different background theories (building-in operators)
Theory Example Equality x1=x2 ∧ ¬ (x1=x3) ⇒ ¬(x1=x3) Bit-vectors (b >> i) & 1 = 1 Linear arithmetic (4y1 + 3y2 ≥ 4) ∨ (y2 – 3y3 ≤ 3) Arrays (j = k ∧ a[k]=2) ⇒ a[j]=2
SLIDE 101
Satisfiability Modulo Theories (1)
SMT decides the satisfiability of first-order logic
formulae using the combination of different background theories (building-in operators)
Theory Example Equality x1=x2 ∧ ¬ (x1=x3) ⇒ ¬(x1=x3) Bit-vectors (b >> i) & 1 = 1 Linear arithmetic (4y1 + 3y2 ≥ 4) ∨ (y2 – 3y3 ≤ 3) Arrays (j = k ∧ a[k]=2) ⇒ a[j]=2 Combined theories (j ≤ k ∧ a[j]=2) ⇒ a[i] < 3
SLIDE 102 Satisfiability Modulo Theories (2)
§ a decidable ∑-theory T § a quantifier-free formula ϕ
ϕ is T-satisfiable iff T ∪ {ϕ} is satisfiable, i.e., there exists a structure that satisfies both formula and sentences of T
SLIDE 103 Satisfiability Modulo Theories (2)
§ a decidable ∑-theory T § a quantifier-free formula ϕ
ϕ is T-satisfiable iff T ∪ {ϕ} is satisfiable, i.e., there exists a structure that satisfies both formula and sentences of T
§ a set Γ ∪ {ϕ} of first-order formulae over T
ϕ is a T-consequence of Γ (Γ ⊧T ϕ) iff every model of T ∪ Γ is also a model of ϕ
SLIDE 104 Satisfiability Modulo Theories (2)
§ a decidable ∑-theory T § a quantifier-free formula ϕ
ϕ is T-satisfiable iff T ∪ {ϕ} is satisfiable, i.e., there exists a structure that satisfies both formula and sentences of T
§ a set Γ ∪ {ϕ} of first-order formulae over T
ϕ is a T-consequence of Γ (Γ ⊧T ϕ) iff every model of T ∪ Γ is also a model of ϕ
- Checking Γ ⊧T ϕ can be reduced in the usual way to
checking the T-satisfiability of Γ ∪ {¬ϕ}
SLIDE 105 Satisfiability Modulo Theories (3)
- let a be an array, b, c and d be signed bit-vectors of width
16, 32 and 32 respectively, and let g be an unary function.
SLIDE 106 Satisfiability Modulo Theories (3)
- let a be an array, b, c and d be signed bit-vectors of width
16, 32 and 32 respectively, and let g be an unary function.
SLIDE 107 Satisfiability Modulo Theories (3)
- let a be an array, b, c and d be signed bit-vectors of width
16, 32 and 32 respectively, and let g be an unary function.
b' extends b to the signed equivalent bit-vector of size 32
SLIDE 108 Satisfiability Modulo Theories (3)
- let a be an array, b, c and d be signed bit-vectors of width
16, 32 and 32 respectively, and let g be an unary function.
b' extends b to the signed equivalent bit-vector of size 32 replace b' by c−3 in the inequality
SLIDE 109 Satisfiability Modulo Theories (3)
- let a be an array, b, c and d be signed bit-vectors of width
16, 32 and 32 respectively, and let g be an unary function.
b' extends b to the signed equivalent bit-vector of size 32 replace b' by c−3 in the inequality using facts about bit-vector arithmetic
SLIDE 110 Satisfiability Modulo Theories (4)
( ) ( ) ( ) ( )
4 1 3 3 1 , 12 , , : 3 − = + ∧ − = − ∧ ≠ d c c c g c c a store select g step
SLIDE 111 Satisfiability Modulo Theories (4)
applying the theory of arrays
( ) ( )
4 1 3 1 12 : 4 − = + ∧ − ∧ ≠ d c c g g step
( ) ( ) ( ) ( )
4 1 3 3 1 , 12 , , : 3 − = + ∧ − = − ∧ ≠ d c c c g c c a store select g step
SLIDE 112 Satisfiability Modulo Theories (4)
applying the theory of arrays
( ) ( )
4 1 3 1 12 : 4 − = + ∧ − ∧ ≠ d c c g g step
The function g implies that for all x and y, if x = y, then g (x) = g (y) (congruence rule).
10) d 5, (c AT : 5 = = S step
( ) ( ) ( ) ( )
4 1 3 3 1 , 12 , , : 3 − = + ∧ − = − ∧ ≠ d c c c g c c a store select g step
SLIDE 113 Satisfiability Modulo Theories (4)
applying the theory of arrays
( ) ( )
4 1 3 1 12 : 4 − = + ∧ − ∧ ≠ d c c g g step
The function g implies that for all x and y, if x = y, then g (x) = g (y) (congruence rule).
10) d 5, (c AT : 5 = = S step
– standard algebraic reduction rules – contextual simplification
false false r
( ) ( )
7 7 7 p a a p a ∧ = ∧ =
) ( ) ( ) ( )
4 1 3 3 1 , 12 , , : 3 − = + ∧ − = − ∧ ≠ d c c c g c c a store select g step
SLIDE 114 BMC of Software
- program modelled as state transition system
– state: program counter and program variables – derived from control-flow graph – checked safety properties give extra nodes
- program unfolded up to given bounds
– loop iterations – context switches
- unfolded program optimized to reduce blow-up
– constant propagation – forward substitutions
int main() { int a[2], i, x; if (x==0) a[i]=0; else a[i+2]=1; assert(a[i+1]==1); }
crucial
SLIDE 115 BMC of Software
- program modelled as state transition system
– state: program counter and program variables – derived from control-flow graph – checked safety properties give extra nodes
- program unfolded up to given bounds
– loop iterations – context switches
- unfolded program optimized to reduce blow-up
– constant propagation – forward substitutions
- front-end converts unrolled and
- ptimized program into SSA
int main() { int a[2], i, x; if (x==0) a[i]=0; else a[i+2]=1; assert(a[i+1]==1); }
crucial
g1 = x1 == 0 a1 = a0 WITH [i0:=0] a2 = a0 a3 = a2 WITH [2+i0:=1] a4 = g1 ? a1 : a3 t1 = a4 [1+i0] == 1
SLIDE 116 BMC of Software
- program modelled as state transition system
– state: program counter and program variables – derived from control-flow graph – checked safety properties give extra nodes
- program unfolded up to given bounds
– loop iterations – context switches
- unfolded program optimized to reduce blow-up
– constant propagation – forward substitutions
- front-end converts unrolled and
- ptimized program into SSA
- extraction of constraints C and properties P
– specific to selected SMT solver, uses theories
- satisfiability check of C ∧ ¬P
int main() { int a[2], i, x; if (x==0) a[i]=0; else a[i+2]=1; assert(a[i+1]==1); }
crucial
( ) ( ) ( )
⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ = ∧ + = ∧ = ∧ = ∧ = = = ) , , ( : 1 , 2 , : : , , : : :
3 1 1 4 2 3 2 1 1 1
a a g ite a i a store a a a i a store a x g C
( )
⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ = + ∧ < + ∧ ≥ + ∧ < + ∧ ≥ + ∧ < ∧ ≥ = 1 1 , 2 1 1 2 2 2 2 :
4
i a select i i i i i i P
SLIDE 117 Encoding of Numeric Types
- SMT solvers typically provide different encodings for
numbers:
– abstract domains (Z, R) – fixed-width bit vectors (unsigned int, …)
▹ “internalized bit-blasting”
SLIDE 118 Encoding of Numeric Types
- SMT solvers typically provide different encodings for
numbers:
– abstract domains (Z, R) – fixed-width bit vectors (unsigned int, …)
▹ “internalized bit-blasting”
- verification results can depend on encodings
(a > 0) ∧ (b > 0) ⇒
⇒ (a + b > 0)
SLIDE 119 Encoding of Numeric Types
- SMT solvers typically provide different encodings for
numbers:
– abstract domains (Z, R) – fixed-width bit vectors (unsigned int, …)
▹ “internalized bit-blasting”
- verification results can depend on encodings
(a > 0) ∧ (b > 0) ⇒
⇒ (a + b > 0)
valid in abstract domains such as Z or R doesn’t hold for bitvectors, due to possible overflows
SLIDE 120 Encoding of Numeric Types
- SMT solvers typically provide different encodings for
numbers:
– abstract domains (Z, R) – fixed-width bit vectors (unsigned int, …)
▹ “internalized bit-blasting”
- verification results can depend on encodings
(a > 0) ∧ (b > 0) ⇒
⇒ (a + b > 0)
– majority of VCs solved faster if numeric types are modelled by abstract domains but possible loss of precision – ESBMC supports both types of encoding and also combines them to improve scalability and precision
valid in abstract domains such as Z or R doesn’t hold for bitvectors, due to possible overflows
SLIDE 121 Encoding Numeric Types as Bitvectors
Bitvector encodings need to handle
- type casts and implicit conversions
§ arithmetic conversions implemented using word-level functions (part of the bitvector theory: Extract, SignExt, …)
- different conversions for every pair of types
- uses type information provided by front-end
SLIDE 122 Encoding Numeric Types as Bitvectors
Bitvector encodings need to handle
- type casts and implicit conversions
§ arithmetic conversions implemented using word-level functions (part of the bitvector theory: Extract, SignExt, …)
- different conversions for every pair of types
- uses type information provided by front-end
§ conversion to / from bool via if-then-else operator
t = ite(v ≠ k, true, false) //conversion to bool v = ite(t, 1, 0) //conversion from bool
SLIDE 123 Encoding Numeric Types as Bitvectors
Bitvector encodings need to handle
- type casts and implicit conversions
§ arithmetic conversions implemented using word-level functions (part of the bitvector theory: Extract, SignExt, …)
- different conversions for every pair of types
- uses type information provided by front-end
§ conversion to / from bool via if-then-else operator
t = ite(v ≠ k, true, false) //conversion to bool v = ite(t, 1, 0) //conversion from bool
- arithmetic over- / underflow
§ standard requires modulo-arithmetic for unsigned integer
unsigned_overflow ⇔ (r – (r mod 2w)) < 2w
SLIDE 124 Encoding Numeric Types as Bitvectors
Bitvector encodings need to handle
- type casts and implicit conversions
§ arithmetic conversions implemented using word-level functions (part of the bitvector theory: Extract, SignExt, …)
- different conversions for every pair of types
- uses type information provided by front-end
§ conversion to / from bool via if-then-else operator
t = ite(v ≠ k, true, false) //conversion to bool v = ite(t, 1, 0) //conversion from bool
- arithmetic over- / underflow
§ standard requires modulo-arithmetic for unsigned integer
unsigned_overflow ⇔ (r – (r mod 2w)) < 2w
§ define error literals to detect over- / underflow for other types
res_op ⇔ ¬ overflow(x, y) ∧ ¬ underflow(x, y)
SLIDE 125 Floating-Point Numbers
- Over-approximate floating-point by fixed-point numbers
– encode the integral (i) and fractional (f) parts
SLIDE 126 Floating-Point Numbers
- Over-approximate floating-point by fixed-point numbers
– encode the integral (i) and fractional (f) parts
- Binary encoding: get a new bit-vector b = i @ f with the
same bitwidth before and after the radix point of a.
// m = number of bits of i // n = number of bits of f
i = Extract(b, nb + ma – 1, nb) : ma ≤ mb SignExt(Extract(b, tb – 1, nb), ma – mb) :
f = Extract(b, nb – 1, nb – nb) : na ≤ nb Extract(b, nb, 0) @ SignExt(b, na - nb) :
SLIDE 127 Floating-Point Numbers
- Over-approximate floating-point by fixed-point numbers
– encode the integral (i) and fractional (f) parts
- Binary encoding: get a new bit-vector b = i @ f with the
same bitwidth before and after the radix point of a.
- Rational encoding: convert a to a rational number
⎪ ⎪ ⎩ ⎪ ⎪ ⎨ ⎧ ≠ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + ∗ + ∗ =
i f p p f p i a
n
: : 1 2
// p = number of decimal places // m = number of bits of i // n = number of bits of f
i = Extract(b, nb + ma – 1, nb) : ma ≤ mb SignExt(Extract(b, tb – 1, nb), ma – mb) :
f = Extract(b, nb – 1, nb – nb) : na ≤ nb Extract(b, nb, 0) @ SignExt(b, na - nb) :
SLIDE 128 Floating-point SMT Encoding
- The SMT floating-point theory is an addition to the
SMT standard, proposed in 2010 and formalises:
§ Floating-point arithmetic
SLIDE 129 Floating-point SMT Encoding
- The SMT floating-point theory is an addition to the
SMT standard, proposed in 2010 and formalises:
§ Floating-point arithmetic § Positive and negative infinities and zeroes
SLIDE 130 Floating-point SMT Encoding
- The SMT floating-point theory is an addition to the
SMT standard, proposed in 2010 and formalises:
§ Floating-point arithmetic § Positive and negative infinities and zeroes § NaNs
SLIDE 131 Floating-point SMT Encoding
- The SMT floating-point theory is an addition to the
SMT standard, proposed in 2010 and formalises:
§ Floating-point arithmetic § Positive and negative infinities and zeroes § NaNs § Comparison operators
SLIDE 132 Floating-point SMT Encoding
- The SMT floating-point theory is an addition to the
SMT standard, proposed in 2010 and formalises:
§ Floating-point arithmetic § Positive and negative infinities and zeroes § NaNs § Comparison operators § Five rounding modes: round nearest with ties choosing the even value, round nearest with ties choosing away from zero, round towards zero, round towards positive infinity and round towards negative infinity
SLIDE 133 Floating-point SMT Encoding
- Missing from the standard:
§ Floating-point exceptions § Signaling NaNs
SLIDE 134 Floating-point SMT Encoding
- Missing from the standard:
§ Floating-point exceptions § Signaling NaNs
- Two solvers currently support the standard:
§ Z3: implements all operators § MathSAT: implements all but two operators
- fp.rem: remainder: x - y * n, where n in Z is nearest to x/y
- fp.fma: fused multiplication and addition; (x * y) + z
SLIDE 135 Floating-point SMT Encoding
- Missing from the standard:
§ Floating-point exceptions § Signaling NaNs
- Two solvers currently support the standard:
§ Z3: implements all operators § MathSAT: implements all but two operators
- fp.rem: remainder: x - y * n, where n in Z is nearest to x/y
- fp.fma: fused multiplication and addition; (x * y) + z
- Both solvers offer non-standard functions:
§ fp_as_ieeebv: converts floating-point to bitvectors § fp_from_ieeebv: converts bitvectors to floating-point
SLIDE 136 How to encode Floating-point programs?
- Most operations performed at program-level to encode
FP numbers have a one-to-one conversion to SMT
- Special cases being casts to
boolean types and the fp.eq
§ Usually, cast operations are encoded using extend/extract
§ Extending floating-point numbers is non-trivial because of the format
SLIDE 137 Cast to/from booleans
§ Casting booleans to floating-point numbers can be done using an ite operator
SLIDE 138 Cast to/from booleans
§ Casting booleans to floating-point numbers can be done using an ite operator If true, assign 1f to b
SLIDE 139 Cast to/from booleans
§ Casting booleans to floating-point numbers can be done using an ite operator Otherwise, assign 0f to b
SLIDE 140 Cast to/from booleans
§ Casting floating-point numbers to booleans can be done using an equality and one not:
SLIDE 141 Cast to/from booleans
§ Casting floating-point numbers to booleans can be done using an equality and one not: true when the floating is not 0.0
SLIDE 142 Cast to/from booleans
§ Casting floating-point numbers to booleans can be done using an equality and one not:
false
SLIDE 143 Cast to/from booleans
§ Casting floating-point numbers to booleans can be done using an equality and one not:
SLIDE 144
Floating-point Encoding: Illustrative Example
SLIDE 145
Floating-point Encoding: Illustrative Example
SLIDE 146
Variable declarations
Floating-point Encoding: Illustrative Example
SLIDE 147
Nondeterministic symbol declaration (optional)
Floating-point Encoding: Illustrative Example
SLIDE 148
Guard used to check satisfiability
Floating-point Encoding: Illustrative Example
SLIDE 149
Assignment of nondeterministic value to x
Floating-point Encoding: Illustrative Example
SLIDE 150
Assignment x to y
Floating-point Encoding: Illustrative Example
SLIDE 151
Check if the comparison satisfies the guard
Floating-point Encoding: Illustrative Example
SLIDE 152
Floating-point Encoding: Illustrative Example
SLIDE 153
Floating-point Encoding: Illustrative Example
SLIDE 154
Floating-point Encoding: Illustrative Example
SLIDE 155
- Introduce software verification and validation
- Understand soundness and completeness
concerning detection techniques
- Emphasize the difference among static
analysis, testing / simulation, and debugging
- Explain bounded model checking of software
- Explain precise memory model for software
verification
Intended learning outcomes
SLIDE 156 Encoding of Pointers
- arrays and records / tuples typically handled directly by
SMT-solver
- pointers modelled as tuples
– p.o ≙ representation of underlying object – p.i ≙ index (if pointer used as array base)
SLIDE 157 Encoding of Pointers
- arrays and records / tuples typically handled directly by
SMT-solver
- pointers modelled as tuples
– p.o ≙ representation of underlying object – p.i ≙ index (if pointer used as array base)
int main() { int a[2], i, x, *p; p=a; if (x==0) a[i]=0; else a[i+1]=1; assert(*(p+2)==1); }
SLIDE 158 Encoding of Pointers
- arrays and records / tuples typically handled directly by
SMT-solver
- pointers modelled as tuples
– p.o ≙ representation of underlying object – p.i ≙ index (if pointer used as array base)
int main() { int a[2], i, x, *p; p=a; if (x==0) a[i]=0; else a[i+1]=1; assert(*(p+2)==1); }
p1 := store(p0, 0, &a[0]) ∧ p2 := store(p1, 1, 0) ∧ g2 := (x2 == 0) ∧ a1 := store(a0, i0, 0) ∧ a2 := a0 ∧ a3 := store(a2, 1+ i0, 1) ∧ a4 := ite(g1, a1, a3) ∧ p3 := store(p2, 1, select(p2 , 1)+2) C:=
SLIDE 159 Encoding of Pointers
- arrays and records / tuples typically handled directly by
SMT-solver
- pointers modelled as tuples
– p.o ≙ representation of underlying object – p.i ≙ index (if pointer used as array base)
int main() { int a[2], i, x, *p; p=a; if (x==0) a[i]=0; else a[i+1]=1; assert(*(p+2)==1); }
p1 := store(p0, 0, &a[0]) ∧ p2 := store(p1, 1, 0) ∧ g2 := (x2 == 0) ∧ a1 := store(a0, i0, 0) ∧ a2 := a0 ∧ a3 := store(a2, 1+ i0, 1) ∧ a4 := ite(g1, a1, a3) ∧ p3 := store(p2, 1, select(p2 , 1)+2) C:=
Store object at position 0
SLIDE 160 Encoding of Pointers
- arrays and records / tuples typically handled directly by
SMT-solver
- pointers modelled as tuples
– p.o ≙ representation of underlying object – p.i ≙ index (if pointer used as array base)
int main() { int a[2], i, x, *p; p=a; if (x==0) a[i]=0; else a[i+1]=1; assert(*(p+2)==1); }
p1 := store(p0, 0, &a[0]) ∧ p2 := store(p1, 1, 0) ∧ g2 := (x2 == 0) ∧ a1 := store(a0, i0, 0) ∧ a2 := a0 ∧ a3 := store(a2, 1+ i0, 1) ∧ a4 := ite(g1, a1, a3) ∧ p3 := store(p2, 1, select(p2 , 1)+2) C:=
Store object at position 0 Store index at position 1
SLIDE 161 Encoding of Pointers
- arrays and records / tuples typically handled directly by
SMT-solver
- pointers modelled as tuples
– p.o ≙ representation of underlying object – p.i ≙ index (if pointer used as array base)
int main() { int a[2], i, x, *p; p=a; if (x==0) a[i]=0; else a[i+1]=1; assert(*(p+2)==1); }
p1 := store(p0, 0, &a[0]) ∧ p2 := store(p1, 1, 0) ∧ g2 := (x2 == 0) ∧ a1 := store(a0, i0, 0) ∧ a2 := a0 ∧ a3 := store(a2, 1+ i0, 1) ∧ a4 := ite(g1, a1, a3) ∧ p3 := store(p2, 1, select(p2 , 1)+2) C:=
Store object at position 0 Store index at position 1 Update index
SLIDE 162 Encoding of Pointers
- arrays and records / tuples typically handled directly by
SMT-solver
- pointers modelled as tuples
– p.o ≙ representation of underlying object – p.i ≙ index (if pointer used as array base)
int main() { int a[2], i, x, *p; p=a; if (x==0) a[i]=0; else a[i+1]=1; assert(*(p+2)==1); }
i0 ≥ 0 ∧ i0 < 2 ∧ 1+ i0 ≥ 0 ∧ 1+ i0 < 2 ∧ select(p3 , 0) == &a[0] ∧ select(select(p3 , 0), select(p3 , 1)) == 1 P:=
negation satisfiable (a[2] unconstrained) ⇒ assert fails
SLIDE 163 Encoding of Memory Allocation
- model memory just as an array of bytes (array theories)
– read and write operations to the memory array on the logic level
SLIDE 164 Encoding of Memory Allocation
- model memory just as an array of bytes (array theories)
– read and write operations to the memory array on the logic level
- each dynamic object do consists of
– m ≙ memory array – s ≙ size in bytes of m – ρ ≙ unique identifier – υ ≙ indicate whether the object is still alive – l ≙ the location in the execution where m is allocated
SLIDE 165 Encoding of Memory Allocation
- model memory just as an array of bytes (array theories)
– read and write operations to the memory array on the logic level
- each dynamic object do consists of
– m ≙ memory array – s ≙ size in bytes of m – ρ ≙ unique identifier – υ ≙ indicate whether the object is still alive – l ≙ the location in the execution where m is allocated
- to detect invalid reads/writes, we check whether
– do is a dynamic object – i is within the bounds of the memory array
( )
n i j d l
j
dynamic is
< ≤ ∧ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ = ∨ ⇔
=
.
1 _ _
ρ
SLIDE 166 Encoding of Memory Allocation
- to check for invalid objects, we
– set υ to true if the function malloc can allocate memory (do is alive) – set υ to false if the function free is called (do is not longer alive) lvalid_object ⇔ (lis_dynamic_object ⇒ do.υ)
SLIDE 167 Encoding of Memory Allocation
- to check for invalid objects, we
– set υ to true if the function malloc can allocate memory (do is alive) – set υ to false if the function free is called (do is not longer alive)
- to detect forgotten memory, at the end of the (unrolled)
program we check
– whether the do has been deallocated by the function free ldeallocated_object ⇔ (lis_dynamic_object ⇒ ¬ do.υ) lvalid_object ⇔ (lis_dynamic_object ⇒ do.υ)
SLIDE 168 Example of Memory Allocation
#include <stdlib.h> void main() { char *p = malloc(5); // ρ = 1 char *q = malloc(5); // ρ = 2 p=q; free(p) p = malloc(5); // ρ = 3 free(p) }
Assume that the malloc call succeeds
SLIDE 169 Example of Memory Allocation
#include <stdlib.h> void main() { char *p = malloc(5); // ρ = 1 char *q = malloc(5); // ρ = 2 p=q; free(p) p = malloc(5); // ρ = 3 free(p) }
memory leak: pointer reassignment makes do1.υ to become an orphan
SLIDE 170 Example of Memory Allocation
#include <stdlib.h> void main() { char *p = malloc(5); // ρ = 1 char *q = malloc(5); // ρ = 2 p=q; free(p) p = malloc(5); // ρ = 3 free(p) }
do1.ρ=1 ∧ do1.s=5 ∧ do1.υ=true ∧ p=do1 ∧ do2.ρ=2 ∧ do2.s=5 ∧ do2.υ=true ∧ q=do2 ∧ p=do2 ∧ do2.υ=false ∧ do3.ρ=3 ∧ do3.s=5 ∧ do3.υ=true ∧ p=do3 ∧ do3.υ=false C:= ¬do1.υ ∧ ¬do2.υ ¬do3.υ P:=
SLIDE 171 Example of Memory Allocation
#include <stdlib.h> void main() { char *p = malloc(5); // ρ = 1 char *q = malloc(5); // ρ = 2 p=q; free(p) p = malloc(5); // ρ = 3 free(p) }
do1.ρ=1 ∧ do1.s=5 ∧ do1.υ=true ∧ p=do1 ∧ do2.ρ=2 ∧ do2.s=5 ∧ do2.υ=true ∧ q=do2 ∧ p=do2 ∧ do2.υ=false ∧ do3.ρ=3 ∧ do3.s=5 ∧ do3.υ=true ∧ p=do3 ∧ do3.υ=false C:= ¬do1.υ ∧ ¬do2.υ ¬do3.υ P:=
SLIDE 172 Align-guaranteed memory mode
- Alignment rules require that any pointer variable
must be aligned to at least the alignment of the pointer type
§ E.g., an integer pointer’s value must be aligned to at least 4 bytes, for 32-bit integers
SLIDE 173 Align-guaranteed memory mode
- Alignment rules require that any pointer variable
must be aligned to at least the alignment of the pointer type
§ E.g., an integer pointer’s value must be aligned to at least 4 bytes, for 32-bit integers
- Encode property assertions when dereferences
- ccur during symbolic execution
§ To guard against executions where an unaligned pointer is dereferenced
SLIDE 174 Align-guaranteed memory mode
- Alignment rules require that any pointer variable
must be aligned to at least the alignment of the pointer type
§ E.g., an integer pointer’s value must be aligned to at least 4 bytes, for 32-bit integers
- Encode property assertions when dereferences
- ccur during symbolic execution
§ To guard against executions where an unaligned pointer is dereferenced § This is not as strong as the C standard requirement, that a pointer variable may never hold an unaligned value
- But it provides a guarantee that any pointer dereference will either
be correctly aligned or result in a verification failure
SLIDE 175 ESBMC’s memory model
- statically tracks possible pointer variable targets (objects)
– dereferencing a pointer leads to the construction of guarded references to each potential target
SLIDE 176 ESBMC’s memory model
- statically tracks possible pointer variable targets (objects)
– dereferencing a pointer leads to the construction of guarded references to each potential target
- C is very liberal about permitted dereferences
struct foo { uint16_t bar[2]; uint8_t baz; }; struct foo qux; char *quux = &qux; quux++; *quux;
pointer and object types do not match
SLIDE 177 ESBMC’s memory model
- statically tracks possible pointer variable targets (objects)
– dereferencing a pointer leads to the construction of guarded references to each potential target
- C is very liberal about permitted dereferences
- SAT: immediate access to bit-level representation
struct foo { uint16_t bar[2]; uint8_t baz; }; struct foo qux; char *quux = &qux; quux++; *quux;
pointer and object types do not match
SLIDE 178 ESBMC’s memory model
- statically tracks possible pointer variable targets (objects)
– dereferencing a pointer leads to the construction of guarded references to each potential target
- C is very liberal about permitted dereferences
- SMT: sorts must be repeatedly unwrapped
struct foo { uint16_t bar[2]; uint8_t baz; }; struct foo qux; char *quux = &qux; quux++; *quux;
pointer and object types do not match
SLIDE 179 Byte-level data extraction in SMT
- access to underlying data bytes is complicated
– requires manipulation of arrays / tuples
SLIDE 180 Byte-level data extraction in SMT
- access to underlying data bytes is complicated
– requires manipulation of arrays / tuples
- problem is magnified by nondeterministic offsets
uint16_t *fuzz; if (nondet_bool()) { fuzz = &qux.bar[0]; } else { fuzz = &qux.baz; }
─ chooses accessed field nondeterministically ─ requires a byte_extract expression ─ handles the tuple that encoded the struct
SLIDE 181 Byte-level data extraction in SMT
- access to underlying data bytes is complicated
– requires manipulation of arrays / tuples
- problem is magnified by nondeterministic offsets
- supporting all legal behaviors at SMT layer difficult
– extract (unaligned) 16bit integer from *fuzz
uint16_t *fuzz; if (nondet_bool()) { fuzz = &qux.bar[0]; } else { fuzz = &qux.baz; }
─ chooses accessed field nondeterministically ─ requires a byte_extract expression ─ handles the tuple that encoded the struct
SLIDE 182 Byte-level data extraction in SMT
- access to underlying data bytes is complicated
– requires manipulation of arrays / tuples
- problem is magnified by nondeterministic offsets
- supporting all legal behaviors at SMT layer difficult
– extract (unaligned) 16bit integer from *fuzz
- experiments showed significantly increased memory
consumption
uint16_t *fuzz; if (nondet_bool()) { fuzz = &qux.bar[0]; } else { fuzz = &qux.baz; }
─ chooses accessed field nondeterministically ─ requires a byte_extract expression ─ handles the tuple that encoded the struct
SLIDE 183 “Aligned” Memory Model
- framework cannot easily be changed to SMT-level
byte representation (a la LLBMC)
SLIDE 184 “Aligned” Memory Model
- framework cannot easily be changed to SMT-level
byte representation (a la LLBMC)
- push unwrapping of SMT data structures to dereference
SLIDE 185 “Aligned” Memory Model
- framework cannot easily be changed to SMT-level
byte representation (a la LLBMC)
- push unwrapping of SMT data structures to dereference
- enforce C alignment rules
– static analysis of pointer alignment eliminates need to encode unaligned data accesses → reduces number of behaviors that must be modeled
SLIDE 186 “Aligned” Memory Model
- framework cannot easily be changed to SMT-level
byte representation (a la LLBMC)
- push unwrapping of SMT data structures to dereference
- enforce C alignment rules
– static analysis of pointer alignment eliminates need to encode unaligned data accesses → reduces number of behaviors that must be modeled – add alignment assertions (if static analysis not conclusive)
SLIDE 187 “Aligned” Memory Model
- framework cannot easily be changed to SMT-level
byte representation (a la LLBMC)
- push unwrapping of SMT data structures to dereference
- enforce C alignment rules
– static analysis of pointer alignment eliminates need to encode unaligned data accesses → reduces number of behaviors that must be modeled – add alignment assertions (if static analysis not conclusive)
– extracting 16-bit integer from *fuzz if guard is true:
– offset = 0: project bar[0] out of foo – offset = 1: “unaligned memory access” failure – offset = 2: project bar[1] out of foo – offset = 3: “unaligned memory access” failure – offset = 4: “access to object out of bounds” failure
SLIDE 188
- Described the difference between soundness and
completeness concerning detection techniques
– False positive and false negative
- Pointed out the difference between static analysis
and testing / simulation
– hybrid combination of static and dynamic analysis techniques to achieve a good trade-off between soundness and completeness
- Explained bounded model checking of software
– they have been applied successfully to verify single- threaded software using a precise memory model
Summary