SLIDE 1 TYPE-GUIDED WORST-CASE INPUT GENERATION
Di Wang, Jan Hoffmann
Carnegie Mellon University
λ
SLIDE 2 RESOURCE ANALYSIS
Programs
2
SLIDE 3 RESOURCE ANALYSIS
Programs
2
Performance
SLIDE 4 RESOURCE ANALYSIS
Programs
2
Performance Time Memory Power …
SLIDE 5 RESOURCE ANALYSIS
Programs Worst-Case Analysis
Performance bottlenecks Algorithmic complexity vulnerabilities Timing side channels
2
Performance Time Memory Power …
SLIDE 6 EXAMPLE OF WORST-CASE ANALYSIS
PHP
1 CVE - CVE-2011-4885. Available on: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2011-4885. 2 PHP 5.3.8 - Hashtables Denial of Service. Available on https://www.exploit-db.com/exploits/18296/. 3 PHP: PHP 5 ChangeLog. Available on http://www.php.net/ChangeLog-5.php#5.3.9.
3
SLIDE 7 EXAMPLE OF WORST-CASE ANALYSIS
PHP Potential Denial-of-Service attack1
1 CVE - CVE-2011-4885. Available on: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2011-4885. 2 PHP 5.3.8 - Hashtables Denial of Service. Available on https://www.exploit-db.com/exploits/18296/. 3 PHP: PHP 5 ChangeLog. Available on http://www.php.net/ChangeLog-5.php#5.3.9.
3
SLIDE 8 EXAMPLE OF WORST-CASE ANALYSIS
PHP Potential Denial-of-Service attack1 Concrete exploits (by hash collisions)2
1 CVE - CVE-2011-4885. Available on: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2011-4885. 2 PHP 5.3.8 - Hashtables Denial of Service. Available on https://www.exploit-db.com/exploits/18296/. 3 PHP: PHP 5 ChangeLog. Available on http://www.php.net/ChangeLog-5.php#5.3.9.
3
SLIDE 9 EXAMPLE OF WORST-CASE ANALYSIS
PHP Potential Denial-of-Service attack1 Concrete exploits (by hash collisions)2 Bug fixed!3
1 CVE - CVE-2011-4885. Available on: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2011-4885. 2 PHP 5.3.8 - Hashtables Denial of Service. Available on https://www.exploit-db.com/exploits/18296/. 3 PHP: PHP 5 ChangeLog. Available on http://www.php.net/ChangeLog-5.php#5.3.9.
3
SLIDE 10 EXAMPLE OF WORST-CASE ANALYSIS
PHP Potential Denial-of-Service attack1 Concrete exploits (by hash collisions)2 Bug fixed!3
Worst-case inputs are instrumental to understand and fix performance bugs!
1 CVE - CVE-2011-4885. Available on: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2011-4885. 2 PHP 5.3.8 - Hashtables Denial of Service. Available on https://www.exploit-db.com/exploits/18296/. 3 PHP: PHP 5 ChangeLog. Available on http://www.php.net/ChangeLog-5.php#5.3.9.
3
SLIDE 11 EXISTING APPROACHES
4
SLIDE 12 EXISTING APPROACHES
Fuzz testing Symbolic execution Dynamic worst-case analysis … Flexible & universal Potentially unsound: The resulting inputs might not expose the worst-case behavior. Dynamic
4
SLIDE 13 EXISTING APPROACHES
Fuzz testing Symbolic execution Dynamic worst-case analysis … Flexible & universal Potentially unsound: The resulting inputs might not expose the worst-case behavior. Type systems Abstract interpretation … Sound upper bounds Potentially not tight: No concrete witness — the bound might be too conservative. Dynamic Static
4
SLIDE 14 CONTRIBUTIONS
A type-guided worst-case input generation algorithm Proof of soundness and relative completeness Heuristics to improve scalability
5
SLIDE 15 CONTRIBUTIONS
λ
A type-guided worst-case input generation algorithm Proof of soundness and relative completeness Heuristics to improve scalability
5
SLIDE 16 CONTRIBUTIONS
λ
A type-guided worst-case input generation algorithm Proof of soundness and relative completeness Heuristics to improve scalability
Resource Aware ML (RaML)
5
SLIDE 17 CONTRIBUTIONS
λ
A type-guided worst-case input generation algorithm Proof of soundness and relative completeness Heuristics to improve scalability
Resource Aware ML (RaML)
5
SLIDE 18 CONTRIBUTIONS
λ
A type-guided worst-case input generation algorithm Proof of soundness and relative completeness Heuristics to improve scalability
Resource Aware ML (RaML) Symbolic Execution
5
SLIDE 19 CONTRIBUTIONS
λ
A type-guided worst-case input generation algorithm Proof of soundness and relative completeness Heuristics to improve scalability
Resource Aware ML (RaML) Symbolic Execution Guide
5
SLIDE 20 OVERVIEW
Motivation Resource Aware ML (RaML) Type-Guided Worst-Case Input Generation Evaluation
6
SLIDE 21 AMORTIZED RESOURCE ANALYSIS
The potential method
7
SLIDE 22 AMORTIZED RESOURCE ANALYSIS
The potential method
D0 D1 D2 D3 … Dn
7
D4 D5 …
SLIDE 23 AMORTIZED RESOURCE ANALYSIS
The potential method
D0 D1 D2 D3 … Dn
Di’s are program states
7
D4 D5 …
SLIDE 24 AMORTIZED RESOURCE ANALYSIS
The potential method
D0 D1 D2 D3 … Dn
Di’s are program states Arrows are transitions with actual costs
7
D4 D5 …
SLIDE 25 AMORTIZED RESOURCE ANALYSIS
The potential method
D0 D1 D2 D3 … Dn
Di’s are program states Arrows are transitions with actual costs
Φ(D0) Φ(D1) Φ(D2) Φ(D3) Φ(Dn)
7
D4 D5 …
SLIDE 26 AMORTIZED RESOURCE ANALYSIS
The potential method
D0 D1 D2 D3 … Dn
Di’s are program states Arrows are transitions with actual costs
Φ(D0) Φ(D1) Φ(D2) Φ(D3) Φ(Dn)
The potential function maps program states to nonnegative numbers
7
D4 D5 …
SLIDE 27 AMORTIZED RESOURCE ANALYSIS
The potential method
D0 D1 D2 D3 … Dn
Di’s are program states Arrows are transitions with actual costs
Φ(D0) Φ(D1) Φ(D2) Φ(D3) Φ(Dn)
The potential function maps program states to nonnegative numbers
Φ(D2) ≥ Cost(D2, D3) + Φ(D3)
7
D4 D5 …
SLIDE 28 AMORTIZED RESOURCE ANALYSIS
The potential method
D0 D1 D2 D3 … Dn
Di’s are program states Arrows are transitions with actual costs
Φ(D0) Φ(D1) Φ(D2) Φ(D3) Φ(Dn)
The potential function maps program states to nonnegative numbers
Φ(D2) ≥ Cost(D2, D3) + Φ(D3)
The initial potential is an upper bound!
7
D4 D5 …
SLIDE 29 TYPE-BASED ANALYSIS
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
The potential at a program point is defined by a static annotation of data structures. A list of length n annotated with a nonnegative number q has q·n units of potential.
8
SLIDE 30 TYPE-BASED ANALYSIS
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
The potential at a program point is defined by a static annotation of data structures. A list of length n annotated with a nonnegative number q has q·n units of potential.
Each of [], ::, (,) consumes 2 memory cells.
8
SLIDE 31 TYPE-BASED ANALYSIS
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
The potential at a program point is defined by a static annotation of data structures. A list of length n annotated with a nonnegative number q has q·n units of potential.
Each of [], ::, (,) consumes 2 memory cells.
8
Cost = 2 ⋅ |ℓ| + 2
SLIDE 32 TYPE-BASED ANALYSIS
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
The potential at a program point is defined by a static annotation of data structures. A list of length n annotated with a nonnegative number q has q·n units of potential.
L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎)
Each of [], ::, (,) consumes 2 memory cells.
8
SLIDE 33 TYPE-BASED ANALYSIS
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
The potential at a program point is defined by a static annotation of data structures. A list of length n annotated with a nonnegative number q has q·n units of potential.
L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎)
Each of [], ::, (,) consumes 2 memory cells.
Φ0 = 2 ⋅ |ℓ| + 2
8
SLIDE 34 TYPE-BASED ANALYSIS
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
The potential at a program point is defined by a static annotation of data structures. A list of length n annotated with a nonnegative number q has q·n units of potential.
L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎)
Each of [], ::, (,) consumes 2 memory cells.
Φ0 = 2 ⋅ |ℓ| + 2 Cost = 2
8
SLIDE 35 TYPE-BASED ANALYSIS
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
The potential at a program point is defined by a static annotation of data structures. A list of length n annotated with a nonnegative number q has q·n units of potential.
L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎)
Each of [], ::, (,) consumes 2 memory cells.
Φ1 = 2 ⋅ |xs| + 4
8
SLIDE 36 TYPE-BASED ANALYSIS
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
The potential at a program point is defined by a static annotation of data structures. A list of length n annotated with a nonnegative number q has q·n units of potential.
L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎)
Each of [], ::, (,) consumes 2 memory cells.
Φ2 = 2 ⋅ |xs′| + 6
8
SLIDE 37 TYPE-BASED ANALYSIS
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
The potential at a program point is defined by a static annotation of data structures. A list of length n annotated with a nonnegative number q has q·n units of potential.
L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎)
Each of [], ::, (,) consumes 2 memory cells.
Φ2 = 2 ⋅ |xs′| + 6 Cost = 4
8
SLIDE 38 TYPE-BASED ANALYSIS
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
The potential at a program point is defined by a static annotation of data structures. A list of length n annotated with a nonnegative number q has q·n units of potential.
L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎)
Each of [], ::, (,) consumes 2 memory cells.
Φ2 = 2 ⋅ |xs′| + 6 Φ3 = 2 ⋅ |xs′| + 2 Cost = 4
8
SLIDE 39 TYPE-BASED ANALYSIS
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
The potential at a program point is defined by a static annotation of data structures. A list of length n annotated with a nonnegative number q has q·n units of potential.
L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎)
Each of [], ::, (,) consumes 2 memory cells.
Φ2 = 2 ⋅ |xs′| + 6 Φ3 = 2 ⋅ |xs′| + 2 Cost = 4
8
SLIDE 40 OVERVIEW
Motivation Resource Aware ML (RaML) Type-Guided Worst-Case Input Generation Evaluation
9
SLIDE 41
SYMBOLIC EXECUTION
Idea: search all execution paths, record path constraints, and compute resource usage
γ ⊢ e ⇒ ⟨ψ, S⟩
SLIDE 42
SYMBOLIC EXECUTION
Idea: search all execution paths, record path constraints, and compute resource usage
γ ⊢ e ⇒ ⟨ψ, S⟩
symbolic environment
SLIDE 43
SYMBOLIC EXECUTION
Idea: search all execution paths, record path constraints, and compute resource usage
γ ⊢ e ⇒ ⟨ψ, S⟩
symbolic environment expression
SLIDE 44
SYMBOLIC EXECUTION
Idea: search all execution paths, record path constraints, and compute resource usage
γ ⊢ e ⇒ ⟨ψ, S⟩
symbolic environment expression path constraints
SLIDE 45
SYMBOLIC EXECUTION
Idea: search all execution paths, record path constraints, and compute resource usage
γ ⊢ e ⇒ ⟨ψ, S⟩
symbolic environment expression symbolic evaluation result path constraints
SLIDE 46
SYMBOLIC EXECUTION
Idea: search all execution paths, record path constraints, and compute resource usage
γ ⊢ e ⇒ ⟨ψ, S⟩
symbolic environment expression symbolic evaluation result path constraints
Symbolic execution rules for conditional expressions
SLIDE 47 SYMBOLIC EXECUTION
Idea: search all execution paths, record path constraints, and compute resource usage
γ ⊢ e ⇒ ⟨ψ, S⟩
symbolic environment expression symbolic evaluation result path constraints
10
γ ⊢ e1 ⇒ ⟨ψ, S⟩ γ ⊢ 𝗃𝗀 e 𝗎𝗂𝖿𝗈 e1 𝖿𝗆𝗍𝖿 e2 ⇒ ⟨γ(e) ∧ ψ, S⟩ γ ⊢ e2 ⇒ ⟨ψ, S⟩ γ ⊢ 𝗃𝗀 e 𝗎𝗂𝖿𝗈 e1 𝖿𝗆𝗍𝖿 e2 ⇒ ⟨¬γ(e) ∧ ψ, S⟩
Symbolic execution rules for conditional expressions
Then Else
SLIDE 48 SYMBOLIC EXECUTION
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
11
SLIDE 49 An example of worst-case execution paths for input lists
SYMBOLIC EXECUTION
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
ℓ ↦ [𝗃𝗈𝗎1, 𝗃𝗈𝗎2, 𝗃𝗈𝗎3, 𝗃𝗈𝗎4] ⊢ 𝗆𝗊𝖻𝗃𝗌𝗍 ℓ ⇒ ⟨(𝗃𝗈𝗎1 < 𝗃𝗈𝗎2) ∧ (𝗃𝗈𝗎3 < 𝗃𝗈𝗎4), [(𝗃𝗈𝗎1, 𝗃𝗈𝗎2), (𝗃𝗈𝗎3, 𝗃𝗈𝗎4)]⟩
11
SLIDE 50 An example of worst-case execution paths for input lists
SYMBOLIC EXECUTION
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
ℓ ↦ [𝗃𝗈𝗎1, 𝗃𝗈𝗎2, 𝗃𝗈𝗎3, 𝗃𝗈𝗎4] ⊢ 𝗆𝗊𝖻𝗃𝗌𝗍 ℓ ⇒ ⟨(𝗃𝗈𝗎1 < 𝗃𝗈𝗎2) ∧ (𝗃𝗈𝗎3 < 𝗃𝗈𝗎4), [(𝗃𝗈𝗎1, 𝗃𝗈𝗎2), (𝗃𝗈𝗎3, 𝗃𝗈𝗎4)]⟩
Invoke an SMT solver to find a model, e.g., [0,1,0,1]
11
SLIDE 51 TYPE-GUIDED SYMBOLIC EXECUTION
Nondeterminism leads to state explosion
γ ⊢ e1 ⇒ ⟨ψ, S⟩ γ ⊢ 𝗃𝗀 e 𝗎𝗂𝖿𝗈 e1 𝖿𝗆𝗍𝖿 e2 ⇒ ⟨γ(e) ∧ ψ, S⟩ γ ⊢ e2 ⇒ ⟨ψ, S⟩ γ ⊢ 𝗃𝗀 e 𝗎𝗂𝖿𝗈 e1 𝖿𝗆𝗍𝖿 e2 ⇒ ⟨¬γ(e) ∧ ψ, S⟩ Then Else
12
SLIDE 52 TYPE-GUIDED SYMBOLIC EXECUTION
Nondeterminism leads to state explosion
γ ⊢ e1 ⇒ ⟨ψ, S⟩ γ ⊢ 𝗃𝗀 e 𝗎𝗂𝖿𝗈 e1 𝖿𝗆𝗍𝖿 e2 ⇒ ⟨γ(e) ∧ ψ, S⟩ γ ⊢ e2 ⇒ ⟨ψ, S⟩ γ ⊢ 𝗃𝗀 e 𝗎𝗂𝖿𝗈 e1 𝖿𝗆𝗍𝖿 e2 ⇒ ⟨¬γ(e) ∧ ψ, S⟩ Then Else
12
Use the information about potentials obtained from resource aware type checking to prune the search space of symbolic execution.
SLIDE 53 13
TYPE-GUIDED SYMBOLIC EXECUTION
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎)
SLIDE 54 13
TYPE-GUIDED SYMBOLIC EXECUTION
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎) ℓ ↦ [𝗃𝗈𝗎1, 𝗃𝗈𝗎2, 𝗃𝗈𝗎3, 𝗃𝗈𝗎4]
SLIDE 55 13
TYPE-GUIDED SYMBOLIC EXECUTION
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎) ℓ ↦ [𝗃𝗈𝗎1, 𝗃𝗈𝗎2, 𝗃𝗈𝗎3, 𝗃𝗈𝗎4] x1 ↦ 𝗃𝗈𝗎1, x2 ↦ 𝗃𝗈𝗎2, xs′ ↦ [𝗃𝗈𝗎3, 𝗃𝗈𝗎4]
SLIDE 56 13
TYPE-GUIDED SYMBOLIC EXECUTION
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
Φ2 = 2 ⋅ |xs′| + 6 = 10 L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎) ℓ ↦ [𝗃𝗈𝗎1, 𝗃𝗈𝗎2, 𝗃𝗈𝗎3, 𝗃𝗈𝗎4] x1 ↦ 𝗃𝗈𝗎1, x2 ↦ 𝗃𝗈𝗎2, xs′ ↦ [𝗃𝗈𝗎3, 𝗃𝗈𝗎4]
SLIDE 57 13
TYPE-GUIDED SYMBOLIC EXECUTION
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
Φ2 = 2 ⋅ |xs′| + 6 = 10 Cost = 4 L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎) ℓ ↦ [𝗃𝗈𝗎1, 𝗃𝗈𝗎2, 𝗃𝗈𝗎3, 𝗃𝗈𝗎4] x1 ↦ 𝗃𝗈𝗎1, x2 ↦ 𝗃𝗈𝗎2, xs′ ↦ [𝗃𝗈𝗎3, 𝗃𝗈𝗎4]
SLIDE 58 13
TYPE-GUIDED SYMBOLIC EXECUTION
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
Φ2 = 2 ⋅ |xs′| + 6 = 10 Φ3 = 2 ⋅ |xs′| + 2 = 6 Cost = 4 L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎) ℓ ↦ [𝗃𝗈𝗎1, 𝗃𝗈𝗎2, 𝗃𝗈𝗎3, 𝗃𝗈𝗎4] x1 ↦ 𝗃𝗈𝗎1, x2 ↦ 𝗃𝗈𝗎2, xs′ ↦ [𝗃𝗈𝗎3, 𝗃𝗈𝗎4]
SLIDE 59 13
TYPE-GUIDED SYMBOLIC EXECUTION
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
Φ2 = 2 ⋅ |xs′| + 6 = 10 Φ3 = 2 ⋅ |xs′| + 2 = 6 Cost = 4 L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎) ℓ ↦ [𝗃𝗈𝗎1, 𝗃𝗈𝗎2, 𝗃𝗈𝗎3, 𝗃𝗈𝗎4] x1 ↦ 𝗃𝗈𝗎1, x2 ↦ 𝗃𝗈𝗎2, xs′ ↦ [𝗃𝗈𝗎3, 𝗃𝗈𝗎4]
SLIDE 60 13
TYPE-GUIDED SYMBOLIC EXECUTION
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
Φ2 = 2 ⋅ |xs′| + 6 = 10 Φ3 = 2 ⋅ |xs′| + 2 = 6 Cost = 4 L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎) ℓ ↦ [𝗃𝗈𝗎1, 𝗃𝗈𝗎2, 𝗃𝗈𝗎3, 𝗃𝗈𝗎4] x1 ↦ 𝗃𝗈𝗎1, x2 ↦ 𝗃𝗈𝗎2, xs′ ↦ [𝗃𝗈𝗎3, 𝗃𝗈𝗎4] Waste!
SLIDE 61 13
TYPE-GUIDED SYMBOLIC EXECUTION
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
Φ2 = 2 ⋅ |xs′| + 6 = 10 Φ3 = 2 ⋅ |xs′| + 2 = 6 Cost = 4 L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎) ℓ ↦ [𝗃𝗈𝗎1, 𝗃𝗈𝗎2, 𝗃𝗈𝗎3, 𝗃𝗈𝗎4] x1 ↦ 𝗃𝗈𝗎1, x2 ↦ 𝗃𝗈𝗎2, xs′ ↦ [𝗃𝗈𝗎3, 𝗃𝗈𝗎4]
If an execution path does not have potential waste, it must expose the worst- case resource usage.
Waste!
SLIDE 62 13
TYPE-GUIDED SYMBOLIC EXECUTION
let rec lpairs l = match l with | [] -> [] | x1 :: xs -> match xs with | [] -> [] | x2 :: xs’ -> if (x1:int) < (x2:int) then (x1, x2) :: lpairs xs’ else lpairs xs’
Φ2 = 2 ⋅ |xs′| + 6 = 10 Φ3 = 2 ⋅ |xs′| + 2 = 6 Cost = 4 L2(𝗃𝗈𝗎) 2/0 L0(𝗃𝗈𝗎 × 𝗃𝗈𝗎) ℓ ↦ [𝗃𝗈𝗎1, 𝗃𝗈𝗎2, 𝗃𝗈𝗎3, 𝗃𝗈𝗎4] x1 ↦ 𝗃𝗈𝗎1, x2 ↦ 𝗃𝗈𝗎2, xs′ ↦ [𝗃𝗈𝗎3, 𝗃𝗈𝗎4]
If an execution path does not have potential waste, it must expose the worst- case resource usage.
Waste!
SLIDE 63 SOUNDNESS & COMPLETENESS
14
SLIDE 64 SOUNDNESS & COMPLETENESS
Soundness: If the algorithm generates an input, then the input will cause the program to consume exactly the same amount of resource as the inferred upper bound (by RaML).
14
SLIDE 65 SOUNDNESS & COMPLETENESS
Soundness: If the algorithm generates an input, then the input will cause the program to consume exactly the same amount of resource as the inferred upper bound (by RaML). Relative completeness: If there is an input of some given shape that causes the program to consume exactly the same amount of resource as the inferred upper bound (by RaML), then the algorithm is able to find a corresponding execution path.
14
SLIDE 66 15
γ ⊢ e1 ⇒ ⟨ψ, S⟩ γ ⊢ 𝗃𝗀 e 𝗎𝗂𝖿𝗈 e1 𝖿𝗆𝗍𝖿 e2 ⇒ ⟨γ(e) ∧ ψ, S⟩ γ ⊢ e2 ⇒ ⟨ψ, S⟩ γ ⊢ 𝗃𝗀 e 𝗎𝗂𝖿𝗈 e1 𝖿𝗆𝗍𝖿 e2 ⇒ ⟨¬γ(e) ∧ ψ, S⟩ Then Else
SPEED UP INPUT GENERATION
SLIDE 67 How about eliminating some generation rules?
15
γ ⊢ e1 ⇒ ⟨ψ, S⟩ γ ⊢ 𝗃𝗀 e 𝗎𝗂𝖿𝗈 e1 𝖿𝗆𝗍𝖿 e2 ⇒ ⟨γ(e) ∧ ψ, S⟩ γ ⊢ e2 ⇒ ⟨ψ, S⟩ γ ⊢ 𝗃𝗀 e 𝗎𝗂𝖿𝗈 e1 𝖿𝗆𝗍𝖿 e2 ⇒ ⟨¬γ(e) ∧ ψ, S⟩ Then Else
SPEED UP INPUT GENERATION
SLIDE 68 How about eliminating some generation rules?
15
γ ⊢ e1 ⇒ ⟨ψ, S⟩ γ ⊢ 𝗃𝗀 e 𝗎𝗂𝖿𝗈 e1 𝖿𝗆𝗍𝖿 e2 ⇒ ⟨γ(e) ∧ ψ, S⟩ γ ⊢ e2 ⇒ ⟨ψ, S⟩ γ ⊢ 𝗃𝗀 e 𝗎𝗂𝖿𝗈 e1 𝖿𝗆𝗍𝖿 e2 ⇒ ⟨¬γ(e) ∧ ψ, S⟩ Then Else
SPEED UP INPUT GENERATION
SLIDE 69 How about eliminating some generation rules?
15
γ ⊢ e1 ⇒ ⟨ψ, S⟩ γ ⊢ 𝗃𝗀 e 𝗎𝗂𝖿𝗈 e1 𝖿𝗆𝗍𝖿 e2 ⇒ ⟨γ(e) ∧ ψ, S⟩ γ ⊢ e2 ⇒ ⟨ψ, S⟩ γ ⊢ 𝗃𝗀 e 𝗎𝗂𝖿𝗈 e1 𝖿𝗆𝗍𝖿 e2 ⇒ ⟨¬γ(e) ∧ ψ, S⟩ Then Else
Still Sound!
SPEED UP INPUT GENERATION
SLIDE 70 How about eliminating some generation rules?
15
γ ⊢ e1 ⇒ ⟨ψ, S⟩ γ ⊢ 𝗃𝗀 e 𝗎𝗂𝖿𝗈 e1 𝖿𝗆𝗍𝖿 e2 ⇒ ⟨γ(e) ∧ ψ, S⟩ γ ⊢ e2 ⇒ ⟨ψ, S⟩ γ ⊢ 𝗃𝗀 e 𝗎𝗂𝖿𝗈 e1 𝖿𝗆𝗍𝖿 e2 ⇒ ⟨¬γ(e) ∧ ψ, S⟩ Then Else
Still Sound!
SPEED UP INPUT GENERATION
Generalization: enforce all the calls with the same shape of inputs execute the same path in the function body.
SLIDE 71 OVERVIEW
Motivation Resource Aware ML (RaML) Type-Guided Worst-Case Input Generation Evaluation
16
SLIDE 72 IMPLEMENTATION
17
We implemented the generation algorithm for a purely functional fragment of Resource Aware ML (RaML), including higher-order functions, user-defined data structures, and polynomial resource bounds. We used the off-the-shelf SMT solver Z3.
SLIDE 73 BENCHMARKS (SELECTED)
18
Description Shape ALG ALG+H1 ALG+H2
Insertion sort
200 integers 7.74s 6.97s 94.81s
Quicksort
200 integers T/O 53.23s 157.21s
Lexicographic quicksort
Lists of length 100, 99, …, 1 439.35s 438.79s T/O
Functional queue
200 operations 444.64s T/O T/O
Zigzag on a tree
200 internal nodes T/O T/O 4.87s
Hash table for 8-char strings
64 insertions 7.64s 7.62s 181.74s
SLIDE 74 19
EXAMPLE: HASH TABLE
SLIDE 75 19
EXAMPLE: HASH TABLE
Customized resource metric: count for hash collisions
SLIDE 76 19
EXAMPLE: HASH TABLE
Customized resource metric: count for hash collisions Use a hash function from a vulnerable PHP implementation
SLIDE 77 19
EXAMPLE: HASH TABLE
Customized resource metric: count for hash collisions Use a hash function from a vulnerable PHP implementation The program inserts 64 strings into an empty hash table
SLIDE 78 19
EXAMPLE: HASH TABLE
Customized resource metric: count for hash collisions Use a hash function from a vulnerable PHP implementation The program inserts 64 strings into an empty hash table Our algorithm “realizes” that it should find 64 strings with the same hash key, in order to trigger the most collisions
SLIDE 79 SUMMARY
20
λ
TYPE-GUIDED SYMBOLIC EXECUTION
FOR WORST-CASE INPUT GENERATION
SLIDE 80 SUMMARY
20
λ
TYPE-GUIDED SYMBOLIC EXECUTION
FOR WORST-CASE INPUT GENERATION
Formally developed algorithm Soundness & relative completeness Theoretical Results
SLIDE 81 SUMMARY
20
λ
TYPE-GUIDED SYMBOLIC EXECUTION
FOR WORST-CASE INPUT GENERATION
Integrated with RaML Effective on 22 benchmark programs Formally developed algorithm Soundness & relative completeness Theoretical Results Experimental Results
SLIDE 82 SUMMARY
20
Limitations: Purely functional programs Only work for tight bounds Depend on RaML
λ
TYPE-GUIDED SYMBOLIC EXECUTION
FOR WORST-CASE INPUT GENERATION
Integrated with RaML Effective on 22 benchmark programs Formally developed algorithm Soundness & relative completeness Theoretical Results Experimental Results
SLIDE 83 SUMMARY
20
Limitations: Purely functional programs Only work for tight bounds Depend on RaML Future work: Support side effects Interact with resource analysis General theory for worst- case analysis
λ
TYPE-GUIDED SYMBOLIC EXECUTION
FOR WORST-CASE INPUT GENERATION
Integrated with RaML Effective on 22 benchmark programs Formally developed algorithm Soundness & relative completeness Theoretical Results Experimental Results