Cracking Sendmail crackaddr Still a challenge for automated program - - PowerPoint PPT Presentation

cracking sendmail crackaddr
SMART_READER_LITE
LIVE PREVIEW

Cracking Sendmail crackaddr Still a challenge for automated program - - PowerPoint PPT Presentation

Hackito Ergo Sum October 29th, 2015 Cracking Sendmail crackaddr Still a challenge for automated program analysis? Name Lastname < name @ mail . org > ()()()()()()()()() . . . ()()() Bogdan Mihaila Technical University of Munich, Germany


slide-1
SLIDE 1

Hackito Ergo Sum

October 29th, 2015

Cracking Sendmail crackaddr

Still a challenge for automated program analysis? Name Lastname < name@mail.org > ()()()()()()()()() . . . ()()() Bogdan Mihaila

Technical University of Munich, Germany

1 / 35

slide-2
SLIDE 2

Sendmail crackaddr Bug

Discovered 2003 by Mark Dowd Buffer overflow in an email address parsing function of

  • Sendmail. Consists of a parsing loop using a state machine.

∼500 LOC

2 / 35

slide-3
SLIDE 3

Sendmail crackaddr Bug

Discovered 2003 by Mark Dowd Buffer overflow in an email address parsing function of

  • Sendmail. Consists of a parsing loop using a state machine.

∼500 LOC Bounty for Static Analyzers since 2011 by Halvar Flake Halvar extracted a smaller version of the bug as an example of a hard problem for static analyzers. ∼50 LOC

2 / 35

slide-4
SLIDE 4

Sendmail crackaddr Bug

Discovered 2003 by Mark Dowd Buffer overflow in an email address parsing function of

  • Sendmail. Consists of a parsing loop using a state machine.

∼500 LOC Bounty for Static Analyzers since 2011 by Halvar Flake Halvar extracted a smaller version of the bug as an example of a hard problem for static analyzers. ∼50 LOC Since then . . . Various talks at security conferences and a paper presenting a static analysis of the example. The solutions however required manual specification of the loop invariant.

2 / 35

slide-5
SLIDE 5

Backstory

Halvar likes to challenge people! Halvar gave us the challenge some years ago: “The tool should automatically (i.e. without hints provided by the user) show that the vulnerable version has a bug and the fixed version is safe.” We were sure our analyzer could not yet handle it so did not look into it. Last year we gave it a try and it suddenly worked :).

3 / 35

slide-6
SLIDE 6

Sendmail Bug (simplified)

Let’s see the bug details ...

4 / 35

slide-7
SLIDE 7

Sendmail Bug Code

1 #define BUFFERSIZE 200 2 #define TRUE 1 3 #define FALSE 0 4 int copy_it (char *input , unsigned int length) { 5 char c, localbuf[BUFFERSIZE ]; 6 unsigned int upperlimit = BUFFERSIZE - 10; 7 unsigned int quotation = roundquote = FALSE; 8 unsigned int inputIndex = outputIndex = 0; 9 while (inputIndex < length) { 10 c = input[inputIndex ++]; 11 if ((c == '<') && (! quotation )) { 12 quotation = TRUE; upperlimit --; 13 } 14 if ((c == '>') && (quotation )) { 15 quotation = FALSE; upperlimit ++; 16 } 17 if ((c == '(') && (! quotation) && !roundquote) { 18 roundquote = TRUE; upperlimit--; // decrementation was missing in bug 19 } 20 if ((c == ')') && (! quotation) && roundquote) { 21 roundquote = FALSE; upperlimit ++; 22 } 23 // If there is sufficient space in the buffer , write the character. 24 if (outputIndex < upperlimit) { 25 localbuf[outputIndex] = c; 26

  • utputIndex ++;

27 } 28 } 29 if (roundquote) { 30 localbuf[outputIndex] = ')'; outputIndex ++; } 31 if (quotation) { 32 localbuf[outputIndex] = '>'; outputIndex ++; } 33 } 5 / 35

slide-8
SLIDE 8

State Machine of Parser

We need to verify that outputIndex < upperlimit < BUFFERSIZE always holds in the good version. Good:

< > < ( ) ulimit-- !q ulimit++ q ulimit++ !q!r ulimit-- !q r

Bad:

< > < ( ) ulimit-- !q ulimit++ q ulimit++ !q!r !q r

In the bad version upperlimit can be steadily incremented and a write outside of the stack allocated buffer can be triggered.

6 / 35

slide-9
SLIDE 9

Sendmail Bug Analysis

Why are these 50 LOC hard to analyze?

7 / 35

slide-10
SLIDE 10

Sendmail Bug Analysis

Why are these 50 LOC hard to analyze?

  • each iteration reads/writes one character

❀ 201 loop iterations to trigger the bug

7 / 35

slide-11
SLIDE 11

Sendmail Bug Analysis

Why are these 50 LOC hard to analyze?

  • each iteration reads/writes one character

❀ 201 loop iterations to trigger the bug

  • paths through the loop dependent on the input:

( ) < > combined with the last if-condition ❀ 10 different paths

7 / 35

slide-12
SLIDE 12

Sendmail Bug Analysis

Why are these 50 LOC hard to analyze?

  • each iteration reads/writes one character

❀ 201 loop iterations to trigger the bug

  • paths through the loop dependent on the input:

( ) < > combined with the last if-condition ❀ 10 different paths

  • a naïve state space exploration in worst case would need to

visit around 2 ∗ 5201 ≈ 2664 paths to find the bug!

7 / 35

slide-13
SLIDE 13

Sendmail Bug Analysis

Why are these 50 LOC hard to analyze?

  • each iteration reads/writes one character

❀ 201 loop iterations to trigger the bug

  • paths through the loop dependent on the input:

( ) < > combined with the last if-condition ❀ 10 different paths

  • a naïve state space exploration in worst case would need to

visit around 2 ∗ 5201 ≈ 2664 paths to find the bug!

  • to naïvely prove the absence of the bug we would need to

test all the possible input strings e.g. with lengths from 0 to 65535 = UINT_MAX ❀ around 1065535 ≈ 2217702 paths that need to be tested!

7 / 35

slide-14
SLIDE 14

Sendmail Bug Analysis

On the other hand . . . !

8 / 35

slide-15
SLIDE 15

Sendmail Bug Analysis

On the other hand . . . !

  • finding the bug requires just finding 1 of the faulty paths!

8 / 35

slide-16
SLIDE 16

Sendmail Bug Analysis

On the other hand . . . !

  • finding the bug requires just finding 1 of the faulty paths!
  • smarter tools combine many paths together and reason

about all of them at once (abstraction)!

8 / 35

slide-17
SLIDE 17

Sendmail Bug Analysis

On the other hand . . . !

  • finding the bug requires just finding 1 of the faulty paths!
  • smarter tools combine many paths together and reason

about all of them at once (abstraction)! But unfortunately

  • abstraction might introduce imprecision and false positives

8 / 35

slide-18
SLIDE 18

Sendmail Bug Analysis

On the other hand . . . !

  • finding the bug requires just finding 1 of the faulty paths!
  • smarter tools combine many paths together and reason

about all of them at once (abstraction)! But unfortunately

  • abstraction might introduce imprecision and false positives
  • ❀ the non-vulnerable version is flagged as vulnerable, too,

by an imprecise analyzer

8 / 35

slide-19
SLIDE 19

Abstraction Techniques

Let’s introduce one abstraction technique in more detail ...

9 / 35

slide-20
SLIDE 20

Abstract Interpretation Primer

Static program analysis using abstract interpretation

  • use abstract domains to over-approximate concrete states
  • abstract transformers simulate the concrete program

semantics on the abstract state

  • perform a fixpoint computation to infer invariants for each

program point

  • merge over all paths over-approximates all possible

program executions (soundness)

  • precision depends on the abstraction (completeness)
  • for termination widening is necessary

(introduces imprecision)

10 / 35

slide-21
SLIDE 21

Abstraction Examples

Some examples of concrete values and their abstractions ...

11 / 35

slide-22
SLIDE 22

Sets of Concrete Values and their Abstractions

5 10 5 10 Constraints: x = 2 ∧ y = 6 ∨ x = 3 ∧ y = 5 ∨ x = 3 ∧ y = 7 ∨ x = 3 ∧ y = 8 ∨ . . . x y Concrete Points ±x = c 5 10 5 10 Constraints: 2 ≤ x ∧ x ≤ 8 ∧ 2 ≤ y ∧ y ≤ 8 x y Intervals ±x ≤ c 5 10 5 10 Constraints: 2 ≤ x ∧ x ≤ 5 ∨ 7 ≤ x ∧ x ≤ 8 ∨ 2 ≤ y ∧ y ≤ 3 ∨ 5 ≤ y ∧ y ≤ 8 x y Interval Sets

  • i(li ≤ x ∧ x ≤ ui)

5 10 5 10 Constraints: 2x − y ≤ −2 ∧ − 2x − y ≤ −10 ∧ 2x + y ≤ −21 ∧ 3x + 4y ≤ 4 ∧ x + 4y ≤ 35 x y Polyhedra

  • i aixi ≤ c

12 / 35

slide-23
SLIDE 23

Sets of Concrete Values and their Abstractions

5 10 5 10 Constraints: x = 2 ∧ y = 1 ∨ x = 8 ∧ y = 5 x y Concrete Points ±x = c 5 10 5 10 Constraints: 2x − 3y = 3 x y Affine Equalities

  • i aixi = c

5 10 5 10 Constraints: x ≡ 2 (mod 3) ∧ y ≡ 1 (mod 2) x y Congruences x ≡ b (mod a) 13 / 35

slide-24
SLIDE 24

Operations on Abstractions

Some examples of operations on abstractions ...

14 / 35

slide-25
SLIDE 25

Some Operations on Intervals

Arithmetics: [0, 100] + [1, 2] = [1, 102] [0, 100] − [1, 2] = [−2, 99] Tests or Assumptions, Meet ⊓

x ∈ [−∞, +∞] x ∈ [−∞, 5] x ∈ [6, +∞] 1 2 3 ⊓ x < 6 6 ≤ x

Merge of paths, Join ⊔

x ∈ [0, 15] x ∈ [30, 100] 4 5 6 ⊔ x ∈ [0, 100]

15 / 35

slide-26
SLIDE 26

Operations on Abstractions

Widening and Narrowing To analyze loops in less steps than the real iterations count ... and especially always analyze loops in finitely many steps. Termination of Analysis!

16 / 35

slide-27
SLIDE 27

Widening and Narrowing on Intervals

int x = 1; int y = 1; // shown x, y values // are at loop head while (x <= 6) { x = x + 1; y = y + 2; }

0 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 x y

1st Iteration

0 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 x y

2nd Iteration: ⊔ join

0 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 x y

3rd Iteration: ∇ widening

0 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10

x ≤ 7

x y

4th Iteration: ∆ narrowing 17 / 35

slide-28
SLIDE 28

Abstract Interpretation

Good introduction and overview material:

  • A gentle Introduction to Formal Verification of Computer Systems

by Abstract Interpretation, P. Cousot and R. Cousot, 2010

  • Abstract Interpretation Based Formal Methods and Future

Challenges, P. Cousot, 2001

  • Abstract Interpretation: Past, Present and Future,
  • P. Cousot and R. Cousot, 2014

18 / 35

slide-29
SLIDE 29

Static Binary Analyzer

Now to our Analyzer “Bindead” ...

19 / 35

slide-30
SLIDE 30

Analyzer Features

Analysis of binaries using abstract interpretation

  • analyze machine code from disassembled executables
  • translate machine code to intermediate language (RREIL)
  • abstract transformers for instruction semantics of RREIL
  • perform a reachability analysis to infer jump targets and
  • use abstract domains to infer memory bounds and flag
  • ut-of-bounds accesses

Project page: https://bitbucket.org/mihaila/bindead

20 / 35

slide-31
SLIDE 31

Analyzer Overview

decode

addr L(rreil) Binary

fixpoint engine state+CFG storage

query L(memory) memory domain

segments heap stack . . . fields

query L(finite) finite domain

undef points to flags . . . wrapping

query L(zeno) numeric domain

predicates affine congruence

  • ctagons

. . . interval

  • disassembler frontend produces RREIL

for the analysis

  • RREIL gets transformed to simpler

languages for the abstract domains

  • fixpoint and disassembly process are

intertwined

  • modular construction using co-fibered

abstract domains

  • domains stack is a partially reduced

product of domains

  • for interprocedural analysis we use

either call-string or a summarization approach

21 / 35

slide-32
SLIDE 32

Sendmail Problem for Abstract Interpretation

... and what is needed to solve the Sendmail Example

22 / 35

slide-33
SLIDE 33

Sendmail Code Revisited (Problems and Ideas)

#define BUFFERSIZE 200 #define TRUE 1 #define FALSE 0 prove memory correctnes for all possible concrete inputs! int copy_it (char *input , unsigned int length) { ∗input[i] ∈ [−∞, +∞], length ∈ [0, +∞] char c, localbuf[BUFFERSIZE ]; unsigned int upperlimit = BUFFERSIZE - 10; unsigned int quotation = roundquote = FALSE; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { c = input[inputIndex ++]; if ((c == '<') && (! quotation )) { quotation = TRUE; upperlimit --; } if ((c == '>') && (quotation )) { quotation = FALSE; upperlimit ++; } if ((c == '(') && (! quotation) && !roundquote) { roundquote = TRUE; upperlimit--; // decrementation was missing in bug } if ((c == ')') && (! quotation) && roundquote) { roundquote = FALSE; upperlimit ++; } // If there is sufficient space in the buffer , write the character. if (outputIndex < upperlimit) { localbuf[outputIndex] = c; prove that outputIndex < BUFFERSIZE holds

  • utputIndex ++;

} } if (roundquote) { prove that invariant outputIndex < BUFFERSIZE holds localbuf[outputIndex] = ')'; outputIndex ++; } if (quotation) { prove that invariant outputIndex < BUFFERSIZE holds localbuf[outputIndex] = '>'; outputIndex ++; } } 23 / 35

slide-34
SLIDE 34

Sendmail Code Revisited (Problems and Ideas)

#define BUFFERSIZE 200 #define TRUE 1 #define FALSE 0 int copy_it (char *input , unsigned int length) { char c, localbuf[BUFFERSIZE ]; unsigned int upperlimit = BUFFERSIZE - 10; unsigned int quotation = roundquote = FALSE; unsigned int inputIndex = outputIndex = 0; inputIndex ∈ [0, 0], outputIndex ∈ [0, 0] while (inputIndex < length) { c = input[inputIndex ++]; inputIndex ∈ [1, 1] if ((c == '<') && (! quotation )) { quotation = TRUE; upperlimit --; } if ((c == '>') && (quotation )) { quotation = FALSE; upperlimit ++; } if ((c == '(') && (! quotation) && !roundquote) { roundquote = TRUE; upperlimit--; // decrementation was missing in bug } if ((c == ')') && (! quotation) && roundquote) { roundquote = FALSE; upperlimit ++; } // If there is sufficient space in the buffer , write the character. if (outputIndex < upperlimit) { localbuf[outputIndex] = c; prove that outputIndex < BUFFERSIZE holds

  • utputIndex ++;
  • utputIndex ∈ [1, 1]

} ⊔ : outputIndex ∈ [0, 1] } if (roundquote) { prove that invariant outputIndex < BUFFERSIZE holds localbuf[outputIndex] = ')'; outputIndex ++; } if (quotation) { prove that invariant outputIndex < BUFFERSIZE holds localbuf[outputIndex] = '>'; outputIndex ++; } } 23 / 35

slide-35
SLIDE 35

Sendmail Code Revisited (Problems and Ideas)

#define BUFFERSIZE 200 #define TRUE 1 #define FALSE 0 int copy_it (char *input , unsigned int length) { char c, localbuf[BUFFERSIZE ]; unsigned int upperlimit = BUFFERSIZE - 10; unsigned int quotation = roundquote = FALSE; unsigned int inputIndex = outputIndex = 0; inputIndex ∈ [0, 0], outputIndex ∈ [0, 0] while (inputIndex < length) { widening ∇: inputIndex ∈[0, +∞], outputIndex ∈[0, +∞] c = input[inputIndex ++]; inputIndex ∈ [1, 1] if ((c == '<') && (! quotation )) { quotation = TRUE; upperlimit --; } if ((c == '>') && (quotation )) { quotation = FALSE; upperlimit ++; } if ((c == '(') && (! quotation) && !roundquote) { roundquote = TRUE; upperlimit--; // decrementation was missing in bug } if ((c == ')') && (! quotation) && roundquote) { roundquote = FALSE; upperlimit ++; } // If there is sufficient space in the buffer , write the character. if (outputIndex < upperlimit) { localbuf[outputIndex] = c; prove that outputIndex < BUFFERSIZE holds

  • utputIndex ++;
  • utputIndex ∈ [1, 1]

} ⊔ : outputIndex ∈ [0, 1] } if (roundquote) { prove that invariant outputIndex < BUFFERSIZE holds localbuf[outputIndex] = ')'; outputIndex ++; } if (quotation) { prove that invariant outputIndex < BUFFERSIZE holds localbuf[outputIndex] = '>'; outputIndex ++; } } 23 / 35

slide-36
SLIDE 36

Sendmail Code Revisited (Problems and Ideas)

#define BUFFERSIZE 200 #define TRUE 1 #define FALSE 0 int copy_it (char *input , unsigned int length) { char c, localbuf[BUFFERSIZE ]; unsigned int upperlimit = BUFFERSIZE - 10; upperlimit ∈ [190, 190] unsigned int quotation = roundquote = FALSE; unsigned int inputIndex = outputIndex = 0; inputIndex ∈ [0, 0], outputIndex ∈ [0, 0] while (inputIndex < length) { widening ∇: inputIndex ∈[0, +∞], outputIndex ∈[0, +∞] c = input[inputIndex ++]; inputIndex ∈ [1, 1] if ((c == '<') && (! quotation )) { quotation = TRUE; upperlimit --; } if ((c == '>') && (quotation )) { quotation = FALSE; upperlimit ++; } if ((c == '(') && (! quotation) && !roundquote) { roundquote = TRUE; upperlimit--; // decrementation was missing in bug } if ((c == ')') && (! quotation) && roundquote) { roundquote = FALSE; upperlimit ++; } // If there is sufficient space in the buffer , write the character. if (outputIndex < upperlimit) { use threshold outputIndex <upperlimit for widening! localbuf[outputIndex] = c; prove that outputIndex < BUFFERSIZE holds

  • utputIndex ++;
  • utputIndex ∈ [1, 1]

} ⊔ : outputIndex ∈ [0, 1] } if (roundquote) { prove that invariant outputIndex < BUFFERSIZE holds localbuf[outputIndex] = ')'; outputIndex ++; } if (quotation) { prove that invariant outputIndex < BUFFERSIZE holds localbuf[outputIndex] = '>'; outputIndex ++; } } 23 / 35

slide-37
SLIDE 37

Sendmail Code Revisited (Problems and Ideas)

#define BUFFERSIZE 200 #define TRUE 1 #define FALSE 0 int copy_it (char *input , unsigned int length) { char c, localbuf[BUFFERSIZE ]; unsigned int upperlimit = BUFFERSIZE - 10; upperlimit ∈ [190, 190] unsigned int quotation = roundquote = FALSE; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { c = input[inputIndex ++]; if ((c == '<') && (! quotation )) { quotation = TRUE; upperlimit --; } ⊔ : upperlimit ∈ [189, 190] if ((c == '>') && (quotation )) { quotation = FALSE; upperlimit ++; } ⊔ : upperlimit ∈ [189, 191] if ((c == '(') && (! quotation) && !roundquote) { roundquote = TRUE; upperlimit--; // decrementation was missing in bug } ⊔ : upperlimit ∈ [188, 191] if ((c == ')') && (! quotation) && roundquote) { roundquote = FALSE; upperlimit ++; } ⊔ : upperlimit ∈ [188, 192] // If there is sufficient space in the buffer , write the character. if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (roundquote) { localbuf[outputIndex] = ')'; outputIndex ++; } if (quotation) { localbuf[outputIndex] = '>'; outputIndex ++; } } 23 / 35

slide-38
SLIDE 38

Sendmail Code Revisited (Problems and Ideas)

#define BUFFERSIZE 200 #define TRUE 1 #define FALSE 0 int copy_it (char *input , unsigned int length) { char c, localbuf[BUFFERSIZE ]; unsigned int upperlimit = BUFFERSIZE - 10; upperlimit ∈ [190, 190] unsigned int quotation = roundquote = FALSE; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { widening ∇ removes bounds: upperlimit ∈ [−∞, +∞] c = input[inputIndex ++]; if ((c == '<') && (! quotation )) { quotation = TRUE; upperlimit --; } ⊔ : upperlimit ∈ [189, 190] if ((c == '>') && (quotation )) { quotation = FALSE; upperlimit ++; } ⊔ : upperlimit ∈ [189, 191] if ((c == '(') && (! quotation) && !roundquote) { roundquote = TRUE; upperlimit--; // decrementation was missing in bug } ⊔ : upperlimit ∈ [188, 191] if ((c == ')') && (! quotation) && roundquote) { roundquote = FALSE; upperlimit ++; } ⊔ : upperlimit ∈ [188, 192] // If there is sufficient space in the buffer , write the character. if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (roundquote) { localbuf[outputIndex] = ')'; outputIndex ++; } if (quotation) { localbuf[outputIndex] = '>'; outputIndex ++; } } 23 / 35

slide-39
SLIDE 39

Sendmail Code Revisited (Problems and Ideas)

#define BUFFERSIZE 200 #define TRUE 1 #define FALSE 0 int copy_it (char *input , unsigned int length) { char c, localbuf[BUFFERSIZE ]; unsigned int upperlimit = BUFFERSIZE - 10; upperlimit ∈ [190, 190] unsigned int quotation = roundquote = FALSE; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { widening ∇ removes bounds: upperlimit ∈ [−∞, +∞] c = input[inputIndex ++]; use relation with flag variables quotation and roundquote if ((c == '<') && (! quotation )) { to keep upperlimit bounded! quotation = TRUE; upperlimit --; } ⊔ : upperlimit ∈ [189, 190] if ((c == '>') && (quotation )) { quotation = FALSE; upperlimit ++; } ⊔ : upperlimit ∈ [189, 191] if ((c == '(') && (! quotation) && !roundquote) { roundquote = TRUE; upperlimit--; // decrementation was missing in bug } ⊔ : upperlimit ∈ [188, 191] if ((c == ')') && (! quotation) && roundquote) { roundquote = FALSE; upperlimit ++; } ⊔ : upperlimit ∈ [188, 192] // If there is sufficient space in the buffer , write the character. if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (roundquote) { localbuf[outputIndex] = ')'; outputIndex ++; } if (quotation) { localbuf[outputIndex] = '>'; outputIndex ++; } } 23 / 35

slide-40
SLIDE 40

Sendmail Code Revisited (Problems and Ideas)

#define BUFFERSIZE 200 #define TRUE 1 #define FALSE 0 int copy_it (char *input , unsigned int length) { char c, localbuf[BUFFERSIZE ]; unsigned int upperlimit = BUFFERSIZE - 10; unsigned int quotation = roundquote = FALSE; quotation ∈ [0, 0], roundquote ∈ [0, 0] unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { c = input[inputIndex ++]; if ((c == '<') && (! quotation )) { quotation = TRUE; upperlimit --; } ⊔ : quotation ∈ [0, 1] if ((c == '>') && (quotation )) { quotation = FALSE; upperlimit ++; } ⊔ : quotation ∈ [0, 1] if ((c == '(') && (! quotation) && !roundquote) { roundquote = TRUE; upperlimit--; // decrementation was missing in bug } ⊔ : quotation ∈ [0, 1], roundquote ∈ [0, 1] if ((c == ')') && (! quotation) && roundquote) { roundquote = FALSE; upperlimit ++; } ⊔ : quotation ∈ [0, 1], roundquote ∈ [0, 1] // If there is sufficient space in the buffer , write the character. if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (roundquote) { localbuf[outputIndex] = ')'; outputIndex ++; } if (quotation) { localbuf[outputIndex] = '>'; outputIndex ++; } } 23 / 35

slide-41
SLIDE 41

Sendmail Code Revisited (Problems and Ideas)

#define BUFFERSIZE 200 #define TRUE 1 #define FALSE 0 int copy_it (char *input , unsigned int length) { char c, localbuf[BUFFERSIZE ]; unsigned int upperlimit = BUFFERSIZE - 10; unsigned int quotation = roundquote = FALSE; quotation ∈ [0, 0], roundquote ∈ [0, 0] unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { ∇ removes bounds: quotation∈[0, +∞], roundquote∈[0, +∞] c = input[inputIndex ++]; if ((c == '<') && (! quotation )) { quotation = TRUE; upperlimit --; } ⊔ : quotation ∈ [0, 1] if ((c == '>') && (quotation )) { quotation = FALSE; upperlimit ++; } ⊔ : quotation ∈ [0, 1] if ((c == '(') && (! quotation) && !roundquote) { roundquote = TRUE; upperlimit--; // decrementation was missing in bug } ⊔ : quotation ∈ [0, 1], roundquote ∈ [0, 1] if ((c == ')') && (! quotation) && roundquote) { roundquote = FALSE; upperlimit ++; } ⊔ : quotation ∈ [0, 1], roundquote ∈ [0, 1] // If there is sufficient space in the buffer , write the character. if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (roundquote) { localbuf[outputIndex] = ')'; outputIndex ++; } if (quotation) { localbuf[outputIndex] = '>'; outputIndex ++; } } 23 / 35

slide-42
SLIDE 42

Sendmail Code Revisited (Problems and Ideas)

#define BUFFERSIZE 200 #define TRUE 1 #define FALSE 0 int copy_it (char *input , unsigned int length) { char c, localbuf[BUFFERSIZE ]; unsigned int upperlimit = BUFFERSIZE - 10; unsigned int quotation = roundquote = FALSE; quotation ∈ [0, 0], roundquote ∈ [0, 0] unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { ∇ removes bounds: quotation∈[0, +∞], roundquote∈[0, +∞] c = input[inputIndex ++]; delay widening until flags and relations stable! if ((c == '<') && (! quotation )) { quotation = TRUE; upperlimit --; } ⊔ : quotation ∈ [0, 1] if ((c == '>') && (quotation )) { quotation = FALSE; upperlimit ++; } ⊔ : quotation ∈ [0, 1] if ((c == '(') && (! quotation) && !roundquote) { roundquote = TRUE; upperlimit--; // decrementation was missing in bug } ⊔ : quotation ∈ [0, 1], roundquote ∈ [0, 1] if ((c == ')') && (! quotation) && roundquote) { roundquote = FALSE; upperlimit ++; } ⊔ : quotation ∈ [0, 1], roundquote ∈ [0, 1] // If there is sufficient space in the buffer , write the character. if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (roundquote) { localbuf[outputIndex] = ')'; outputIndex ++; } if (quotation) { localbuf[outputIndex] = '>'; outputIndex ++; } } 23 / 35

slide-43
SLIDE 43

Stack of Required Domains

To verify the code (disassembled from the binary) we used these abstract domains:

stack regions memory fields cpu flags points to wrapping delayed widening thresholds widening affine equalities intervals

Adding more domains (e.g. predicates, congruences, octagons, polyhedra, interval-sets) improves the precision of the inferred bounds after widening but is not necessary to verify the code.

24 / 35

slide-44
SLIDE 44

Solving Sendmail with Abstract Interpretation

A Walkthrough the Sendmail Analysis using Bindead

25 / 35

slide-45
SLIDE 45

Analysis Steps and inferred Values

Lets analyze the code! int copy_it (char *input , unsigned int length) { char c, localbuf [200]; unsigned int upperlimit = 190; unsigned int q = rq = 0; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { c = input[inputIndex ++]; if (... && (!q)) { q = 1; upperlimit --; } if (... && (q)) { q = 0; upperlimit ++; } if (... && (!q) && !rq) { rq = TRUE; upperlimit--; } if (... && (!q) && rq) { rq = 0; upperlimit ++; } if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (rq) { localbuf[outputIndex] = ')'; outputIndex ++; } if (q) { localbuf[outputIndex] = '>'; outputIndex ++; } } 26 / 35

slide-46
SLIDE 46

Analysis Steps and inferred Values

1st iteration: infers the affine equality between the variables: upperlimit + q + rq = 190 int copy_it (char *input , unsigned int length) { char c, localbuf [200]; unsigned int upperlimit = 190; unsigned int q = rq = 0; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { inputIndex = 0, outputIndex = 0, length ∈ [−∞, +∞] c = input[inputIndex ++]; if (... && (!q)) { upperlimit = 190, q = 0, rq = 0 q = 1; upperlimit --; } ⊔ : upperlimit + q = 190, upperlimit ∈ [189, 190], q ∈ [0, 1] if (... && (q)) { q = 0; upperlimit ++; } if (... && (!q) && !rq) { rq = TRUE; upperlimit--; } if (... && (!q) && rq) { rq = 0; upperlimit ++; } if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (rq) { localbuf[outputIndex] = ')'; outputIndex ++; } if (q) { localbuf[outputIndex] = '>'; outputIndex ++; } } 26 / 35

slide-47
SLIDE 47

Analysis Steps and inferred Values

1st iteration: infers the affine equality between the variables: upperlimit + q + rq = 190 int copy_it (char *input , unsigned int length) { char c, localbuf [200]; unsigned int upperlimit = 190; unsigned int q = rq = 0; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { c = input[inputIndex ++]; if (... && (!q)) { q = 1; upperlimit --; } if (... && (q)) { upperlimit = 189, q = 1, rq = 0 q = 0; upperlimit ++; } ⊔ : upperlimit + q = 190, upperlimit ∈ [189, 190], q ∈ [0, 1] if (... && (!q) && !rq) { rq = TRUE; upperlimit--; } if (... && (!q) && rq) { rq = 0; upperlimit ++; } if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (rq) { localbuf[outputIndex] = ')'; outputIndex ++; } if (q) { localbuf[outputIndex] = '>'; outputIndex ++; } } 26 / 35

slide-48
SLIDE 48

Analysis Steps and inferred Values

1st iteration: infers the affine equality between the variables: upperlimit + q + rq = 190 int copy_it (char *input , unsigned int length) { char c, localbuf [200]; unsigned int upperlimit = 190; unsigned int q = rq = 0; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { c = input[inputIndex ++]; if (... && (!q)) { q = 1; upperlimit --; } if (... && (q)) { q = 0; upperlimit ++; } if (... && (!q) && !rq) { upperlimit = 190, q = 0, rq = 0 rq = TRUE; upperlimit--; } ⊔ : upperlimit + q + rq = 190, upperlimit ∈ [189, 190], q ∈ [0, 1], rq ∈ [0, 1] if (... && (!q) && rq) { rq = 0; upperlimit ++; } if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (rq) { localbuf[outputIndex] = ')'; outputIndex ++; } if (q) { localbuf[outputIndex] = '>'; outputIndex ++; } } 26 / 35

slide-49
SLIDE 49

Analysis Steps and inferred Values

1st iteration: infers the affine equality between the variables: upperlimit + q + rq = 190 int copy_it (char *input , unsigned int length) { char c, localbuf [200]; unsigned int upperlimit = 190; unsigned int q = rq = 0; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { c = input[inputIndex ++]; if (... && (!q)) { q = 1; upperlimit --; } if (... && (q)) { q = 0; upperlimit ++; } if (... && (!q) && !rq) { rq = TRUE; upperlimit--; } if (... && (!q) && rq) { upperlimit = 189, q = 0, rq = 1 rq = 0; upperlimit ++; } ⊔ : upperlimit + q + rq = 190, upperlimit ∈ [189, 190], q ∈ [0, 1], rq ∈ [0, 1] if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (rq) { localbuf[outputIndex] = ')'; outputIndex ++; } if (q) { localbuf[outputIndex] = '>'; outputIndex ++; } } 26 / 35

slide-50
SLIDE 50

Analysis Steps and inferred Values

1st iteration: infers the affine equality between the variables: upperlimit + q + rq = 190 int copy_it (char *input , unsigned int length) { char c, localbuf [200]; unsigned int upperlimit = 190; unsigned int q = rq = 0; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { c = input[inputIndex ++]; if (... && (!q)) { q = 1; upperlimit --; } if (... && (q)) { q = 0; upperlimit ++; } if (... && (!q) && !rq) { rq = TRUE; upperlimit--; } if (... && (!q) && rq) { rq = 0; upperlimit ++; } if (outputIndex < upperlimit) { upperlimit ∈ [189, 190], outputIndex = 0 localbuf[outputIndex] = c;

  • utputIndex ++;

} ⊔ : upperlimit ∈ [189, 190], outputIndex ∈ [0, 1] } if (rq) { localbuf[outputIndex] = ')'; outputIndex ++; } if (q) { localbuf[outputIndex] = '>'; outputIndex ++; } } 26 / 35

slide-51
SLIDE 51

Analysis Steps and inferred Values

2nd iteration: widening ∇ suppressed by the “delayed widening” domain because of the flag assignments. Join ⊔ performed instead. We analyze the loop again with still valid equality: upperlimit + q + rq = 190 int copy_it (char *input , unsigned int length) { char c, localbuf [200]; unsigned int upperlimit = 190; unsigned int q = rq = 0; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { ⊔ : inputIndex ∈ [0, 1], outputIndex ∈ [0, 1] c = input[inputIndex ++]; if (... && (!q)) { upperlimit + rq = 190, upperlimit ∈ [189, 190], q = 0, rq ∈ [0, 1] q = 1; upperlimit --; } ⊔ : upperlimit + q + rq = 190, upperlimit ∈ [188, 190], q ∈ [0, 1], rq ∈ [0, 1] if (... && (q)) { q = 0; upperlimit ++; } if (... && (!q) && !rq) { rq = TRUE; upperlimit--; } if (... && (!q) && rq) { rq = 0; upperlimit ++; } if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (rq) { localbuf[outputIndex] = ')'; outputIndex ++; } if (q) { localbuf[outputIndex] = '>'; outputIndex ++; } } 26 / 35

slide-52
SLIDE 52

Analysis Steps and inferred Values

2nd iteration: widening ∇ suppressed by the “delayed widening” domain because of the flag assignments. Join ⊔ performed instead. We analyze the loop again with still valid equality: upperlimit + q + rq = 190 int copy_it (char *input , unsigned int length) { char c, localbuf [200]; unsigned int upperlimit = 190; unsigned int q = rq = 0; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { c = input[inputIndex ++]; if (... && (!q)) { q = 1; upperlimit --; } if (... && (q)) { upperlimit + rq = 189, upperlimit ∈ [188, 189], q = 1, rq ∈ [0, 1] q = 0; upperlimit ++; } ⊔ : upperlimit + q + rq = 190, upperlimit ∈ [188, 190], q ∈ [0, 1], rq ∈ [0, 1] if (... && (!q) && !rq) { rq = TRUE; upperlimit--; } if (... && (!q) && rq) { rq = 0; upperlimit ++; } if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (rq) { localbuf[outputIndex] = ')'; outputIndex ++; } if (q) { localbuf[outputIndex] = '>'; outputIndex ++; } } 26 / 35

slide-53
SLIDE 53

Analysis Steps and inferred Values

2nd iteration: widening ∇ suppressed by the “delayed widening” domain because of the flag assignments. Join ⊔ performed instead. We analyze the loop again with still valid equality: upperlimit + q + rq = 190 int copy_it (char *input , unsigned int length) { char c, localbuf [200]; unsigned int upperlimit = 190; unsigned int q = rq = 0; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { c = input[inputIndex ++]; if (... && (!q)) { q = 1; upperlimit --; } if (... && (q)) { q = 0; upperlimit ++; } if (... && (!q) && !rq) { upperlimit = 190, q = 0, rq = 0 rq = TRUE; upperlimit--; } ⊔ : upperlimit + q + rq = 190, upperlimit ∈ [188, 190], q ∈ [0, 1], rq ∈ [0, 1] if (... && (!q) && rq) { rq = 0; upperlimit ++; } if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (rq) { localbuf[outputIndex] = ')'; outputIndex ++; } if (q) { localbuf[outputIndex] = '>'; outputIndex ++; } } 26 / 35

slide-54
SLIDE 54

Analysis Steps and inferred Values

2nd iteration: widening ∇ suppressed by the “delayed widening” domain because of the flag assignments. Join ⊔ performed instead. We analyze the loop again with still valid equality: upperlimit + q + rq = 190 int copy_it (char *input , unsigned int length) { char c, localbuf [200]; unsigned int upperlimit = 190; unsigned int q = rq = 0; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { c = input[inputIndex ++]; if (... && (!q)) { q = 1; upperlimit --; } if (... && (q)) { q = 0; upperlimit ++; } if (... && (!q) && !rq) { rq = TRUE; upperlimit--; } if (... && (!q) && rq) { upperlimit = 189, q = 0, rq = 1 rq = 0; upperlimit ++; } ⊔ : upperlimit + q + rq = 190, upperlimit ∈ [188, 190], q ∈ [0, 1], rq ∈ [0, 1] if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (rq) { localbuf[outputIndex] = ')'; outputIndex ++; } if (q) { localbuf[outputIndex] = '>'; outputIndex ++; } } 26 / 35

slide-55
SLIDE 55

Analysis Steps and inferred Values

2nd iteration: widening ∇ suppressed by the “delayed widening” domain because of the flag assignments. Join ⊔ performed instead. We analyze the loop again with still valid equality: upperlimit + q + rq = 190 int copy_it (char *input , unsigned int length) { char c, localbuf [200]; unsigned int upperlimit = 190; unsigned int q = rq = 0; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { c = input[inputIndex ++]; if (... && (!q)) { q = 1; upperlimit --; } if (... && (q)) { q = 0; upperlimit ++; } if (... && (!q) && !rq) { rq = TRUE; upperlimit--; } if (... && (!q) && rq) { rq = 0; upperlimit ++; } if (outputIndex < upperlimit) { upperlimit ∈ [188, 190], outputIndex ∈ [0, 1] localbuf[outputIndex] = c;

  • utputIndex ++;

} ⊔ : upperlimit ∈ [188, 190], outputIndex ∈ [0, 2] } if (rq) { localbuf[outputIndex] = ')'; outputIndex ++; } if (q) { localbuf[outputIndex] = '>'; outputIndex ++; } } 26 / 35

slide-56
SLIDE 56

Analysis Steps and inferred Values

3rd iteration: now widening ∇ is applied using the widening threshold: outputIndex − 1 < upperlimit. Widening changes the lower bound of upperlimit but reduction with the equality upperlimit + q + rq =190 restores it int copy_it (char *input , unsigned int length) { char c, localbuf [200]; unsigned int upperlimit = 190; unsigned int q = rq = 0; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { ∇ : inputIndex ∈[0, +∞],outputIndex ∈[0, 190],upperlimit ∈[0, 190] c = input[inputIndex ++]; if (... && (!q)) { upperlimit + rq = 190, upperlimit ∈ [189, 190], q = 0, rq ∈ [0, 1] q = 1; upperlimit --; } ⊔ : upperlimit + q + rq = 190, upperlimit ∈ [188, 190], q ∈ [0, 1], rq ∈ [0, 1] if (... && (q)) { q = 0; upperlimit ++; } if (... && (!q) && !rq) { rq = TRUE; upperlimit--; } if (... && (!q) && rq) { rq = 0; upperlimit ++; } if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (rq) { localbuf[outputIndex] = ')'; outputIndex ++; } if (q) { localbuf[outputIndex] = '>'; outputIndex ++; } } 26 / 35

slide-57
SLIDE 57

Analysis Steps and inferred Values

3rd iteration: now widening ∇ is applied using the widening threshold: outputIndex − 1 < upperlimit. Widening changes the lower bound of upperlimit but reduction with the equality upperlimit + q + rq =190 restores it int copy_it (char *input , unsigned int length) { char c, localbuf [200]; unsigned int upperlimit = 190; unsigned int q = rq = 0; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { c = input[inputIndex ++]; if (... && (!q)) { q = 1; upperlimit --; } if (... && (q)) { upperlimit + rq = 189, upperlimit ∈ [188, 189], q = 1, rq ∈ [0, 1] q = 0; upperlimit ++; } ⊔ : upperlimit + q + rq = 190, upperlimit ∈ [188, 190], q ∈ [0, 1], rq ∈ [0, 1] if (... && (!q) && !rq) { rq = TRUE; upperlimit--; } if (... && (!q) && rq) { rq = 0; upperlimit ++; } if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (rq) { localbuf[outputIndex] = ')'; outputIndex ++; } if (q) { localbuf[outputIndex] = '>'; outputIndex ++; } } 26 / 35

slide-58
SLIDE 58

Analysis Steps and inferred Values

3rd iteration: now widening ∇ is applied using the widening threshold: outputIndex − 1 < upperlimit. Widening changes the lower bound of upperlimit but reduction with the equality upperlimit + q + rq =190 restores it int copy_it (char *input , unsigned int length) { char c, localbuf [200]; unsigned int upperlimit = 190; unsigned int q = rq = 0; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { c = input[inputIndex ++]; if (... && (!q)) { q = 1; upperlimit --; } if (... && (q)) { q = 0; upperlimit ++; } if (... && (!q) && !rq) { upperlimit = 190, q = 0, rq = 0 rq = TRUE; upperlimit--; } ⊔ : upperlimit + q + rq = 190, upperlimit ∈ [188, 190], q ∈ [0, 1], rq ∈ [0, 1] if (... && (!q) && rq) { rq = 0; upperlimit ++; } if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (rq) { localbuf[outputIndex] = ')'; outputIndex ++; } if (q) { localbuf[outputIndex] = '>'; outputIndex ++; } } 26 / 35

slide-59
SLIDE 59

Analysis Steps and inferred Values

3rd iteration: now widening ∇ is applied using the widening threshold: outputIndex − 1 < upperlimit. Widening changes the lower bound of upperlimit but reduction with the equality upperlimit + q + rq =190 restores it int copy_it (char *input , unsigned int length) { char c, localbuf [200]; unsigned int upperlimit = 190; unsigned int q = rq = 0; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { c = input[inputIndex ++]; if (... && (!q)) { q = 1; upperlimit --; } if (... && (q)) { q = 0; upperlimit ++; } if (... && (!q) && !rq) { rq = TRUE; upperlimit--; } if (... && (!q) && rq) { upperlimit = 189, q = 0, rq = 1 rq = 0; upperlimit ++; } ⊔ : upperlimit + q + rq = 190, upperlimit ∈ [188, 190], q ∈ [0, 1], rq ∈ [0, 1] if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (rq) { localbuf[outputIndex] = ')'; outputIndex ++; } if (q) { localbuf[outputIndex] = '>'; outputIndex ++; } } 26 / 35

slide-60
SLIDE 60

Analysis Steps and inferred Values

3rd iteration: now widening ∇ is applied using the widening threshold: outputIndex − 1 < upperlimit. Widening changes the lower bound of upperlimit but reduction with the equality upperlimit + q + rq =190 restores it int copy_it (char *input , unsigned int length) { char c, localbuf [200]; unsigned int upperlimit = 190; unsigned int q = rq = 0; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { c = input[inputIndex ++]; if (... && (!q)) { q = 1; upperlimit --; } if (... && (q)) { q = 0; upperlimit ++; } if (... && (!q) && !rq) { rq = TRUE; upperlimit--; } if (... && (!q) && rq) { rq = 0; upperlimit ++; } if (outputIndex < upperlimit) { upperlimit ∈ [188, 190], outputIndex ∈ [0, 189] localbuf[outputIndex] = c;

  • utputIndex ++;

} ⊔ : upperlimit ∈ [188, 190], outputIndex ∈ [0, 190] } if (rq) { localbuf[outputIndex] = ')'; outputIndex ++; } if (q) { localbuf[outputIndex] = '>'; outputIndex ++; } } 26 / 35

slide-61
SLIDE 61

Analysis Steps and inferred Values

4th iteration: loop is stable; outside of the loop body the value of outputIndex is still bounded! int copy_it (char *input , unsigned int length) { char c, localbuf [200]; unsigned int upperlimit = 190; unsigned int q = rq = 0; unsigned int inputIndex = outputIndex = 0; while (inputIndex < length) { ⊑: inputIndex ∈ [0, +∞], outputIndex ∈ [0, 190] c = input[inputIndex ++]; if (... && (!q)) { q = 1; upperlimit --; } if (... && (q)) { q = 0; upperlimit ++; } if (... && (!q) && !rq) { rq = TRUE; upperlimit--; } if (... && (!q) && rq) { rq = 0; upperlimit ++; } if (outputIndex < upperlimit) { localbuf[outputIndex] = c;

  • utputIndex ++;

} } if (rq) { outputIndex ∈ [0, 190] localbuf[outputIndex] = ')'; outputIndex ++; } if (q) {

  • utputIndex ∈ [1, 191]

localbuf[outputIndex] = '>'; outputIndex ++; } } 26 / 35

slide-62
SLIDE 62

Key Points

  • widening needs to be suppressed until the flag variables are

stable to infer the equality relation with upperlimit

  • the inferred equality upperlimit + q + rq = 190 and

reduction between domains results in more precise values for upperlimit; it recovers the precision loss of widening

27 / 35

slide-63
SLIDE 63

Key Points (continued)

  • narrowing does not help here; instead the threshold
  • utputIndex < upperlimit must be used for widening

❀ outputIndex is also restricted outside of the loop for the following two writes to the buffer

  • in the vulnerable version because of the missing

decrementation the equality relation does not hold ❀ upperlimit is unbounded after widening

28 / 35

slide-64
SLIDE 64

Unfortunately

The original 500 LOC Sendmail Bug is more complex!

29 / 35

slide-65
SLIDE 65

Unfortunately

The original 500 LOC Sendmail Bug is more complex!

  • the code contains ∼10 loops (nesting depth is 4) and gotos

29 / 35

slide-66
SLIDE 66

Unfortunately

The original 500 LOC Sendmail Bug is more complex!

  • the code contains ∼10 loops (nesting depth is 4) and gotos
  • lots of pointer arithmetic inside loops

29 / 35

slide-67
SLIDE 67

Unfortunately

The original 500 LOC Sendmail Bug is more complex!

  • the code contains ∼10 loops (nesting depth is 4) and gotos
  • lots of pointer arithmetic inside loops
  • uses string manipulating functions

29 / 35

slide-68
SLIDE 68

Unfortunately

The original 500 LOC Sendmail Bug is more complex!

  • the code contains ∼10 loops (nesting depth is 4) and gotos
  • lots of pointer arithmetic inside loops
  • uses string manipulating functions
  • the bugfix is not only one line but in more places

29 / 35

slide-69
SLIDE 69

Unfortunately

The original 500 LOC Sendmail Bug is more complex!

  • the code contains ∼10 loops (nesting depth is 4) and gotos
  • lots of pointer arithmetic inside loops
  • uses string manipulating functions
  • the bugfix is not only one line but in more places

❀ we cannot yet automatically prove the invariant on that code; the non-vulnerable version is flagged as vulnerable, too

29 / 35

slide-70
SLIDE 70

Analysis Results of various Tools

Now lets look how other tools fare on the simplified example ...

30 / 35

slide-71
SLIDE 71

Analysis Results of various Tools

Evaluated on the simplified Sendmail Crackaddr Example Tool non-vuln. vuln. Techniques used Input Bindead ✓ ✓ AI binary Jakstab ✓ AI binary Astrée ✓ AI C Goblint ✓ AI C TIS-Analyzer/Frama-C ✓(m) ✓ AI + MC C PAGAI ✓ ✓ AI + MC LLVM SeaHorn ✓ ✓ AI + MC LLVM HAVOC ✓(m) ✓ MC C CProver ✓(m) ✓ MC C AFL ✓ Fuzz C Radamsa ✓ Fuzz binary

m: manual hints from user required AI: Abstract Interpretation MC: Model Checking Fuzz: fuzz fuzz fuzz

Still to test: KLEE, S2E, BAP, Java Path Finder, Triton, PySymEmu, Moflow, Angr, McSema, OpenREIL, Bincoa, CodeSonar, Polyspace, Goanna, Clousot . . .

31 / 35

slide-72
SLIDE 72

Conclusion

Program analysis tools can infer surprisingly nice results. ❀ here an invariant that shows the programmer’s intention

32 / 35

slide-73
SLIDE 73

Conclusion

Program analysis tools can infer surprisingly nice results. ❀ here an invariant that shows the programmer’s intention

  • but the tools are quite complex

❀ hard to understand and to reason about the results

32 / 35

slide-74
SLIDE 74

Conclusion

Program analysis tools can infer surprisingly nice results. ❀ here an invariant that shows the programmer’s intention

  • but the tools are quite complex

❀ hard to understand and to reason about the results

  • if an expected invariant cannot be proved it is difficult to

find out why and fix it

32 / 35

slide-75
SLIDE 75

Conclusion

Program analysis tools can infer surprisingly nice results. ❀ here an invariant that shows the programmer’s intention

  • but the tools are quite complex

❀ hard to understand and to reason about the results

  • if an expected invariant cannot be proved it is difficult to

find out why and fix it

  • however, being able to understand, use and debug an

analyzer is key to building useful analyses! ❀ general adoption of static analyzers is an uphill battle :(

32 / 35

slide-76
SLIDE 76

Demo!

Initializing ... demo

Project page: https://bitbucket.org/mihaila/bindead

33 / 35

slide-77
SLIDE 77

Demo!

Initializing ... demo

... ... ... ... 25%

Project page: https://bitbucket.org/mihaila/bindead

33 / 35

slide-78
SLIDE 78

Demo!

Initializing ... demo

... ... ... ... 25% ... ... ... ... 64%

Project page: https://bitbucket.org/mihaila/bindead

33 / 35

slide-79
SLIDE 79

Demo!

Initializing ... demo

... ... ... ... 25% ... ... ... ... 64% ... ... ... ... 98%

Project page: https://bitbucket.org/mihaila/bindead

33 / 35

slide-80
SLIDE 80

Demo!

Initializing ... demo

... ... ... ... 25% ... ... ... ... 64% ... ... ... ... 98%

Project page: https://bitbucket.org/mihaila/bindead

33 / 35

slide-81
SLIDE 81

A Merci Beaucoup goes to ...

People who helped with ideas, discussions and the experiments Halvar Flake, Joshua J. Drake, Pascal Cuoq, Julien Vanegue, Johannes Kinder, Julien Henry, Ralf Vogler All the tool developers of Bindead, Astrée, TIS-Analyzer, Frama-C, Goblint, PAGAI, AFL, Radamsa, Jakstab, SeaHorn, CProver, HAVOC, KLEE, S2E, BAP, Java Path Finder, Triton, PySymEmu, Moflow, Angr, McSema, OpenREIL, Bincoa, CodeSonar, Polyspace, Goanna, Clousot, ... Hackito Ergo Sum 2015

34 / 35

slide-82
SLIDE 82

Some previous material on Sendmail Crackaddr

Presentations

  • Checking the Boundaries of Static Analysis - Halvar Flake 2013
  • Exploitation and State Machines - Halvar Flake 2012
  • Exploit-Generation with Acceleration - Daniel Kröning et al. 2013
  • Modern Static Security Checking of C/C++ Code - Julien Vanegue 2012
  • Practical AI Applications to Information Security - Fyodor Yarochkin 2003

Papers and Web Resources

  • TIS Analyzer Sendmail Crackaddr Analysis Report - Pascal Cuoq 2014
  • SMT Solvers for Software Security - Julien Vanegue et al. 2012
  • Technical Analysis and Exploitation of Sendmail Bug - LSD 2003
  • Sendmail Crackaddr CVE-2002-1337 - MITRE Co. 2003
  • Remote Sendmail Header Processing Vulnerability - IBM ISS 2003

35 / 35