Bintrimmer: Towards Static Binary Debloating Through Abstract - - PowerPoint PPT Presentation

bintrimmer towards static binary debloating through
SMART_READER_LITE
LIVE PREVIEW

Bintrimmer: Towards Static Binary Debloating Through Abstract - - PowerPoint PPT Presentation

Bintrimmer: Towards Static Binary Debloating Through Abstract Interpretation DIMVA, June 19 th 2019 Nilo Redini Computer Science @ UC Santa Barbara nredini@cs.ucsb.edu Motivation - Software complexity pushes developers toward component


slide-1
SLIDE 1

Bintrimmer: Towards Static Binary Debloating Through Abstract Interpretation

DIMVA, June 19th 2019 Nilo Redini

Computer Science @ UC Santa Barbara nredini@cs.ucsb.edu

slide-2
SLIDE 2

Motivation

  • Software complexity pushes developers

toward component re-use

  • Programs bloated with unused code
slide-3
SLIDE 3

Motivation

  • Software complexity pushes developers

toward component re-use

  • Programs bloated with unused code

Unused code can be used to harm users

0x804fd2c: pop rdi pointer to “bin/sh” ret ... 0x7fff4cdda: pop rsi pointer to null ret ... 0x805ccac: pop rdx pointer to null ret 0x7fff39cd4: spawn shell (execve)

slide-4
SLIDE 4

Motivation

  • Software complexity pushes developers

toward component re-use

  • Programs bloated with unused code

Unused code can be used to harm users Remove dead code to reduce attack surface

0x804fd2c: pop rdi pointer to “bin/sh” ret ... 0x7fff4cdda: pop rsi pointer to null ret ... 0x805ccac: pop rdx pointer to null ret 0x7fff39cd4: spawn shell (execve)

slide-5
SLIDE 5

Current Techniques

State-of-the-art debloating techniques require :

  • Source code
  • Test cases
  • Runtime support
slide-6
SLIDE 6

Current Techniques

State-of-the-art debloating techniques require :

  • Source code not always available
  • Test cases
  • Runtime support
slide-7
SLIDE 7

Current Techniques

State-of-the-art debloating techniques require :

  • Source code
  • Test cases unreliable programs
  • Runtime support
slide-8
SLIDE 8

Current Techniques

State-of-the-art debloating techniques require :

  • Source code
  • Test cases
  • Runtime support different architectures
slide-9
SLIDE 9

Debloating

Can we statically identify and remove unused code when only the binary program is available?

slide-10
SLIDE 10

Debloating

Build a complete & sound Control-Flow Graph, and remove the code not referenced

slide-11
SLIDE 11

Debloating

Build a complete & sound Control-Flow Graph, and remove the code not referenced Undecidable ~> Impossible!

slide-12
SLIDE 12

Debloating

Build a complete & sound Control-Flow Graph, and remove the code not referenced Undecidable ~> Impossible! Sound debloating requires a complete Control-Flow Graph

slide-13
SLIDE 13

Debloating

Build a complete & sound Control-Flow Graph, and remove the code not referenced Undecidable ~> Impossible! Sound debloating requires a complete Control-Flow Graph Completeness without precision ~> Uneffective debloating

slide-14
SLIDE 14

Debloating

Assuming we have a complete but imprecise CFG, how do we increase its precision?

slide-15
SLIDE 15

Debloating

Assuming we have a complete but imprecise CFG, how do we increase its precision? Through a precise approximation of variable values (e.g., function pointers)

slide-16
SLIDE 16

Debloating

Assuming we have a complete but imprecise CFG, how do we increase its precision? Through a precise approximation of variable values (e.g., function pointers) Define a precise abstract domain

slide-17
SLIDE 17

Example

void main() { uint8_t opt; void (*f_ptr)( void ) = [foo, bar, baz]; // foo, bar, and baz are // defined in another module scanf("%"SCNu8, &opt);

  • pt = (opt * 2) + 1;

// ... if (opt == 0) { f_ptr[0](); // call to foo } else if (op == 100){ f_ptr[1](); // call to bar } else if (opt < 0) { f_ptr[2](); // cal to baz } }

slide-18
SLIDE 18

Example

void main() { uint8_t opt; void (*f_ptr)( void ) = [foo, bar, baz]; // foo, bar, and baz are // defined in another module scanf("%"SCNu8, &opt);

  • pt = (opt * 2) + 1;

// ... if (opt == 0) { f_ptr[0](); // call to foo } else if (op == 100){ f_ptr[1](); // call to bar } else if (opt < 0) { f_ptr[2](); // cal to baz } }

slide-19
SLIDE 19

Example

void main() { uint8_t opt; void (*f_ptr)( void ) = [foo, bar, baz]; // foo, bar, and baz are // defined in another module scanf("%"SCNu8, &opt);

  • pt = (opt * 2) + 1;

// ... if (opt == 0) { f_ptr[0](); // call to foo } else if (op == 100){ f_ptr[1](); // call to bar } else if (opt < 0) { f_ptr[2](); // cal to baz } }

slide-20
SLIDE 20

Example

void main() { uint8_t opt; void (*f_ptr)( void ) = [foo, bar, baz]; // foo, bar, and baz are // defined in another module scanf("%"SCNu8, &opt);

  • pt = (opt * 2) + 1;

// ... if (opt == 0) { f_ptr[0](); // call to foo } else if (op == 100){ f_ptr[1](); // call to bar } else if (opt < 0) { f_ptr[2](); // cal to baz } }

slide-21
SLIDE 21

Example

void main() { uint8_t opt; void (*f_ptr)( void ) = [foo, bar, baz]; // foo, bar, and baz are // defined in another module scanf("%"SCNu8, &opt);

  • pt = (opt * 2) + 1;

// ... if (opt == 0) { f_ptr[0](); // call to foo } else if (op == 100){ f_ptr[1](); // call to bar } else if (opt < 0) { f_ptr[2](); // cal to baz } }

slide-22
SLIDE 22

Example

void main() { uint8_t opt; void (*f_ptr)( void ) = [foo, bar, baz]; // foo, bar, and baz are // defined in another module scanf("%"SCNu8, &opt);

  • pt = (opt * 2) + 1;

// ... if (opt == 0) { f_ptr[0](); // call to foo } else if (op == 100){ f_ptr[1](); // call to bar } else if (opt < 0) { f_ptr[2](); // cal to baz } }

slide-23
SLIDE 23

Example

void main() { uint8_t opt; void (*f_ptr)( void ) = [foo, bar, baz]; // foo, bar, and baz are // defined in another module scanf("%"SCNu8, &opt);

  • pt = (opt * 2) + 1;

// ... if (opt == 0) { f_ptr[0](); // call to foo } else if (op == 100){ f_ptr[1](); // call to bar } else if (opt < 0) { f_ptr[2](); // cal to baz } }

slide-24
SLIDE 24

Signedness of Variables

While it is easy to detect the signedness of a variable in source code, it is harder on binary programs.

slide-25
SLIDE 25

Signedness of Variables

While it is easy to detect the signedness of a variable in source code, it is harder on binary programs. The abstract domain must be signedness-agnostic

slide-26
SLIDE 26

BinTrimmer

slide-27
SLIDE 27

High-level Idea

Goal: We want to recover a complete and precise CFG, thus guaranteeing program functionality and effective debloating The more precise the CFG is, the more we can trim! CFG Refinement Debloating

slide-28
SLIDE 28

High-level Idea

Goal: We want to recover a complete and precise CFG, thus guaranteeing program functionality and effective debloating The more precise the CFG is, the more we can trim! Signedness-Agnostic Strided Intervals (SASI) CFG Refinement Debloating

slide-29
SLIDE 29

Signedness-Agnostic Strided Intervals

slide-30
SLIDE 30

Signedness-Agnostic Strided Intervals

+ represents modular addition of bit-width Example: 2[1010, 0010]4 = {1010, 1100, 1110, 0000, 0010}

slide-31
SLIDE 31

Signedness-Agnostic Strided Intervals

Number circle ~> Capture overflow behavior of variables on a computer

slide-32
SLIDE 32

Signedness-Agnostic Strided Intervals

Number circle ~> Capture overflow behavior of variables on a computer Stride ~> To increase the precision of the values represented by an element in SASI

slide-33
SLIDE 33

Signedness-Agnostic Strided Intervals

Number circle ~> Capture overflow behavior of variables on a computer Stride ~> To increase the precision of the values represented by an element in SASI Signedness Agnosticity and Soundness ~> Achieved by a careful design of the

  • perations on SASI
slide-34
SLIDE 34

Example: Addition

Given wwo SASI r = Sr [a, b]w and t = St[c, d]w, addition is defined as follows: where Ss = gcd (Sr, St )

slide-35
SLIDE 35

CFG Refinement

slide-36
SLIDE 36

CFG Refinement

slide-37
SLIDE 37

CFG Refinement

slide-38
SLIDE 38

CFG Refinement

slide-39
SLIDE 39

Program Debloating

Delete code + Lighter Binaries

  • Pointers must be updated

Modify code + Guarantee Functionality (no need to fix pointers)

  • Same size
slide-40
SLIDE 40

BinTrimmer

Static Binary Trimming tool Leverage SASI to refine CFG and identify dead code Rewrite dead code with halt Implemented on top of angr

slide-41
SLIDE 41

Experimental Results

slide-42
SLIDE 42

SASI vs. Wrapped Intervals (on Sources)

slide-43
SLIDE 43

SASI vs. Wrapped Intervals (on Binaries)

slide-44
SLIDE 44

Trimming Results

slide-45
SLIDE 45

Trimming Results

slide-46
SLIDE 46

Trimming Results

slide-47
SLIDE 47

Trimming Results

slide-48
SLIDE 48

Trimming Results

slide-49
SLIDE 49

Conclusions

New abstract domain: SASI 98% more precise that state-of-the-art! BinTrimer: Static Binary Debloating Sound debloating: programs guaranteed to work! No test cases needed No source code needed Remove up to 65.6% of a library’s code

slide-50
SLIDE 50

Thanks! && Questions?

Nilo Redini

nredini@cs.ucsb.edu https://badnack.it @badnack