HI-CFG: Construction by Dynamic Binary Analysis, and Application to - - PowerPoint PPT Presentation

hi cfg
SMART_READER_LITE
LIVE PREVIEW

HI-CFG: Construction by Dynamic Binary Analysis, and Application to - - PowerPoint PPT Presentation

HI-CFG: Construction by Dynamic Binary Analysis, and Application to Attack Polymorphism Dan Caselden, Alex Bazhanyuk, Mathias Payer , Stephen McCamant, Dawn Song, UC Berkeley Recovering Information Knowledge of information (data) flow and


slide-1
SLIDE 1

HI-CFG:

Construction by Dynamic Binary Analysis, and Application to Attack Polymorphism

Dan Caselden, Alex Bazhanyuk, Mathias Payer, Stephen McCamant, Dawn Song, UC Berkeley

slide-2
SLIDE 2

Recovering Information

Knowledge of information (data) flow and control flow of an application crucial for analysis

  • Current tools focus on just one type of flow

Combine information flow and control flow into high-level data structure

  • Hybrid, Information- and Control-Flow-Graph (HI-

CFG) using binary analysis

slide-3
SLIDE 3

HI-CFG Overview

1 2 3 4 5 6 Buffer A Buffer B

CFG view Data flow view

Buffer C

slide-4
SLIDE 4

Outline

Motivation Attack Polymorphism Dynamic HI-CFG Construction Evaluation Conclusion

slide-5
SLIDE 5

HI-CFG: Attack Polymorphism

Step one: phase partitioning

  • Divide a computation into steps that transform data

from an original input to an internal format

  • Based on HI-CFG buffers, information-flow and

producer/consumer edges

Step two: phase aware input generation

  • Aim is to produce an input that triggers a vulnerability

deep within a program

  • Use phase structure to divide and conquer
  • Symbolic execution with search pruning
slide-6
SLIDE 6

HI-CFG: Attack Polymorphism

Program (with target condition) Input

slide-7
SLIDE 7

HI-CFG: Attack Polymorphism

Program (with target condition) Input buf0 buf1 buf2

trans. trans. trans.

PoC Input

slide-8
SLIDE 8

HI-CFG: Attack Polymorphism

Program (with target condition) Input buf0 buf1 buf2

trans. trans. trans.

SE SE SE

PoC Input

slide-9
SLIDE 9

Outline

Motivation Attack Polymorphism Dynamic HI-CFG Construction Evaluation Conclusion

slide-10
SLIDE 10

HI-CFG: trace-based construction 1/3

Trace enables us to recover both control-flow and information-flow of an application using some concrete input

  • 1. Start with specific input data
  • 2. Collect an instruction level trace (TEMU)
  • 3. Process the traces to create a HI-CFG
slide-11
SLIDE 11

HI-CFG: trace-based construction 2/3

Work through the execution trace and group “related” memory accesses

  • Categorize buffers hierarchically
  • Conservative and taint-based information flow

Grouping heuristics

  • Instructions use same base pointer
  • Temporally and spatially correlated memory accesses
slide-12
SLIDE 12

HI-CFG: trace-based construction 3/3

Apply graph partitioning algorithms to divide the HI-CFG at “natural” boundaries to separate code and data structures

  • Extract functionality into separate modules for reuse
  • r transformation

No source info needed, except addresses of malloc/calloc/free

slide-13
SLIDE 13

Outline

Motivation Attack Polymorphism Dynamic HI-CFG Construction Evaluation

  • Scalable Symbolic Execution
  • Poppler Case Study

Conclusion

slide-14
SLIDE 14

Scalable SE is key

Vulnerability detection

  • Both in malware and legit applications

Model extraction

  • Automatically learn security-relevant models

Binary code reuse

  • Identify interface and extract components
slide-15
SLIDE 15

Evaluation setup

Simple transformation

  • RLE decoding
  • Output as target, SE produces input

Configurations

  • KLEE
  • FuzzBALL

Detailed results from TR Berkeley/EECS-2013-125

slide-16
SLIDE 16

Limitations of SE

slide-17
SLIDE 17

Limitations of SE

Vanilla symbolic execution does not scale!

slide-18
SLIDE 18

Transformation-aware SE

Computations rely on input transformations Focus on transformations to reduce complexity

  • Surjectivity guarantees existing pre-image
  • Sequentiality ensures output is never revoked
  • Streaming bounds the transformation state

Covered transformations include decryption, decompression, escape sequences, image or sound decoding

slide-19
SLIDE 19

Feedback-guided optimization (FGO)

Search pruning

  • if target “unreachable”

Search prioritization

  • look for short inputs that maximize size of output

Symbolic array accesses

  • treat choice of index like a branch (baseline)
  • combine all possible values into formula
slide-20
SLIDE 20

Evaluation setup

Simple transformation

  • RLE decoding
  • Output as target, SE produces input

Configurations

  • KLEE
  • FuzzBALL
  • FuzzBALL-FGO
slide-21
SLIDE 21

FGO: 1 order of magnitude

slide-22
SLIDE 22

Transformation-aware SE

Divide-and-conquer strategy for SE

  • HI-CFG captures transformations
  • Split SE on transformation boundaries
slide-23
SLIDE 23

Evaluation setup

Two transformations

  • HEX decoding
  • RLE decoding

Different configurations:

  • KLEE/FuzzBALL
  • FuzzBALL-FGO
  • FuzzBALL-HI-CFG (includes FGO)
slide-24
SLIDE 24

Transformation-aware SE: another 1

  • rder of magnitude
slide-25
SLIDE 25

Poppler Case Study

Poppler PDF viewer

  • Type 1 font parsing vulnerability CVE-2010-3704

HI-CFG construction using benign document that loads a font

  • PDF generated by pdftex using a small tex file
slide-26
SLIDE 26

Poppler Phases

I/O Flate decode Read Font Parse Font

slide-27
SLIDE 27

Poppler Buffers

space bf792000 4096 alloc 828b420 312 alloc 829f008 34104 alloc 82b7550 9887 memcpy GfxFont::readEmbed FontFile(Xref*, int*) FlateStream::getHuffmanCode Word(FlateHuffmanTab*) FoFiType1 ::parse() (implicit)

slide-28
SLIDE 28

Poppler Buffers

space bf792000 4096 alloc 828b420 312 alloc 829f008 34104 alloc 82b7550 9887 memcpy GfxFont::readEmbed FontFile(Xref*, int*) FlateStream::getHuffmanCode Word(FlateHuffmanTab*) FoFiType1 ::parse() (implicit)

Automatically produces compressed exploit

slide-29
SLIDE 29

Outline

Motivation Attack Polymorphism Dynamic HI-CFG Construction Evaluation Related Work Conclusion

slide-30
SLIDE 30

Related Work

HOWARD (Slowinska et al., NDSS’11, ATC12): Type and data structure inference from binaries

  • HI-CFG looks at code & relationships between code

and data (not just data structures)

AEG (Avgerinos et al., NDSS’11) and MAYHEM (Cha et al., Oakland’12): SE-based attack input generation

  • HI-CFG enables focus on iterative and scalable SE

(not focus on coverage)

slide-31
SLIDE 31

Outline

Motivation Attack Polymorphism Dynamic HI-CFG Construction Evaluation Related Work Conclusion

slide-32
SLIDE 32

Conclusion

Presented HI-CFG as new data-structure

  • Construction from binary execution traces

HI-CFG enables

  • Deep program analysis
  • Recover components from binaries
  • Guide SE along probable paths

FuzzBALL symbolic execution engine:

  • http://github.com/bitblaze-fuzzball/fuzzball