WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats - - PowerPoint PPT Presentation

weizz automatic grey box fuzzing for structured binary
SMART_READER_LITE
LIVE PREVIEW

WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats - - PowerPoint PPT Presentation

WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats Andrea Fioraldi , Daniele Cono DElia and Emilio Coppa @andreafioraldi andreafioraldi@gmail.com Format-aware Fuzzing Input Input Program Crashes Format Generation Under


slide-1
SLIDE 1

WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

Andrea Fioraldi, Daniele Cono D’Elia and Emilio Coppa

@andreafioraldi andreafioraldi@gmail.com

slide-2
SLIDE 2

Format-aware Fuzzing

Input Format Model Input Generation Program Under Test Crashes

2 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

slide-3
SLIDE 3

Format-aware Fuzzing

3 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • LangFuzz
  • Peach
  • Spike
  • CSmith
  • ...
slide-4
SLIDE 4

Problems

4 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Impossible if the input structure is unknown
slide-5
SLIDE 5

Problems

5 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Impossible if the input structure is unknown
  • May fail to find bugs related to syntactically invalid inputs

in parsers

slide-6
SLIDE 6

Problems

6 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Impossible if the input structure is unknown
  • May fail to find bugs related to syntactically invalid inputs

in parsers

  • Parser implementations do not always closely mirror format

specifications

slide-7
SLIDE 7

Problems

7 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Impossible if the input structure is unknown
  • May fail to find bugs related to syntactically invalid inputs

in parsers

  • Parser implementations do not always closely mirror format

specifications

  • Models take some time to be written by a human (and contains

simplifications)

slide-8
SLIDE 8

Problems

8 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Impossible if the input structure is unknown
  • May fail to find bugs related to syntactically invalid inputs

in parsers

  • Parser implementations do not always closely mirror format

specifications

  • Models take some time to be written by a human (and contain

simplifications)

  • Wrong models make fuzzing ineffective
slide-9
SLIDE 9

Solutions?

9 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Automatically learn the model from the actual implementation
  • f the parser
slide-10
SLIDE 10

Solutions?

10 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Automatically learn the model from the actual implementation
  • f the parser
  • Generate not always syntactically valid inputs
slide-11
SLIDE 11

Solutions?

11 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Automatically learn the model from the actual implementation
  • f the parser

○ (Approximation of) Taint Tracking ■ [Tupni] [Autogram] [Polyglot] [Grimoire] ○ Machine Learning ■ [Learn&Fuzz] [REINAM] ○ Oracle based ■ [GLADE]

  • Generate not always syntactically valid inputs
slide-12
SLIDE 12

Coverage-guided Fuzzing

Corpus Input Mutation Program Under Test Crashes Coverage

12 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

slide-13
SLIDE 13

Problems

13 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Fail to explore deep paths behind parsers
slide-14
SLIDE 14

Problems

14 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Fail to explore deep paths behind parsers
  • Affected by roadblocks (multi-byte comparisons, checksums,

hashes, …) if (hash(input[0:8]) != input[8:12]) exit(1) if (input[12:16] == 0xABADCAFE) bug()

slide-15
SLIDE 15

Structured Fuzzing

Corpus Input Mutation Program Under Test Crashes Coverage

15 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

Input Format Model

slide-16
SLIDE 16

Structured Fuzzing

16 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • AFLSmart
  • Nautilus
  • Superion
  • Libprotobuf-Mutator
  • Zest
  • ...
slide-17
SLIDE 17

Bypass Roadblocks

17 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Concolic Fuzzing

○ [Driller] [QSYM] [Eclipser]

  • (Approximation of) Taint Tracking

○ [TaintScope] [Vuzzer] [Angora] [Redqueen]

  • Sensitive feedbacks

○ [LAF-Intel] [CompareCoverage] [FuzzFactory] [IJON]

slide-18
SLIDE 18

Bypass Roadblocks

18 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Concolic Fuzzing

○ [Driller] [QSYM] [Eclipser]

  • (Approximation of) Taint Tracking

○ [TaintScope] [Vuzzer] [Angora] [Redqueen]

  • Sensitive feedbacks

○ [LAF-Intel] [CompareCoverage] [FuzzFactory] [IJON]

slide-19
SLIDE 19

Idea #1

19 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Reuse expensive analysis to bypass roadblocks previously

explored in past works to enable Structure-aware mutations

slide-20
SLIDE 20

Bypass Roadblocks [Redqueen]

20 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Mutations targeting magic byte comparisons (Input-To-State)
slide-21
SLIDE 21

Bypass Roadblocks [Redqueen]

21 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Mutations targeting magic byte comparisons (Input-To-State)

input: AAAABBBBCCCCBBBB cmp eax, FFFF → eax = BBBB

slide-22
SLIDE 22

Bypass Roadblocks [Redqueen]

22 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Mutations targeting magic byte comparisons (Input-To-State)

input: AAAABBBBDDCCDDCC (equivalent in coverage) cmp eax, FFFF → eax = BBBB

slide-23
SLIDE 23

Bypass Roadblocks [Redqueen]

23 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Mutations targeting magic byte comparisons (Input-To-State)

input: AAAABBBBDDCCDDCC (equivalent in coverage) cmp eax, FFFF → eax = BBBB new input: AAAAFFFFDDCCDDCC

slide-24
SLIDE 24

Bypass Roadblocks [Redqueen]

24 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Mutations targeting magic byte comparisons (Input-To-State)
  • Patch out checksum checks
slide-25
SLIDE 25

Formats as an AST [Grimoire]

25 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

=

+

/

12 3 5

slide-26
SLIDE 26

Not all formats are parsed into an AST

26 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

slide-27
SLIDE 27

Comparisons for validation

27 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

if (chunk->size_field > SIZE_MAX) error(“Invalid Chunk Size”);

slide-28
SLIDE 28

Idea #2

28 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Instead of using memory accesses to reconstruct the format

([Tupni] [Autogram]) use the comparisons instructions that are likely validation checks

slide-29
SLIDE 29

Idea #3

29 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Don’t learn a model and use it to guide the fuzzer, but

reconstruct each time the structure and apply mutations. This avoids the problem of having errors in the learning process.

slide-30
SLIDE 30

Weizz

30 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Based on AFL 2.52b
  • Binary-only (QEMU)
  • Approximate Taint to bypass Roadblocks and learn information

about validation checks

  • Structural mutations based on that information (inspired by

[AFLSmart])

slide-31
SLIDE 31

Architecture

31 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

slide-32
SLIDE 32

Architecture

32 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

slide-33
SLIDE 33

Architecture

33 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

slide-34
SLIDE 34

GetDeps: Approximating Taint Tracking

34 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

Input: AAAABBBBCCCCDDDD cmp eax, FFFF → eax = AAAA

slide-35
SLIDE 35

GetDeps: Approximating Taint Tracking

35 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

Input: AAAABBBBCCCCDDDD cmp eax, FFFF → eax = AAAA Bitflip #1: BAAABBBBCCCCDDDD cmp eax, FFFF → eax = BAAA

slide-36
SLIDE 36

Detect Checksum Checks

36 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • One operand is I2S
  • The other operand is not I2S and GetDeps revealed dependencies
  • n some input bytes
  • The sets of their byte dependencies are disjoint
slide-37
SLIDE 37

Input Tags

37 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Comparison ID
  • Timestamp
  • Parent ID
  • Number of tags with the same ID
  • The Comparison ID of the inner checksum that guard this byte
  • Flags (which CMP operand, if this is a checksum field, …)
slide-38
SLIDE 38

Many Comparisons affected by the same byte

38 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

1. Prioritize Checksum fields

slide-39
SLIDE 39

Many Comparisons affected by the same byte

39 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

1. Prioritize Checksum fields 2. Prioritize comparisons appeared earlier in time (possible validation checks)

slide-40
SLIDE 40

Many Comparisons affected by the same byte

40 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

1. Prioritize Checksum fields 2. Prioritize comparisons appeared earlier in time (possible validation checks) 3. Prioritize if the number of bytes influencing the comparison are low

slide-41
SLIDE 41

Fixing Checksum

41 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Late-stage repair
  • Topological Sort (Tags have the info for this)
  • Unpatch false positives
slide-42
SLIDE 42

Locating Fields

42 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

slide-43
SLIDE 43

Locating Chunks

43 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

struct { int type; int x , y; int cksm; };

slide-44
SLIDE 44

Locating Chunks

44 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

1. Pick a tag type

struct { int type; int x , y; int cksm; };

slide-45
SLIDE 45

Locating Chunks

45 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

1. Pick a tag type 2. Recurse if next Timestamp (ts) > current

struct { int type; int x , y; int cksm; };

slide-46
SLIDE 46

Locating Chunks

46 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

1. Pick a tag type 2. Recurse if next Timestamp (ts) > current

struct { int type; int x , y; int cksm; };

slide-47
SLIDE 47

Locating Chunks

47 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

1. Pick a tag type 2. Recurse if next Timestamp (ts) > current 3. Go forward if next ID = current Parent

struct { int type; int x , y; int cksm; };

slide-48
SLIDE 48

Locating Chunks

48 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

1. Pick a tag type 2. Recurse if next Timestamp (ts) > current 3. Go forward if next ID = current Parent 4. With a probability take untagged part and recurse again

struct { int type; int x , y; int cksm; };

slide-49
SLIDE 49

Mutating Chunks [AFLSmart]

49 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Addition
  • Deletion
  • Splicing
slide-50
SLIDE 50

Mutating Chunks [Weizz]

50 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Addition

○ Select a chunk A and adds a chunk from another input in the queue with the same parent ID in the first tag of A before or after A Current input: Other input: Generated input: A B A B

slide-51
SLIDE 51

Mutating Chunks [Weizz]

51 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Deletion

○ Select a chunk and removes it Current input: Generated input: A

slide-52
SLIDE 52

Mutating Chunks [Weizz]

52 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Splicing

○ Select a chunk A and replaces it with a chunk from another input in the queue with the same comparison ID in the first tag Current input: Other input: Generated input: A B B

slide-53
SLIDE 53

Evaluation

53 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

1. Comparison with popular fuzzers over chunk-oriented programs 2. New bugs found by Weizz 3. Role of structural mutations and roadblock bypassing?

slide-54
SLIDE 54

Evaluation

54 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

slide-55
SLIDE 55

Evaluation (60% conf. intervals)

55 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

slide-56
SLIDE 56

Evaluation (60% conf. intervals)

56 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

slide-57
SLIDE 57

Evaluation

57 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

w/o I2S

slide-58
SLIDE 58

Evaluation (60% conf. intervals)

58 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

slide-59
SLIDE 59

Evaluation (60% conf. intervals)

59 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

slide-60
SLIDE 60

Bugs

60 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

slide-61
SLIDE 61

Evaluation

61 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

w/o struct. mut.

w/o I2S

slide-62
SLIDE 62

Future Directions

62 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

  • Taint Tracking for large inputs
  • More chunk location heuristics

○ Exclude types of tags as starting point for a chunk ○ Apply traditional file-format reverse engineering algorithms based on memory accesses to tags

  • Port to other OSes
slide-63
SLIDE 63

Thank You

63 WEIZZ: Automatic Grey-Box Fuzzing for Structured Binary Formats

https://github.com/andreafioraldi/weizz-fuzzer