Coverage-guided Fuzzing of Individual Functions Without Source Code - - PowerPoint PPT Presentation

coverage guided fuzzing of individual functions without
SMART_READER_LITE
LIVE PREVIEW

Coverage-guided Fuzzing of Individual Functions Without Source Code - - PowerPoint PPT Presentation

Coverage-guided Fuzzing of Individual Functions Without Source Code Alessandro Di Federico Politecnico di Milano October 25, 2018 1 Index Coverage-guided fuzzing An overview of rev.ng Experimental results 2 Fuzzing 3 Fuzzing 1 Generate


slide-1
SLIDE 1

Coverage-guided Fuzzing

  • f Individual Functions

Without Source Code

Alessandro Di Federico

Politecnico di Milano

October 25, 2018

1

slide-2
SLIDE 2

Index

Coverage-guided fuzzing An overview of rev.ng Experimental results

2

slide-3
SLIDE 3

Fuzzing

3

slide-4
SLIDE 4

Fuzzing

1 Generate a lot of different inputs 2 Feed them to a program 3 Wait for it to reach an invalid state 4 Collect a report for the analyst 4

slide-5
SLIDE 5

Features

Pros:

  • Easy to setup
  • It can find subtle bugs

Cons:

  • It might require large amount of resources
  • Semi-decidable

5

slide-6
SLIDE 6

A huge leap forward

Coverage-guided fuzzing

6

slide-7
SLIDE 7

A huge leap forward

Coverage-guided fuzzing

Privilege inputs leading to cover new code paths

7

slide-8
SLIDE 8

A huge leap forward

int main() { if (A && B) { crash (); } else { all_good (); } }

8

slide-9
SLIDE 9

The Control-flow Graph

A B

9

slide-10
SLIDE 10

First run

Input: 0000 0000 0000 0000

A B

10

slide-11
SLIDE 11

First run

Input: 0000 0000 0000 0000

A B

11

slide-12
SLIDE 12

First run

Input: 0000 0000 0000 0000

A B

12

slide-13
SLIDE 13

First run

Input: 0000 0000 0000 0000

A B

13

slide-14
SLIDE 14

Second run

Input: 0000 0000 0000 0001

A B

14

slide-15
SLIDE 15

Second run

Input: 0000 0000 0000 0001

A B

15

slide-16
SLIDE 16

Second run

Input: 0000 0000 0000 0001

A B

16

slide-17
SLIDE 17

Second run

Input: 0000 0000 0000 0001

A B

17

slide-18
SLIDE 18

This input is not interesting!

18

slide-19
SLIDE 19

Third run

Input: 0001 0000 0000 0000

A B

19

slide-20
SLIDE 20

Third run

Input: 0001 0000 0000 0000

A B

20

slide-21
SLIDE 21

Third run

Input: 0001 0000 0000 0000

A B

21

slide-22
SLIDE 22

Third run

Input: 0001 0000 0000 0000

A B

22

slide-23
SLIDE 23

Third run

Input: 0001 0000 0000 0000

A B

23

slide-24
SLIDE 24

This input is interesting! It led us to discover a new basic block

24

slide-25
SLIDE 25

Fourth run

Input: 0011 0000 0000 0000

A B

25

slide-26
SLIDE 26

Fourth run

Input: 0011 0000 0000 0000

A B

26

slide-27
SLIDE 27

Fourth run

Input: 0011 0000 0000 0000

A B

27

slide-28
SLIDE 28

Fourth run

Input: 0011 0000 0000 0000

A B

28

slide-29
SLIDE 29

Fourth run

Input: 0011 0000 0000 0000

A B

29

slide-30
SLIDE 30

american fuzzy lop

  • It made coverage-guided fuzzing popular
  • Developed by lcamtuf
  • Performs instrumentation to detect executed basic blocks
  • Two key modes of operation:
  • Source mode
  • Binary mode

30

slide-31
SLIDE 31

Source mode

Instrumentation is performed at compiler-level

31

slide-32
SLIDE 32

Source mode

Instrumentation is performed at compiler-level

int main() { record (1); if (A && B) { record (2); crash (); } else { record (3); all_good (); } record (4); }

32

slide-33
SLIDE 33

Binary mode

An emulator is employed to detect executed basic blocks

33

slide-34
SLIDE 34

Binary mode

An emulator is employed to detect executed basic blocks

  • QEMU is the chosen emulator
  • It incurs in a sensible slowdown

34

slide-35
SLIDE 35

libfuzzer

  • Alternative to afl
  • It requires the source code to be available
  • Based on LLVM

35

slide-36
SLIDE 36

What’s LLVM? LLVM is a compiler framework

Famous for its C/C++ frontend (clang) and its intermediate representation (the LLVM IR)

36

slide-37
SLIDE 37

libfuzzer can be a lot faster

It doesn’t fork

int main() { while (true) { char *new_input = random_input (); target(new_input ); } }

37

slide-38
SLIDE 38

Index

Coverage-guided fuzzing An overview of rev.ng Experimental results

38

slide-39
SLIDE 39

What is rev.ng?

rev.ng is a unified framework for binary analysis based on QEMU and LLVM

39

slide-40
SLIDE 40

What is rev.ng?

rev.ng is a unified framework for binary analysis based on QEMU and LLVM

Everything you’ll see here is architecture-agnostic

40

slide-41
SLIDE 41

How does QEMU work?

41

slide-42
SLIDE 42

A dynamic binary translator

AArch64 AArch64 ARM Alpha CRIS Unicore SPARC SPARC64 SuperH SystemZ PowerPC PowerPC64 XCore MIPS MIPS64 OpenRISC MicroBlaze x86-64 x86 RISC V QEMU IR QEMU IR AArch64 ARM x86 x86-64 MIPS PowerPC SystemZ SPARC TCI

42

slide-43
SLIDE 43

The frontend is a lifter

AArch64 AArch64 ARM Alpha CRIS Unicore SPARC SPARC64 SuperH SystemZ PowerPC PowerPC64 XCore MIPS MIPS64 OpenRISC MicroBlaze x86-64 x86 RISC V QEMU IR QEMU IR AArch64 ARM x86 x86-64 MIPS PowerPC SystemZ SPARC TCI

43

slide-44
SLIDE 44

QEMU translates at run-time

44

slide-45
SLIDE 45

QEMU translates at run-time rev.ng translates offline

45

slide-46
SLIDE 46

rev.ng: a static binary translator

md5sum.arm Collect entry points Lift to QEMU IR Translate to LLVM IR Collect new entry points Link runtime functions md5sum.x86-64

46

slide-47
SLIDE 47

QEMU IR Alpha ARM AArch64 RISC V Hexagon x86 x86-64 MicroBlaze OpenRISC MIPS64 MIPS XCore PowerPC64 PowerPC SystemZ SuperH SPARC SPARC64 Unicore CRIS

47

slide-48
SLIDE 48

LLVM IR Alpha ARM AArch64 RISC V Hexagon x86 x86-64 MicroBlaze OpenRISC MIPS64 MIPS XCore PowerPC64 PowerPC SystemZ SuperH SPARC SPARC64 Unicore CRIS

48

slide-49
SLIDE 49

rev.ng Alpha ARM AArch64 RISC V Hexagon x86 x86-64 MicroBlaze OpenRISC MIPS64 MIPS XCore PowerPC64 PowerPC SystemZ SuperH SPARC SPARC64 Unicore CRIS

49

slide-50
SLIDE 50

rev.ng Alpha ARM AArch64 RISC V Hexagon x86 x86-64 MicroBlaze OpenRISC MIPS64 MIPS XCore PowerPC64 PowerPC SystemZ SuperH SPARC SPARC64 Unicore CRIS

50

slide-51
SLIDE 51

We produce LLVM IR

51

slide-52
SLIDE 52

We produce LLVM IR

We can employ libfuzzer directly

52

slide-53
SLIDE 53

Steps

1 Lift the program to LLVM IR 2 Identify all the functions 3 Identify a function to fuzz 4 Create the fuzzing function 5 Compile fuzzing function 6 Instrument using libfuzzer 7 Launch the fuzzer 53

slide-54
SLIDE 54

Steps

1 Lift the program to LLVM IR 2 Identify all the functions 3 Identify a function to fuzz MANUAL 4 Create the fuzzing function MANUAL 5 Compile fuzzing function 6 Instrument using libfuzzer 7 Launch the fuzzer 54

slide-55
SLIDE 55

Index

Coverage-guided fuzzing An overview of rev.ng Experimental results

55

slide-56
SLIDE 56

We are sensibly faster than QEMU

56

slide-57
SLIDE 57

We are sensibly faster than QEMU

1 The LLVM optimizer has a wider view on the code 2 The translation is performed offline 57

slide-58
SLIDE 58

Runtime (seconds)

458.sjeng 464.h264ref 400.perlbench 471.omnetpp 462.libquantum 473.astar

1000 2000 3000 4000

Native QEMU rev.ng 401.bzip2 483.xalancbmk 429.mcf 403.gcc 445.gobmk 456.hmmer

500 1000 1500 2000 58

slide-59
SLIDE 59

On average, 68% faster than QEMU

59

slide-60
SLIDE 60

A practical case study

We want to fuzz the PCRE library

60

slide-61
SLIDE 61

A practical case study

We want to fuzz the PCRE library

Not directly, but embedded in another program (less)

61

slide-62
SLIDE 62

Steps (again)

1 Lift the program to LLVM IR 2 Identify all the functions 3 Identify a function to fuzz 4 Create the fuzzing function 5 Compile fuzzing function 6 Instrument using libfuzzer 7 Launch the fuzzer 62

slide-63
SLIDE 63

Steps (again)

1 Lift the program to LLVM IR 2 Identify all the functions 3 Identify a function to fuzz 4 Create the fuzzing function 5 Compile fuzzing function 6 Instrument using libfuzzer 7 Launch the fuzzer 63

slide-64
SLIDE 64

Fuzzing function (simplified)

int LLVMFuzzerTestOneInput(uint8_t *data , size_t size) { char input_string [] = "Test␣string!"; void *compiled_re; compiled_re = pcre_compile(data); pcre_exec(compiled_re , input_string , strlen(input_string )); pcre_free(compiled_re ); return 0; }

64

slide-65
SLIDE 65

We were able to find a known vulnerability in PCRE

65

slide-66
SLIDE 66

Comparing with afl

Are we faster than afl?

  • afl fuzzing worked directly on PCRE (without less)
  • Used black-box mode

66

slide-67
SLIDE 67

Performances

Execs per second Total execs 1 min 10 min 60 min 60 min afl 3 582 3 495 3 682 13 187 295 rev.ng 150 617 79 701 78 306 271 217 728

67

slide-68
SLIDE 68

Summary

  • We do not require the source code
  • We can fuzz any entry point
  • We are sensibly faster than existing techniques

68

slide-69
SLIDE 69

Future works

  • Improve performances
  • Perform symbolic execution (through KLEE)

69

slide-70
SLIDE 70

Future works

Backup slides

70

slide-71
SLIDE 71

Very effective!

71

slide-72
SLIDE 72

License

This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/

  • r send a letter to Creative Commons, 444 Castro Street, Suite

900, Mountain View, California, 94041, USA.

72