NEUZZ: Efficient Fuzzing with Neural Program Smoothing Dongdong She, - - PowerPoint PPT Presentation

neuzz efficient fuzzing with
SMART_READER_LITE
LIVE PREVIEW

NEUZZ: Efficient Fuzzing with Neural Program Smoothing Dongdong She, - - PowerPoint PPT Presentation

NEUZZ: Efficient Fuzzing with Neural Program Smoothing Dongdong She, Kexin Pei, Dave Epstein, Junfeng Yang, Baishakhi Ray, and Suman Jana Columbia University 1 Fuzzing: a popular way to uncover bugs [Liang et al. 2019] 2 Evolutionary Fuzzing


slide-1
SLIDE 1

NEUZZ: Efficient Fuzzing with Neural Program Smoothing

Dongdong She, Kexin Pei, Dave Epstein, Junfeng Yang, Baishakhi Ray, and Suman Jana Columbia University

1

slide-2
SLIDE 2

Fuzzing: a popular way to uncover bugs

2

[Liang et al. 2019]

slide-3
SLIDE 3

Evolutionary Fuzzing

3

Advantage: easy to implement Disadvantage: inefficient

  • Random mutations are not effective
  • Often get stuck in long sequence of

wasteful mutations Mutation

Hard to find scalable and adaptive heuristics for guided mutation

Seed Children Grandchildren

slide-4
SLIDE 4

A new approach to fuzzing

4

slide-5
SLIDE 5

Fuzzing: An Optimization Problem

5

a program input # of bugs found by input generate K inputs from input space Maximize is discrete and hard to optimize

Find C(X) that can maximize total no. of bugs

F(x) C(X) X

x x x

  • x∈C(X)

F(x)

∈X

F(x)

slide-6
SLIDE 6

6

Fuzzing: An Optimization Problem

: # of bugs Input Hard to find inputs like and among flat plateaus

F(x)

x

x1 x2 x1 x2

slide-7
SLIDE 7

Fuzzing: An Optimization Problem

7

a program input edge coverage of input generate K inputs from input space Maximize

Find C(X) that can maximize total number of edges

C(X) X

x x x

∈X

G(x)

  • x∈C(X)

G(x)

slide-8
SLIDE 8

Input

8

Fuzzing: An Optimization Problem

: # of edges

x

G(x)

slide-9
SLIDE 9

Input

9

Evolutionary optimization

x

1 2 3 4 5

Random mutation is not efficient : # of edges

G(x)

slide-10
SLIDE 10

Input

10

Gradient-guided Optimization

: # of edges

x

Smooth Approximation + Gradient-guided Mutation

G(x) H(x)

: smooth approximation of G(x)

slide-11
SLIDE 11

: smooth approximation of Input

11

Gradient-guided Optimization

x

Smooth Approximation + Gradient-guided Mutation

H(x) G(x)

1 2 3 4 5

slide-12
SLIDE 12

Smooth Approximation

Problem: How to smoothly approximate G(x)? Neuzz Solution: Use a NN to learn a smooth H(x) Universal Approximation Theorem: A NN can approximate any continuous function

12

slide-13
SLIDE 13

Gradient-guided Mutation

13

Why gradient guidance? Gradient indicates critical parts of input What are critical parts of the input? Critical parts of input affect program branches How gradient-guided mutation works? Focus mutations on the critical parts of the input

slide-14
SLIDE 14

Main Idea behind Neuzz

14

Input Branching Behaviors Program NN

Gradient-guided mutation Smooth Surrogate

Input Branching Behaviors

slide-15
SLIDE 15

A Peek Into NN Model

15

slide-16
SLIDE 16

Generalization to Unseen branches

Observations:

  • Real world program inputs have critical parts
  • Most of branches are affected by the critical parts

Neuzz Solution:

  • Identify critical parts based on observed branches
  • Perform more mutations on the critical part of

inputs to explore unseen branches

16

slide-17
SLIDE 17

Design of NEUZZ

17

slide-18
SLIDE 18

Evaluation

Ø 10 real world programs Ø Lava-M and DARPA CGC datasets Ø Comparison with RNN-based fuzzers Ø Performance of different model choices

18

slide-19
SLIDE 19

Evaluations: Edge Coverage NEUZZ vs. state-of-the-art fuzzers

10 real world applications for 24 hours NEUZZ achieves on average 3x more edge coverage than other fuzzers

19

slide-20
SLIDE 20

Evaluations: Bug Finding NEUZZ vs. state-of-the-art fuzzers

NEUZZ finds the most number of bugs and all 5 bug types including two new CVEs

20

slide-21
SLIDE 21

Evaluations: Lava-M and CGC

21

NEUZZ outperforms state-of-the-art fuzzers on LAVA-M and CGC

Lava-M dataset DARPA CGC dataset

slide-22
SLIDE 22

Evaluations: NEUZZ vs. RNN-based Fuzzer

NEUZZ achieves 6x more edge coverage and 20x less training time

22

slide-23
SLIDE 23

Evaluations: Effect of Different NNs

23

NEUZZ achieves best performance with NN+Incremetal learning Edge coverage for 1M mutations

slide-24
SLIDE 24

Key Takeaways of NEUZZ

  • Use NN gradients to identify the critical locations of

program inputs

  • Focus mutations on the critical locations
  • Minimize runtime overhead by using simple feed-forward

neural networks

  • Retrain the network incrementally to find new critical

locations

24

slide-25
SLIDE 25

Github Repo

NEUZZ is available at

https://github.com/Dongdongshe/neuzz

25

slide-26
SLIDE 26

NEUZZ: Efficient Fuzzing with Neural Program Smoothing

Dongdong She, Kexin Pei, Dave Epstein, Junfeng Yang, Baishakhi Ray, and Suman Jana Columbia University

26