through Coverage-guided Tracing Stefan Nagy Matthew Hicks - PowerPoint PPT Presentation

Full-speed Fuzzing: Reducing Fuzzing Overhead through Coverage-guided Tracing Stefan Nagy Matthew Hicks snagy2@vt.edu mdhicks2@vt.edu COMPUTER SCIENCE 1

Fuzzing 2 COMPUTER SCIENCE

An Overview of Fuzzing Time-tested technique AFL, honggFuzz, libFuzzer CVE ’s galore Popular in the industry Google, Microsoft Fuzzing platforms MSRD, OSS-Fuzz, FuzzBuzz, FuzzIt Source: lcamtuf.coredump.cx/afl Most popular: coverage-guided fuzzing 3 COMPUTER SCIENCE

Coverage-guided Fuzzing Angora Steelix New coverage FidgetyAFL VUzzer AFLFast ✓ Driller QSYM (<< N ) Trigger bugs SkyFire Coverage- CollAFL zZZZ T-Fuzz MutaGen guided … Tracing No new coverage X ▲ (<< N ) ( N ) test cases ▲ 0.3 % 36 – 612 % overhead ( ~ N ) ▲ Orthogonal to tracing, generation 4 COMPUTER SCIENCE

How are coverage-increasing test cases found? By tracing every test case! Dynamic translation Static callbacks Static inlining faster slower binary-only from source (“black-box”) (“white-box”) 5 COMPUTER SCIENCE

How do fuzzers spend their time? Avg. % time Avg. rate AFL – “naïve” fuzzing on exec/ cvg.-incr. Fuzzer, Driller – “smart” fuzzing tracer trace test cases 91.8 6.20E-5 AFL-Clang 8 benchmarks, 1hr trials 97.3 2.57E-4 AFL-QEMU Driller-QEMU 95.9 6.53E-5 ▼ O1: > 90% time on test case tracing, execution ▼ O2: < 3/10000 test cases increase coverage 6 COMPUTER SCIENCE

Likelihood of coverage-increasing test cases? AFL-QEMU 5x 24hr trials x 8 benchmarks ▼ O3 : rate decreases over time ( < 1/10000 ) 7 COMPUTER SCIENCE

Impact of tracing every test case? ▼ Over 90 % of time is spent tracing test cases … ▼ Over 99.99 % of which are discarded ! Equivalent to checking every straw to find the needle! 8 COMPUTER SCIENCE

Why is tracing every test case expensive? Storing coverage call loc.__afl_maybe_log mov rax, qword [arg_10h] • Bitmaps, arrays mov rcx, qword [arg_8h] <B1> mov rdx, qword [rsp] lea <B1> rsp, qword rsp + 0x98 Block <B1> Multiple additional <B4> <B4> instructions per block benchmark # blocks Many blocks, edges bsdtar 31379 pdftohtml 54596 Long exec paths, loops readelf 21249 Overhead quickly adds up tcpdump 33743 9 COMPUTER SCIENCE

Coverage-guided Tracing 10 COMPUTER SCIENCE

Guiding Principle Can we identify coverage-increasing test cases without tracing every test case ? 11 COMPUTER SCIENCE

Find New Coverage Without Tracing Apply and dynamically remove interrupts B1 B1 401a49: 55 push %rbp <init> <INT> Hit 401a4a: 48 89 e5 mov %rsp, %rbp 401a49: CC INT 03 401a4d: 48 81 ec sub $0x380, %rsp 401a4a: 48 89 e5 mov %rsp, %rbp 401a54: 89 bd 8c mov %edi, -0x374(%rbp) 401a4d: 48 81 ec sub $0x380, %rsp New coverage! 401a54: 89 bd 8c mov %edi, -0x374(%rbp) B2 B3 <this> <that> Reset Overwrite with interrupt 401a49: 55 push %rbp 401a4a: 48 89 e5 mov %rsp, %rbp 401a4d: 48 81 ec sub $0x380, %rsp Continue! 401a54: 89 bd 8c mov %edi, -0x374(%rbp) B4 <exit> 12 COMPUTER SCIENCE

Coverage-guided Tracing Approach: Trace only coverage-increasing test cases ”Filter-out” those that don’t hit an interrupt Hit one ✓ ✓ ✓ <INT> <INT> Trace <INT> ✓ ✓ ✓ <INT> <INT> <INT> Reset ✓ ✓ <INT> <INT> <INT> ✓ ✓ Continue <INT> <INT> <INT> ▲ Common case ( 99.99 %) don’t hit —thus aren’t traced ▲ Approaches native execution speed ( 0 % overhead ) 13 COMPUTER SCIENCE

Incorporating CGT into Fuzzing Implementation: UnTracer ✓ X <INT> ( ~ N ) <INT> <INT> <INT> ✓ ( << N ) X ▲ ( ~ N ) of ( N ) : <B1> native speed! <B2> <B3> 14 COMPUTER SCIENCE

Evaluation 15 COMPUTER SCIENCE

Performance Evaluation [ BB ] = black-box (binary-only) Goal: isolate tracing overhead [ WB ] = white-box (from source) 1-core VM’s to avoid OS noise Fuzzing Description Tracer Strip AFL to tracing-only code AFL-Dyninst [ BB ] Static rewriting [ BB ] Dynamic 8 diverse real-world benchmarks AFL-QEMU translation AFL-Clang [ WB ] Assembly rewriting Compare tracer exec times 5 days’ test cases per benchmark • UnTracer [ BB ] Coverage-guided (Dyninst) Tracing (static rewriting) 5x trials per day of test cases • 16 COMPUTER SCIENCE

Benchmarks Benchmark name Benchmark type bsdtar (libarchive) archiv ing cert-basic (libksba) crypto graphy cjson (cjson) web development djpeg (libjpeg) image processing pdftohtml (poppler) doc ument processing readelf (binutils) dev elopment sfconvert (audiofile) audio processing tcpdump (tcpdump) net working 17 COMPUTER SCIENCE

Can CGT beat tracing all with Black-box ? AVG. relative overhead: ▼ AFL-Dyninst 518% ▼ AFL-QEMU 618% ▲ UnTracer 0.3 % 18 COMPUTER SCIENCE

Can CGT beat tracing all with White-box ? AVG. relative overhead: ▼ AFL-Dyninst 518% ▼ AFL-QEMU 618% ▲ UnTracer 0.3 % ▼ AFL-Clang 36% 19 COMPUTER SCIENCE

Can CGT boost hybrid fuzzing throughput? Goal: measure impact on total test case throughput QSYM (concolic exec + fuzzing) 8 benchmarks, 5x 24-hr trials QSYM-UnTracer throughput: ▲ 616 % >> QSYM-QEMU ▲ 79 % >> QSYM-Clang 20 COMPUTER SCIENCE

Conclusions: Why Coverage-guided Tracing? ▼ Fuzzers find coverage-increasing test cases by tracing all of them ▼ Costs over 90% of time yet over 99.99 % are inevitably discarded These resources could be better used to find bugs! CGT restricts tracing to the few guaranteed to increase coverage ▲ Performance: Cuts tracing overhead from 36-618 % to 0.3 % Boosts test case throughput by 79-616 % ▲ Compatibility: “Filter-out” approach allows plugging-in any tracer ▲ Orthogonality: Can combine with other fuzzing improvements (e.g., better test case generation, faster tracing) 21 COMPUTER SCIENCE

Thank you! Our open-sourced software: • UnTracer-AFL UnTracer integrated with AFL • afl-fid AFL suite for fixed input datasets • FoRTE-FuzzBench Our 8 real-world benchmarks All repos are available here! https://github.com/ FoRTE-Research 22 COMPUTER SCIENCE

Expanding Coverage Metrics Current work: edge Block Covered Blocks coverage, hit counts <A> A, B, C A, D, C Static critical edge Block Block handling doable <D> <B> Implicit Edges A-B, B-C A-C A-D, D-C Hit counts need more Block <C> complex transforms 23 COMPUTER SCIENCE

CGT versus Hardware-Assisted Tracing Can approximate Intel-PT overhead: • AFL-Clang = 36% OH • AFL-Clang ≅ 10-100% OH rel. to AFL-Clang-fast • AFL-Clang-fast ≅ 18-32% OH • Intel-PT ≅ 7% OH rel. to AFL-Clang-fast • Intel-PT ≅ 19-35% OH Trace decoding adds way more 24 COMPUTER SCIENCE

Fully Black-box (binary-only) Implementation Oracle forkserver uses assembly-time instrumentation Theoretically doable via binary rewriting • Dyninst’s performance infeasible Binary hooking an alternative e.g., via LD_PRELOAD 25 COMPUTER SCIENCE

Appendix -- CGT step-by-step Intuition : restrict tracing to coverage-increasing test cases 1. Statically overwrite start of each block with an interrupt • The “Interest Oracle” 2. Get a new test case and run it on the oracle 3. If an interrupt is triggered: Trace the test case’s code coverage • Unmodify (reset) all newly -covered blocks • 4. Return to step 2 26 COMPUTER SCIENCE

Appendix -- CGT step-by-step As more blocks unmodified over time, binary starts to mirror the original Thus, most testcases are run at native execution speed ! 27 COMPUTER SCIENCE

Appendix -- Implementation: UnTracer Built atop AFL • Dyninst for CFG/tracing • File I/O for mod/unmod • 28 COMPUTER SCIENCE

through Coverage-guided Tracing Stefan Nagy Matthew Hicks - PowerPoint PPT Presentation

Full-speed Fuzzing: Reducing Fuzzing Overhead through Coverage-guided Tracing Stefan Nagy Matthew Hicks snagy2@vt.edu mdhicks2@vt.edu COMPUTER SCIENCE 1 Fuzzing 2 COMPUTER SCIENCE An Overview of Fuzzing Time-tested technique AFL,

Advanced Ray Tracing 1 2/8/2006 Distributed Ray Tracing Distributed ray tracing is an

FFR Guided Functional FFR Guided Functional FFR Guided Functional FFR Guided Functional

61A Extra Lecture 9 Announcements Pixels (Demo) Ray Tracing Ray Tracing A technique for

Computer Graphics - Ray-Tracing II - Hendrik Lensch Computer Graphics WS07/08 Ray Tracing II

1 minute Path tracing Bidirectional path tracing Progressive photon mapping 1 minute

MIT 6.837 - Ray Tracing Ray Tracing MIT EECS 6.837 Most slides are taken from Frdo Durand and

Advanced Ray Tracing Stochastic ray tracing: distribute rays stochastically across pixel

Guided Therapeutics in Cancer Surgery Guided Therapeutics in Cancer Surgery Guided Therapeutics

Computer Graphics - Ray Tracing I - Hendrik Lensch Computer Graphics WS07/08 Ray Tracing I

Knowledge Tracing Machines: Factorization Machines for Knowledge Tracing Jill-Jnn Vie Hisashi

Introduction to Path Tracing Marc Sunet Table of contents From Ray Tracing to Path Tracing The

Ray Tracing 1 Ray Tracing Ray Tracing kills two birds with one stone: Solves the Hidden

Tracing with Perf tools Namhyung Kim 2013-11-13 Wed Namhyung Kim Tracing with Perf tools

MVC Guided Pathways Brief review of Guided Pathways at MVC Plan for Today Spring

Coverage-Oriented Verification Coverage-Oriented Verification of Banias of Banias Alon Gluska

Data Flow Coverage 1 Stuart Anderson Stuart Anderson Data Flow Coverage 1 2011 c 1 Why

An Introduction to caret Max Kuhn max.kuhn@pfizer.com Pfizer Global R & D Nonclinical

The Impact of M-Health on the Self-Management of Diabetes: A preliminary study Annie Chang

1. Exergames in Rehabilitation WuppDi!, University Bremen University Ulster Assad et al. 2011

Person Centered Care for Person's With Dementia TRO Conference London 2017 Introduction Katelyn

Entrepreneurship does it start with a good idea? Dr Erik Lundmark What do scholars mean when

physicochemical and toxicological properties of chemicals using computed molecular descriptors

Phenotype Sequencing Marc Harper UCLA Bioinformatics, Genomics and Proteomics March 4th, 2013

Systematic Annotation Mark Voorhies 4/5/2012 Mark Voorhies Systematic Annotation Review RTFM

through Coverage-guided Tracing Stefan Nagy Matthew Hicks - PowerPoint PPT Presentation

Full-speed Fuzzing: Reducing Fuzzing Overhead through Coverage-guided Tracing Stefan Nagy Matthew Hicks snagy2@vt.edu mdhicks2@vt.edu COMPUTER SCIENCE 1 Fuzzing 2 COMPUTER SCIENCE An Overview of Fuzzing Time-tested technique AFL,

Advanced Ray Tracing 1 2/8/2006 Distributed Ray Tracing Distributed ray tracing is an

FFR Guided Functional FFR Guided Functional FFR Guided Functional FFR Guided Functional

61A Extra Lecture 9 Announcements Pixels (Demo) Ray Tracing Ray Tracing A technique for

Computer Graphics - Ray-Tracing II - Hendrik Lensch Computer Graphics WS07/08 Ray Tracing II

1 minute Path tracing Bidirectional path tracing Progressive photon mapping 1 minute

MIT 6.837 - Ray Tracing Ray Tracing MIT EECS 6.837 Most slides are taken from Frdo Durand and

Advanced Ray Tracing Stochastic ray tracing: distribute rays stochastically across pixel

Guided Therapeutics in Cancer Surgery Guided Therapeutics in Cancer Surgery Guided Therapeutics

Computer Graphics - Ray Tracing I - Hendrik Lensch Computer Graphics WS07/08 Ray Tracing I

Knowledge Tracing Machines: Factorization Machines for Knowledge Tracing Jill-Jnn Vie Hisashi

Introduction to Path Tracing Marc Sunet Table of contents From Ray Tracing to Path Tracing The

Ray Tracing 1 Ray Tracing Ray Tracing kills two birds with one stone: Solves the Hidden

Tracing with Perf tools Namhyung Kim 2013-11-13 Wed Namhyung Kim Tracing with Perf tools

MVC Guided Pathways Brief review of Guided Pathways at MVC Plan for Today Spring

Coverage-Oriented Verification Coverage-Oriented Verification of Banias of Banias Alon Gluska

Data Flow Coverage 1 Stuart Anderson Stuart Anderson Data Flow Coverage 1 2011 c 1 Why

An Introduction to caret Max Kuhn max.kuhn@pfizer.com Pfizer Global R &amp; D Nonclinical

The Impact of M-Health on the Self-Management of Diabetes: A preliminary study Annie Chang

1. Exergames in Rehabilitation WuppDi!, University Bremen University Ulster Assad et al. 2011

Person Centered Care for Person's With Dementia TRO Conference London 2017 Introduction Katelyn

Entrepreneurship does it start with a good idea? Dr Erik Lundmark What do scholars mean when

physicochemical and toxicological properties of chemicals using computed molecular descriptors

Phenotype Sequencing Marc Harper UCLA Bioinformatics, Genomics and Proteomics March 4th, 2013

Systematic Annotation Mark Voorhies 4/5/2012 Mark Voorhies Systematic Annotation Review RTFM

An Introduction to caret Max Kuhn max.kuhn@pfizer.com Pfizer Global R & D Nonclinical