automated test generation a journey from symbolic
play

Automated Test Generation: A Journey from Symbolic Execution to - PowerPoint PPT Presentation

Automated Test Generation: A Journey from Symbolic Execution to Smart Fuzzing and Beyond Koushik Sen EECS Department University of California, Berkeley https://people.eecs.berkeley.edu/~ksen/ 1 Programs are still written by humans, and will


  1. Feedback-directed Fuzzing 101 Seed Interesting Lots of choices: Inputs Mutate 1. Which input to pick? Run on the Input Input Inputs 2. How to mutate an Pick an Input input? Input Input . Input 3. How many mutants to . Input Fuzzer Program . generate? Input . Inputs 4. What kind of feedback? . 5. How to decide if an Input input is interesting? Yes: add Input Resolved using heuristics over a period of 10 years Feedback Interesting? • Coverage • New coverage? • • Execution length Longer execution? • Valid input? • Well-formed input • ... • ... No: Discard input 50

  2. Feedback-directed Fuzzing 101 Seed Interesting Lots of choices: Inputs Mutate 1. Which input to pick? Run on the Input Input Inputs 2. How to mutate an Pick an Input input? Input Input . Input 3. How many mutants to . Input Fuzzer Program . generate? Input . Inputs 4. What kind of feedback? . 5. How to decide if an Input input is interesting? Yes: add Input Resolved using heuristics over a period of 10 years Feedback Interesting? • Coverage • New coverage? • • Execution length Longer execution? • Valid input? • Well-formed input • ... • ... No: Discard input 51

  3. Feedback-directed Fuzzing 101 Fuzzers: Seed • AFL Interesting Inputs • AFLFast Mutate Run on the Input • Libfuzzer Input Inputs Pick an • Input Angora Input Input • . VUzzer Input . Input • Steelix Fuzzer Program . Input . Inputs • AFLGo . • AFLSmart Input • Nautilus Yes: add • FairFuzz Input • PerfFuzz • JQF/Zest • FuzzFactory Feedback Interesting? • • Coverage RLCheck • New coverage? • • Execution length Longer execution? • Valid input? • Well-formed input • ... • ... No: Discard input 52

  4. What Bugs Can Fuzzing Find? • Assertion violations • Segmentation faults • Buffer overflows • Use-after-frees • Integer signedness • etc. … 53

  5. What Bugs Have Fuzzing Found? • Tons of them ... • CVE-2014-6277: ShellShock bug in Bash: – GNU Bash through 4.3 bash43-026 does not properly parse function definitions in the values of environment ... • CVE-2014-0160: Heartbleed bug in OpenSSL: – A read buffer overflow allowed an attacker to extract information from servers using OpenSSL • CVE-2016-8677: ImageMagick – imagemagick: memory allocate failure in AcquireQuantumPixels (quantum.c) • CVE-2014-1564: Firefox – Mozilla Firefox before 32.0, Firefox ESR 31.x before 31.1, and Thunderbird 31.x before 31.1 do not properly initialize memory for GIF rendering • CVE-2010-0539: Safari Remote Execution – Integer signedness error in the window drawing implementation in Apple Java for Mac OS X 10.5 ... • See http://lcamtuf.coredump.cx/afl/ for an exhaustive list of bugs and security vulnerabilities found by a state-of-the-art fuzzer AFL 54

  6. How Good is Fuzzing? 55

  7. What’s Missing? Uneven Coverage Observation: some parts of the program easier to int process_xml(char * fuzzed_data, cover int fuzzed_data_len) { Hit by 100k+ inputs if (fuzzed_data_len >= 10) {  Code under is // more code well-covered } // ... Hit by 1 input if (starts_with(fuzzed_data, “<!ATTLIST”)){  Code under is // ... barely covered } // ... return process_result; } 56

  8. F a i r F u z z FairFuzz : A Targeted Mutation Strategy for F u z ! Increasing Greybox Fuzz Testing Coverage ? u z r F z u z Caroline Lemieux, Koushik Sen University of California, Berkeley source: https://github.com/carolemieux/afl-rb 57

  9. Feedback-directed Fuzzing 101 Seed Interesting Inputs Mutate Run on the Input Input Inputs Pick an Input Input Input . Input . Input Fuzzer Program . Input . Inputs . Input Yes: add Input Feedback Interesting? • Coverage • New coverage? • • Execution length Longer execution? • Valid input? • Well-formed input • ... • ... No: Discard input 58

  10. FairFuzz: Ideas FairFuzz Ideas: Seed Interesting Inputs 2 heuristics Mutate Run on the Input Input Inputs Pick an Input 1. Identify : branches hit Input Input . Input by few inputs (rare . Input Fuzzer Program . Input branches) . Inputs . 2. Identify : where input Input can be mutated and hit Yes: add Input branch Feedback Interesting? • Coverage • New coverage? • • Execution length Longer execution? • Valid input? • Well-formed input • ... • ... No: Discard input 59

  11. FairFuzz: Ideas FairFuzz Ideas: Seed Interesting Inputs 2 heuristics Mutate Run on the Input Input Inputs Pick an Input 1. Identify : branches hit Input Input . Input by few inputs (rare . Input Fuzzer Program . Input branches) . Inputs . 2. Identify : where an Input input can be mutated Yes: add Input and hit branch Feedback Interesting? • Coverage • New coverage? • • Execution length Longer execution? • Valid input? • Well-formed input • ... • ... No: Discard input 60

  12. Summary Results – Coverage Leaders 61

  13. Summary Results – Coverage Leaders  FairFuzz achieves the highest coverage fast, for nearly all benchmarks 62

  14. PerfFuzz : Automatically Generating Pathological Inputs Caroline Lemieux, Rohan Padhye, Koushik Sen, Dawn Song University of California, Berkeley source: https://github.com/carolemieux/perffuzz 63

  15. Performance Problems Have Consequences poor user experience security vulnerabilities (DoS) excessive resource consumption 64

  16. Feedback-directed Fuzzing 101 Seed Interesting Inputs Mutate Run on the Input Input Inputs Pick an Input Input Input . Input . Input Fuzzer Program . Input . Inputs . Input Yes: add Input Feedback Interesting? • Coverage • New coverage? • • Execution length Longer execution? • Valid input? • Well-formed input • ... • ... No: Discard input 65

  17. PerfFuzz: Idea Seed PerfFuzz Ideas: Interesting Inputs Mutate Run on change heuristic the Input Input Inputs Pick an Input Input Input 1. Feedback: # of . Input . Input times each branch Fuzzer Program . Input . Inputs is executed . 2. Interesting: Longer Input execution of some Yes: add branch Input Feedback Interesting? • • # of times each Longer execution of some branch? branch is executed No: Discard input 66

  18. Macro-Benchmarks: Maximum Path Length • Path length: total number of hits of CFG edges by an input libpng libxml2 libjpeg- zlib turbo 67

  19. Macro-Benchmarks: Maximum Path Length • Path length: total number of hits of CFG edges by an input 24.7x libpng libxml2 libjpeg- zlib turbo 68

  20. PerfFuzz: Memory-alloc Fuzzing PerfFuzz Ideas: Seed Interesting Inputs change heuristic Mutate Run on the Input Input Inputs Pick an Input 1. Feedback: # of Input Input . Input bytes allocated at . Input Fuzzer Program . Input each malloc() call . Inputs . 2. Interesting: More bytes allocated Input than any other Yes: add Input input Interesting? Feedback • More bytes • # of bytes allocated allocated at some at each malloc() call? No: Discard input 69

  21. Memory-alloc fuzzing: OOMs and Bombs • Libpng 1. 100 bytes Input with large dimensions • Reader allocates 2 billion bytes 2. 100 bytes Input with large color space, but fixed dimension • Color table allocated with 4 GB space • Libarchive 1. 50 bytes zipped file: 4GB output 2. Memory leaks with LZMA compression (32 byte ZIP leaks 96 bytes) 70

  22. FuzzFactory : Domain-Specific Fuzzing with Waypoints Rohan Padhye and Caroline Lemieux and Koushik Sen and Laurent Simon and Hayawardh Vijayakumar source: https://github.com/rohanpadhye/FuzzFactory 71

  23. Domain-Specific Fuzzers • Zest [Padhye et al. 2018] – “increase coverage amongst valid inputs” • SlowFuzz [Petsios et al. 2017] – “increase path length” Common Strategy: • PerfFuzz [Lemieux et al. 2018] Save intermediate inputs – “maximize branch exec counts” “Waypoints” • DifFuzz [Nilizadeh et al. 2019] – “leak more info on the side channel” • MemFuzz [Coppik et al. 2019] – “access new input-dependent memory locations” 72

  24. Can we rapidly create domain- specific fuzzers? Without touching the underlying search algorithm 73

  25. Feedback-directed Fuzzing 101 Seed Interesting Inputs Mutate Run on the Input Input Inputs Pick an Input Input Input . Input . Input Fuzzer Program . Input . Inputs . Input Yes: add Input Interesting? Feedback • Better value of dsf(k) for some k ? (key-value map) No: Discard input 74

  26. Example Fuzzers using FuzzFactory • CMP – Goal : Test programs whose inputs require magic bytes, checksums, etc. – Waypoints : inputs which increase progress of strcmp, memcmp, strstr, etc. • MEM – Goal : Find memory allocation and management related bugs – Waypoints : input which which increase args to malloc() • CMP+MEM – Goal : Find memory mgmt bugs in programs with magic bytes, checksums, etc. – Waypoints : CMP or MEM 75

  27. Super-Fuzzer: CMP + MEM 76

  28. Super-Fuzzer: CMP + MEM LZ4 Bomb (4GB alloc when decoding 21-byte input) PNG Bomb (2GB alloc when reading ~100 byte 20px image) 77

  29. Coverage is Still Low 78

  30. Why Coverage is Still Low? ✗ Cannot explore “deep states” ✗ Cannot find complex logical bugs ✗ Gets stuck in input parsing stage ✗ Hardly gets 20%-30% code coverage on real-world software  But cheap and simple 79

  31. Time to Bring Human in the Loop Approach: Human restricts the set of inputs to be explored by providing A Randomized A Precondition on or or ... Generator Inputs Algorithms to search the restricted input space 80

  32. Semantic Fuzzing with Zest Rohan Padhye (UC Berkeley), Caroline Lemieux (UC Berkeley), Koushik Sen (UC Berkeley), Mike Papadakis (U. Luxembourg), Yves Le Traon (U. Luxembourg) source: https://github.com/rohanpadhye/jqf 81

  33. ? How do I test ... • a program taking an XML file as input – (e.g. Maven, Ant) • a compiler – (e.g. closure or Rhino compilers for JavaScript) • In general, a program taking structurally complex inputs 82

  34. Human Writes a Simple Input Generator  Generates random public XMLElement genXML ( Random random) { // Generate a random tag name syntactically valid String name = random.nextString(MAX_TAG_LENGTH) ; XMLElement node = new XMLElement (name); XML documents ✗ May not conform to // Generate a random number of children int n = random.nextInt( MAX_CHILDREN ); a given schema for ( int i = 0; i < n; i++) { // Generate child nodes recursively node.addChild( genXML (random)); } // Maybe insert text inside element if (random. nextBoolean ()) { node.addText( random.nextString(MAX_TEXT_LENGTH) ); } return node; } foo Example generated: <foo><i>xyz</i><br/></foo> i br xyz 83

  35. Zest: Mutate Params to Generator Seed Augmented Program Interesting Inputs Mutate the Generator params params Pick a set of params Input params . Input . Input Fuzzer . Input Program . params . params Yes: add Input Feedback Interesting? • • Coverage New coverage? • Valid input? • Input validity No: Discard input 84

  36. Zest: New bugs discovered  Google Closure Compiler : #2842, #2843, #3220, #3173  OpenJDK : JDK-8190332, JDK-8190511, JDK-8190512, JDK-8190997, JDK- 8191023, JDK-8191076, JDK-8191109, JDK-8191174,JDK-8191073, JDK- 8193444, JDK-8193877, CVE-2018-3214  Apache Commons : LANG-1385, COMPRESS-424, COLLECTIONS-714, CVE-2018- 11771  Apache Ant : #62655  Apache Maven : #34, #57  Apache PDFBox : PDFBOX-4333, PDFBOX-4338, PDFBOX-4339, CVE-2018-8036  Apache TIKA : CVE-2018-8017 , CVE-2018-12418  Apache BCEL : BCEL-303, BCEL-307, BCEL-308, BCEL-309, BCEL-310, BCEL- 311, BCEL-312, BCEL-313  Mozilla Rhino : #405, #406, #407, #409, #410 85

  37. Zest finds complex semantic bugs On this JavaScript input, Google’s Closure compiler throws an “ IllegalStateException: Unexpected variable” during optimization passes 86

  38. Time to Bring Human in the Loop Approach: Human restricts the set of inputs to be explored by providing A Randomized A Precondition on or or ... Generator Inputs Algorithms to search the restricted input space 87

  39. Efficient Sampling of SAT and SMT Constraints Rafael Dutra, Kevin Laeufer, Jonathan Bachrach, and Koushik Sen EECS Department UC Berkeley source: https://github.com/RafaelTupynamba/quicksampler 88

  40. Human Writes a Pre-condition on Inputs  An over-approximation of valid inputs In SMT (Satisfiability Modulo Theories)  Restricts the set of inputs to be generated (x + y = 4 ∧ x ≥ 0 ∧ x < 4) ∧ (mem’[1] < 0 ∨ mem’[1] ≥ 4), where x = mem[0], Goal: sample inputs from y = mem[1], the restricted input space mem’ = store(mem, mem[0], -1 * mem[mem[0]]) mem ∈ Array(BV[4], BV[4]) 89

  41. Sampling SAT and SMT Constraints Input: Logical constraint (SAT formula) Goal: Quickly generate lots of solutions that satisfy the constraint (x1 x4) (x1 ¬x3 ¬x8) x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 1 0 0 0 1 0 0 0 1 0 σ 0 (x1 x8 x6) 0 0 0 1 1 0 0 1 1 0 σ 1 (x2 x5) 1 1 0 0 1 0 0 0 1 0 σ 2 (¬x7 ¬x3 x9) 0 1 0 1 1 0 0 1 1 0 σ 3 (¬x7 x8 ¬x9) 1 0 1 0 1 0 0 0 1 0 σ 4 (x7 x8 ¬x10) 1 1 1 0 1 0 0 0 1 0 σ 5 (x7 x10 ¬x6)

  42. QuickSampler Our goals: Our approach: • Generate samples • Compute patterns of bit >100x faster than other flips which preserve techniques satisfiability • Sampling should be • Combine those bit flip close to uniform patterns to generate lots of samples 91

  43. Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) 92

  44. Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) x0 x1 x2 x3 y0 y1 y2 y3 0 0 1 0 1 1 0 0 Random assignment σ’ 93

  45. Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) x0 x1 x2 x3 y0 y1 y2 y3 0 0 1 0 1 1 0 0 Random assignment σ’ MAX-SAT 0 0 1 0 1 1 1 0 Solution σ 94

  46. Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) x0 x1 x2 x3 y0 y1 y2 y3 0 0 1 0 1 1 0 0 Random assignment σ’ MAX-SAT 0 0 1 0 1 1 1 0 Solution σ MAX-SAT 95

  47. Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) x0 x1 x2 x3 y0 y1 y2 y3 0 0 1 0 1 1 0 0 Random assignment σ’ MAX-SAT 0 0 1 0 1 1 1 0 Solution σ MAX-SAT 1 0 1 0 0 1 1 0 σ 0 96

  48. Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) x0 x1 x2 x3 y0 y1 y2 y3 0 0 1 0 1 1 0 0 Random assignment σ’ MAX-SAT 0 0 1 0 1 1 1 0 Solution σ MAX-SAT 1 0 1 0 0 1 1 0 σ 0 97

  49. Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) x0 x1 x2 x3 y0 y1 y2 y3 0 0 1 0 1 1 0 0 Random assignment σ’ MAX-SAT 0 0 1 0 1 1 1 0 Solution σ MAX-SAT 1 0 1 0 0 1 1 0 0 1 1 1 1 0 1 0 σ 0 σ 1 98

  50. Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) x0 x1 x2 x3 y0 y1 y2 y3 0 0 1 0 1 1 0 0 Random assignment σ’ MAX-SAT 0 0 1 0 1 1 1 0 Solution σ MAX-SAT 1 0 1 0 0 1 1 0 0 1 1 1 1 0 1 0 σ 0 σ 1 99

  51. Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) x0 x1 x2 x3 y0 y1 y2 y3 0 0 1 0 1 1 0 0 Random assignment σ’ MAX-SAT 0 0 1 0 1 1 1 0 Solution σ MAX-SAT 1 0 1 0 0 1 1 0 0 1 1 1 1 0 1 0 0 σ 0 σ 1 100

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend