guarding vulnerable code module 1 sanitization
play

Guarding Vulnerable Code: Module 1: Sanitization Mathias Payer, - PowerPoint PPT Presentation

Guarding Vulnerable Code: Module 1: Sanitization Mathias Payer, Purdue University http://hexhive.github.io 1 Vulnerabilities everywhere? 2 Common Languages: TIOBE18 Jul 2018 Jul 2017 Change Language Ratings Change 1 1 Java


  1. Guarding Vulnerable Code: Module 1: Sanitization Mathias Payer, Purdue University http://hexhive.github.io 1

  2. Vulnerabilities everywhere? 2

  3. Common Languages: TIOBE’18 Jul 2018 Jul 2017 Change Language Ratings Change 1 1 Java 16.139% +2.37% 2 2 C 14.662% +7.34% 3 3 C++ 7.615% +2.04% 4 4 Python 6.361% +2.82% 5 7 + VB .NET 4.247% +1.20% 6 5 - C# 3.795% +0.28% 7 6 - PHP 2.832% -0.26% 8 8 JavaScript 2.831% +0.22% 9 - ++ SQL 2.334% +2.33% 10 18 ++ Objective-C 1.453% -0.44% 3

  4. Software is highly complex Google Chrome: 76 MLoC Gnome: 9 MLoC Xorg: 1 MLoC glibc: 2 MLoC Linux kernel: 17 MLoC Low-level languages (C/C++) trade type safety and memory safety for performance 4

  5. Defense: Testing vs. Mitigations Software Testing Mitigations ● Discover bugs ● Stop exploitation ● Development tool ● Always on ● Result oriented ● Low overhead 5

  6. Memory Corruption 6

  7. Memory error: invalid dereference Dangling pointer: free (foo); (temporal) *foo = 23; Out-of-bounds pointer: char foo[ 40 ]; (spatial) foo[ 42 ] = 23; Violation iff: pointer is read, written, or freed 7

  8. Type Confusion 8

  9. Type confusion through downcasts Base Greeter Exec Greeter *g = new Greeter(); Exec *e = static_cast<Exec*>(b); √ Base *b = static_cast<Base*>(g); X 9

  10. C++ casting operations ● static_cast<ToClass>(Object) – Compile time check – No runtime type information ● dynamic_cast<ToClass>(Object) – Runtime check – Requires Runtime Type Information (RTTI) – Not used in performance critical code 10

  11. Static cast Base *b = …; a = static_cast<Greeter*>(b); movq -24(%rbp), %rax # Load pointer # Type “check” movq %rax, -40(%rbp) # Store pointer 11

  12. Dynamic cast (O2) Base *b = …; a = dynamic_cast<Greeter*>(b); leaq _ZTI7Greeter(%rip), %rdx leaq _ZTI4Base(%rip), %rsi xorl %ecx, %ecx movq %rbp, %rdi # Load pointer call __dynamic_cast@PLT # Type check 12

  13. Type confusion Gptr vtable*? class Base { Bptr x int x; }; y? class Greeter: Base { int y; vtable* virtual void Hi(); }; B G x … y Base *Bptr = new Base(); Greeter *Gptr; Gptr = static_cast<Greeter*>Gptr; // Type Conf Gptr->y = 0x43; // Memory safety violation! Gptr->Hi(); // Control-flow hijacking 13

  14. Type Confusion Demo 14

  15. C++ virtual dispatch class Base { … }; class Exec: public Base { Base public : virtual void exec( char *prg) { system(prg) ; } Greater Exec }; class Greeter: public Base { public : virtual void sayHi( char *str) { std::cout << str << std::endl; } }; Greeter *greeter = new Greeter(); greeter->sayHi("Oh, hello there!"); 15

  16. Simple exploitation demo GreeterT int main() { Base *b1 = new Greeter(); Base *b2 = new Exec(); Greeter *g; b1 vtable* g = static_cast <Greeter*>(b1); g->sayHi( "Greeter says hi!" ); // g[0][0](str); g = static_cast <Greeter*>(b2); g->sayHi( "/usr/bin/xcalc" ); // g[0][0](str); delete b1; delete b2; b2 vtable* return 0; } ExecT 16

  17. Sanitization 17

  18. Problem: broken abstractions? C/C++ void log( int a) { printf("Log: "); printf("%d", a); } void (* fun )( int ) = &log; void init() { fun(15); } ASM log: ... fun : .quad log init: ... movl $15, %edi movq fun(%rip), %rax call *%rax 18

  19. LLVM Sanitization ● Test cases detect bugs through assertions, segmentation faults, traps, exceptions ● Enforce stronger policies during testing! – Address Sanitizer: memory safety – Leak Sanitizer: memory leaks – Memory Sanitizer: uninitialized memory – UBSan: undefined behavior – Thread Sanitizer: data races – HexVASAN: variadic argument checker – HexType: type safety 19

  20. Type Safety 20

  21. Type confusion detection* ● A static cast is checked only at compile time – Fast but no runtime guarantees ● Dynamic casts are checked at runtime – High overhead, limited to polymorphic classes ● HexType design: – Conceptually check all casts dynamically – Aggressively optimize design and implementation * TypeSanitizer: Practical Type Confusion Detection. Istvan Haller, Yuseok Jeon, Hui Peng, Mathias Payer, Herbert Bos, Cristiano Giuffrida, Erik van der Kouwe. In CCS'16 * HexType: Efficient Detection of Type Confusion Errors for C++. Yuseok Jeon, Priyam Biswas, Scott A. Carr, Byoungyoung Lee, and Mathias Payer. In CCS'17 21

  22. Making type checks explicit ● Enforce runtime check at all cast sites – static_cast<ToClass>(Object) – dynamic_cast<ToClass>(Object) – reinterpret_cast<ToClass>(Object) – (ToClass)(Object) ● Build global type hierarchy ● Keep track of the allocation type of each object – Must instrument all forms of allocation – Requires disjoint metadata 22

  23. HexType: design Source Instrumentation code (Type casting verification) HexType Clang Binary HexType Type Hierarchy Runtime Information Library LLVM Pass Link 23

  24. HexType: aggressive optimization ● Limit tracing to unsafe types – Remove tracing of types that are never cast ● Limit checking to unsafe casts – Remove statically verifiable casts ● No more RTTI for dynamic casts – Replace dynamic casts with fast lookup 24

  25. Demo Time! 25

  26. HexType coverage 26

  27. Newly discovered bugs ● Discovered seven new vulnerabilities: Apache Xerces C++ Qt base library DOMNode QMapNode Base DOM DOM Character Element Data QMapNode DOM DOM Text ElementImpl Type DOM Confusion! TextImpl 27

  28. Sanitizer Summary: Type Safety ● Type confusion fundamental in today’s exploits ● Existing sanitizers are incomplete, partial, slow ● HexType – (Almost) full coverage (2-6x increase) – Reasonable overhead (SPEC CPU: 0-32x improvement, Firefox: 0-0.5x slowdown) – Future work: remaining coverage, optimizations 28

  29. T-Fuzz 29

  30. Fuzzing Challenges ● Challenges Shallow code paths Shallow code paths – Shallow coverage start – Hard to find “deep” bugs Deep code paths Deep code paths check1 ● Root cause check2 – Fuzzer-generated inputs cannot bypass complex check3 sanity checks in the target program bug – Existing work limits itself to input generation end 30

  31. T-Fuzz: Fuzz the Program! ● Option 1: generate input to bypass checks by heavy-weight program analysis techniques – Driller (concolic analysis) – VUzzer (dynamic taint analysis) ● Our idea: remove program’s sanity checks – Checks filter orthogonal input, e.g., magic values, checksum, or hashes (Non-Critical Check, NCC) – Insight: removing NCCs is safe if ( strncmp (hdr, “ELF", 3) == 0) { // main program logic } else { error (); } 31

  32. Design and Implementation ● Fuzzer generates inputs Transformed Programs ● When “stuck” – Detect NCCs* Inputs Fuzzer Program ● Transform program (e.g. AFL) Transformer Crashing ● Verify crashes inputs Bug Reports Crash Analyzer False Positjves *Approximation of NCCs: edged in the CFG connecting covered/uncovered nodes 32

  33. Detecting NCC’s ● Approximate NCCs as edges connecting covered and uncovered nodes in CFG – Over approximate, may contain false positives – Lightweight and simple to implement 33 Covered Node Uncovered Node NCC Candidates 33

  34. Program Transformation start ● Our approach: negate NCCs – Simple: static binary rewriting A == B – Zero runtime overhead in True branch False branch resulting target program – Unchanged CFG end – Trace in transformed program maps to original program start Negated Check – Path constraints of original A != B program can be recovered True branch False branch 34 end 34

  35. Comparison to Symbolic Executoion ● Explores all code paths, tracks constraints ● Path explosion, e.g., loops ● Each branch doubles the number of code paths ... ... ... ... ● Resource requirement ... ● Theoretically beautiful, limited scalability ( Path 1 , ( Path n , ... constraint set 1 ) constraint set n ) 35

  36. Comparison to Concolic Execution ● Guided by concrete inputs input ● Follows single code path, Not C1 C1 collects constraints for new code paths ● Reduced resource requirements ... ... ... ... ● Still an exponential number ... of paths to explore! 36

  37. Comparison to Driller (Fuzz & CE) ● Fuzzing until coverage wall ● When fuzzing gets “stuck”, Fuzzer concolic execution explores mutating SE & constraint solving new code paths using fuzzer generated inputs Inputs ● Limitations target – “SE & constraints solving” slows program down fuzzing – Not able to bypass “hard” checks Crashes 37

  38. T-Fuzz: fuzz first, solve only crashes ● Fuzzing/SE decoupled ● SE only applied to T-Fuzz detected crashes Program ● For “hard” checks, Fuzzer Transformation T-Fuzz detects the guarded bug, but program cannot verify it SE & constraints solving Crashes T-Fuzz in action 38

  39. Evaluation ● Implementation – Fuzzer: shellphish fuzzer (python wrapper of AFL) – Program Transformer: angr tracer, radare2 – Crash Analyzer: 2k LoC Python hackery ● Evaluation – DARPA CGC dataset – LAVA-M dataset – 4 real-world programs 39

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend