 
              Securing Software Systems by Preventing Information Leaks Kangjie Lu Georgia Institute of Technology
Computer devices are everywhere 2
Foundational software systems 3
Inherent insecurity: Vulnerabilities and insecure designs Implemented in unsafe languages (e.g., C/C++) – Increasing vulnerabilities Number of reported vulnerabilities in Linux 400 350 300 250 200 150 100 50 0 Data source: U.S. National Vulnerability Database System designers prioritize performance over security – Many insecure designs 4
Critical system attacks exploiting vulnerabilities and insecure designs System attacks are evolving: More and more advanced, harder and harder to defend against 5
Two typical goals of system attacks Control attacks To control victim systems To leak sensitive data Data leaks 6
Defeating both data leaks and control attacks by preventing information leaks 7
A fundamental requirement of control attacks Attackers have to replace a code pointer with a malicious one to gain control Memory Memory Code pointer Malicious pointer Malicious Overwriting code pieces control data 8
A fundamental requirement of control attacks Attackers have to replace a code pointer with a malicious one to gain control Memory Memory Address of Address of a code malicious pointer code Code pointer Malicious pointer Malicious Overwriting code pieces control data Have to know the addresses of both a code pointer and malicious code 9
A widely deployed defense---ASLR ASLR: Address Space Layout Randomization – Preventing attackers from knowing addresses 2 40 Memory Memory Memory possibilities Code/data Code/data Code/data … 1 st run 2 nd run 3 rd run n run 10
In principle, ASLR is “perfect” Memory Randomized addresses ASLR is efficient, easy to deploy, and effective as long as there is no information leak 11
In practice, ASLR is weak Number of reported information-leak vulnerabilities 2500 2000 1500 1000 500 0 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 Data source: U.S. National Vulnerability Database Control attacks still work because of information leaks 12
ASLR re-defines the prevention problem in modern systems Control attack Address prevention ASLR leak prevention problem problem Preventing address leaks can defeat control attacks 13
Information leak is inevitable for both attacks Exploiting information leaks Bypassing ASLR Data leaks Control attacks 14
Research goal: Preventing information leaks Exploiting information leaks Bypassing ASLR Data leaks Control attacks 15
Root causes of known information leaks Uninitialized Memory error read Logic error Missing check Vulnerabilities Hardware error Row hammer Root causes Specification Uninitialized issue padding Organization Refork-on-crash issue Design flaws Mechanism Deduplication + issue COW Side channels AnC on MMU Category Type Example 16
Three ways to prevent information leaks Eliminating information-leak vulnerabilities • UniSan : Eliminating uninitialized data leaks [ CCS’16 ] • PointSan : Eliminating uninitialized pointers [ NDSS’17 ] Securing system designs against information leaks • Runtime re-randomization for process forking [ NDSS'16 ] Protecting sensitive data from information leaks • ASLR-Guard : Preventing code pointer leaks [ CCS’15 ] • Buddy : Detecting memory disclosures for COTS 17
Motivation of UniSan OS kernels are the trusted computing base – Contain sensitive data like crypto keys – Deploy security mechanisms like ASLR Hundreds of information-leak vulnerabilities – Data leaks – ASLR bypass 18
UniSan: To eliminate (the most common) information-leak vulnerabilities in OS kernels à Mitigate data leaks, code-reuse and privilege-escalation attacks 19
Main contributions of UniSan • Automatically secure the Linux and Android kernels with negligible runtime overhead • Reported and patched 19 kernel vulnerabilities – CVE-2016-5243, CVE-2016-5244, CVE- 2016-4569, CVE-2016-4578, CVE-2016-4569, CVE-2016-4485, CVE-2016-4486, CVE-2016-4482, …… • Found and fixed a critical security problem in compilers • Porting UniSan to GCC for adoption 20
The main cause of information leaks: Uninitialized data read Logic error (e.g., missing check) 14% Uninitialized data read 29% 57% Out-of-bound read & use- UniSan after-free Data source: U.S. National Vulnerability Database (kernel information leaks reported between 2013 and 2016) 21
How an uninitialized data read leads to an information leak Kernel space Kernel space User space Kernel space Object B Object B Object A sensitive sensitive sensitive sensitive 1 2 3 4 We call such information leaks User A allocates User A deallocates User B allocates User B reads “ uninitialized data leaks ” object A and object A; object B without Object B; writes “ sensitive ” “ sensitive ” is not Initialization; “ sensitive ” into it cleared “ sensitive ” kept leaked ! 22
Troublemaker: Developer Missing field initialization: Blame the developers? Difficult to avoid inconsistency – Too complex Object Data struct < / > use/init definition 23
Troublemaker: Compiler Data structure padding: A fundamental feature for improving CPU efficiency /* both fields (5 bytes) are initialized*/ struct test { struct test t = { unsigned int a; .a = 0, unsigned char b; A critical and prevalent security problem: .b = 0 /* 3-bytes padding */ Programs are built by compilers! /* 3 uninitialized padding bytes */ }; }; /* leaking uninitialized 3-byte padding*/ copy_to_user(dest, &t, sizeof(t)); 24
The root cause: C specifications (C11) Chapter §6.2.6.1/6 “When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values.” 25
UniSan: A compiler-based solution Simply initialize all allocated objects? Too expensive! Kernel The UniSan Approach source code Detecting Initializing Hardened unsafe unsafe LLVM IR LLVM IR allocations allocations Secured kernel image 26
Unsafe allocation detection Byte-level and flow-, context-, and field-sensitive taint tracking Reachability analysis Sinks Sources (e.g., (i.e., Data flow copy_to_user) allocations) Initialization analysis 27
Technical challenges in detection • Global call-graph construction – Conservative type analysis for indirect calls • Byte-level tracking – Maintaining offsets of fields • Eliminating false negatives Be conservative! Assume it is unsafe for unhandled special cases! 28
Zero-initializing all unsafe allocations Stack Heap obj = 0 kmalloc(size, flags| __GFP_ZERO ) memset (obj, 0 , sizeof(obj)) Zero initialization is semantic preserving – Robust – Tolerant of false positives 29
LLVM-based implementation An analysis pass + an instrumentation pass How to use UniSan: $ unisan @bitcode.list 30
UniSan is performant and effective Applied to the latest Linux kernel and Android kernel 10% (2K) of allocations are detected as Accuracy unsafe Negligible runtime overhead: • System operations: 1.36% UniSan Performant Web servers: <0.1% • • User programs: 0.54% Prevented known and new vulnerabilities Effective 19 have been confirmed and fixed by Google • and Linux 31
Three ways to prevent information leaks Eliminating information-leak vulnerabilities • UniSan : Eliminating uninitialized data leaks [ CCS’16 ] • PointSan : Eliminating uninitialized pointers [ NDSS’17 ] Securing system designs against information leaks • Runtime re-randomization for process forking [ NDSS'16 ] Protecting sensitive data from information leaks • ASLR-Guard : Preventing code pointer leaks [ CCS’15 ] • Buddy : Detecting memory disclosures for COTS 32
Three ways to prevent information leaks Eliminating information-leak vulnerabilities • UniSan : Eliminating uninitialized data leaks [ CCS’16 ] • PointSan : Eliminating uninitialized pointers [ NDSS’17 ] Securing system designs against information leaks • Runtime re-randomization for process forking [ NDSS'16 ] Protecting sensitive data from information leaks • ASLR-Guard : Preventing code pointer leaks [ CCS’15 ] • Buddy : Detecting memory disclosures for COTS 33
The insecure process forking violates ASLR A common design of web servers: Worker HTTP(S) Code/data fork() Master Worker fork() HTTP(S) Code/data Code/data Worker fork() HTTP(S) Code/data Exactly same memory layout. Re-fork upon worker crashes 34
The clone-probing attack Attack goal: To guess sensitive data (say randomized return address) with a simple buffer overflow Stack of a web server return address Buffer 12 34 56 78 9a bc ed f0 overflow Crash, try another one 00 34 56 78 9a bc ed f0 AAAAAAA Crash, try another one 01 34 56 78 9a bc ed f0 AAAAAAA Brute-forcing complexity is reduced from 2 64 to 8*2 8 … … Attack Bingo, continue to 12 34 56 78 9a bc ed f0 AAAAAAA payload guess next byte … … Usually can be done within two minutes. … … 12 34 56 78 9a bc ed f0 AAAAAAA … … Finally, get all bytes 12 34 56 78 9a bc ed f0 AAAAAAA 35
Recommend
More recommend