Security through Multi-Layer Diversity Meng Xu (Qualifying - PowerPoint PPT Presentation

Security • RIPE Benchmark Config Succeed Probabilistic Failed Not possible 114 16 720 2990 Default 8 0 842 2990 AddressSanitizer 8 0 842 2990 Bunshin • Real-world CVEs Config CVE Exploits Sanitizer Detect 2013-2028 Blind ROP AddressSanitizer nginx-1.4.0 2016-5636 Integer overflow AddressSanitizer cpython-2.7.10 2015-4602 Type confusion AddressSanitizer php-5.6.6 2014-0160 Heartbleed AddressSanitizer openssl-1.0.1a 2014-3581 Null dereference UndefinedBehaviorSanitizer httpd-2.4.10 44

Performance Benchmark Items Strict-Lockstep Selective-Lockstep Max 17.5% 14.7% SPEC CPU2006 Min 1.6% 1.0% (19 Programs) Ave 8.6% 5.6% Max 21.4% 18.9% SPLASH-2X / PARSEC Min 10.7% 6.6% (19 Programs) Ave 16.6% 14.5% lighttpd Ave 1.44% 1.21% 1MB File Request nginx Ave 1.71% 1.41% 1MB File Request 45

Performance Highlights • Low overhead (5% - 16%) for standard benchmarks • Negligible overhead (<= 2%) for server programs • Extra cost of ensuring weak determinism is 8% • Selective-lockstep saves around 3% overhead 46

Scalability - Number of Variants Ave Max Min 37.6 Sync Overhead (%) 20.9 17.2 11.4 11.2 10.5 6.6 4.4 1.7 0.6 0.5 0 2 4 6 8 Number of variants 47

Scalability - Number of Variants Ave Max Min 37.6 Sync Overhead (%) The number of variants Bunshin can 20.9 17.2 support with a reasonable overhead depends on machine configurations 11.4 11.2 10.5 and program characteristics. 6.6 4.4 1.7 0.6 0.5 0 2 4 6 8 Number of variants 48

Scalability - System Load Ave Max Min 13 9.7 Sync Overhead (%) 6.6 6.4 4.8 2.2 1.9 0.8 0.2 2% 50% 99% Number of variants 49

Scalability - System Load Ave Max Min 13 9.7 Sync Overhead (%) 6.6 6.4 Bunshin works well in all levels of system load 4.8 (i.e., Bunshin does not require exclusive cores) 2.2 1.9 0.8 0.2 2% 50% 99% Number of variants 50

Check Distribution - ASan 107 107 Overhead (%) Overhead (%) 65.6 63 57.4 43.1 37.2 34.8 34.9 Whole V1 V2 Bunshin Whole V1 V2 V3 Bunshin 51

Sanitizer Distribution - UBSan 228 228 Overhead (%) Overhead (%) 129 125 124 94.5 88 78.7 77.2 Whole V1 V2 Bunshin Whole V1 V2 V3 Bunshin 52

Unifying LLVM Sanitizers ASan MSan UBSan Bunshin 248 246 208 207 205 191 189 177 172 Overhead (%) 165 158 148 141 116 112 98.9 gobmk povray h264ref average 53

Unifying LLVM Sanitizers ASan MSan UBSan Bunshin 248 246 208 207 205 191 189 177 172 Overhead (%) With an average of 5% more slowdown, 165 158 Bunshin can seamlessly unify all three 148 141 LLVM sanitizers 116 112 98.9 gobmk povray h264ref average 54

Limitations and Future Work • Finer-grained check distribution • Sanitizer integration • Record-and-replay 55

Conclusion • It is feasible to achieve both comprehensive protection and high throughput with an N-version system • Bunshin is e ff ective in reducing slowdown caused by sanitizers • 107% → 47.1% for ASan, 228% → 94.5% for UBSan • Bunshin can seamlessly unify three LLVM sanitizers with 5% extra slowdown https://github.com/sslab-gatech/bunshin (Source code will be released soon) 56

Enhance System Security Through Diversity Input Virtualization Input MSan UBSan ASan Bunshin (ATC’17) HHVM Zend Zend JPHP Future work Linux Linux Windows MacOS PlatPal (Security’17) Software Variant 1 Variant 2 Variant 3 Stack Output Synchronize Execution & Consolidate Outputs Output 57

PlatPal: Detecting Malicious Documents with Platform Diversity Meng Xu and Taesoo Kim Georgia Tech Presented at the 2017 USENIX Security Symposium (Security’17) 58

Malicious Documents On the Rise 59

Adobe Components Exploited Element parser JavaScript engine 137 CVEs in 2015 Font manager 227 CVEs in 2016 System dependencies 62

Maldoc Formula Flexibility of doc spec More opportunities A large attack surface to profit Less caution from users 63

Battle against Maldoc - A Survey Category Focus Work Year Detection JavaScript PJScan 2011 Lexical analysis JavaScript Vatamanu et al. 2012 Token clustering JavaScript Lux0r 2014 API reference classification JavaScript MPScan 2013 Shellcode and opcode sig Static Metadata PDF Malware Slayer 2012 Linearized object path Metadata Srndic et al. 2013 Hierarchical structure Metadata PDFrate 2012 Content meta-features Both Maiorca et al. 2016 Many heuristics combined JavaScript MDScan 2011 Shellcode and opcode sig JavaScript PDF Scrutinizer 2012 Known attack patterns JavaScript ShellOS 2011 Memory access patterns Dynamic JavaScript Liu et al. 2014 Common attack behaviors Memory CWXDetector 2012 Violation of invariants 64

Reliance on External PDF Parser Category Focus Work Year Detection External Parser ? JavaScript PJScan 2011 Lexical analysis Yes JavaScript Vatamanu et al. 2012 Token clustering Yes JavaScript Lux0r 2014 API reference classification Yes JavaScript MPScan 2013 Shellcode and opcode sig No Static Metadata PDF Malware Slayer 2012 Linearized object path Yes Metadata Srndic et al. 2013 Hierarchical structure Yes Metadata PDFrate 2012 Content meta-features Yes Both Maiorca et al. 2016 Many heuristics combined Yes JavaScript MDScan 2011 Shellcode and opcode sig Yes JavaScript PDF Scrutinizer 2012 Known attack patterns Yes JavaScript ShellOS 2011 Memory access patterns Yes Dynamic JavaScript Liu et al. 2014 Common attack behaviors No Memory CWXDetector 2012 Violation of invariants No 65

Reliance on External PDF Parser Category Focus Work Year Detection External Parser ? JavaScript PJScan 2011 Lexical analysis Yes JavaScript Vatamanu et al. 2012 Token clustering Yes JavaScript Lux0r 2014 API reference classification Yes JavaScript MPScan 2013 Shellcode and opcode sig No Static Metadata PDF Malware Slayer 2012 Linearized object path Yes Metadata Srndic et al. 2013 Hierarchical structure Yes Parser-confusion attacks Metadata PDFrate 2012 Content meta-features Yes (Carmony et al., NDSS’16) Both Maiorca et al. 2016 Many heuristics combined Yes JavaScript MDScan 2011 Shellcode and opcode sig Yes JavaScript PDF Scrutinizer 2012 Known attack patterns Yes JavaScript ShellOS 2011 Memory access patterns Yes Dynamic JavaScript Liu et al. 2014 Common attack behaviors No Memory CWXDetector 2012 Violation of invariants No 66

Reliance on Machine Learning Category Focus Work Year Detection Machine Learning ? JavaScript PJScan 2011 Lexical analysis Yes JavaScript Vatamanu et al. 2012 Token clustering Yes JavaScript Lux0r 2014 API reference classification Yes JavaScript MPScan 2013 Shellcode and opcode sig No Static Metadata PDF Malware Slayer 2012 Linearized object path Yes Metadata Srndic et al. 2013 Hierarchical structure Yes Metadata PDFrate 2012 Content meta-features Yes Both Maiorca et al. 2016 Many heuristics combined Yes JavaScript MDScan 2011 Shellcode and opcode sig No JavaScript PDF Scrutinizer 2012 Known attack patterns No JavaScript ShellOS 2011 Memory access patterns No Dynamic JavaScript Liu et al. 2014 Common attack behaviors No Memory CWXDetector 2012 Violation of invariants No 67

Reliance on Machine Learning Category Focus Work Year Detection Machine Learning ? JavaScript PJScan 2011 Lexical analysis Yes JavaScript Vatamanu et al. 2012 Token clustering Yes JavaScript Lux0r 2014 API reference classification Yes JavaScript MPScan 2013 Shellcode and opcode sig No Static Metadata PDF Malware Slayer 2012 Linearized object path Yes Metadata Srndic et al. 2013 Hierarchical structure Yes Automatic classifier evasions Metadata PDFrate 2012 Content meta-features Yes (Xu et al., NDSS’16) Both Maiorca et al. 2016 Many heuristics combined Yes JavaScript MDScan 2011 Shellcode and opcode sig No JavaScript PDF Scrutinizer 2012 Known attack patterns No JavaScript ShellOS 2011 Memory access patterns No Dynamic JavaScript Liu et al. 2014 Common attack behaviors No Memory CWXDetector 2012 Violation of invariants No 68

Reliance on Known Attacks Category Focus Work Year Detection Known Attacks ? JavaScript PJScan 2011 Lexical analysis Yes JavaScript Vatamanu et al. 2012 Token clustering Yes JavaScript Lux0r 2014 API reference classification Yes JavaScript MPScan 2013 Shellcode and opcode sig Yes Static Metadata PDF Malware Slayer 2012 Linearized object path Yes Metadata Srndic et al. 2013 Hierarchical structure Yes Metadata PDFrate 2012 Content meta-features Yes Both Maiorca et al. 2016 Many heuristics combined Yes JavaScript MDScan 2011 Shellcode and opcode sig Yes JavaScript PDF Scrutinizer 2012 Known attack patterns Yes JavaScript ShellOS 2011 Memory access patterns Yes Dynamic JavaScript Liu et al. 2014 Common attack behaviors Yes Memory CWXDetector 2012 Violation of invariants No 69

Reliance on Known Attacks Category Focus Work Year Detection Known Attacks ? JavaScript PJScan 2011 Lexical analysis Yes JavaScript Vatamanu et al. 2012 Token clustering Yes JavaScript Lux0r 2014 API reference classification Yes JavaScript MPScan 2013 Shellcode and opcode sig Yes Static Metadata PDF Malware Slayer 2012 Linearized object path Yes Metadata Srndic et al. 2013 Hierarchical structure Yes How about zero-day attacks ? Metadata PDFrate 2012 Content meta-features Yes Both Maiorca et al. 2016 Many heuristics combined Yes JavaScript MDScan 2011 Shellcode and opcode sig Yes JavaScript PDF Scrutinizer 2012 Known attack patterns Yes JavaScript ShellOS 2011 Memory access patterns Yes Dynamic JavaScript Liu et al. 2014 Common attack behaviors Yes Memory CWXDetector 2012 Violation of invariants No 70

Reliance on Detectable Discrepancy (between benign and malicious docs) Category Focus Work Year Detection Discrepancy ? JavaScript PJScan 2011 Lexical analysis Yes JavaScript Vatamanu et al. 2012 Token clustering Yes JavaScript Lux0r 2014 API reference classification Yes JavaScript MPScan 2013 Shellcode and opcode sig No Static Metadata PDF Malware Slayer 2012 Linearized object path Yes Metadata Srndic et al. 2013 Hierarchical structure Yes Metadata PDFrate 2012 Content meta-features Yes Both Maiorca et al. 2016 Many heuristics combined Yes JavaScript MDScan 2011 Shellcode and opcode sig No JavaScript PDF Scrutinizer 2012 Known attack patterns No JavaScript ShellOS 2011 Memory access patterns Yes Dynamic JavaScript Liu et al. 2014 Common attack behaviors Yes Memory CWXDetector 2012 Violation of invariants No 71

Reliance on Detectable Discrepancy (between benign and malicious docs) Category Focus Work Year Detection Discrepancy ? JavaScript PJScan 2011 Lexical analysis Yes JavaScript Vatamanu et al. 2012 Token clustering Yes JavaScript Lux0r 2014 API reference classification Yes JavaScript MPScan 2013 Shellcode and opcode sig No Static Metadata PDF Malware Slayer 2012 Linearized object path Yes Metadata Srndic et al. 2013 Hierarchical structure Yes Mimicry and reverse mimicry attacks Metadata PDFrate 2012 Content meta-features Yes (Srndic et al., Oakland’14 and Maiorca et al, AsiaCCS’13) Both Maiorca et al. 2016 Many heuristics combined Yes JavaScript MDScan 2011 Shellcode and opcode sig No JavaScript PDF Scrutinizer 2012 Known attack patterns No JavaScript ShellOS 2011 Memory access patterns Yes Dynamic JavaScript Liu et al. 2014 Common attack behaviors Yes Memory CWXDetector 2012 Violation of invariants No 72

Highlights of the Survey Prior works rely on • External PDF parsers Parser-confusion attacks • Machine learning Automatic classifier evasion • Known attack signatures Zero-day attacks • Detectable discrepancy Mimicry and reverse mimicry 73

Motivations for PlatPal Prior works rely on What PlatPal aims to achieve • External PDF parsers • Using Adobe’s parser • Machine learning • Using only simple heuristics • Known attack signatures • Capable to detect zero-days • Detectable discrepancy • Do not assume discrepancy • Complementary to prior works 74

A Motivating Example • A CVE-2013-2729 PoC against Adobe Reader 10.1.4 SHA-1: 74543610d9908698cb0b4bfcc73fc007bfeb6d84 80

Platform Diversity as A Heuristic When the same document is opened across different platforms: • A benign document “behaves” the same • A malicious document “behaves” differently 83

Questions for PlatPal • What is a “behavior” ? • What is a divergence ? • How to trace them ? • How to compare them ? 84

PlatPal Basic Setup ? Virtual Machine 1 Virtual Machine 2 Adobe Reader Adobe Reader Windows Host MacOS Host 85

PlatPal Dual-Level Tracing ? Virtual Machine 1 Virtual Machine 2 Adobe Reader Adobe Reader Traces of PDF Internal Tracer Internal Tracer processing Windows Host MacOS Host 86

PlatPal Dual-Level Tracing ? Virtual Machine 1 Virtual Machine 2 Adobe Reader Adobe Reader Traces of PDF Internal Tracer Internal Tracer processing Syscalls Syscalls Impacts on External Tracer External Tracer host platform Windows Host MacOS Host 87

PlatPal Internal Tracer • Implemented as an Adobe Adobe Reader Reader plugin. Internal Tracer • Hooks critical functions and callbacks during the PDF COS object parsing processing lifecycle. PD tree construction Script execution • Very fast and stable across Other actions Adobe Reader versions. Element rendering 88

PlatPal External Tracer Virtual Machine • Implemented based on NtTrace (for Windows) and Dtrace (for Adobe Reader MacOS). Syscalls • Resembles high-level system External Tracer impacts in the same manner as Cuckoo guest agent. Filesystem Program Operations Executions • Starts tracing only after the Network Normal Exit document is loaded into Adobe Activities or Crash Reader. Host Platform 89

PlatPal Automated Workflow PlatPal <file-to-check> Restore Clean Restore Clean Snapshot Snapshot Launch Adobe Launch Adobe Reader Reader Attach External Attach External Tracer Tracer Open PDF Open PDF Drive PDF by Drive PDF by Internal Tracer Internal Tracer Dump Traces Dump Traces Compare Windows VM MacOS VM Traces 90

Evaluate PlatPal • Robustness against benign samples A benign document “behaves” the same ? • E ff ectiveness against malicious samples A malicious document “behaves” di ff erently ? • Speed and resource usages 91

Robustness • 1000 samples from Google search. • 30 samples that use advanced features in PDF standards from PDF learning sites. Divergence Detected ? Sample Type Number of Samples (i.e., False Positive) 966 No Plain PDF 34 No Embedded fonts 32 No JavaScript code 17 No AcroForm 2 No 3D objects 92

Effectiveness • 320 malicious samples from VirusTotal with CVE labels. • Restricted to analyze CVEs published after 2013. • Use the most recent version of Adobe Reader when the CVE is published. 93

Effectiveness Analysis Results of   320 Maldoc Samples 24% 11% 65% No Divergence Both Crash Divergence 94

Effectiveness Analysis Results of   Breakdown of 77   320 Maldoc Samples potentially false positives 24% 25% 47% 3% 11% 65% 26% No Divergence Targets old versions Mis-classified by AV vendor No malicious activity trigerred Unknown 95

Time and Resource Usages Average Analysis Time Breakdown Resource Usages (unit. Seconds) • 2GB memory per running virtual Item Windows MacOS machine. 9.7 12.6 Snapshot restore 0.5 0.6 Document parsing • 60GB disk space for Windows 10.5 5.1 Script execution and MacOS snapshots that each corresponds to one of the 7.3 6.2 Element rendering 6 Adobe Readers versions. 23.7 22.1 Total 96

Evaluation Highlights • Confirms our fundamental assumption in general: benign document “behaves” the same malicious document “behaves” di ff erently • PlatPal is subject to the pitfalls of dynamic analysis i.e., prepare the environment to lure the malicious behaviors • Incurs reasonable analysis time to make PlatPal practical 97

Further Analysis • What could be the root causes of these divergences? 98

Diversified Factors across Platforms Category Factor Windows MacOS Shellcode Creation Memory Management Platform Features 99

Diversified Factors across Platforms Category Factor Windows MacOS Both the syscall number and the register set used to hold Syscall semantics syscall arguments are di ff erent Shellcode Calling convention rcx, rdx, r8 for first 3 args rdi, rsi, rdx for first 3 args Creation Library dependencies e.g., LoadLibraryA e.g. dlopen Memory Management Platform Features 100

Security through Multi-Layer Diversity Meng Xu (Qualifying - PowerPoint PPT Presentation

Security through Multi-Layer Diversity Meng Xu (Qualifying Examination Presentation) 1 Bringing Diversity to Computing Monoculture Current computing monoculture leaves our infrastructure vulnerable to massive and rapid attacks. Knowing

Multi Multi Multi- Multi - - -Layer Access Control Layer Access Control Layer Access

Overview Multi-layer networks: Cognitive Modeling limits of single layer networks; Lecture

Network Layer October 2, 2019 guha.jayachandran@sjsu.edu Layer 2: Protocol atop Layer 1

A multi- -layer layer A multi A multi-layer research and training platform research and

Lecture 6: Wireless Link Layer, Lecture 6: Wireless Link Layer, MAC protocols, CSMA MAC

1 Transport Layer Transport Layer Outline Message, Segment, Datagram Transport-layer

ELEC / COMP 177 Fall 2016 Some slides from Kurose and Ross, Computer Networking , 5 th Edition

5 Network Layer Network Layer Network Layer Network Layer Example: Choosing among multiple ASes

Diversity through time... Changes in dinosaur diversity by continent Count species? genera?

DNS and Security DNS and Security DNS and Security DNS and Security DNS and Security DNS and

10 mm Cytoarchitecture and function layer 4: input layer 5: output Motor cortex: expanded layer

Data-link layer Da Data ta-link link layer er Referred to as layer 2 Physical

CompSci 356: Computer Network Architectures Lecture 25: Application Layer Protocols Chapter 9.1

7 Network Layer Network Layer Network Layer Network Layer Subnets Classful Address

1 Network Layer Network Layer Recall: Circuit Switching vs. Packet Interplay between routing

CompSci 356: Computer Network Architectures Lecture 23: Application Layer Protocols Chapter 9.1

With your IT Ops Mission Using Kanban Dominica DeGrandis @Dominicad Cycle of Frustration

Financial Report Financial Report For the Period Ending For the Period Ending February 28, 2019

Consumer Online Chart Tools Group 2 - Chris, Lisa, Markus & Romy Reviewed Visualization

How SMPng Works and Why It Doesn't Work The Way You Think NYC*BUG February 6, 2013 John Baldwin

Miscellaneous Topics in Databases P ARALLEL DBMS W HY P ARALLEL A CCESS T O D ATA ? At 10 MB/s

Curriculum Night 2020-21 Ms. Golden and Mrs. Sundholm The Kindergarten Class of 2020-2021 Weekly

Agenda

Multimedia Systems Chapter 7.2: Interactive versus non-interactive Layer 6: Compression

Security through Multi-Layer Diversity Meng Xu (Qualifying - PowerPoint PPT Presentation

Security through Multi-Layer Diversity Meng Xu (Qualifying Examination Presentation) 1 Bringing Diversity to Computing Monoculture Current computing monoculture leaves our infrastructure vulnerable to massive and rapid attacks. Knowing

Multi Multi Multi- Multi - - -Layer Access Control Layer Access Control Layer Access

Overview Multi-layer networks: Cognitive Modeling limits of single layer networks; Lecture

Network Layer October 2, 2019 guha.jayachandran@sjsu.edu Layer 2: Protocol atop Layer 1

A multi- -layer layer A multi A multi-layer research and training platform research and

Lecture 6: Wireless Link Layer, Lecture 6: Wireless Link Layer, MAC protocols, CSMA MAC

1 Transport Layer Transport Layer Outline Message, Segment, Datagram Transport-layer

ELEC / COMP 177 Fall 2016 Some slides from Kurose and Ross, Computer Networking , 5 th Edition

5 Network Layer Network Layer Network Layer Network Layer Example: Choosing among multiple ASes

Diversity through time... Changes in dinosaur diversity by continent Count species? genera?

DNS and Security DNS and Security DNS and Security DNS and Security DNS and Security DNS and

10 mm Cytoarchitecture and function layer 4: input layer 5: output Motor cortex: expanded layer

Data-link layer Da Data ta-link link layer er Referred to as layer 2 Physical

CompSci 356: Computer Network Architectures Lecture 25: Application Layer Protocols Chapter 9.1

7 Network Layer Network Layer Network Layer Network Layer Subnets Classful Address

1 Network Layer Network Layer Recall: Circuit Switching vs. Packet Interplay between routing

CompSci 356: Computer Network Architectures Lecture 23: Application Layer Protocols Chapter 9.1

With your IT Ops Mission Using Kanban Dominica DeGrandis @Dominicad Cycle of Frustration

Financial Report Financial Report For the Period Ending For the Period Ending February 28, 2019

Consumer Online Chart Tools Group 2 - Chris, Lisa, Markus &amp; Romy Reviewed Visualization

How SMPng Works and Why It Doesn't Work The Way You Think NYC*BUG February 6, 2013 John Baldwin

Miscellaneous Topics in Databases P ARALLEL DBMS W HY P ARALLEL A CCESS T O D ATA ? At 10 MB/s

Curriculum Night 2020-21 Ms. Golden and Mrs. Sundholm The Kindergarten Class of 2020-2021 Weekly

Agenda

Multimedia Systems Chapter 7.2: Interactive versus non-interactive Layer 6: Compression

Consumer Online Chart Tools Group 2 - Chris, Lisa, Markus & Romy Reviewed Visualization