Bunshin: Compositing Security Mechanisms through Diversification - - PowerPoint PPT Presentation

bunshin compositing security mechanisms through
SMART_READER_LITE
LIVE PREVIEW

Bunshin: Compositing Security Mechanisms through Diversification - - PowerPoint PPT Presentation

Bunshin: Compositing Security Mechanisms through Diversification Meng Xu, Kangjie Lu, Taesoo Kim, Wenke Lee Georgia Institute of Technology 1 Memory Corruptions Are Costly 2 3 4 Name your phone Nexus 5X %x.%x 5 Battle against


slide-1
SLIDE 1

Bunshin: Compositing Security Mechanisms through Diversification

Meng Xu, Kangjie Lu, Taesoo Kim, Wenke Lee Georgia Institute of Technology

1

slide-2
SLIDE 2

Memory Corruptions Are Costly…

2

slide-3
SLIDE 3

3

slide-4
SLIDE 4

4

slide-5
SLIDE 5

5

Name your phone “Nexus 5X %x.%x”

slide-6
SLIDE 6

Battle against Memory Errors

Existing security mechanisms: W⊕R, ASLR, CFI

→ Not hard to by pass

6

slide-7
SLIDE 7

Battle against Memory Errors

Existing security mechanisms: W⊕R, ASLR, CFI

→ Not hard to by pass

Protect all dangerous operation using sanity checks:

→ Auto-applied at compile time

7

void foo(T *a) { *a = 0x1234; } void foo(T *a) { if(!is_valid_address(a) { report_and_abort(); } *a = 0x1234; } Sanitize

slide-8
SLIDE 8

Battle against Memory Errors

8

Memory Error Main Causes Defenses Out-of-bound read/write Lack of length check Softbound AddressSanitizer Integer overflow Format string bug Bad type casting Use-after-free Dangling pointer CETS AddressSanitizer Double free Uninitialized read Lack of initialization MemorySanitizer Data structure alignment Subword copying Undefined behaviors Divide-by-zero UndefinedBehaviorSanitizer Pointer misalignment Null-pointer dereference

slide-9
SLIDE 9

Comprehensive Protection: Goal and Reality

  • Accumulated execution slowdown
  • Example: Softbound + CETS → 110% slowdown
  • Implementation conflicts
  • Example: AddressSanitizer and MemorySanitizer

9

slide-10
SLIDE 10

Comprehensive Protection with Bunshin

  • Accumulated execution slowdown
  • Example: Softbound + CETS → 110% slowdown
  • Bunshin: Reduce to 60% or 40% (depends on the config)
  • Implementation conflicts
  • Example: AddressSanitizer and MemorySanitizer
  • Bunshin: Seamlessly enforce conflicting sanitizers

10

slide-11
SLIDE 11

The N-Version Way

11

Program Input Output

slide-12
SLIDE 12

The N-Version Way

12

Virtualization Synchronize Execution & Consolidate Outputs

Input Output Variant 1 Variant 2 Variant 3 Program Input Output

slide-13
SLIDE 13

The N-Version Way

13

Virtualization Synchronize Execution & Consolidate Outputs

Input (benign) Output (consensus) Variant 1 Variant 2 Variant 3 Program Input Output

slide-14
SLIDE 14

The N-Version Way

14

Virtualization Synchronize Execution & Consolidate Outputs

Output (divergence) Variant 1 Variant 2 Variant 3 Program Input Output Input (malicious)

slide-15
SLIDE 15

The N-Version Way

15

Virtualization Synchronize Execution & Consolidate Outputs

Output (divergence) Variant 1 Variant 2 Variant 3 Program Input Output Input (malicious)

An attacker has to simultaneously compromise all variants in order to to compromise the whole system

slide-16
SLIDE 16

Similar Ideas

  • Two variants placed in disjoint memory partitions

[N-Variant Systems]

  • Two variants with stacks growing in different directions

[Orchestra]

  • Multiple variants with randomized heap object locations

[DieHard]

  • Multiple versions of the same program

[Varan, Mx]

16

slide-17
SLIDE 17

Bunshin Overview

  • Goal:
  • Reduce slowdown caused by security mechanisms
  • Enable different or even conflicting mechanisms

17

slide-18
SLIDE 18

Challenges for Bunshin

18

  • How to generate these variants?
  • What properties they should have?
  • How to make them appear as one to outsiders?
  • What is a “behavior” and what is a divergence?
  • What if the sanitizers introduces new behaviors?
  • Multi-threading support?
slide-19
SLIDE 19

Variant Generation Intuitions

  • Scope of protection required → Sanitizers selected
  • Instrumented checks by each sanitizer

19

Memory Error Defenses Out-of-bound read/write Softbound, AddressSanitizer Use-after-free CETS, AddressSanitizer Uninitialized read MemorySanitizer Undefined behaviors UndefinedBehaviorSanitizer

void foo(T *a) { if(!is_valid_address(a) { report_and_abort(); } *a = 0x1234; } void bar(T *b) { if(!is_valid_address(b) { report_and_abort(); } *b = 0x5678; }

slide-20
SLIDE 20

Variant Generation Principles

  • Check distribution
  • Sanitizer distribution

20

slide-21
SLIDE 21

Check Distribution

21

Virtualization Synchronize Execution & Consolidate Outputs

Input Output Variant 1 Variant 2 Variant 3 Program Input Output

Partition 1 Partition 2 Partition 3 Partition 1 Partition 2 Partition 3

slide-22
SLIDE 22

Sanitizer Distribution

22

Virtualization Synchronize Execution & Consolidate Outputs

Input Output Variant 1 Variant 2 Variant 3 Program Input Output

A D D R E S S M E M O R Y U N D E F A D D R E S S M E M O R Y U N D E F

slide-23
SLIDE 23

Cost Profiling

  • Calculate the slowdown caused by the sanity checks

void foo(T *a) { timing_start(); if(!is_valid_address(a) { report_and_abort(); } *a = 0x1234; timing_end(); } void foo(T *a) { timing_start(); *a = 0x1234; timing_end(); }

23

slide-24
SLIDE 24

Cost Distribution

  • Equally distribute overhead to variants so that they

execute at the same speed

24

17% 28% 35% 20%

Foo Bar Baz Qux

17% 35%

Foo Baz

28% 20%

Bar Qux Variant 1 (52% overhead) Variant 2 (48% overhead)

slide-25
SLIDE 25

Variant Generation Process

25

Costs profiling Security mechanisms Variant compiling Variant generator Source code Variants Overhead distribution

(e.g., ASan, MSan, UBSan)

  • pt.
  • pt.

w/ ASan w/ UBSan w/ MSan w/ ASan

...

full selective

...

slide-26
SLIDE 26

Variant Sync Considerations

26

  • What is a behavior and what is a divergence?
  • System call (both order and arguments)
  • How to hook it?
  • By patching the system call table with a kernel module
  • What if different sanitizers introduce different system calls?
  • Sync only when a program is in its main function
  • Do not check system calls for memory management
slide-27
SLIDE 27

System Call Synchronization

27

Userspace Kernel Leader Follower 1 Follower 2

Partition 1 Partition 2 Partition 3

sync slot

Syscall number Arguments Execution result

slide-28
SLIDE 28

System Call Synchronization

28

Userspace Kernel Leader Follower 1 Follower 2

Partition 1 Partition 2 Partition 3

Syscall number Arguments Execution result

sync slot

① Leader enters syscall

slide-29
SLIDE 29

System Call Synchronization

29

Userspace Kernel Leader Follower 1 Follower 2

Partition 1 Partition 2 Partition 3

Syscall number Arguments Execution result

sync slot

② Followers enter syscall

slide-30
SLIDE 30

System Call Synchronization

30

Userspace Kernel Leader Follower 1 Follower 2

Partition 1 Partition 2 Partition 3

Syscall number Arguments Execution result

sync slot

③ Kernel execute the syscall

  • nly once
slide-31
SLIDE 31

System Call Synchronization

31

Userspace Kernel Leader Follower 1 Follower 2

Partition 1 Partition 2 Partition 3

Syscall number Arguments Execution result

sync slot

④ Leader fetches syscall result ④ Followers fetch syscall result

slide-32
SLIDE 32

Strict and Selective Lockstep

32

Userspace Kernel Leader Follower 1 Follower 2

Partition 1 Partition 2 Partition 3

sync ring buffer

Leader writes at the next available slot Followers read at their own speed

slide-33
SLIDE 33

Strict and Selective Lockstep

33

Userspace Kernel Leader Follower 1 Follower 2

Partition 1 Partition 2 Partition 3

sync ring buffer Always strictly synchronized for “write” related system calls

slide-34
SLIDE 34

Strict and Selective Lockstep

34

Userspace Kernel Leader Follower 1 Follower 2

Partition 1 Partition 2 Partition 3

sync ring buffer Always strictly synchronized for “write” related system calls

Selective-locksteps mitigates address leaks Address leak involves a "write" system call and with ASLR enabled, such leak attempt will be captured Reduce sync. overhead by 3% - 5%

slide-35
SLIDE 35

Multi-threading Support

35

Before fork After fork Leader Follower 1 Follower 2

Original Execution group New Execution group New ring buffer

slide-36
SLIDE 36

Multi-threading Support

36

Before fork After fork Leader Follower 1 Follower 2

Original Execution group New Execution group New ring buffer

Works if there is no interleaving between threads

slide-37
SLIDE 37

Multi-threading Support

37

Leader Follower 1 Follower 2 Userspace Kernel Total order of lock acquisition and releases Record Enforce Enforce

slide-38
SLIDE 38

Multi-threading Support

38

Leader Follower 1 Follower 2 Userspace Kernel Total order of lock acquisition and releases Record Enforce Enforce

Works under weak determinism (data race-free programs) Implementation specific (pthread APIs only)

slide-39
SLIDE 39

Evaluate Bunshin

39

  • Robustness and Security
  • Efficiency and Scalability
  • Protection Distribution Case Studies
slide-40
SLIDE 40

Robustness

40

Benchmark Single/Multi-thread Featuer Pass ? SPEC CPU2006 Single CPU Intensive SPLASH-2x Multi PARSEC Multi 6 out of 13 lighttpd Single I/O Intensive nginx Multi python, php Single Interpreter

slide-41
SLIDE 41

Security

  • RIPE Benchmark
  • Real-world CVEs

41

Config Succeed Probabilistic Failed Not possible Default 114 16 720 2990 AddressSanitizer 8 842 2990 Bunshin 8 842 2990 Config CVE Exploits Sanitizer Detect nginx-1.4.0 2013-2028 Blind ROP AddressSanitizer cpython-2.7.10 2016-5636 Integer overflow AddressSanitizer php-5.6.6 2015-4602 Type confusion AddressSanitizer

  • penssl-1.0.1a

2014-0160 Heartbleed AddressSanitizer httpd-2.4.10 2014-3581 Null dereference UndefinedBehaviorSanitizer

slide-42
SLIDE 42

Performance

Benchmark Items Strict-Lockstep Selective-Lockstep SPEC CPU2006 (19 Programs) Max 17.5% 14.7% Min 1.6% 1.0% Ave 8.6% 5.6% SPLASH-2X / PARSEC (19 Programs) Max 21.4% 18.9% Min 10.7% 6.6% Ave 16.6% 14.5% lighttpd 1MB File Request Ave 1.44% 1.21% nginx 1MB File Request Ave 1.71% 1.41%

slide-43
SLIDE 43

Performance Highlights

  • Low overhead (5% - 16%) for standard benchmarks
  • Negligible overhead (<= 2%) for server programs
  • Extra cost of ensuring weak determinism is 8%
  • Selective-lockstep saves around 3% overhead
slide-44
SLIDE 44

Scalability - Number of Variants

44

Sync Overhead (%) Number of variants 2 4 6 8

0.5 6.6 11.4 1.7 11.2 17.2 37.6 0.6 4.4 10.5 20.9

Ave Max Min

slide-45
SLIDE 45

Scalability - Number of Variants

45

Sync Overhead (%) Number of variants 2 4 6 8

0.5 6.6 11.4 1.7 11.2 17.2 37.6 0.6 4.4 10.5 20.9

Ave Max Min

The number of variants Bunshin can support with a reasonable overhead depends on machine configurations and program characteristics.

slide-46
SLIDE 46

Scalability - System Load

46

Sync Overhead (%) Number of variants 2% 50% 99%

0.2 0.8 1.9 6.4 9.7 13 2.2 4.8 6.6

Ave Max Min

slide-47
SLIDE 47

Scalability - System Load

47

Sync Overhead (%) Number of variants 2% 50% 99%

0.2 0.8 1.9 6.4 9.7 13 2.2 4.8 6.6

Ave Max Min

Bunshin works well in all levels of system load (i.e., Bunshin does not require exclusive cores)

slide-48
SLIDE 48

Check Distribution - ASan

48

Overhead (%) Whole V1 V2 V3 Bunshin

43.1 37.2 34.9 34.8 107

Overhead (%) Whole V1 V2 Bunshin

65.6 63 57.4 107

slide-49
SLIDE 49

Sanitizer Distribution - UBSan

49

Overhead (%) Whole V1 V2 V3 Bunshin

94.5 88 78.7 77.2 228

Overhead (%) Whole V1 V2 Bunshin

129 125 124 228

slide-50
SLIDE 50

Deviation from Optimal - ASan

50

Overhead (%) Whole V1 V2 V3 Bunshin

43.1 37.2 34.9 34.8 107

Overhead (%) Whole V1 V2 Bunshin

65.6 63 57.4 107 53.5 53.5 35.7 35.7

slide-51
SLIDE 51

Deviation from Optimal - UBSan

51

Overhead (%) Whole V1 V2 V3 Bunshin

94.5 88 78.7 77.2 228

Overhead (%) Whole V1 V2 Bunshin

129 125 124 228 114 114 76 76

slide-52
SLIDE 52

Reasons for Deviation from Optimal

  • Synchronization overhead
  • Inaccuracy in profiling
  • Suboptimal distribution
  • Non-distributable overhead
slide-53
SLIDE 53

Unifying LLVM Sanitizers

53

Overhead (%) gobmk povray h264ref average

177 208 248 165 172 207 189 141 148 191 246 158 98.9 112 205 116

ASan MSan UBSan Bunshin

slide-54
SLIDE 54

Overhead (%) gobmk povray h264ref average

177 208 248 165 172 207 189 141 148 191 246 158 98.9 112 205 116

ASan MSan UBSan Bunshin

Unifying LLVM Sanitizers

54

With an average of 5% more slowdown, Bunshin can seamlessly unify all three LLVM sanitizers

slide-55
SLIDE 55

Limitations and Future Work

  • Finer-grained check distribution
  • Sanitizer integration
  • Record-and-replay

55

slide-56
SLIDE 56

Conclusion

  • It is feasible to achieve both comprehensive protection and high

throughput with an N-version system

  • Bunshin is effective in reducing slowdown caused by sanitizers
  • 107% → 47.1% for ASan, 228% → 94.5% for UBSan
  • Bunshin can seamlessly unify three LLVM sanitizers with 5%

extra slowdown https://github.com/sslab-gatech/bunshin (Source code will be released soon)

56