Delta Pointers: Buffer Overflow Checks Without the Checks
Tadde¨ us Kroes & Koen Koning Erik van der Kouwe Herbert Bos Cristiano Giuffrida
June 19, 2018
Delta Pointers: Buffer Overflow Checks Without the Checks Tadde us - - PowerPoint PPT Presentation
Delta Pointers: Buffer Overflow Checks Without the Checks Tadde us Kroes & Koen Koning Erik van der Kouwe Herbert Bos Cristiano Giuffrida June 19, 2018 Preview buffer[10] secret 2 Preview buffer[10] secret buffer[5] 2 Preview
Delta Pointers: Buffer Overflow Checks Without the Checks
Tadde¨ us Kroes & Koen Koning Erik van der Kouwe Herbert Bos Cristiano Giuffrida
June 19, 2018
Preview
buffer[10] secret
2
Preview
buffer[10] secret
buffer[5]
2
Preview
buffer[10] secret
buffer[5] buffer[11]
2
Preview
buffer[10] secret
buffer[5] buffer[11]
secret
2
Preview
buffer[10] secret
buffer[5] buffer[11]
2
Preview
buffer[10] secret
buffer[5] buffer[11]
Automatic! FAST!
2
Buffer overflows still very common today
3
Bounds checking is slow
139%
MPX
94%
SGXBounds
80%
ASan
72%
BaggyBounds
64%
Low-Fat Pointers
% overhead
100 150 50
4
What is bounds checking?
void foo(char *buffer, size_t n) { buffer[n] = 10; }
5
What is bounds checking?
void foo(char *buffer, size_t n) { buffer[n] = 10; } Attacker-controlled?
5
What is bounds checking?
void foo(char *buffer, size_t n) { if (n >= SIZE(buffer)) ERROR("overflow"); buffer[n] = 10; }
A u t
a t i c a l l y i n s e r t e d
5
What is bounds checking?
void foo(char *buffer, size_t n) { if (n >= SIZE(buffer)) ERROR("overflow"); buffer[n] = 10; }
Needs metadata
5
What is bounds checking?
void foo(char *buffer, size_t n) { if (n >= SIZE(buffer)) ERROR("overflow"); buffer[n] = 10; }
Needs metadata Branching check
5
What is bounds checking?
void foo(char *buffer, size_t n) { if (n >= SIZE(buffer)) ERROR("overflow"); buffer[n] = 10; }
Needs metadata Branching check
Overhead!
5
What is bounds checking?
void foo(char *buffer, size_t n) { if (n >= SIZE(buffer)) ERROR("overflow"); buffer[n] = 10; }
Needs metadata Branching check
Overhead!
Efficient solution: pointer tagging
5
What is bounds checking?
void foo(char *buffer, size_t n) { if (n >= SIZE(buffer)) ERROR("overflow"); buffer[n] = 10; }
Needs metadata Branching check
Overhead!
Efficient solution: pointer tagging Still slow
5
Our approach: Delta Pointers
◮ Use pointer tagging
◮ No memory access for metadata lookup
◮ No need for branches
◮ Delegate checks to (off-the-shelf) hardware instead
◮ Focus on common case: upper bound on x86 64
◮ Mitigates all CVEs reported by related work 6
Our approach: Delta Pointers
◮ Use pointer tagging
◮ No memory access for metadata lookup
◮ No need for branches
◮ Delegate checks to (off-the-shelf) hardware instead
◮ Focus on common case: upper bound on x86 64
◮ Mitigates all CVEs reported by related work 6
Our approach: Delta Pointers
◮ Use pointer tagging
◮ No memory access for metadata lookup
◮ No need for branches
◮ Delegate checks to (off-the-shelf) hardware instead
◮ Focus on common case: upper bound on x86 64
◮ Mitigates all CVEs reported by related work 6
Our approach: Delta Pointers
139%
MPX
94%
SGXBounds
80%
ASan
72%
BaggyBounds
64%
Low-Fat Pointers
35%
Delta Pointers
% overhead
100 150 50
7
Regular pointers
00 e8 02 0c 40 10 00 00
virtual address ext
48 bit 16 bit
8
Regular pointers
00 e8 02 0c 40 10 00 00
virtual address ext
48 bit 16 bit
U p p e r 1 6 b i t s m u s t b e z e r
Regular pointers
00 e8 02 0c 40 10 00 00
virtual address ext
48 bit 16 bit
00 e8 02 0c 40 10 00 01
virtual address ext
48 bit 16 bit
Non-canonical, MMU faults!
8
Tagged pointers
00 e8 02 0c 40 10 12 34
virtual address tag
48 bit 16 bit
E n c
e i n f
m a t i
i n u n u s e d b i t s !
9
Tagged pointers
02 0c 40 10 12 34 56 78
virtual address tag
32 bit 32 bit
Shrink address space for bigger tags
9
Delta Pointers
02 0c 40 10 00 00 00 18
virtual address tag
32 bit 32 bit
size Size=24
10
Delta Pointers
02 0c 40 1c 00 00 00 18
virtual address tag
32 bit 32 bit
size Size=24
W h a t a b
t i n t e r n a l p
n t e r s ?
10
Delta Pointers
02 0c 40 1c 00 00 00 0c
virtual address tag
32 bit 32 bit
distance
Check upper bound for any pointer
Distance=12
10
Delta Pointers
02 0c 40 1c ff ff ff f4
virtual address tag
32 bit 32 bit
10
Delta Pointers
02 0c 40 1c 7f ff ff f4
virtual address tag
32 bit 32 bit
Set to 1 if
10
Delta Pointers
02 0c 40 1c 7f ff ff f4
virtual address delta tag
32 bit 32 bit
Set to 1 if
10
Instrumentation 02 0c 40 10 0 00 00 00 00
char *p = malloc(24);
11
Instrumentation 02 0c 40 10 0 00 00 00 00
char *p = malloc(24);
Distance=24
11
Instrumentation 02 0c 40 10 0 00 00 00 00
char *p = malloc(24);
Distance=24
0 00 00 00 00
11
Instrumentation 02 0c 40 10 0 00 00 00 00
char *p = malloc(24);
Distance=24
| (-24 << 32);
0 7f ff ff e8
11
Instrumentation 02 0c 40 27 0 7f ff ff e8
p += 23;
+23
11
Instrumentation 02 0c 40 27 0 7f ff ff e8
p += 23;
+23
Replicate arithmetic on tag
11
Instrumentation 02 0c 40 27 0 7f ff ff e8
p += 23;
+23
+ (23 << 32);
+23 0 7f ff ff ff
D i s t a n c e = 1
11
Instrumentation 02 0c 40 28 1 00 00 00 00
carry
p += 1 + (1 << 32);
+1 +1
Distance=0,
11
Instrumentation 02 0c 40 27 0 7f ff ff ff
p += -1 + (-1 << 32);
carry
Distance=1, in-bounds again
11
Instrumentation 02 0c 40 27 0 7f ff ff ff
p += -1 + (-1 << 32);
carry
Distance=1, in-bounds again
11
Dereferencing an in-bounds pointer
02 0c 40 27 0 7f ff ff ff
12
Dereferencing an in-bounds pointer
02 0c 40 27 0 7f ff ff ff ff ff ff ff 1 00 00 00 00
Strips away distance
12
Dereferencing an in-bounds pointer
02 0c 40 27 0 7f ff ff ff ff ff ff ff 1 00 00 00 00
02 0c 40 27 0 00 00 00 00
12
Dereferencing an in-bounds pointer
02 0c 40 27 0 7f ff ff ff ff ff ff ff 1 00 00 00 00
02 0c 40 27 0 00 00 00 00
Normal (in-bounds) pointer, access OK!
12
Dereferencing an out-of-bounds pointer
02 0c 40 2c 1 00 00 00 04
13
Dereferencing an out-of-bounds pointer
02 0c 40 2c 1 00 00 00 04 ff ff ff ff 1 00 00 00 00
13
Dereferencing an out-of-bounds pointer
02 0c 40 2c 1 00 00 00 04 ff ff ff ff 1 00 00 00 00
02 0c 40 2c 1 00 00 00 00
13
Dereferencing an out-of-bounds pointer
02 0c 40 2c 1 00 00 00 04 ff ff ff ff 1 00 00 00 00
Non-canonical pointer, MMU faults!
02 0c 40 2c 1 00 00 00 00
13
Implementation
◮ LLVM based prototype for C/C++ ◮ Stack + heap + globals ◮ 32-bit address → 4GB address space ◮ 31-bit distance → 2GB allocations ◮ Instrument NULL pointer with distance = −1 ◮ Optimizations: omit instrumentation on in-bounds pointers
14
Pointer tagging breaks things
◮ Uninstrumented libraries
// strdup(ptr); TAG(strdup(MASK(ptr)));
◮ Non-zero NULL pointer ◮ Subtraction, addition, multiplication, vectors, etc. ◮ Incomplete type information (e.g., unions) ◮ Compiler quirks ◮ . . . and more
◮ Solved with TBAA + def-use chain analysis ◮ Details in paper 15
Pointer tagging breaks things
◮ Uninstrumented libraries
// strdup(ptr); TAG(strdup(MASK(ptr)));
◮ Non-zero NULL pointer ◮ Subtraction, addition, multiplication, vectors, etc. ◮ Incomplete type information (e.g., unions) ◮ Compiler quirks ◮ . . . and more
◮ Solved with TBAA + def-use chain analysis ◮ Details in paper 15
Pointer tagging breaks things
◮ Uninstrumented libraries
// strdup(ptr); TAG(strdup(MASK(ptr)));
◮ Non-zero NULL pointer ◮ Subtraction, addition, multiplication, vectors, etc. ◮ Incomplete type information (e.g., unions) ◮ Compiler quirks ◮ . . . and more
◮ Solved with TBAA + def-use chain analysis ◮ Details in paper 15
Pointer tagging breaks things
◮ Uninstrumented libraries
// strdup(ptr); TAG(strdup(MASK(ptr)));
◮ Non-zero NULL pointer ◮ Subtraction, addition, multiplication, vectors, etc. ◮ Incomplete type information (e.g., unions) ◮ Compiler quirks ◮ . . . and more
◮ Solved with TBAA + def-use chain analysis ◮ Details in paper 15
Pointer tagging breaks things
◮ Uninstrumented libraries
// strdup(ptr); TAG(strdup(MASK(ptr)));
◮ Non-zero NULL pointer ◮ Subtraction, addition, multiplication, vectors, etc. ◮ Incomplete type information (e.g., unions) ◮ Compiler quirks ◮ . . . and more
◮ Solved with TBAA + def-use chain analysis ◮ Details in paper 15
Pointer tagging breaks things
◮ Uninstrumented libraries
// strdup(ptr); TAG(strdup(MASK(ptr)));
◮ Non-zero NULL pointer ◮ Subtraction, addition, multiplication, vectors, etc. ◮ Incomplete type information (e.g., unions) ◮ Compiler quirks ◮ . . . and more
◮ Solved with TBAA + def-use chain analysis ◮ Details in paper 15
16
Nginx
0.1 0.2 0.3 0.4 0.5 0.6 5 10 15 20 25 30 35 40 45 50 55 60 Latency (ms) Throughput (x1000 reqs/s) Baseline Delta Pointers
3-6% (I/O bound)
17
SPEC CPU2006 (C/C++)
0% 25% 50% 75% 100% 400.perlbench 401.bzip2 403.gcc 429.mcf 433.milc 444.namd 445.gobmk 447.dealII 450.soplex 453.povray 456.hmmer 458.sjeng 462.libquantum 464.h264ref 470.lbm 471.omnetpp 473.astar 482.sphinx3 483.xalancbmk masking tagging arithmetic
35% geomean with optimizations
18
19
19
Branching implementation: 48% overhead
19
Branching implementation: 48% overhead > 35%!
19
Branching implementation: 48% overhead > 35%!
19
Conclusion
◮ Reliable pointer tagging implementation ◮ We can check (upper) bounds without checks ◮ Faster than existing solutions
https://github.com/vusec/deltapointers
20
Related work
System C++ Metadata Checks Passing OoB pointers Non-linear Runtime Memory Softbound ✗ Table Deref ✓ ✓ 67% 64% Baggy Bounds ✗ Layout Arith ✓a ✓ 72% 11% PAriCheck ✗ Shadow Arith ✓ ✓b 96% 18% LBC ✗ Shadow Deref ✓ ✗ 22% 7.7% ASan ✓ Shadow Deref ✓ ✗ 80% 237% Intel MPX ✓ Table Deref ✓ ✓ 139% 90% LowFat ✓ Layout Deref ✗ ✓ 54% 5.2% SGXBounds ✓ Tag Deref ✓ ✓ 89% 0.1% Delta Pointers ✓ Tag — ✓ ✓ 35% 0%
a Only up to alloc size/2 on 32-bit. b Unless wrap-around on 16-bit labels occurs. 21
Impact of optimization with static analysis
0% 25% 50% 75% 100% 400.perlbench 401.bzip2 403.gcc 429.mcf 433.milc 444.namd 445.gobmk 447.dealII 450.soplex 453.povray 456.hmmer 458.sjeng 462.libquantum 464.h264ref 470.lbm 471.omnetpp 473.astar 482.sphinx3 483.xalancbmk unoptimized
41% ⇒ 35%
22
Statistics
◮ 72% of SPEC offsets are dynamic ◮ 80% increase in code size with Delta Pointers
23
Branching implementation
void foo(int n) { char *buffer = malloc(24); char *p = buffer + n; *p = 'x'; }
24
Branching implementation
S t
e e n d p
n t e r d i s t a n c e i n t a g
void foo(int n) { char *buffer = malloc(24); buffer |= (buffer + 24) << 32; char *p = buffer + n; *p = 'x'; }
24
Branching implementation
E x t r a c t t a g
l
d / s t
e
void foo(int n) { char *buffer = malloc(24); buffer |= (buffer + 24) << 32; char *p = buffer + n; tag = p >> 32; p = p & 0xffffffff; *p = 'x'; }
24
Branching implementation
B r a n c h i n g c h e c k
void foo(int n) { char *buffer = malloc(24); buffer |= (buffer + 24) << 32; char *p = buffer + n; tag = p >> 32; p = p & 0xffffffff; if (p >= tag) ERROR("overflow"); *p = 'x'; }
24
Some pointer tagging challenges
◮ Some operations need masking to preserve semantics
char a[10]; // size_t len = &a[10] - &a[0]; size_t len = MASK (&a[10]) - MASK (&a[0]);
◮ Pointers that look like integers
union { char *buf; uint64_t foo; } field; field.buf += 42; // should instrument field.foo += 42; // should NOT
25