[PPT] - ASLR on the Line Ben Gras, Kaveh Razavi, Erik Bosman , Herbert Bos, PowerPoint Presentation

SLIDE 1

ASLR on the Line

Ben Gras, Kaveh Razavi, Erik Bosman, Herbert Bos, Crisano Giuffrida

VUSec

SLIDE 2

Erik Bosman

@brainsmoke

SLIDE 3

Ben Gras

@bjg

Kaveh Razavi

@gober

Stephan van Schaik

SLIDE 4

SLIDE 5

ASLR

SLIDE 6

Address Space Layout Randomizaon

Widely deployed exploit migaon strategy: Choose a different locaon for code and data every me a process is run.

SLIDE 7

lower addresses higher addresses 2

48-1

SLIDE 8

lower addresses higher addresses 2

48-1

SLIDE 9

lower addresses higher addresses 2

48-1

SLIDE 10

lower addresses higher addresses 2

48-1

SLIDE 11

lower addresses higher addresses 2

48-1

SLIDE 12

lower addresses higher addresses 2

48-1

SLIDE 13

lower addresses higher addresses 2

48-1

SLIDE 14

Address Space Layout Randomizaon

Makes life for exploit writers a bit more difficult. Usually exploits need to know the locaon

f certain data in memory.

SLIDE 15

A Single Leak Reveals

- Joshua Drake

SLIDE 16

Address Space Layout Randomizaon

Exploit writers need to find a bug which leaks addresses without crashing the program. ... or do they?

SLIDE 17

This Presentaon:

A side-channel aack on processes baked into the hardware to discover ASLR informaon from Javascript in the browser. ASLR Cache (AnC) ⊕

SLIDE 18

CPU Core L1 L2 L3 (Last Level Cache), shared between cores DDR Memory CPU Core L1 L2 CPU Core L1 L2 CPU Core L1 L2

Modern CPU architectures

SLIDE 19

CPU Core

L1 code / L1 data L2 L3 (Last Level Cache), shared between cores

... ...

SLIDE 20

CPU Core

L1 code / L1 data L2 L3 (Last Level Cache), shared between cores

... ...

memory access data

SLIDE 21

CPU Core

L1 code / L1 data L2 L3 (Last Level Cache), shared between cores

... ...

memory access data virtual address

SLIDE 22

CPU Core

L1 code / L1 data L2 L3 (Last Level Cache), shared between cores

... ...

memory access data

MMU

virtual address physical address

SLIDE 23

CPU Core

L1 code / L1 data L2 L3 (Last Level Cache), shared between cores

... ...

memory access data virtual address physical address MMU TLB cache

SLIDE 24

CPU Core

L1 code / L1 data L2 L3 (Last Level Cache), shared between cores

... ...

memory access data virtual address physical address MMU TLB cache PT walk miss

SLIDE 25

CPU Core

L1 code / L1 data L2 L3 (Last Level Cache), shared between cores

... ...

memory access data virtual address physical address MMU TLB cache PT walk miss

SLIDE 26

Timers in Javascript

SLIDE 27

t0=performance.now();

peration();

t1=performance.now(); t = t1-t0; measured me real me

SLIDE 28

t0=performance.now();

peration();

t1=performance.now(); t = t1-t0; measured me real me

SLIDE 29

t0=performance.now();

peration();

t1=performance.now(); t = t1-t0; measured me real me

aer an- side-channel migaons (firefox) aer an- side-channel migaons (firefox)

SLIDE 30

c = 0; t0 = p.now(); while(t0 == p.now()); t1 = p.now();

peration();

while(t1 == p.now()) { c++; } measured me real me

aer an- side-channel migaons (firefox)

SLIDE 31

c = 0; t0 = p.now(); while(t0 == p.now()); t1 = p.now();

peration();

while(t1 == p.now()) { c++; } measured me real me

aer an- side-channel migaons (firefox)

SLIDE 32

c = 0; t0 = p.now(); while(t0 == p.now()); t1 = p.now();

peration();

while(t1 == p.now()) { c++; } measured me real me

aer an- side-channel migaons (firefox)

SLIDE 33

c = 0; t0 = p.now(); while(t0 == p.now()); t1 = p.now();

peration();

while(t1 == p.now()) { c++; } measured me real me

aer an- side-channel migaons (firefox)

SLIDE 34

c = 0; t0 = p.now(); while(t0 == p.now()); t1 = p.now();

peration();

while(t1 == p.now()) { c++; } measured me real me

aer an- side-channel migaons (firefox) aer an- side-channel migaons (firefox)

SLIDE 35

c = 0; t0 = p.now(); while(t0 == p.now()); t1 = p.now();

peration();

while(t1 == p.now()) { c++; } measured me real me

aer an- side-channel migaons (chrome) aer an- side-channel migaons (chrome)

SLIDE 36

new SharedArrayBuffer()

SLIDE 37

memory which may be shared between mulple worker threads.

new SharedArrayBuffer()

SLIDE 38

enabled by default by Firefox, Chrome and Edge since 2017 memory which may be shared between mulple worker threads.

new SharedArrayBuffer()

SLIDE 39

let SharedRowhammerBuffer = SharedArrayBuffer;

SLIDE 40

c=0; while (buf[0] == 0); while (buf[0] == 1) { c++; } buf[0]=1;

peration();

buf[0]=0;

1 2

measured me real me

using SharedArrayBuffer and worker threads

SLIDE 41

c=0; while (buf[0] == 0); while (buf[0] == 1) { c++; } buf[0]=1;

peration();

buf[0]=0;

1 2

measured me real me

using SharedArrayBuffer and worker threads

SLIDE 42

c=0; while (buf[0] == 0); while (buf[0] == 1) { c++; } buf[0]=1;

peration();

buf[0]=0;

1 2

measured me real me

using SharedArrayBuffer and worker threads

SLIDE 43

c=0; while (buf[0] == 0); while (buf[0] == 1) { c++; } buf[0]=1;

peration();

buf[0]=0;

1 2

measured me real me

using SharedArrayBuffer and worker threads

SLIDE 44

c=0; while (buf[0] == 0); while (buf[0] == 1) { c++; } buf[0]=1;

peration();

buf[0]=0;

1 2

measured me real me

using SharedArrayBuffer and worker threads

SLIDE 45

c=0; while (buf[0] == 0); while (buf[0] == 1) { c++; } buf[0]=1;

peration();

buf[0]=0;

1 2

measured me real me

using SharedArrayBuffer and worker threads

SLIDE 46

c=0; while (buf[0] == 0); while (buf[0] == 1) { c++; } buf[0]=1;

peration();

buf[0]=0;

1 2

measured me real me

using SharedArrayBuffer and worker threads

SLIDE 47

Cache Side-Channels

SLIDE 48

cache line (64 bytes) memory memory access data physical address

SLIDE 49

L3 cache N-way associave cache set 1 cache set memory memory access data physical address

SLIDE 50

L3 cache

...

2048 cache sets with 64 byte cache lines memory memory access data physical address

SLIDE 51

L3 cache

...

as many slices as cores

... ... ...

memory memory access data physical address

SLIDE 52

L3 cache

...

memory memory access data physical address

SLIDE 53

L3 cache

...

cache_set = (addr >> 6) % 2048, direct mapping, repeated every 128KB memory memory access data physical address

SLIDE 54

L3 cache

...

cache_set = (addr >> 6) % 2048, cache_slice = xor_hash(addr) direct mapping, repeated every 128KB memory memory access data physical address

SLIDE 55

L3 cache cache_set = (addr >> 6) % 2048, direct mapping, repeated every 128KB memory memory access data physical address

SLIDE 56

L3 cache cache_set = (addr >> 6) % 2048, two cache lines mapping to the same cache set have the same physical address modulo 128KB direct mapping, repeated every 128KB memory memory access data physical address

SLIDE 57

L3 cache cache_set = (addr >> 6) % 2048, two cache lines mapping to the same cache set have the same physical address modulo 4KB direct mapping, repeated every 128KB memory memory access data physical address

SLIDE 58

L3 cache two cache lines mapping to the same cache set have the same

ffset into their

memory page memory memory access data physical address 1 page = 64 cache lines

SLIDE 59

L3 cache EVICT + TIME (does an operaon use a specific cache line?)

SLIDE 60

L3 cache EVICT + TIME (does an operaon use a specific cache line?) evict(line_x); time(); t0 = time();

peration();

t = time()-t0;

SLIDE 61

L3 cache EVICT + TIME (does an operaon use a specific cache line?) evict(line_x); time(); t0 = time();

peration();

t = time()-t0;

SLIDE 62

L3 cache EVICT + TIME (does an operaon use a specific cache line?)

X mybuf X X X ...

evict(line_x); time(); t0 = time();

peration();

t = time()-t0;

SLIDE 63

L3 cache EVICT + TIME (does an operaon use a specific cache line?)

X mybuf X X X ...

evict(line_x); time(); t0 = time();

peration();

t = time()-t0;

X X X X X X X X X X X X

...

SLIDE 64

L3 cache EVICT + TIME (does an operaon use a specific cache line?)

X mybuf X X X ...

evict(line_x); time(); t0 = time();

peration();

t = time()-t0;

X X X X X X X X X X X X

...

SLIDE 65

L3 cache EVICT + TIME (does an operaon use a specific cache line?)

X mybuf X X X ...

evict(line_x); time(); t0 = time();

peration();

t = time()-t0;

X X X X X X X X X X X X

... trigger memory access (or not)

SLIDE 66

CPU Core

L1 code / L1 data L2 L3 (Last Level Cache), shared between cores

... ...

memory access data virtual address physical address MMU TLB cache PT walk miss

SLIDE 67

Page Tables

SLIDE 68

lower addresses higher addresses 2

48-1

SLIDE 69

CR3 512 entries covering 512GB each 2

48-1

SLIDE 70

CR3 512 entries covering 1GB each 2

48-1

SLIDE 71

CR3 512 entries covering 2MB each 2

48-1

SLIDE 72

CR3 512 entries poinng to 4096 byte regions in memory 2

48-1

SLIDE 73

CR3 2

48-1

SLIDE 74

7F83B6372040

virtual address lookup (x86_64)

SLIDE 75

7F83B6372040

virtual address lookup (x86_64)

7 F 8 3 B 6 3 7 3 4

SLIDE 76

TLB miss!

SLIDE 77

CR3

SLIDE 78

CR3 512 entries 511

SLIDE 79

255

CR3 512 entries 511

SLIDE 80

255

CR3 512 entries

SLIDE 81

255

CR3

14

512 entries

SLIDE 82

255

CR3

14

512 entries

SLIDE 83

255

CR3

14 433

512 entries

SLIDE 84

255

CR3

14 433

512 entries

SLIDE 85

255

CR3

14 433 370

512 entries

SLIDE 86

255

CR3

14 433 370

actual data

SLIDE 87

255

CR3

14 433 370

actual data

64

SLIDE 88

255

CR3

14 433 370

actual data 4K page

64

SLIDE 89

255

CR3

14 433 370

4K page

64

SLIDE 90

255

CR3

14 433 370

4K page

64

SLIDE 91

address informaon is directly encoded into the page table lookups, and page tables are pages themselves.

Observaon:

SLIDE 92

255

CR3

14 433 370

4K page

64

SLIDE 93

255

CR3

14 433 370

4K page

64

4K page 4K page 4K page 4K page

SLIDE 94

255

CR3

14 433 370

4K page

64

4K page 4K page 4K page 4K page

SLIDE 95

CR3

SLIDE 96

... ...

255 254 253 252 251 250 249 248 256 247 255

SLIDE 97

?

... ...

255 254 253 252 251 250 249 248 256 247 255 1 Cache line = 64 bytes = 8 possible page table entries

SLIDE 98

?

... ...

255 254 253 252 251 250 249 248 256 247 255 1 Cache line = 64 bytes = 8 possible page table entries

SLIDE 99

?

... ...

255 254 253 252 251 250 249 248 256 247 255 1 Cache line = 64 bytes = 8 possible page table entries cache line reveals 6 address bits

SLIDE 100

64 370 433 14 255

SLIDE 101

64 370 433 14 255

SLIDE 102

64 370 433 14 255 locaon within the page known by studying browser memory allocator

SLIDE 103

64 370 433 14 255

SLIDE 104

64 370 433 14 255

max entropy le:

SLIDE 105

? ? ? ?

max entropy le:

SLIDE 106

? ? ? ?

max entropy le: 4*3 bits + ...

SLIDE 107

? ? ? ?

which hit belongs to which cache line? max entropy le: 4*3 bits + ...

SLIDE 108

? ? ? ?

which hit belongs to which cache line? max entropy le: 4*3 bits + log2( 4 * 3 * 2 * 1 )

SLIDE 109

? ? ? ?

which hit belongs to which cache line? max entropy le: ~ 16.6 bits

SLIDE 110

allocate a buffer perform this side-channel aack on buffer entries 4096 bytes apart measure when the page table lookup crosses a cache line boundary

Sliding

SLIDE 111

... ...

375 374 373 372 371 370 369 368 376 367 370

SLIDE 112

... ...

375 374 373 372 371 370 369 368 376 367 371 +4096 bytes

SLIDE 113

... ...

375 374 373 372 371 370 369 368 376 367 372 +4096 bytes +4096 bytes

SLIDE 114

... ...

375 374 373 372 371 370 369 368 376 367 373 +4096 bytes +4096 bytes +4096 bytes

SLIDE 115

... ...

375 374 373 372 371 370 369 368 376 367 374 +4096 bytes +4096 bytes +4096 bytes +4096 bytes

SLIDE 116

... ...

375 374 373 372 371 370 369 368 376 367 375 +4096 bytes +4096 bytes +4096 bytes +4096 bytes +4096 bytes

SLIDE 117

376 +4096 bytes +4096 bytes +4096 bytes +4096 bytes +4096 bytes +4096 bytes

... ...

375 374 373 372 371 370 369 368 367 376

SLIDE 118

we can do the same thing for the 2nd level page table

Sliding

SLIDE 119

... ...

439 438 437 436 435 434 433 432 440 431 433

SLIDE 120

... ...

439 438 437 436 435 434 433 432 440 431 434 +2MB

SLIDE 121

... ...

439 438 437 436 435 434 433 432 440 431 435 +2MB +2MB

SLIDE 122

... ...

439 438 437 436 435 434 433 432 440 431 436 +2MB +2MB +2MB

SLIDE 123

... ...

439 438 437 436 435 434 433 432 440 431 437 +2MB +2MB +2MB +2MB

SLIDE 124

... ...

439 438 437 436 435 434 433 432 440 431 438 +2MB +2MB +2MB +2MB +2MB

SLIDE 125

... ...

439 438 437 436 435 434 433 432 440 431 439 +2MB +2MB +2MB +2MB +2MB +2MB

SLIDE 126

... ...

439 438 437 436 435 434 433 432 440 431 440 +2MB +2MB +2MB +2MB +2MB +2MB +2MB

SLIDE 127

? ?

SLIDE 128

? ?

SLIDE 129

? ?

max entropy le: 2*3 + log2(2 * 1) = 7 bits

SLIDE 130

... ...

15 14 13 12 11 10 9 8 16 7 +1GB 14

SLIDE 131

... ...

255 254 253 252 251 250 249 248 256 247 255 +512GB

SLIDE 132

Firefox (on Linux) does not inialize ArrayBuffers, so linux does not allocate space for the actual pages We can allocate huge chunks and use sliding to recover the whole address

Allocang large chunks of memory

SLIDE 133

Chrome does inialize memory, but jumps ahead in the address space every me it creates a new heap 3rd level address bits can be recovered, 4'th level bits needs chrome to inialize/free up to 4TB :-)

Allocang large chunks of memory

SLIDE 134 CPU Model Microa rchitecture Y ea r Intel Xeon E3-1240 v5 Skylake 2015 Intel Core i7-6700K Skylake 2015 Intel Celeron N2840 Silverm

nt

2014 Intel Xeon E5-2658 v2 Ivy Bridge EP 2013 Intel Atom C2750 Silverm

nt

2013 Intel Core i7-4500U Haswell 2013 Intel Core i7-3632QM Ivy Bridge 2012 Intel Core i7-2620QM Sandy Bridge 2011 Intel Core i5 M480 Westm ere 2010 Intel Core i7 920 Nehalem 2008 AMD FX-8350 8-Core Piledriver 2012 AMD FX-8320 8-Core Piledriver 2012 AMD FX-8120 8-Core Bulldozer 2011 AMD Athlon II 640 X4 K10 2010 AMD E-350 Bobcat 2010 AMD Phenom 9550 4-Core K10 2008 Allwinner A64 ARM Cortex A53 2016 Sam sung Exynos 5800 ARM Cortex A15 2014 Sam sung Exynos 5800 ARM Cortex A7 2014 Nvidia Tegra K1 CD580M-A1 ARM Cortex A15 2014 Nvidia Tegra K1 CD570M-A1 ARM Cortex A15; LPAE 2014

This side-channel was detected on 22 out of 22 tested architectures!

SLIDE 135

Demo video

SLIDE 136

Conclusions

Browser vendors seem to have given up on

protecng against side-channel aacks in favor of adding features :,-(

It's possible to perform cache side-channel

aacks from Javascript on the Memory Managment Unit to recover ASLR informaon

SLIDE 137

Any Quesons?

VUSec

project page: hps://vusec.net/projects/anc