Exploiting Microarchitectural Flaws in the Heart of the Memory - - PowerPoint PPT Presentation

exploiting microarchitectural flaws
SMART_READER_LITE
LIVE PREVIEW

Exploiting Microarchitectural Flaws in the Heart of the Memory - - PowerPoint PPT Presentation

Exploiting Microarchitectural Flaws in the Heart of the Memory Subsystem Daniel Moghimi, Worcester Polytechnic University Feb 20, 2020 Columbia University Spoiler!! 2 CPU Memory Subsystem Allocation Queue Front End CPU Memory Subsystem


slide-1
SLIDE 1

Exploiting Microarchitectural Flaws

in the Heart of the Memory Subsystem

Daniel Moghimi, Worcester Polytechnic University Feb 20, 2020 Columbia University

slide-2
SLIDE 2

2

Spoiler!!

slide-3
SLIDE 3

CPU Memory Subsystem

Front End

Allocation Queue

slide-4
SLIDE 4

CPU Memory Subsystem

Front End

Allocation Queue

stor $$, (add_A)

slide-5
SLIDE 5

CPU Memory Subsystem

Front End

Allocation Queue

stor $$, (add_A)

Scheduler

Store Load Load ALU ALU

EUs ROB Back End

slide-6
SLIDE 6

CPU Memory Subsystem

Front End

Allocation Queue

stor $$, (add_A)

Scheduler

Store Load Load ALU ALU

EUs ROB

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

Memory Subsystem Back End

slide-7
SLIDE 7

CPU Memory Subsystem

7 Front End 7

Allocation Queue

stor $$, (add_A)

Scheduler

Store Load Load ALU ALU

EUs ROB DRAM L3 L2

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End

slide-8
SLIDE 8

CPU Memory Subsystem

8 Front End 8

Allocation Queue

stor $$, (add_A)

Scheduler

Store Load Load ALU ALU

EUs ROB DRAM L3 L2

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End DTLB

P

RW US A …

Physical Page Number

… …

P

RW US A …

Physical Page Number

… …

P

RW US A …

Physical Page Number

… …

0x000401

Store Virtual Address

slide-9
SLIDE 9

CPU Memory Subsystem

9 Front End 9

Allocation Queue

stor $$, (add_A)

Scheduler

Store Load Load ALU ALU

EUs ROB DRAM L3 L2

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End DTLB

P

RW US A …

Physical Page Number

… …

P

RW US A …

Physical Page Number

… …

P

RW US A …

Physical Page Number

… …

0x000401

Store Virtual Address PMH

slide-10
SLIDE 10

CPU Memory Subsystem

10 Front End 10

Allocation Queue

stor $$, (add_A)

Scheduler

Store Load Load ALU ALU

EUs ROB DRAM L3 L2

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End DTLB

P

RW US A …

Physical Page Number

… …

P

RW US A …

Physical Page Number

… …

P

RW US A …

Physical Page Number

… …

0x000401

Store Virtual Address PMH Page Walk

slide-11
SLIDE 11

CPU Memory Subsystem

11 Front End 11

Allocation Queue

stor $$, (add_A)

Scheduler

Store Load Load ALU ALU

EUs ROB DRAM L3 L2

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End

slide-12
SLIDE 12

CPU Memory Subsystem

12 Front End 12

Allocation Queue

stor $$, (add_A)

Scheduler

Store Load Load ALU ALU

EUs ROB DRAM L3 L2

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End

slide-13
SLIDE 13

CPU Memory Subsystem

13 Front End 13

Allocation Queue

load (add_B), AX

Scheduler

Store Load Load ALU ALU

EUs ROB DRAM L3 L2

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End

slide-14
SLIDE 14

CPU Memory Subsystem

14 Front End 14

Allocation Queue

load (add_B), AX

Scheduler

Store Load Load ALU ALU

EUs ROB DRAM L3 L2

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End

slide-15
SLIDE 15

CPU Memory Subsystem

15 Front End 15

Allocation Queue

load (add_B), AX

Scheduler

Store Load Load ALU ALU

EUs ROB DRAM L3 L2

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End

slide-16
SLIDE 16

CPU Memory Subsystem

16 Front End 16

Allocation Queue

load (add_B), AX

Scheduler

Store Load Load ALU ALU

EUs ROB DRAM L3 L2

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End

slide-17
SLIDE 17

CPU Memory Subsystem

17 Front End 17

Allocation Queue

stor $$, (add_A) stor ##, (add_B) load (add_C), CX add CX, BX

Scheduler

Store Load Load ALU ALU

EUs ROB

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End DRAM L3 L2

slide-18
SLIDE 18

CPU Memory Subsystem – Store Forwarding

18 Front End 18

Allocation Queue

stor $$, (add_A) stor ##, (add_B) load (add_C), CX add CX, BX

Scheduler

Store Load Load ALU ALU

EUs ROB

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End DRAM L3 L2

slide-19
SLIDE 19

CPU Memory Subsystem – Store Forwarding

19 Front End 19

Allocation Queue

stor $$, (add_A) stor ##, (add_B) load (add_C), CX add CX, BX

Scheduler

Store Load Load ALU ALU

EUs ROB

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End DRAM L3 L2

  • addr_c == addr_a?
  • addr_c == addr_b?
slide-20
SLIDE 20

20

slide-21
SLIDE 21

21

slide-22
SLIDE 22
  • 1. Pepperoni
  • 2. Chicken
  • 3. Pepperoni

Tom Alex Professor

slide-23
SLIDE 23
  • 1. Pepperoni
  • 2. Chicken
  • 3. Pepperoni

Cut and Precook Cut Tomato Grind Chees Tom Alex Professor Cook Deliver

slide-24
SLIDE 24
  • 1. Pepperoni
  • 2. Chicken
  • 3. Pepperoni

Cut and Precook Cut Tomato Grind Chees Tom Alex Professor Cook Deliver

slide-25
SLIDE 25
  • 1. Pepperoni
  • 2. Chicken
  • 3. Pepperoni
slide-26
SLIDE 26
  • 1. Pepperoni
  • 2. Chicken
  • 3. Pepperoni

Cut Cut Grind Precook and mix

slide-27
SLIDE 27
  • 1. Pepperoni
  • 2. Chicken
  • 3. Pepperoni

Cut Cut Grind Precook and mix

slide-28
SLIDE 28
  • 1. Pepperoni
  • 2. Chicken
  • 3. Pepperoni

Cut Cut Grind Precook and mix Precook and mix Cook and deliver Cook and deliver

slide-29
SLIDE 29

Speculative Cut

Busy

Speculative Cut

  • 1. Pepperoni
  • 2. Chicken
  • 3. Pepperoni
  • 4. ???
slide-30
SLIDE 30

Speculative Cut Speculative Cut

  • 1. Pepperoni
  • 2. Chicken
  • 3. Pepperoni
  • 4. Chicken

Precook and mix

slide-31
SLIDE 31

Speculative Cut Speculative Cut

  • 1. Pepperoni
  • 2. Chicken
  • 3. Pepperoni
  • 4. Chicken

Precook and mix Precook and mix

slide-32
SLIDE 32

MemJam

32

slide-33
SLIDE 33

MemJam Attack

  • Memory loads/stores are executed out of order and speculatively.

33

slide-34
SLIDE 34

MemJam Attack

  • Memory loads/stores are executed out of order and speculatively.
  • Address translation can be expensive.

34

slide-35
SLIDE 35

MemJam Attack

  • Memory loads/stores are executed out of order and speculatively.
  • Address translation can be expensive.
  • 4K Aliasing: Addresses that are 4K apart are assumed dependent.

35

slide-36
SLIDE 36

CPU Memory Subsystem – Store Forwarding

36 Front End 36

Allocation Queue

stor $$, (add_A) stor ##, (add_B) load (add_C), CX add CX, BX

Scheduler

Store Load Load ALU ALU

EUs ROB

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End DRAM L3 L2

  • addr_c[0:12] == addr_a[12:0]?
slide-37
SLIDE 37

CPU Memory Subsystem – Store Forwarding

37 Front End 37

Allocation Queue

stor $$, (add_A) stor ##, (add_B) load (add_C), CX add CX, BX

Scheduler

Store Load Load ALU ALU

EUs ROB

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End DRAM L3 L2 Verify?

slide-38
SLIDE 38

MemJam Attack

  • Memory loads/stores are executed out of order and speculatively.
  • Address translation can be expensive.
  • 4K Aliasing: Addresses that are 4K apart are assumed dependent.
  • The dependency is verified after the execution!
  • Re-execute the load block due to false dependency.

38

slide-39
SLIDE 39

MemJam – 4K Aliasing across Sibling Threads

39 Core Thread A Thread B Load 0xFECD1 Load 0xFECD2 Load 0xFECD3 Load 0xFECD4 Load 0xFECD5 Load 0xFECD6 Load 0xFECD7 Load 0xFECD8 Execute & Time

slide-40
SLIDE 40

MemJam – 4K Aliasing across Sibling Threads

40 Core Thread A Thread B Load 0xFECD1 Load 0xFECD2 Load 0xFECD3 Load 0xFECD4 Load 0xFECD5 Load 0xFECD6 Load 0xFECD7 Load 0xFECD8 Execute & Time Store 0x12ABCDEF Store 0x12ABCDEF Store 0x12ABCDEF Store 0x12ABCDEF Store 0x12ABCDEF Store 0x12ABCDEF Store 0x12ABCDEF Store 0x12ABCDEF Store 0x12ABCDEF Store 0x12ABCDEF

slide-41
SLIDE 41

MemJam – 4K Aliasing across Sibling Threads

41 Core Thread A Thread B Load 0xFECD1 Load 0xFECD2 Load 0xFECD3 Load 0xFECD4 Load 0xFECD5 Load 0xFECD6 Load 0xFECD7 Load 0xFECD8 Execute & Time Store 0x12ABC200 Store 0x12ABC200 Store 0x12ABC200 Store 0x12ABC200 Store 0x12ABC200 Store 0x12ABC200 Store 0x12ABC200 Store 0x12ABC200 Store 0x12ABC200 Store 0x12ABC200

slide-42
SLIDE 42

MemJam – 4K Aliasing across Sibling Threads

42 Core Thread A Thread B Load 0xFECD1 Load 0xFECD2 Load 0xFECD3 Load 0xFECD4 Load 0xFECD5 Load 0xFECD6 Load 0xFECD7 Load 0xFECD8 Execute & Time Store 0x12ABC Store 0x12ABC Store 0x12ABC Store 0x12ABC Store 0x12ABC Store 0x12ABC Store 0x12ABC Store 0x12ABC Store 0x12ABC Store 0x12ABC

slide-43
SLIDE 43

MemJam – Intra Cache Line Resolution

43 Least 12 bits (Virtual Address = Physical Address) Rest of the bits (Virtual != Physical)

slide-44
SLIDE 44

MemJam – Intra Cache Line Resolution

44 Least 12 bits (Virtual Address = Physical Address) Rest of the bits (Virtual != Physical) L1 Cache Attacks

slide-45
SLIDE 45

MemJam – Intra Cache Line Resolution

45 Least 12 bits (Virtual Address = Physical Address) Rest of the bits (Virtual != Physical) L1 Cache Attacks L2/L3 Cache Attacks

slide-46
SLIDE 46

MemJam – Intra Cache Line Resolution

46 Least 12 bits (Virtual Address = Physical Address) Rest of the bits (Virtual != Physical) L1 Cache Attacks L2/L3 Cache Attacks

MemJam

  • Conflicted intra-cache line Leakage (4-byte granularity)
  • Higher time → Memory accesses with the same bit 3 - 12
  • 4 bits of intra-cache level leakage
slide-47
SLIDE 47

MemJam – Attacking So-Called Constant Time AES

  • Scatter-gather implementation of AES
  • Intel SGX Software Development Kit (SDK) and IPP Cryptography Library
  • 256 S-Box – 4 Cache Line
  • Cache independent access pattern

47 LINE 2 A LINE 2 B LINE 2 C LINE 2 D 64 Bytes 4 Cache Lines S-Box Lookup A B C D B

slide-48
SLIDE 48

MemJam – Attacking So-Called Constant Time AES

LINE 2 64 Bytes 4 Cache Lines

slide-49
SLIDE 49

MemJam – AES Key Recovery

slide-50
SLIDE 50

Are there other Address Aliasing?

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

L1

DTLB

Memory Subsystem

VFN PFN [8:0] VFN PFN [8:0] Offset Offset DATA DATA VFN PFN [8:0] VFN PFN [8:0] Offset Offset DATA DATA

Store Buffer

slide-51
SLIDE 51

Spoiler: Finding Undocumented Aliasing

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

L1

DTLB

Memory Subsystem

VFN PFN [8:0] VFN PFN [8:0] Offset Offset DATA DATA VFN PFN [8:0] VFN PFN [8:0] Offset Offset DATA DATA

Store Buffer

Virtual Pages

slide-52
SLIDE 52

Spoiler: Finding Undocumented Aliasing

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

L1

DTLB

Memory Subsystem

VFN PFN [8:0] VFN PFN [8:0] Offset Offset DATA DATA VFN PFN [8:0] VFN PFN [8:0] Offset Offset DATA DATA

Store Buffer

Virtual Pages 64 pages

slide-53
SLIDE 53

Spoiler: Finding Undocumented Aliasing

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. 0C0 0C0 0C0 … DATA DATA DATA …

L1

DTLB

Memory Subsystem

VFN PFN [8:0] VFN PFN [8:0] 0C0 0C0 DATA DATA VFN PFN [8:0] VFN PFN [8:0] 0C0 0C0 DATA DATA

Store Buffer

Virtual Pages 64 pages

0 C 0 0 x 4 0 0 F E 2 0 C 0 0 x 4 0 0 F E 1 … … 0 C 0 0 x 4 0 1020 Stores

slide-54
SLIDE 54

Spoiler: Finding Undocumented Aliasing

VFN PFN VFN PFN VFN PFN … …. Offset 0C0 Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. 0C0 0C0 0C0 … DATA DATA DATA …

L1

DTLB

Memory Subsystem

VFN PFN [8:0] VFN PFN [8:0] 0C0 0C0 DATA DATA VFN PFN [8:0] VFN PFN [8:0] 0C0 0C0 DATA DATA

Store Buffer

Virtual Pages 64 pages

Stores 0 C 0 0 x 4 0 0 F E 2 0 C 0 0 x 4 0 0 F E 1 … … 0 C 0 0 x 4 0 1 0 2 0 0 C 0 0 x 4 F 1 2 3 4 Load

slide-55
SLIDE 55

Spoiler: Finding Undocumented Aliasing

Stores 0 C 0 0 x 4 0 0 F E 3 0 C 0 0 x 4 0 0 F E 2 … … 0 C 0 0 x 4 0 1 0 2 1 0 C 0 0 x 4 F 1 2 3 4 Load

Virtual Pages

VFN PFN VFN PFN VFN PFN … …. Offset 0C0 Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. 0C0 0C0 0C0 … DATA DATA DATA …

L1

DTLB

Memory Subsystem

VFN PFN [8:0] VFN PFN [8:0] 0C0 0C0 DATA DATA VFN PFN [8:0] VFN PFN [8:0] 0C0 0C0 DATA DATA

Store Buffer

slide-56
SLIDE 56

Spoiler: Finding Undocumented Aliasing

Stores 0 C 0 0 x 4 0 0 F E 4 0 C 0 0 x 4 0 0 F E 3 … … 0 C 0 0 x 4 0 1 0 2 2 0 C 0 0 x 4 F 1 2 3 4 Load

Virtual Pages

VFN PFN VFN PFN VFN PFN … …. Offset 0C0 Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. 0C0 0C0 0C0 … DATA DATA DATA …

L1

DTLB

Memory Subsystem

VFN PFN [8:0] VFN PFN [8:0] 0C0 0C0 DATA DATA VFN PFN [8:0] VFN PFN [8:0] 0C0 0C0 DATA DATA

Store Buffer

slide-57
SLIDE 57

Spoiler: Finding Undocumented Aliasing

0 C 0 0 x 4 0 0 F E 5 0 C 0 0 x 4 0 0 F E 4 … … 0 C 0 0 x 4 0 1 0 2 3 0 C 0 0 x 4 F 1 2 3 4

Virtual Pages

VFN PFN VFN PFN VFN PFN … …. Offset 0C0 Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. 0C0 0C0 0C0 … DATA DATA DATA …

L1

DTLB

Memory Subsystem

VFN PFN [8:0] VFN PFN [8:0] 0C0 0C0 DATA DATA VFN PFN [8:0] VFN PFN [8:0] 0C0 0C0 DATA DATA

Store Buffer

slide-58
SLIDE 58

Spoiler: Finding Undocumented Aliasings

0 C 0 0 x 4 0 0 F E 5 0 C 0 0 x 4 0 0 F E 4 … … 0 C 0 0 x 4 0 1 0 2 3 0 C 0 0 x 4 F 1 2 3 4

Virtual Pages

VFN PFN VFN PFN VFN PFN … …. Offset 0C0 Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. 0C0 0C0 0C0 … DATA DATA DATA …

L1

DTLB

Memory Subsystem

VFN PFN [8:0] VFN PFN [8:0] 0C0 0C0 DATA DATA VFN PFN [8:0] VFN PFN [8:0] 0C0 0C0 DATA DATA

Store Buffer

0 C 0 0 x 6 5 F 3 2 X X 0 C 0 0 x 3 2 A C 2 X X

Physical Addresses

slide-59
SLIDE 59

Spoiler: Finding Undocumented Aliasing

Virtual Pages

VFN PFN VFN PFN VFN PFN … …. Offset 0C0 Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. 0C0 0C0 0C0 … DATA DATA DATA …

L1

DTLB

Memory Subsystem

VFN PFN [8:0] VFN PFN [8:0] 0C0 0C0 DATA DATA VFN PFN [8:0] VFN PFN [8:0] 0C0 0C0 DATA DATA

Store Buffer

slide-60
SLIDE 60

Spoiler: Finding Undocumented Aliasing

Virtual Pages

60

slide-61
SLIDE 61

Spoiler: Learning on Physical Address Bits

61 Least 12 bits (Virtual Address = Physical Address) Rest of the bits (Virtual != Physical) L1 Cache Attacks L2/L3 Cache Attacks

MemJam

slide-62
SLIDE 62

Spoiler: Learning on Physical Address Bits

62 Least 12 bits (Virtual Address = Physical Address) VFN L1 Cache Attacks L2/L3 Cache Attacks

MemJam

PFN

MemJam

slide-63
SLIDE 63

Spoiler: Learning on Physical Address Bits

63 Least 12 bits (Virtual Address = Physical Address) VFN L1 Cache Attacks L2/L3 Cache Attacks

MemJam

PFN

MemJam

Pime+Probe on Cache, Eviction Sets, Rowhammer

slide-64
SLIDE 64

Spoiler: Learning on Physical Address Bits

64 Least 12 bits (Virtual Address = Physical Address) VFN L1 Cache Attacks L2/L3 Cache Attacks

MemJam

PFN

MemJam

Pime+Probe on Cache, Eviction Sets, Rowhammer

Spoiler

slide-65
SLIDE 65

Spoiler – JavaScript Eviction Sets

65

slide-66
SLIDE 66

Spoiler - Rowhammer

66

  • Row Buffer Conflict
  • Single-sided Rowhammer
slide-67
SLIDE 67

Spoiler - Rowhammer

67

  • Detecting Contiguous Memory
  • Double-sided Rowhammer
slide-68
SLIDE 68

2018: Meltdown Attack?

68

slide-69
SLIDE 69

2018: Meltdown Attack?

69

0xf…81a0123 P A S S W O R D Virtual Address Space

User Space Kernel Space 256 different CPU Cache Line CPU Registers

slide-70
SLIDE 70

2018: Meltdown Attack?

0xf…81a0123 P A S S W O R D Virtual Address Space

User Space Kernel Space

Oracle

256 different CPU Cache Line CPU Registers

slide-71
SLIDE 71

2018: Meltdown Attack?

0xf…81a0123 P A S S W O R D Virtual Address Space

User Space Kernel Space

Oracle

256 different CPU Cache Line CPU Registers

Fault

slide-72
SLIDE 72

2018: Meltdown Attack?

0xf…81a0123 P A S S W O R D Virtual Address Space

User Space Kernel Space

Oracle

256 different CPU Cache Line CPU Registers

P

Fault

slide-73
SLIDE 73

2018: Meltdown Attack?

0xf…81a0123 P A S S W O R D Virtual Address Space

User Space Kernel Space

Oracle

256 different CPU Cache Line CPU Registers

P

Fault

slide-74
SLIDE 74

2018: Meltdown Attack?

0xf…81a0123 P A S S W O R D Virtual Address Space

User Space Kernel Space

Oracle

256 different CPU Cache Line CPU Registers

slide-75
SLIDE 75

2018: Meltdown Attack?

0xf…81a0123 P A S S W O R D Virtual Address Space

User Space Kernel Space

Oracle

256 different CPU Cache Line CPU Registers F+R

slide-76
SLIDE 76

2018: Meltdown Attack?

0xf…81a0123 P A S S W O R D Virtual Address Space

User Space Kernel Space

Oracle

256 different CPU Cache Line CPU Registers F+R

slide-77
SLIDE 77

2018: Meltdown Attack?

0xf…81a0123 P A S S W O R D Virtual Address Space

User Space Kernel Space

Oracle

256 different CPU Cache Line CPU Registers F+R

slide-78
SLIDE 78

2018: Meltdown Attack?

P A S S W O R D

Virtual Address Space

User Space Kernel Space

Oracle

256 different CPU Cache Line CPU Registers

‘P’ = 0x50

slide-79
SLIDE 79

Microarchitecture Data Sampling (MDS)

  • Meltdown is fixed but you can still leak on the fix hardware.
  • Which part of the CPU leak the data?!
  • Why does it leak?

79

slide-80
SLIDE 80

CPU Memory Subsystem – Challenges?

80 Front End 80

Allocation Queue

stor $$, (add_A) stor ##, (add_B) load (add_C), CX add CX, BX

Scheduler

Store Load Load ALU ALU

EUs ROB

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End DRAM L3 L2

slide-81
SLIDE 81

81

Memory Access

Canonical #GP

Offset VFN

Virtual Address

slide-82
SLIDE 82

82

Memory Access

Canonical #GP TLB

PTE

PMH P

RW US A …

Physical Page Number

… … Offset VFN

Virtual Address Y

slide-83
SLIDE 83

83

Memory Access

Canonical #GP TLB

Y

PMH P

RW US A …

Physical Page Number

… …

Perm.

Y PTE

Offset VFN

Virtual Address

slide-84
SLIDE 84

84

Memory Access

Canonical #GP TLB

Y

PMH

P RW US A …

Physical Page Number

… …

Perm.

Y

Present

Y

#PF

PTE

Offset VFN

Virtual Address

slide-85
SLIDE 85

85

Memory Access

Canonical #GP TLB

Y

PMH

P RW US A …

Physical Page Number

… …

Perm.

Y

Present

Y

#PF Accessed

Y

Set A Bit

PTE

Offset VFN

Virtual Address

slide-86
SLIDE 86

86

Memory Access

Canonical #GP TLB

Y

PMH

P RW US A …

Physical Page Number

… …

Perm.

Y

Present

Y

#PF Accessed

Y

Set A Bit Aligned Vector

Y PTE

Offset VFN

Virtual Address

#GP

slide-87
SLIDE 87

87

Memory Access

Canonical #GP TLB

Y

PMH

P RW US A …

Physical Page Number

… …

Perm.

Y

Present

Y

#PF Accessed

Y

Set A Bit Aligned Vector

Y PTE

Offset VFN

Virtual Address

#GP Cache Aligned Split Cache

Y

slide-88
SLIDE 88

88

Memory Access

Canonical #GP TLB

Y

PMH

P RW US A …

Physical Page Number

… …

Perm.

Y

Present

Y

#PF Accessed

Y

Set A Bit Aligned Vector

Y PTE

Offset VFN

Virtual Address

#GP Cache Aligned Split Cache

Y

Cached

Y

Cache Miss Handler

slide-89
SLIDE 89

89

Memory Access

Canonical #GP TLB

Y

PMH

P RW US A …

Physical Page Number

… …

Perm.

Y

Present

Y

#PF Accessed

Y

Set A Bit Aligned Vector

Y PTE

Offset VFN

Virtual Address

#GP Cache Aligned Split Cache

Y

Cached

Y

Cache Miss Handler False Store Dep.

Y

Hazard Recovery

slide-90
SLIDE 90

90

Memory Access

Canonical #GP TLB

Y

PMH

P RW US A …

Physical Page Number

… …

Perm.

Y

Present

Y

#PF Accessed

Y

Set A Bit Aligned Vector

Y PTE

Offset VFN

Virtual Address

#GP Cache Aligned Split Cache

Y

Cached

Y

Cache Miss Handler False Store Dep.

Y

Hazard Recovery TSX Failure

Y #RTM

slide-91
SLIDE 91

Fault VS. Assist Dilemma

  • Microcode Assists: The CPU executes an internal event handler to

service complex instructions/operations

  • Fault (#GP

, #PF , #RTM): An assist that run a software-based callback

slide-92
SLIDE 92

CPU Memory Subsystem – Hazard Recovery

92 Front End 92

Allocation Queue stor $$, (addr_B) load (addr_A), AX Scheduler

Store Load Load ALU ALU

EUs ROB

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End DRAM L3 L2

slide-93
SLIDE 93

CPU Memory Subsystem – Hazard Recovery

93 Front End 93

Allocation Queue stor $$, (addr_B) load (addr_A), AX Scheduler

Store Load Load ALU ALU

EUs ROB

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End DRAM L3 L2

slide-94
SLIDE 94

CPU Memory Subsystem – Hazard Recovery

94 Front End 94

Allocation Queue stor $$, (addr_B) load (addr_A), AX Scheduler

Store Load Load ALU ALU

EUs ROB

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End DRAM L3 L2

slide-95
SLIDE 95

CPU Memory Subsystem – Hazard Recovery

95 Front End 95

Allocation Queue stor $$, (addr_B) load (addr_A), AX Scheduler

Store Load Load ALU ALU

EUs ROB

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

Memory Subsystem Back End DRAM L3 L2

slide-96
SLIDE 96

MDS Attacks (ZombieLoad, RIDL, Fallout, …)

  • The CPU must flush the pipeline before executing an assist.
  • Upon an Exception/Fault/Assist on a Load, Intel CPUs:
  • Execute the load until the last stage.
  • Flush the pipeline at the retirement stage (Cheap Recovery Logic).
  • Continue the load with some data to reach the retirement stage.
  • Which data?

96

slide-97
SLIDE 97

CPU Memory Subsystem – Leaky Buffers

97 97

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

DRAM L3 L2 Memory Subsystem

ZombieLoad Fallout

slide-98
SLIDE 98

CPU Memory Subsystem – Leaky Buffers

98 98

VFN PFN VFN PFN VFN PFN … …. Offset Offset Offset … DATA DATA DATA …

Load Buffer

VFN PFN [8:0] VFN PFN [8:0] VFN PFN [8:0] … …. Offset Offset Offset … DATA DATA DATA …

Store Buffer

L1

Fill Buffer DTLB

DRAM L3 L2 Memory Subsystem

ZombieLoad Fallout

slide-99
SLIDE 99

MDS Attacks (ZombieLoad, RIDL, Fallout, …)

  • The CPU must flush the pipeline before executing an assist.
  • Upon an Exception/Fault/Assist on a Load, Intel CPUs:
  • Execute the load until the last stage.
  • Flush the pipeline at the retirement stage (Cheap Recovery Logic).
  • Continue the load with some data to reach the retirement stage.
  • Which data? (Fill buffer, Store Buffer, Load Buffer)
  • Which one will be leaked first?

99

slide-100
SLIDE 100

ZombieLoad Attack

100

L1D Cache

DRAM L3 L2

Core

slide-101
SLIDE 101

ZombieLoad Attack

101

LFB

L1D Cache

DRAM L2 L3

Core

slide-102
SLIDE 102

ZombieLoad Attack

102

LFB

L1D Cache

DRAM L3 L2

Core

slide-103
SLIDE 103

ZombieLoad Attack

103

DRAM LFB

L1D Cache

L3 L2

Core Cache Line

slide-104
SLIDE 104

ZombieLoad Attack

104

DRAM LFB

L1D Cache

L3

Cache Line

L2

Core

slide-105
SLIDE 105

ZombieLoad Attack

105 x x x x

DRAM LFB

L1D Cache

x x x …

L3

Cache Line

L2

Core

slide-106
SLIDE 106

ZombieLoad Attack

106 x x x x

DRAM LFB (10 entries)

L1D Cache

x x x x x …

L3

Cache Line

L2

Core De-allocate

slide-107
SLIDE 107

ZombieLoad Attack

107 x x x x

DRAM LFB (10 entries)

L1D Cache

x x x x x …

L3

Cache Line P

RW

US

A …

Physical Page Number

… …

Cache Line

L2

Core

slide-108
SLIDE 108

ZombieLoad Attack

108 x x x x

DRAM LFB (10 entries)

L1D Cache

x x x x x …

L3

Cache Line P

RW

US

A …

Physical Page Number

… …

L2

Core

slide-109
SLIDE 109

ZombieLoad Attack

109 x x x x

DRAM LFB (10 entries)

L1D Cache

x x x x x …

L3

Cache Line P

RW

US

A …

Physical Page Number

… …

L2

Core

slide-110
SLIDE 110

ZombieLoad Attack

110 x x x x

DRAM LFB (10 entries)

L1D Cache

x x x x x …

L3

Cache Line P

RW

US

A …

Physical Page Number

… …

x x x x

L2

Core

slide-111
SLIDE 111

ZombieLoad Attack

111 x x x x

DRAM LFB (10 entries)

L1D Cache

x x x x x …

L3

Cache Line P

RW

US

A

Physical Page Number

… …

x x x x

Variant 1: #GP Variant 3: MC

L2

Core

Variant 2: #RTM

slide-112
SLIDE 112

Meltdown-style Attacks

112

slide-113
SLIDE 113

Data Sampling – Domino Attack

  • We may leak bytes of data from other unimportant fill buffer

entries

  • Leak domino bytes to perform error correction

113

1 1 0 1 0 0 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1

T arget Secret

0xd3 0x10 0x4f 0x37 0x0e 0xb0

slide-114
SLIDE 114

Data Sampling – Domino Attack

  • We may leak bytes of data from other unimportant fill buffer

entries

  • Leak domino bytes to perform error correction

114

1 1 0 1 0 0 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1

T arget Secret

0xd3 0x10 0x4f 0x37 0x0e 0xb0 0x7f 0x84

slide-115
SLIDE 115

Data Sampling – Domino Attack

  • We may leak bytes of data from other unimportant fill buffer

entries

  • Leak domino bytes to perform error correction

115

1 1 0 1 0 0 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1

T arget Secret

0xd3 0x10 0x4f 0x37 0x0e 0xb0 0x7f 0x84 0xd3 0x37 0x7f

slide-116
SLIDE 116

116

slide-117
SLIDE 117

ZombieLoad - Recovering Intel SGX Sealing Key

  • Intel SGX allow developers to have hardware support for TEE
  • Malicious OS is part of the threat model
  • We can read register values of a trusted enclave with help of a

malicious OS

117

sgx-step

slide-118
SLIDE 118

ZombieLoad - Recovering Intel SGX Sealing Key

  • Intel SGX allow developers to have hardware support for TEE
  • Malicious OS is part of the threat model
  • We can read register values of a trusted enclave with help of a

malicious OS

118

sgx-step

slide-119
SLIDE 119

ZombieLoad - Recovering Intel SGX Sealing Key

  • Intel SGX allow developers to have hardware support for TEE
  • Malicious OS is part of the threat model
  • We can read register values of a trusted enclave with help of a

malicious OS

119

sgx-step

slide-120
SLIDE 120

ZombieLoad - Recovering Intel SGX Sealing Key

  • Intel SGX allow developers to have hardware support for TEE
  • Malicious OS is part of the threat model
  • We can read register values of a trusted enclave with help of a

malicious OS

120

sgx-step

slide-121
SLIDE 121

ZombieLoad - Recovering Intel SGX Sealing Key

  • Intel SGX allow developers to have hardware support for TEE
  • Malicious OS is part of the threat model
  • We can read register values of a trusted enclave with help of a

malicious OS

121 z-step Mark Non- Executabl e

slide-122
SLIDE 122

ZombieLoad - Recovering Intel SGX Sealing Key

  • Intel SGX allow developers to have hardware support for TEE
  • Malicious OS is part of the threat model
  • We can read register values of a trusted enclave with help of a

malicious OS

122 z-step Mark Non- Executabl e Try to Execute Exception

slide-123
SLIDE 123

ZombieLoad - Recovering Intel SGX Sealing Key

  • Intel SGX allow developers to have hardware support for TEE
  • Malicious OS is part of the threat model
  • We can read register values of a trusted enclave with help of a

malicious OS

123 z-step Mark Non- Executabl e Try to Execute Exception Handle Exception

slide-124
SLIDE 124

ZombieLoad - Recovering Intel SGX Sealing Key

  • Intel SGX allow developers to have hardware support for TEE
  • Malicious OS is part of the threat model
  • We can read register values of a trusted enclave with help of a

malicious OS

  • Repeated Context Switch in the transient domain w/ the same

register values

124

slide-125
SLIDE 125

Transynther and Medusa

  • A Tool based on Fuzzing-techniques to Generate Data Leakage Code
  • Microarchitectural Grooming to Find new MDS Variants/Subvariants
  • Medusa, A New Variant that only Leaks Write Combining Stores
  • Medusa
  • Write Combining fills up the entire Data Bus.
  • We leak only the Upper-half of the Data Bus to recover pre-filtered data.
  • Implicit WC, i.e., ‘rep mov’, ‘rep stos’, can be leaked.

125

slide-126
SLIDE 126

Mitigation

  • Spoiler and MemJam
  • Hardware: No plan to fix, No hardware mitigation!
  • Software: Constant-time implementation (Secret Obliviousness)
  • MDS
  • Hardware:
  • Everything is vulnerable before IceLake CPU
  • Disable Hyperthreading to reduce the impact
  • Software: Special Microcode Sequence

126

slide-127
SLIDE 127

Questions?!

127

Publications

  • MemJam: A False Dependency

Attack against Constant-Time Crypto Implementations (IACR CT-RSA 2018, IJPP 2019)

  • SPOILER: Speculative Load Hazards Boost Rowhammer and Cache

Attacks (Usenix Security 2019).

  • ZombieLoad: Cross-Privilege-Boundary Data Sampling. (ACM CCS 2019)
  • Fallout: Leaking Data on Meltdown-resistant CPU. (ACM CCS 2019)
  • Medusa (will appear at Usenix Security 2020)