Prevention of Microarchitectural Covert Channels on an Open-Source - - PowerPoint PPT Presentation

prevention of microarchitectural covert channels on an
SMART_READER_LITE
LIVE PREVIEW

Prevention of Microarchitectural Covert Channels on an Open-Source - - PowerPoint PPT Presentation

Prevention of Microarchitectural Covert Channels on an Open-Source 64-bit RISC-V Core Fourth Workshop on Computer Architecture Research with RISC-V (CARRV 2020) May 29 th , 2020 Nils Wistoff Moritz Schneider Frank K. Grkaynak Luca Benini


slide-1
SLIDE 1

May 29th, 2020

Prevention of Microarchitectural Covert Channels on an Open-Source 64-bit RISC-V Core

Fourth Workshop on Computer Architecture Research with RISC-V (CARRV 2020) Nils Wistoff Moritz Schneider Frank K. Gürkaynak Luca Benini Gernot Heiser

slide-2
SLIDE 2

Outline

  • 1. Covert channels?
  • 2. Measure
  • 3. Mitigate
  • 4. Costs
  • 5. Conclusion

Integrated Systems Laboratory

2

slide-3
SLIDE 3

Covert Channel

Integrated Systems Laboratory

3

File System Mail Client security boundary Supervisor (OS) Hardware

slide-4
SLIDE 4

Covert Channel

Integrated Systems Laboratory

4

File System Mail Client security boundary Supervisor (OS) Hardware

slide-5
SLIDE 5

Microarchitectural Timing Channel

Integrated Systems Laboratory

5

Application A Trojan Application B Spy security boundary

slide-6
SLIDE 6

Microarchitectural Timing Channel

Integrated Systems Laboratory

6

Application A Trojan Application B Spy security boundary Microarchitectural State Temporally shared HW Indirectly modify depending on secret Measure execution time

slide-7
SLIDE 7

Example: D$ Timing Channel

Integrated Systems Laboratory

7

Application A Trojan Application B Spy Main memory

D$

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-8
SLIDE 8

Example: D$ Timing Channel – Prime

Integrated Systems Laboratory

8

Application A Trojan Application B Spy Main memory

D$

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-9
SLIDE 9

Example: D$ Timing Channel – Prime

Integrated Systems Laboratory

9

Application A Trojan Application B Spy Main memory

D$

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-10
SLIDE 10

Example: D$ Timing Channel – Context switch

Integrated Systems Laboratory

10

Application A Trojan Application B Spy Main memory

D$

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-11
SLIDE 11

Example: D$ Timing Channel – Encode s

Integrated Systems Laboratory

11

Application A Trojan Application B Spy Main memory

D$

s lines

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-12
SLIDE 12

Example: D$ Timing Channel – Encode s

Integrated Systems Laboratory

12

Application A Trojan Application B Spy Main memory

D$

s lines

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-13
SLIDE 13

Example: D$ Timing Channel – Context Switch

Integrated Systems Laboratory

13

Application A Trojan Application B Spy Main memory

D$

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-14
SLIDE 14

Example: D$ Timing Channel – Probe

Integrated Systems Laboratory

14

Application A Trojan Application B Spy Main memory

D$

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-15
SLIDE 15

Example: D$ Timing Channel – Probe

Integrated Systems Laboratory

15

Application A Trojan Application B Spy Main memory

D$

s lines

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-16
SLIDE 16

Spatial Partitioning

Integrated Systems Laboratory

16

Application A Trojan Application B Spy Main memory

D$

OS

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-17
SLIDE 17

Spatial Partitioning

Integrated Systems Laboratory

17

Application A Trojan Application B Spy Main memory

D$

OS

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-18
SLIDE 18

Temporal Partitioning

Integrated Systems Laboratory

18

Application A Trojan Application B Spy Main memory

D$

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-19
SLIDE 19

Temporal Partitioning

Integrated Systems Laboratory

19

Application A Trojan Application B Spy Main memory

D$

OS: Flush

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-20
SLIDE 20

Temporal Partitioning

Integrated Systems Laboratory

20

Application A Trojan Application B Spy Main memory

D$

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-21
SLIDE 21

Temporal Partitioning

Integrated Systems Laboratory

21

Application A Trojan Application B Spy Main memory

D$

OS: Flush

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-22
SLIDE 22

Temporal Partitioning

Integrated Systems Laboratory

22

Application A Trojan Application B Spy Main memory

D$

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-23
SLIDE 23

Temporal Partitioning

Integrated Systems Laboratory

23

Application A Trojan Application B Spy Main memory

D$

OS: Flush

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-24
SLIDE 24

Flush: SW Approach

Integrated Systems Laboratory

24

Application A Trojan Application B Spy Main memory

D$

OS OS OS OS OS OS OS OS

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-25
SLIDE 25

Evaluation Platform

Integrated Systems Laboratory

25

Ariane RV64GC core [4] Hardware platform

  • FPGA (Genesys 2) @50MHz
  • Add timer peripheral and 512KiB LLC [3]
  • Write-through 32KiB L1D$ and 16KiB L1I$
  • 16-entry DTLB, 16-entry BTB, 64-entry BHT
slide-26
SLIDE 26

Evaluation Platform

Integrated Systems Laboratory

26

Ariane RV64GC core [4] seL4 microkernel [5] Hardware platform Supervisor

  • FPGA (Genesys 2) @50MHz
  • Add timer peripheral and 512KiB LLC [3]
  • Write-through 32KiB L1D$ and 16KiB L1I$
  • 16-entry DTLB, 16-entry BTB, 64-entry BHT
  • Formally verified Kernel by Data61
  • Focus on security
  • Port to Ariane
  • Enable cache colouring of LLC
slide-27
SLIDE 27

Evaluation Platform

Integrated Systems Laboratory

27

Ariane RV64GC core [4] seL4 microkernel [5] Channel bench [1] Hardware platform Supervisor Application

  • FPGA (Genesys 2) @50MHz
  • Add timer peripheral and 512KiB LLC [3]
  • Write-through 32KiB L1D$ and 16KiB L1I$
  • 16-entry DTLB, 16-entry BTB, 64-entry BHT
  • Formally verified Kernel by Data61
  • Focus on security
  • Port to Ariane
  • Enable cache colouring of LLC
  • Measure covert channels on ARM/x86
  • Port to RISC-V
  • Tailor attacks to Ariane‘s Arch
slide-28
SLIDE 28

Channel Bench Output: L1 D$

Integrated Systems Laboratory

28

s0 107 s1 11 s2 112 s3 235 s4 246 s5 152 t0 83316 t1 80209 t2 82069 t3 88152 t4 88856 t5 86627

slide-29
SLIDE 29

Channel Matrix: L1 D$

Integrated Systems Laboratory

29

N = 106

slide-30
SLIDE 30

Channel Matrix: L1 D$

Integrated Systems Laboratory

30

N = 106

slide-31
SLIDE 31

Channel Matrix: L1 D$

Integrated Systems Laboratory

31

N = 106 M = 1667.3 mb

slide-32
SLIDE 32

Channel Bench Output: L1 D$

Integrated Systems Laboratory

32

s0 107 s1 11 s2 112 s3 235 s4 246 s5 152 t0 83316 t1 80209 t2 82069 t3 88152 t4 88856 t5 86627

M

slide-33
SLIDE 33

Channel Bench Output: L1 D$

Integrated Systems Laboratory

34

s0 107 s1 11 s2 112 s3 235 s4 246 s5 152 t0 83316 t1 80209 t2 82069 t3 88152 t4 88856 t5 86627

Shuffle M

s0 107 s1 11 s2 112 s3 235 s4 246 s5 152 t0 83316 t1 80209 t2 82069 t3 88152 t4 88856 t5 86627

𝑁0

slide-34
SLIDE 34

Channel Bench Output: L1 D$

Integrated Systems Laboratory

35

s0 107 s1 11 s2 112 s3 235 s4 246 s5 152 t0 83316 t1 80209 t2 82069 t3 88152 t4 88856 t5 86627

Shuffle 𝑁

s0 107 s1 11 s2 112 s3 235 s4 246 s5 152 t0 83316 t1 80209 t2 82069 t3 88152 t4 88856 t5 86627

𝑁0

s0 t2 s1 t1 s2 t0 s3 t4 s4 t3 s5 t5

𝑁0

1

Repeat

s0 t1 s1 t2 s2 t0 s3 t3 s4 t4 s5 t5

𝑁0

2 s0 t5 s1 t2 s2 t0 s3 t1 s4 t3 s5 t4

𝑁0

3 s0 t5 s1 t4 s2 t0 s3 t3 s4 t1 s5 t2

𝑁0

4

𝑁0: 95% confidence interval of 𝑁0

𝑁 > 𝑁0 ⇒ covert channel!

slide-35
SLIDE 35

Channel Matrix: L1 D$

36

N = 106 M = 1667.3 mb M0 = 0.5 mb

Integrated Systems Laboratory

slide-36
SLIDE 36

Flush: SW Approach

Integrated Systems Laboratory

37

Application A Trojan Application B Spy Main memory

D$

OS OS OS OS OS OS OS OS

(2) OS:

  • Cont. sw.

(1) Spy: Prime (3) Trojan: Encode s (4) OS:

  • Cont. sw.

(5) Spy: Probe

slide-37
SLIDE 37

Software Mitigation: L1 D$ Channel

Integrated Systems Laboratory

38

N = 106, M = 1667.3 mb, M0 = 0.5 mb N = 106, M = 1471.5 mb, M0 = 0.6 mb

Unmitigated L1 D$ prime on context switch

slide-38
SLIDE 38

Software Mitigation: L1 D$ Channel

Integrated Systems Laboratory

39

N = 106, M = 1471.5 mb, M0 = 0.6 mb N = 106, M = 515.7 mb, M0 = 1.1 mb

Single L1 D$ prime on context switch Double L1 D$ prime on context switch

slide-39
SLIDE 39

Temporal Fence Instruction (fence.t)

Integrated Systems Laboratory

40

slide-40
SLIDE 40

Temporal Fence Instruction (fence.t)

Integrated Systems Laboratory

41

fence.t select [4]

slide-41
SLIDE 41

Temporal Fence Instruction (fence.t)

Integrated Systems Laboratory

42

[4] + Pipeline

slide-42
SLIDE 42

fence.t: L1 D$ Channel

Integrated Systems Laboratory

43

N = 106, M = 1667.3 mb, M0 = 0.5 mb N = 106, M = 7.7 mb, M0 = 1.4 mb

Unmitigated Flush targeted components

  • n context switch
slide-43
SLIDE 43

fence.t: L1 D$ Channel

Integrated Systems Laboratory

44

N = 106, M = 1667.3 mb, M0 = 0.5 mb N = 106, M = 7.7 mb, M0 = 1.4 mb

Unmitigated Flush targeted components

  • n context switch

… but wait!

slide-44
SLIDE 44

Vulnerable 2nd Order State-Holding Components

▪ L1 D$:

▪ LFSR for pseudo-random replacement policy ▪ Memory arbiter ▪ TX FIFO ▪ Write-buffer arbiters

▪ L1 I$:

▪ LFSR for pseudo-random replacement policy

▪ TLBs:

▪ Pseudo-LRU tree for replacement policy

Integrated Systems Laboratory

45

slide-45
SLIDE 45

Full fence.t: L1 D$ Channel

Integrated Systems Laboratory

46

N = 106, M = 1667.3 mb, M0 = 0.5 mb N = 106, M = 8.4 mb, M0 = 9.6 mb

Unmitigated Flush all vulnerable components

  • n context switch
slide-46
SLIDE 46

L1 I$ Channel

Integrated Systems Laboratory

47

N = 106, M = 1905.0 mb, M0 = 0.5 mb N = 106, M = 19.5 mb, M0 = 20.5 mb

Unmitigated Flush all vulnerable components

  • n context switch
slide-47
SLIDE 47

TLB Channel

Integrated Systems Laboratory

48

N = 106, M = 409.2 mb, M0 = 0.1 mb N = 106, M = 2.7 mb, M0 = 5.4 mb

Unmitigated Flush all vulnerable components

  • n context switch
slide-48
SLIDE 48

BTB Channel

Integrated Systems Laboratory

49

N = 106, M = 3481.3 mb, M0 = 0.1 mb N = 106, M = 33.0 mb, M0 = 57.6 mb

Unmitigated Flush all vulnerable components

  • n context switch
slide-49
SLIDE 49

BHT Channel

Integrated Systems Laboratory

50

N = 106, M = 4873.3 mb, M0 = 0.1 mb N = 106, M = 44.1 mb, M0 = 58.8 mb

Unmitigated Flush all vulnerable components

  • n context switch
slide-50
SLIDE 50

Context Switch Latency

seL4 one-way inter-address-space IPC microbenchmark

Integrated Systems Laboratory

51

Unmitigated Hot Cold 430 (7.0) 1,180 (1.0)

slide-51
SLIDE 51

Context Switch Latency

seL4 one-way inter-address-space IPC microbenchmark

Integrated Systems Laboratory

52

Unmitigated D$ Software Flush Hot Cold Single Double 430 (7.0) 1,180 (1.0) 12,099 (52) 51,876 (256)

slide-52
SLIDE 52

Context Switch Latency

seL4 one-way inter-address-space IPC microbenchmark

Integrated Systems Laboratory

53

Unmitigated D$ Software Flush HW Flush Hot Cold Single Double 430 (7.0) 1,180 (1.0) 12,099 (52) 51,876 (256) 1,502 (0.9)

slide-53
SLIDE 53

Hardware Costs: FPGA

Integrated Systems Laboratory

55

LUTs Registers Muxes Unmodified 102,796 (10) 58,957 (208) 13,590 (38) w/ fence.t 102,792 (57) 60,607 (5) 15,038 (2) 0% +2.8% +10.6%

slide-54
SLIDE 54

Conclusion

▪ We measure five distinct covert channels on Ariane ▪ Confirmed: OS needs HW-support for time protection [1] ▪ HW-mechanism must flush all Arch state

▪ Identifying Arch state not always straight-forward ▪ Systematic approach for HW / Security codesign needed

▪ Further, off-core covert channels still need to be addressed

▪ e.g. DRAM, thermal controller, etc.

Integrated Systems Laboratory

56

slide-55
SLIDE 55

Sources

[1] Qian Ge, Yuval Yarom, Tom Chothia, and Gernot Heiser: “Time Protection: The Missing OS Abstraction”, EuroSys, 2019 [2] R. E. Kessler and Mark D. Hill: “Page Placement Algorithm for Large Real-Indexed Caches”, ACM Trans. Comp. Syst. 19, 1992 [3] Wolfgang Rönninger: “Memory Subsystem for the First Fully Open-Source RISC-V Heterogeneous SoC”, Master’s thesis, ETH Zurich, 2019 [4] Florian Zaruba and Luca Benini: “The Cost of Application-Class Processing: Energy and Performance Analysis of a Linux-Ready 1.7-GHz 64-Bit RISC-V Core in 22-nm FDSOI Technology”, IEEE Trans. on VLSI Systems 27, 2019 [5] Gerwin Klein, June Andronick, Kevin Elphistone, Toby Murray, Thomas Sewell, Rafal Kolanski, and Gernot Heiser: “Comprehensive Formal Verification of an OS Microkernel”, ACM Trans. Comp. Syst. 32, 2014

Integrated Systems Laboratory

57

slide-56
SLIDE 56

May 29th, 2020

Prevention of Microarchitectural Covert Channels on an Open-Source 64-bit RISC-V Core

Fourth Workshop on Computer Architecture Research with RISC-V (CARRV 2020) Nils Wistoff Moritz Schneider Frank K. Gürkaynak Luca Benini Gernot Heiser

slide-57
SLIDE 57

Hardware Costs: FPGA

Integrated Systems Laboratory

59

LUTs Registers Muxes Unmodified 102,796 (10) 58,957 (208) 13,590 (38) w/ fence.t 102,792 (57) 50.4% 60,607 (5) 14.9% 15,038 (2) 9.8% 0% +2.8% +10.6%

slide-58
SLIDE 58

Time Protection [1]

Integrated Systems Laboratory

60

A B HW A B HW A B HW A B HW

Spatial partitioning Temporal partitioning

  • Off-core components
  • e.g. cache colouring (LLC) [2]
  • Not a solution for on-core components!
  • On-core components
  • e.g. L1 caches, TLBs, branch predictors
  • Reset Arch state on context switch