Microarchitectural Attacks: Protecting Cloud Accelerators By Ahmad - - PowerPoint PPT Presentation

microarchitectural attacks
SMART_READER_LITE
LIVE PREVIEW

Microarchitectural Attacks: Protecting Cloud Accelerators By Ahmad - - PowerPoint PPT Presentation

Microarchitectural Attacks: Protecting Cloud Accelerators By Ahmad Daniel Moghimi PhD Candidate Worcester Polytechnic Institute (WPI) @danielmgmi OUTLINE Summary of Recent Contributions: Microarchiture MemJam Intel SGX


slide-1
SLIDE 1

Microarchitectural Attacks: Protecting Cloud Accelerators

By Ahmad “Daniel” Moghimi PhD Candidate Worcester Polytechnic Institute (WPI) @danielmgmi

slide-2
SLIDE 2

OUTLINE

▪ Summary of Recent Contributions: ▪ Microarchiture→ MemJam ▪ Intel SGX → CacheZoom ▪ Intel EPID → CacheQuote ▪ Speculation → Spoiler ▪ Mitigation → MicroWalk ▪ Shared FPGA-CPU Hardware Security ▪ Proposal ▪ Lab Equipment/Setup ▪ Ongoing Work

2

slide-3
SLIDE 3

Microarchitecture (Memory)

3

slide-4
SLIDE 4

μArch Attacks: Data Dependency

add %ebx, %eax sub %eax, %edx xor %ecx, %ecx add %eax, %edi sub %ecx, %edi

1 2 3 4 5

4

slide-5
SLIDE 5

μArch Attacks: Pipelined Memory Exec

add %ebx, %eax sub %eax, %edx xor %ecx, %ecx add %eax, %edi sub %ecx, %edi

1 2 3 4 5 IF ID EX

WB

Instruction Fetch Instruction Decode Execute Write Back

IF IF ID

5

slide-6
SLIDE 6

add %ebx, %eax sub %eax, %edx xor %ecx, %ecx add %eax, %edi sub %ecx, %edi

1 2 3 4 5 IF ID EX

WB

Instruction Fetch Instruction Decode Execute Write Back

IF IF ID EX ID IF

6

μArch Attacks: Pipelined Memory Exec

slide-7
SLIDE 7

add %ebx, %eax sub %eax, %edx xor %ecx, %ecx add %eax, %edi sub %ecx, %edi

1 2 3 4 5 IF ID EX

WB

Instruction Fetch Instruction Decode Execute Write Back

IF IF ID EX ID IF

WB

EX ID IF

μArch Attacks: Pipelined Memory Exec

7

slide-8
SLIDE 8

add %ebx, %eax sub %eax, %edx xor %ecx, %ecx add %eax, %edi sub %ecx, %edi

1 2 3 4 5 IF ID EX

WB

Instruction Fetch Instruction Decode Execute Write Back

IF IF ID EX ID IF

WB

EX ID IF EX EX ID IF

WB

ID

WB

EX EX

WB WB

8

μArch Attacks: Pipelined Memory Exec

slide-9
SLIDE 9

μArch Attacks: 4K Aliasing False Dependency

Memory loads/stores are executed out of order and speculatively

The dependency is verified after the execution!

4K Aliasing: Addresses that are 4K apart are assumed dependent

Re-execute the load and corresponding instructions due to false dependency

Virtual-to-physical address translation → Memory disambiguation

mov %eax, (%ebx) mov (%ecx), %edx

Load

Store

Execute

Load

Execute

Store

Dependent?

Yes 9

slide-10
SLIDE 10

Core HT – Thread A HT – Thread B Load 0xFECD1 Load 0xFECD2 Load 0xFECD3 Load 0xFECD4 Load 0xFECD5 Load 0xFECD6 Load 0xFECD7 Load 0xFECD8 Execute & Time 10

μArch Attacks – Hyperthreading 4K Aliasing

slide-11
SLIDE 11

Core HT – Thread A HT – Thread B Load 0xFECD1 Load 0xFECD2 Load 0xFECD3 Load 0xFECD4 Load 0xFECD5 Load 0xFECD6 Load 0xFECD7 Load 0xFECD8 Execute & Time Store 0x12ABCDEF Store 0x12ABCDEF Store 0x12ABCDEF Store 0x12ABCDEF Store 0x12ABCDEF Store 0x12ABCDEF Store 0x12ABCDEF Store 0x12ABCDEF Store 0x12ABCDEF Store 0x12ABCDEF 11

μArch Attacks – Hyperthreading 4K Aliasing

slide-12
SLIDE 12

Core HT – Thread A HT – Thread B Load 0xFECD1 Load 0xFECD2 Load 0xFECD3 Load 0xFECD4 Load 0xFECD5 Load 0xFECD6 Load 0xFECD7 Load 0xFECD8 Execute & Time Store 0x12ABC200 Store 0x12ABC200 Store 0x12ABC200 Store 0x12ABC200 Store 0x12ABC200 Store 0x12ABC200 Store 0x12ABC200 Store 0x12ABC200 Store 0x12ABC200 Store 0x12ABC200 12

μArch Attacks – Hyperthreading 4K Aliasing

slide-13
SLIDE 13

Core HT – Thread A HT – Thread B Load 0xFECD1 Load 0xFECD2 Load 0xFECD3 Load 0xFECD4 Load 0xFECD5 Load 0xFECD6 Load 0xFECD7 Load 0xFECD8 Execute & Time Store 0x12ABC Store 0x12ABC Store 0x12ABC Store 0x12ABC Store 0x12ABC Store 0x12ABC Store 0x12ABC Store 0x12ABC Store 0x12ABC Store 0x12ABC 13

μArch Attacks – Hyperthreading 4K Aliasing

slide-14
SLIDE 14

MemJam

14

slide-15
SLIDE 15

MemJam – Intra Cache Line Resolution

15 Least 12 bits (Virtual Address = Physical Address) Rest of the bits (Virtual != Physical)

slide-16
SLIDE 16

16 Least 12 bits (Virtual Address = Physical Address) Rest of the bits (Virtual != Physical) L1 Cache Attacks

MemJam – Intra Cache Line Resolution

slide-17
SLIDE 17

17 Least 12 bits (Virtual Address = Physical Address) Rest of the bits (Virtual != Physical) L1 Cache Attacks L2/LLC Cache Attacks

MemJam – Intra Cache Line Resolution

slide-18
SLIDE 18

18 Least 12 bits (Virtual Address = Physical Address) Rest of the bits (Virtual != Physical) L1 Cache Attacks L2/LLC Cache Attacks

Conflicted intra-cache line Leakage (4-byte granularity)

Higher time correlates→ Memory accesses with the same bit 3 to 12

4 bits of intra-cache level leakage

MemJam – Intra Cache Line Resolution

slide-19
SLIDE 19

MemJam Attack

CPU

Core HT HT Core HT HT Encryption Service

load compute load load compute load compute load load

Execute Execute Again Higher time if there are more number of 4K conflicts 19

slide-20
SLIDE 20

Constant time AES – Safe2Encrypt_RIJ128

Scatter-gather implementation of AES

256 S-Box – 4 Cache Line

Cache independent access pattern

Implemented and distributed as part of Intel products

Intel SGX Linux Software Development Kit (SDK)

Intel IPP Cryptography Library

20 LINE 2 A LINE 2 B LINE 2 C LINE 2 D 64 Bytes 4 Cache Lines S-Box Lookup A B C

Local Buffer

D B

slide-21
SLIDE 21

MemJam Attack on Safe2Encrypt_RIJ128

21 LINE 2 64 Bytes 4 Cache Lines

Local Buffer

slide-22
SLIDE 22

MemJam Attack on Safe2Encrypt_RIJ128

22 LINE 2 64 Bytes 4 Cache Lines

Local Buffer

slide-23
SLIDE 23

23

Intel SGX

slide-24
SLIDE 24

INTEL SOFTWARE GUARD EXTENSION (SGX)

▪ Trusted Execution Environment (TEE) ▪ Enclave: Hardware protected user-level software module

▪ Loaded by the user program ▪ Mapped by the Operating System ▪ Authenticated and Encrypted by CPU

▪ Memory accesses are protected by the hardware

24

slide-25
SLIDE 25

MemJam Attack on SGX

25

slide-26
SLIDE 26

CacheZoom: Controlled Cache Attack ON SGX

1. Isolation of the target & victim cache

  • 2. Stabilize the processor frequency
  • 3. Perform the attack on small exec steps by

interrupting the victim

  • 4. Measure and filter the remaining noise

26

slide-27
SLIDE 27

L1D Cache

CacheZoom: Interrupted Cache Attack

1 2 3 4 5 6 7 8 57 58 59 60 61 62 63 … 56 Step 1: Attacker prime all the L1D sets PC 27

slide-28
SLIDE 28

L1D Cache

CacheZoom: Interrupted Cache Attack

1 2 3 4 5 6 7 8 57 58 59 60 61 62 63 … 56 Step 1: Attacker prime all the L1D sets PC 28 Step 2: Victim executes some codes

slide-29
SLIDE 29

L1D Cache

CacheZoom: Interrupted Cache Attack

1 2 3 4 5 6 7 8 57 58 59 60 61 62 63 … 56 Step 1: Attacker prime all the L1D sets PC 29 Step 2: Victim executes some codes

slide-30
SLIDE 30

L1D Cache

CacheZoom: Interrupted Cache Attack

1 2 3 4 5 6 7 8 57 58 59 60 61 62 63 … 56 Step 1: Attacker prime all the L1D sets PC 30 Step 2: Victim executes some codes Step 3: Attacker interrupts the execution pipeline

slide-31
SLIDE 31

L1D Cache

CacheZoom: Interrupted Cache Attack

1 2 3 4 5 6 7 8 57 58 59 60 61 62 63 … 56 Step 1: Attacker prime all the L1D sets PC 31 Step 2: Victim executes some codes Step 3: Attacker interrupts the execution pipeline Step 4: Attacker probes the access times → Go to step 1

slide-32
SLIDE 32

L1D Cache

CacheZoom: Interrupted Cache Attack

1 2 3 4 5 6 7 8 57 58 59 60 61 62 63 … 56 Step 1: Attacker prime all the L1D sets PC 32 Step 2: Victim executes some codes Step 3: Attacker interrupts the execution pipeline Step 4: Attacker probes the access times → Go to step 1

slide-33
SLIDE 33

L1D Cache

CacheZoom: Interrupted Cache Attack

1 2 3 4 5 6 7 8 57 58 59 60 61 62 63 … 56 Step 1: Attacker prime all the L1D sets PC 33 Step 2: Victim executes some codes Step 3: Attacker interrupts the execution pipeline Step 4: Attacker probes the access times → Go to step 1

slide-34
SLIDE 34

CacheZoom: Interrupted Cache Attack

34

slide-35
SLIDE 35

35

CacheQuote

slide-36
SLIDE 36

CacheQuote Attack

Quoting Enclave:

EPID Signature scheme built-in enclave by Intel

Attest the integrity of user-provided enclave

▪ EPID Implementation (is)was not constant-time

36

slide-37
SLIDE 37

CacheQuote Attack

Loop iteration leaks Leading Zero Bits

CacheZoom to accurately measure

Feed the short vectors to a lattice and

37

slide-38
SLIDE 38

38

Memory Speculation

slide-39
SLIDE 39

Speculative Memory Accesses

39

slide-40
SLIDE 40

Spoiler on Spoiler Attack

40

slide-41
SLIDE 41

MicroWalk: Finding μArch Sources in Binaries

41

Detecting Leakages based on Binary Instrumentation and Mutual Information Analysis

slide-42
SLIDE 42

42

Accelerators in the Cloud

slide-43
SLIDE 43

Side-channel Threats Shared FPGA-CPU Platforms

FPGAs on the cloud can boost applications

Optimized Application-specific Hardware Configuration

e.g Real-time Artificial Intelligence

New Attack Surface:

Accelerator Function Units (AFUs) placed on the FPGA can be used to interact with the CPU

  • r other AFUs for malicious purpose.

AFU to AFU Attack

AFU to HPS Attack

AFU to CPU Attack

CPU to AFU Attack

Across VMS ?

43

slide-44
SLIDE 44

Shared FPGA-CPU Platforms

44

slide-45
SLIDE 45

Attack Vectors

Rowhammer

Trojan Bitstreams

45

Cache Attacks

Cold Boot

DMA/IOMMU

FPGA-centric Attacks

slide-46
SLIDE 46

What is interesting about FPGA-CPU in the Cloud?

Infancy, Attack/Defense Playground (Intel SGX in 2015)

Customizable Hardware → More Devastating Attacks

E.g. Design your own timers, Direct access to memory interface, etc.

Complex Threat Model

46

slide-47
SLIDE 47

Lab/Collaboration Setup

Weekly Meeting ( 2 Faculty + 3 Students = 5 people are actively involved.)

Software

OPAE Stack

Intel Quartus (Synthesis)

KVM (Virtualization Scenario)

Hardware

Remote Access to Intel Labs (Xeon)

Local Server including Intel PAC

Heavy Load Workstation (Synthesis)

47

slide-48
SLIDE 48

Ongoing Work: Threat Modeling and Security Analysis

Threat Modeling of the Technology based on Modern Use Cases

Security Analysis of the Entire Stack Based on Available Resources

48

slide-49
SLIDE 49

WPI + Lubeck Team

49

slide-50
SLIDE 50

Acknowledgements

50

Thanks to Carlos Rosaz, Matthias Schunter, Anand Rajan, Evan Custodio from Intel

slide-51
SLIDE 51

THANKS

▪ Questions?

51

slide-52
SLIDE 52

Ongoing Work: Replicating μArch Attacks on FPGA-CPU Interface

Memory Interface and the Cache Coherency Protocol

Side-channel Analysis of Memory Operations

52