CPU Side-Channel Attacks the Meltdown Attack Heechul Yun 1 This - - PowerPoint PPT Presentation

cpu side channel attacks
SMART_READER_LITE
LIVE PREVIEW

CPU Side-Channel Attacks the Meltdown Attack Heechul Yun 1 This - - PowerPoint PPT Presentation

CPU Side-Channel Attacks the Meltdown Attack Heechul Yun 1 This Week: Hardware Security Introduction Meltdown Papers Spectre Attacks: Exploiting Speculative Execution, IEEE Security and Privacy (S&P), 2019 (Dustin)


slide-1
SLIDE 1

CPU Side-Channel Attacks

the Meltdown Attack Heechul Yun

1

slide-2
SLIDE 2

This Week: Hardware Security

  • Introduction

– Meltdown

  • Papers

– Spectre Attacks: Exploiting Speculative Execution, IEEE Security and Privacy (S&P), 2019 (Dustin) – SpectreGuard: An Efficient Data-centric Defense Mechanism against Spectre Attacks, DAC, 2019 (Jacob)

2

slide-3
SLIDE 3

Meltdown

  • What is it?

– An attack that exploits Intel CPU’s flaw that allows any user-level process to read the content of the kernel-

  • nly accessible memory---usually the entire dram
  • What’s the impact?

– An attacker can dump the entire memory, including password and other confidential information

  • Which CPUs are affected?

– Almost all Intel CPUs that do Out-of-Order Execution to improve performance

3

slide-4
SLIDE 4

Meltdown

  • Is there any solution to address the issue?

– Yes. Major OS vendors already provide a solution. – But it comes at a performance cost of a varying

  • degree. (we will see why later)
  • What’s the technical details?

– Meltdown paper

  • Meltdown, Moritz Lipp, Michael Schwarz, Daniel Gruss, Thomas Prescher, Werner Haas, Stefan Mangard, Paul

Kocher, Daniel Genkin, Yuval Yarom, Mike Hamburg, arXiv preprint (Submitted on 3 Jan 2018) 4

slide-5
SLIDE 5

Outline

  • Background

– Virtual Memory – Out-of-Order Execution – Cache Side-Channel

  • Meltdown Attack
  • Meltdown Defense
  • Other Side-Channel Attacks

5

slide-6
SLIDE 6

Virtual Memory

  • Abstraction

– A large (e.g., 4GB) linear address space for each process

  • Reality

– A limited (e.g., 1GB) amount

  • f actual physical memory

shared with many other processes

  • How?

6

slide-7
SLIDE 7

MMU

7

Process A Process B Process C Physical Memory

MMU

Virtual address physical address

slide-8
SLIDE 8

Page Table

  • OS managed table that describe virtual to

physical address mapping.

8

slide-9
SLIDE 9

Properties of Virtual Memory

  • Memory isolation among different processes

– E.g., Process A cannot see process B’s memory (vice versa.)

  • What about memory isolation between kernel

and user?

– Q1. how does kernel map its own private memory? – Q2. how to prevent user processes from accessing the kernel mapped memory?

9

slide-10
SLIDE 10

Kernel/User Virtual Memory

  • Kernel memory

– Kernel code, data – Identical to all address spaces – Fixed 1-1 mapping of physical memory

  • User memory

– Process code, data, heap, stack,... – Unique to each address space – On-demand mapping (page fault)

10

Kernel User 0xFFFFFFFF 0xC0000000 0x00000000

slide-11
SLIDE 11

Kernel/User Virtual Memory

  • Every user-process has mappings to

kernel memory

  • But the kernel memory is only

accessible at the kernel mode

– when you execute system calls or interrupt handlers.

  • Benefits of this design: Performance

– Kernel can move data between user memory and kernel memory easily w/o changing the address space.

11

Kernel User 0xFFFFFFFF 0xC0000000 0x00000000

slide-12
SLIDE 12

ARM Page Table

12

slide-13
SLIDE 13

Kernel/User Virtual Memory

  • Meltdown tricks the CPU so that

the user can access its kernel memory

  • How?

– By exploiting weaknesses in Intel’s out-

  • f-order execution engine

13

Kernel User 0xFFFFFFFF 0xC0000000 0x00000000

slide-14
SLIDE 14

Out-of-Order Execution

  • Background
  • A cache-miss can take ~100 cycles
  • Idling CPU while waiting data from memory is bad
  • Out-of-order execution

– A technique to minimize data waiting time by executing future instructions – Introduced in 1967 (Tomasulo algorithm)

  • Most (all) high-performance CPUs use OoOE

– Intel, AMD, ARM, ….

14

slide-15
SLIDE 15

Out-of-Order Execution

  • Instructions are

fetched into a queue

  • Any instructions

whose data (operands) are ready are executed

  • ut-of-order (subject

to data dependency)

  • Results are “retired”

in-order.

15

slide-16
SLIDE 16

Speculative Execution

  • Guess which branch to

take.

  • Speculatively execute

instructions in the likely branch.

  • If guessed wrong,

squash the results

  • But the side-effect

(cache state change) remain

16

If (condition) { Do something A1 Do something A2 Do something A3 } else { Do something B1 Do something B2 Do something B3 }

slide-17
SLIDE 17

Cache Side-Channel Attack

  • Side-effect of memory access is changes in

micro-architectural state (cache state change)

  • By measuring access timing differences of a

memory location, an attacker can determine whether the memory is cached or not.

  • This can be used to leak secret information
  • Methods: Flush + Reload, Prime + Probe, etc.

17

slide-18
SLIDE 18

Flush+Reload Attack

  • Flush the shared cache-lines
  • Let the victim execute
  • Measure access timing of each cache-line

– Fast: accessed by the victim, Slow: not-accessed

18 Image credit: “Cache Side Channels: State of the Art and Research Opportunities” by Prof. Yinqian Zhang at OSU

slide-19
SLIDE 19

Prime+Probe Attack

  • The attacker prime the cache sets
  • The victim runs and evicts attacker’s cache-lines
  • The attacker measure access timing of each set

– Slow: victim accessed, fast: victim not accessed

19 Image credit: “Cache Side Channels: State of the Art and Research Opportunities” by Prof. Yinqian Zhang at OSU

slide-20
SLIDE 20

Outline

  • Background
  • Meltdown Attack
  • Meltdown Defense
  • Other Side-Channel Attacks

20

slide-21
SLIDE 21

Meltdown

  • An exploit of Out-of-Order Execution, which

change the microarchitectural state in a way that leaks information

21

slide-22
SLIDE 22

Toy Example

  • Line 1 diverts execution flow to the exception

handler.

  • So, line 3 should not be reached.

22

slide-23
SLIDE 23

Toy Example

  • But due to OoOE, Line 3 may have already been

executed, before raising the exception.

  • The OoOE results will be squashed, but the side-

effect (cache state change) remains.

23

slide-24
SLIDE 24

Toy Example

  • By measuring access timings, you can uncover

the value of ‘data’

24

slide-25
SLIDE 25

Meltdown

  • This is it.

25

slide-26
SLIDE 26

Meltdown

  • Step 1: load an attacker chosen kernel address

into a register. This would raise an exception but it may take some time.

26

slide-27
SLIDE 27

Meltdown

  • Step 2: access memory based on the secret

content of the rax register. This access will be in the cache.

27

slide-28
SLIDE 28

Meltdown

  • Step 3: measure the access timing of the

probe array (one per page) to determine which is in the cache. Which in turn tell what was the content of the kernel address [rcx]

28

slide-29
SLIDE 29

Meltdown

  • Entire physical memory is 1-to-1 mapped into

part of kernel memory. So, you can dump it.

29

Kernel User Physical Memory Process address space

slide-30
SLIDE 30

Outline

  • Background
  • Meltdown Attack
  • Meltdown Defense

– KAISER/KPTI

  • Other Side-Channel Attacks

30

slide-31
SLIDE 31

KAISER/KPTI

  • OS solution to mitigate the meltdown attack.
  • Idea: Do not map kernel in each process’s address
  • space. Instead kernel has its own address space.
  • If kernel is not mapped in user process’s address space,

there’s no way to access kernel memory, thus preventing meltdown.

  • But, to enter the kernel (system call or interrupt), you

have to first make a context switch it.

  • There’s direct/indirect overhead of a context-switch.
  • I/O intensive workloads are especially affected by this.

31

slide-32
SLIDE 32

KAISER/KPTI

  • Hide kernel from user’s address space.
  • Switch to kernel’s address space to enter kernel (i.e.,

system calls, interrupt handling)

32

slide-33
SLIDE 33

KAISER/KPTI

  • Just the top-level table (PML4) needs to have two

distinct copies. Memory overhead is low.

  • Benefits: kernel address is not mapped in the user’s
  • space. So, no speculative reads can happen.

33

slide-34
SLIDE 34

KAISER/KPTI: Overhead

  • Flush TLB on context-switch between user/kernel

– PCID (or ASID) supporting CPUs don’t need to flush TLB.

  • Page table swapping (CR3 update): ~hundreds cycles
  • The overhead is incurred for every single system-

call/interrupt.  noticeable performance hit (5-30%)

34

slide-35
SLIDE 35

Summary

  • OoOE introduces side-effect, which can be

exploited in a way to leak information.

  • Meltdown is such an attack, exploiting weakness

in Intel CPU’s OoOE implementation. It can dump entire kernel memory.

  • KAISER/KPTI mitigates the issue by completely

separating kernel from user process’s address space.

  • But the cost can be quite high for OS intensive

workloads.

35

slide-36
SLIDE 36

References

  • M. Lipp et al., Meltdown, arXiv preprint, 2018
  • D. Gruss et al., KASLR is Dead: Long Live KASLR, 2017
  • https://lwn.net/Articles/738975/

36

slide-37
SLIDE 37

Reliability Side-Channel

  • Disturbance errors in DRAM (*)
  • a.k.a. Row Hammer Bug
  • Repeated opening/closing a DRAM row can cause bit flips

in adjacent rows.

  • In more than 80% DRAM modules between 2010 -2013
  • Google demonstrated successful hacking method utilizing

the bug (**)

– manipulate page tables at the user-level

37 (*) Yoongu Kim et al, “Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors,” ISCA’14 (**) Google Project Zero. Exploiting the DRAM rowhammer bug to gain kernel privileges, 2015

slide-38
SLIDE 38

DRAM Chip

Row of Cells Row Row Row Row Wordline

VLOW VHIGH

Victim Row Victim Row Aggressor Row

Repeatedly opening and closing a row induces disturbance errors in adjacent rows

Opened Closed

38 This slide is from the Dr. Yoongu Kim’s ISCA 2014 presentation

slide-39
SLIDE 39

RowHammer

39

slide-40
SLIDE 40

References

  • Flipping Bits in Memory Without Accessing

Them: An Experimental Study of DRAM Disturbance Errors. ISCA, 2014

  • Drammer: Deterministic Rowhammer Attacks
  • n Mobile Platforms, CCS'16 [blog] (optional)

40

slide-41
SLIDE 41

DeepPicar Build

  • Parts

– Playstation Eye camera x1 – Pololu DRV8835 motor hat x1 – New Bright 1:24 RC Car x1 – INIU USB power bank battery (2.4A) x1 – Form board 11x14x3/16in x1 – Machine Screw: #2-56, 1/4″ x4 (or more) – JST-USB cable x1 – Jumper wires male->female x6

  • Instruction

– https://docs.google.com/document/d/1Tzx2_GrA6KHZhD3 Up83Q3eieVpPXd5ZUJ4jvwPXNy6o/edit?usp=sharing

41

slide-42
SLIDE 42

Appendix

42

slide-43
SLIDE 43

Address Layout

  • 32 bit systems

– User: 3GB (0x00000000 to 0xbfffffff) – Kernel: 1GB (0xc0000000 to 0xffffffff)

  • 64 bit systems (x86-64)

– User: 2^47 (zero to 0x7fffffffffff) – Kernel: 0xffff880000000000 – 0xffffffffffffffff

43

slide-44
SLIDE 44

Foreshadow

  • https://software.intel.com/security-software-

guidance/insights/deep-dive-intel-analysis-l1- terminal-fault

44