Security Needs a Better Hardware-Software Contract Gernot Heiser | - - PowerPoint PPT Presentation

security needs a better hardware software contract
SMART_READER_LITE
LIVE PREVIEW

Security Needs a Better Hardware-Software Contract Gernot Heiser | - - PowerPoint PPT Presentation

Security Needs a Better Hardware-Software Contract Gernot Heiser | gernot@unsw.edu.au | @GernotHeiser DAC19, Las Vegas, 5 June 2019 https://trustworthy.systems Threats Speculation An unknown unknown until recently A known


slide-1
SLIDE 1

https://trustworthy.systems

Security Needs a Better Hardware-Software Contract

Gernot Heiser | gernot@unsw.edu.au | @GernotHeiser

  • DAC’19, Las Vegas, 5 June 2019
slide-2
SLIDE 2

Threats

DAC, Las Vegas, 5 June 2019 2 |

Speculation Microarchitectural Timing Channel

An “unknown unknown” until recently A “known unknown” for decades

slide-3
SLIDE 3

What Are Timing Channels?

DAC, Las Vegas, 5 June 2019 3 |

slide-4
SLIDE 4

Timing Channels

DAC, Las Vegas, 5 June 2019 4 |

Information leakage through timing of events

  • Typically by observing response latencies or own execution speed

Covert channel: Information flow that bypasses the security policy Side channel: Covert channel exploitable without insider help

High Low Trojan encodes info Spy

  • bserves

Attacker

  • bserves

Victim executes normally

slide-5
SLIDE 5

Cause: Competition for Shared HW Resources

DAC, Las Vegas, 5 June 2019 5 |

Affect execution speed Shared hardware

  • Inter-process interference
  • Competing access to micro-

architectural features

  • Hidden by the HW-SW contract!

High Low

slide-6
SLIDE 6

Security: A HW-SW Codesign Issue

DAC, Las Vegas, 5 June 2019 6 |

slide-7
SLIDE 7

Enforcing Security

DAC, Las Vegas, 5 June 2019 7 |

Operating System Hardware (CPU etc) High Low

Provide mechanisms Enforce policies HW-SW Contract

slide-8
SLIDE 8

Why Hardware Cannot Do Security Alone

  • Security policies are high-level
  • Course-grain: “applications” are sets of cooperating processes
  • Hardware mechanisms are fine-grain: instructions, pages, address spaces
  • Much semantics lost in mapping to hardware level
  • Security policies are complex: “Can A talk to B?” is too simple
  • maybe one-way communication is allowed
  • maybe communication is allowed under certain conditions
  • maybe low-bandwidth leakage doesn’t matter
  • maybe secrets only matter for a short time
  • maybe only subset of {confidentiality, integrity, availability} is important

DAC, Las Vegas, 5 June 2019 8 |

slide-9
SLIDE 9

Why the ISA is an Insufficient Contract

  • The ISA is a purely operational contract
  • Sufficient for ensuring functional correctness
  • Insufficient for ensuring confidentiality or availability

DAC, Las Vegas, 5 June 2019 9 |

High Low

Observe execution speed: Confidentiality violation Affect execution speed: Availability violation

The ISA intentionally abstracts time away

slide-10
SLIDE 10

What Is Needed?

DAC, Las Vegas, 5 June 2019 10 |

slide-11
SLIDE 11

Confidentiality Needs Time Protection

Time protection: A collection

  • f OS mechanisms which

collectively prevent interference between security domains that make execution speed in one domain dependent on the activities of another. [Ge et al. EuroSys’19]

DAC, Las Vegas, 5 June 2019 11 |

High Low

Traditionally OSes enforce security by memory protection, i.e. enforcing spatial isolation

slide-12
SLIDE 12

Time Protection: Partition Hardware

Low High Cache

Flush Temporally partition Cannot spatially partition on- core caches (L1, TLB, branch predictor, pre-fetchers)

  • virtually-indexed
  • OS cannot control

Low Cache High Low High Cache

Spatially partition Flushing useless for concurrent access

  • HW threads
  • cores

Need both! Need both!

DAC, Las Vegas, 5 June 2019 12 |

slide-13
SLIDE 13

Requirements for Time Protection

DAC, Las Vegas, 5 June 2019 13 |

Timing channels can be closed iff the OS can

  • (spatially) partition or
  • reset

all shared hardware

On-core state Off-core state & stateless HW

slide-14
SLIDE 14

Sharing 1: Stateless Interconnect

H/W is bandwidth-limited

  • Interference during concurrent

access

  • Generally reveals no data or

addresses

  • Must encode info into access

patterns

  • Only usable as covert channel, not

side channel

Shared interconnect

Memory

No effective defence with present hardware!

DAC, Las Vegas, 5 June 2019 14 |

High Low

slide-15
SLIDE 15

Sharing 2: Stateful Hardware

HW is capacity-limited

  • Interference during
  • concurrent access
  • time-shared access
  • Collisions reveal addresses
  • Usable as side channel

Cache

Any state-holding microarchitectural feature:

  • cache, branch predictor, pre-fetcher state machine

Solvable problem – focus of this work

DAC, Las Vegas, 5 June 2019 15 |

High Low

slide-16
SLIDE 16

Implementing Time Protection

  • n Stateful

Hardware

DAC, Las Vegas, 5 June 2019 16 |

slide-17
SLIDE 17

Spatial Partitioning: Cache Colouring

DAC, Las Vegas, 5 June 2019 17 |

Cache RAM

  • Partitions get frames of disjoint colours
  • seL4: userland supplies kernel memory

⇒ colouring userland colours dynamic kernel memory

  • Per-partition kernel image to colour kernel

[Ge et al. EuroSys’19] High Low

TCB PT PT TCB

slide-18
SLIDE 18

Temporal Partitioning: Flush on Switch

DAC, Las Vegas, 5 June 2019 18 |

  • 1. T0 = current_time()
  • 2. Switch user context
  • 3. Flush on-core state
  • 4. Touch all shared data needed for return
  • 5. while (T0+WCET < current_time()) ;
  • 6. Reprogram timer
  • 7. return

Latency depends

  • n prior execution!

Time padding to Remove dependency Ensure deterministic execution

Must remove any history dependence!

slide-19
SLIDE 19

Reality Check: Flushing On-Core State

DAC, Las Vegas, 5 June 2019 19 |

slide-20
SLIDE 20

Evaluating Intra-Core Channels

DAC, Las Vegas, 5 June 2019 20 |

Flush Mitigation on Intel and Arm processors:

  • Disable data prefetcher (just to be sure)
  • On context switch, perform all architected flush operations:
  • Intel: wbinvd + invpcid (no targeted L1-cache flush supported!)
  • Arm: DCCISW + ICIALLU + TLBIALL + BPIALL

Low High Cache

Flush

Low Cache High

slide-21
SLIDE 21

Methodology: Prime and Probe

DAC, Las Vegas, 5 June 2019 21 |

Output Signal

2. Touch n cache lines 1. Fill cache with own data 3. Traverse cache, measure execution time

Input Signal High Low Trojan encodes Spy

  • bserves
slide-22
SLIDE 22

Methodology: Channel Matrix

DAC, Las Vegas, 5 June 2019 22 |

7000 8000 9000 10000 11000 12000 10 20 30 40 50 60 Probing time (cycles) Cache sets accessed datafile using 1:2:($3>pmax ? pmax : $3) 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04

Horizontal variation indicates channel Raw I-cache channel Intel Sandy Bridge Channel Matrix:

  • Conditional probability of
  • bserving time, t, given input, n.
  • Represented as heat map:
  • bright = high probability
slide-23
SLIDE 23

I-Cache Channel With Full State Flush

DAC, Las Vegas, 5 June 2019 23 |

60000 61000 62000 63000 64000 10 20 30 40 50 60 Time (cycles) datafile using 1:2:3 0.001 0.01

Intel Sandy Bridge

12500 13000 13500 14000 2 4 6 8 10 Time (cycles) datafile using 1:2:3 0.001 0.01

Intel Haswell

7000 8000 9000 10000 11000 10 20 30 40 50 60 Output (cycles) Input (sets) datafile using 1:2:3 0.00010 0.00100 Intel Skylake 90000 92000 94000 5 10 15 20 25 30 35 40 Time (cycles) Cache sets datafile using 1:2:3 0.00010 0.00100 HiSilicon A53

CHANNEL! CHANNEL! No evidence

  • f channel

SMALL CHANNEL!

slide-24
SLIDE 24

HiSilicon A53 Branch History Buffer

DAC, Las Vegas, 5 June 2019 24 |

0 1

10-1 10-3 10-2 10-4 10-5 400 600 800 1000 Trojan signal Spy execution time Branch history buffer (BHB)

  • One-bit channel
  • All reset operations applied

Channel!

slide-25
SLIDE 25

Intel Haswell Branch Target Buffer

DAC, Las Vegas, 5 June 2019 25 |

31000 32000 33000 34000 3500 4000 4500 5000 Time (cycles) datafile using 1:2:3 0.001 0.01

Spy execution time Trojan cache footprint Channel! Found residual channels in all recent Intel and ARM processors examined! Branch target buffer

  • All reset operations

applied

slide-26
SLIDE 26

Intel Spectre Defences

DAC, Las Vegas, 5 June 2019 26 |

Intel added indirect branch control (IBC) feature, which closes most channels, but… Intel Skylake Branch history buffer Small channel! https://ts.data61.csiro.au/projects/TS/timingchannels/arch-mitigation.pml

slide-27
SLIDE 27

Requirements

  • n Hardware

DAC, Las Vegas, 5 June 2019 27 |

slide-28
SLIDE 28

New HW/SW Contract: aISA

For all shared microarchitectural resources: 1. Resource must be spatially partitionable or flushable 2. Concurrently shared resources must be spatially partitioned 3. Resource accessed solely by virtual address must be flushed and not concurrently accessed

  • Implies cannot share HW threads across security domains!

4. Mechanisms must be sufficiently specified for OS to partition or reset 5. Mechanisms must be constant time, or of specified, bounded latency 6. Desirable: OS should know if resettable state is derived from data, instructions, data addresses or instruction addresses Augmented ISA supporting time protection

DAC, Las Vegas, 5 June 2019 28 |

slide-29
SLIDE 29

https://trustworthy.systems

THANK YOU

Gernot Heiser | gernot@unsw.edu.au | @GernotHeiser