S CATTER C ACHE : Thwarting Cache Attacks via Cache Set - - PowerPoint PPT Presentation

s catter c ache thwarting cache attacks via cache set
SMART_READER_LITE
LIVE PREVIEW

S CATTER C ACHE : Thwarting Cache Attacks via Cache Set - - PowerPoint PPT Presentation

S CATTER C ACHE : Thwarting Cache Attacks via Cache Set Randomization Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard August 15, 2019 Graz University of Technology What is S CATTER C ACHE ? www.tugraz.at Alternative design for n-way


slide-1
SLIDE 1

SCATTERCACHE: Thwarting Cache Attacks via Cache Set Randomization

Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard August 15, 2019

Graz University of Technology

slide-2
SLIDE 2

What is SCATTERCACHE?

www.tugraz.at

  • Alternative design for n-way set associative caches
  • Designed as countermeasures against cache attacks
  • Breaks the fixed link between addresses and cache sets
  • Increases the number of possible cache sets
  • IDs to change the mapping between security domains

→ Exploitation of side channel information is much harder

  • Reuses established concepts
  • Skewed caches [Sez93]
  • Low latency cryptography (e.g., QARMA-64 [Ava17])
  • Still similar to existing cache designs (usability, hardware)

1 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-3
SLIDE 3

Motivation and Background

slide-4
SLIDE 4

CPU Cache

www.tugraz.at

printf("%d", i); printf("%d", i); printf("%d", i);

C a c h e m i s s

printf("%d", i); printf("%d", i);

C a c h e m i s s

R e q u e s t

Response

i

printf("%d", i); printf("%d", i);

C a c h e m i s s

R e q u e s t

Response

i

printf("%d", i);

C a c h e h i t

printf("%d", i);

C a c h e m i s s

R e q u e s t

Response

i

printf("%d", i);

C a c h e h i t No DRAM access, much faster DRAM access, slow

2 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-5
SLIDE 5

Memory Access Latency

www.tugraz.at

50 100 150 200 250 300 350 400 1 2 3 ·106 Latency [Cycles] Number of Accesses Cache Hits Cache Misses

generated using the CTA calibration tool [GSM15] on my i5-4200U laptop 3 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-6
SLIDE 6

Regular 2-way Set Associative Cache

www.tugraz.at

Memory Address Cache

Tag Data b bits n bits Cache Index 2n cache sets f Way 2 Tag Way 2 Data Way 1 Tag Way 1 Data =? =? Tag Data

4 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-7
SLIDE 7

Prime+Probe

www.tugraz.at

Attacker Address Space Cache Victim Address Space loads data loads data fast access slow access

5 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-8
SLIDE 8

Why should we care?

www.tugraz.at

  • Cache attacks are powerful and break isolation boundaries
  • Many attacking techniques
  • FLUSH+RELOAD, EVICT+RELOAD, FLUSH+FLUSH
  • PRIME+PROBE, EVICT+TIME
  • Numerous attack scenarios
  • Extracting cryptographic keys
  • Keyloggers
  • Breaking of ASLR
  • Collection of private information
  • Often used building block for further microarchitectural attacks

6 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-9
SLIDE 9

SCATTERCACHE

slide-10
SLIDE 10

SCATTERCACHE - Idea

www.tugraz.at

Set 0 Set 1 Set 2 Set 3

  • Addr. A

Domain X

  • Addr. A

Domain Y

  • Addr. B
  • Addr. A
  • Addr. B

@DAC [Tri+18], @MICRO [Qur18]

7 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-11
SLIDE 11

How can we build such a SCATTERCACHE?

slide-12
SLIDE 12

SCATTERCACHE - Naive Concept

www.tugraz.at

IDF

cache line address key

idx0-3 idx0 idx2 idx1 idx3

SDID

  • fgset

tag index

nways·2bindices+nways−1

nways

  • possible cache sets

512 KiB (32 B lines), nways = 8, bindices = 11 → 296.7 sets

  • Index Derivation Function (IDF)

takes an address and returns a cache set

  • Depends on hardware key and
  • ptional Security Domain ID

(SDID)

  • → Unique combination of cache

lines for each address − Potential index collisions − One nways multi-port memory

8 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-13
SLIDE 13

SCATTERCACHE - Concept

www.tugraz.at

We want something that is closer to a traditional cache! instead of this:

  • fgset

set[idx+2] set[idx-2] set[idx-1] set[idx+1] way 0 way 1 way 2 way 3

index tag

let’s do this:

  • fgset

idx0 way 3

index tag

IDF

cache line addr. key

idx1 idx2 idx3 way 1 way 2 way 0

SDID

9 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-14
SLIDE 14

SCATTERCACHE - Concept

www.tugraz.at

  • fgset

idx0 way 3

index tag

IDF

cache line addr. key

idx1 idx2 idx3 way 1 way 2 way 0

SDID

2bindices·nways possible cache sets 512 KiB (32 B lines), nways = 8, bindices = 11 → 288 sets

  • Skewed cache [Sez93] (i.e.,

traditional cache with additional addressing logic) and an IDF

  • Similar to building larger caches

from smaller cache slices

  • We use random replacement

policy (for now)

10 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-15
SLIDE 15

SCATTERCACHE - Selecting the IDF

www.tugraz.at

  • Inputs: cache line address, SDID, key
  • Outputs: nways indices with bindices bits
  • Reuse concepts and existing cryptographic primitives
  • SCv1: hashing variant
  • Block ciphers (e.g., PRINCE [Bor+12])
  • Tweakable block ciphers (e.g., QARMA [Ava17])
  • Permutation-based primitives (e.g., Keccak-p [Ber+11])
  • SCv2: permutation variant
  • Prevents birthday-bound index collisions
  • No off-the-shelf primitives

11 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-16
SLIDE 16

System Integration

slide-17
SLIDE 17

SCATTERCACHE - System Integration

www.tugraz.at

  • SCATTERCACHE as last level cache
  • Hardware managed key
  • Randomly generated at boot time
  • Rekeying with full cache flush
  • Potential for iterative rekeying

→ concurrently developed CEASER-S @ISCA [Qur19]

  • SDID management via page table (indirection)
  • x86: Page Attribute Tables (PATs)
  • ARM: Memory Attribute Indirection Register (MAIRs)

12 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-18
SLIDE 18

SCATTERCACHE - Software Support

www.tugraz.at

  • SCATTERCACHE requires no software support, default SDID = 0
  • But - OS support enables page-wise security domains

→ shared read-only pages can be private in the cache!

  • OS can define domains as needed

(pages, processes, containers, VMs, . . . )

  • Software-based page “rekeying” by changing the SDID

13 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-19
SLIDE 19

Security and Evaluation

slide-20
SLIDE 20

Applicable Cache Attacks

www.tugraz.at

  • Unshared memory has no shared (physical) addresses

→ No FLUSH+RELOAD, EVICT+RELOAD, FLUSH+FLUSH → Specialized PRIME+PROBE is possible

  • Shared, read-only memory

→ Like unshared memory given OS support → Otherwise, eviction-based attacks are hindered

  • Shared, writable memory can’t be separated

→ Eviction-based attacks are hindered

14 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-21
SLIDE 21

SCATTERCACHE - PRIME+PROBE

www.tugraz.at

  • No end-to-end attack yet

→ Simplified setting: perfect control, single access, no noise → Investigate the building blocks in simulation and analytically

  • Finding congruent addresses (nways = 8, bindices = 11)
  • Full collisions are unlikely → use partial collisions
  • Approach in the paper: ≈ 225 profiled victim accesses
  • Generalized by Purnal and Verbauwhede [PV19]: ≈ 210
  • Evicting one set with 99 % needs 275 addresses
  • Two PRIME+PROBE variants (nways = 8, bindices = 12)
  • 99 % confidence: 35 to 152 victim accesses (repetitions)
  • Between 9870 and 1216 congruent addresses
  • Investigate the effect of noise (coupon collector problem)

15 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-22
SLIDE 22

SCATTERCACHE - Performance

www.tugraz.at

  • Micro benchmarks using the gem5 full system simulator (ARM)
  • Poky Linux from Yocto 2.5 (kernel version 4.14.67)
  • GAP

, MiBench, lmbench, scimark2

  • SPEC CPU 2017 on custom cache simulator
  • Cache hit rate always at or above levels of set-associative

cache with random replacement

  • Typically 2 % − 4 % below LRU on micro benchmarks, 0 % − 2 %

for SPEC

16 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-23
SLIDE 23

Conclusion

www.tugraz.at

  • SCATTERCACHE builds upon skewed caches and low latency

cryptographic primitives

  • Breaks the fixed link between addresses and cache sets
  • Removes the rigid assignment of cache lines to sets
  • Enables software control over the cache congruencies via SDIDs
  • Comparable performance to contemporary caches
  • Harder to attack even in very strong attack models
  • Attacks are probabilistic and demand new approaches
  • Still, more analysis is required in more realistic models to

determine if and how often rekeying is needed

17 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-24
SLIDE 24

Acknowledgements - We want to thank ...

www.tugraz.at

  • the anonymous USENIX reviewers.
  • our shepherd Yossi Oren.
  • Antoon Purnal and Ingrid Verbauwhede from KU Leuven for their analysis.
  • Our funding partners:
  • European Research Council (ERC)

Horizon 2020 grant agreement No 681402

  • Intel

18 Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard — Graz University of Technology

slide-25
SLIDE 25

SCATTERCACHE: Thwarting Cache Attacks via Cache Set Randomization

Werner, Unterluggauer, Giner, Schwarz, Gruss, Mangard August 15, 2019

Graz University of Technology

slide-26
SLIDE 26

References i

www.tugraz.at

References

[Ava17] Roberto Avanzi. “The QARMA Block Cipher Family. Almost MDS Matrices Over Rings With Zero Divisors, Nearly Symmetric Even-Mansour Constructions With Non-Involutory Central Rounds, and Search Heuristics for Low-Latency S-Boxes”. In: IACR Trans. Symmetric Cryptol. (2017),

  • pp. 4–44. DOI: 10.13154/tosc.v2017.i1.4-44.

[Ber+11] Guido Bertoni, Joan Daemen, Micha¨ el Peeters, and Gilles Van Assche. The KECCAK reference. https://keccak.team/files/Keccak-reference-3.0.pdf. 2011. [Bor+12] Julia Borghoff et al. “PRINCE - A Low-Latency Block Cipher for Pervasive Computing Applications

  • Extended Abstract”. In: Advances in Cryptology – ASIACRYPT. 2012, pp. 208–225. DOI:

10.1007/978-3-642-34961-4\_14. [GSM15] Daniel Gruss, Raphael Spreitzer, and Stefan Mangard. Cache Template Attacks Repository. https://github.com/IAIK/cache_template_attacks. 2015.

slide-27
SLIDE 27

References ii

www.tugraz.at [PV19] Antoon Purnal and Ingrid Verbauwhede. “Advanced profiling for probabilistic Prime+Probe attacks and covert channels in ScatterCache”. In: arXiv abs/1508.03619 (2019). URL: http://arxiv.org/abs/1908.03383. [Qur18] Moinuddin K. Qureshi. “CEASER: Mitigating Conflict-Based Cache Attacks via Encrypted-Address and Remapping”. In: IEEE/ACM International Symposium on Microarchitecture – MICRO. 2018, pp. 775–787. DOI: 10.1109/MICRO.2018.00068. [Qur19] Moinuddin K. Qureshi. “New attacks and defense for encrypted-address cache”. In: International Symposium on Computer Architecture – ISCA. 2019, pp. 360–371. DOI: 10.1145/3307650.3322246. [Sez93] Andr´ e Seznec. “A Case for Two-Way Skewed-Associative Caches”. In: International Symposium

  • n Computer Architecture – ISCA. 1993, pp. 169–178. DOI: 10.1145/165123.165152.

[Tri+18] David Trilla, Carles Hern´ andez, Jaume Abella, and Francisco J. Cazorla. “Cache side-channel attacks and time-predictability in high-performance critical real-time systems”. In: Design Automation Conference – DAC. 2018, 98:1–98:6. DOI: 10.1145/3195970.3196003.