Cache-Timing Attacks Matteo BOCCHI System Research & - - PowerPoint PPT Presentation

cache timing attacks
SMART_READER_LITE
LIVE PREVIEW

Cache-Timing Attacks Matteo BOCCHI System Research & - - PowerPoint PPT Presentation

Cache-Timing Attacks Matteo BOCCHI System Research & Applications STMicroelectronics S.r.l. 2018/12/06 Agenda 2 STM32 Nucleo Boards STM32 Firewall Cache-Timing Attack Evict&Time vs. MbedTLS AES on STM32


slide-1
SLIDE 1

Cache-Timing Attacks

Matteo BOCCHI System Research & Applications STMicroelectronics S.r.l. 2018/12/06

slide-2
SLIDE 2

Agenda

  • STM32
  • Nucleo Boards
  • STM32 Firewall
  • Cache-Timing Attack
  • Evict&Time vs. MbedTLS AES on STM32
  • Flush+Reload vs. OpenSSL AES on PC

2

slide-3
SLIDE 3

STM32: 32-bit Cortex -M MCUs

3 Ultra-low-power Mainstream

Cortex-M0 Cortex-M0+ Cortex-M3 Cortex-M4 Cortex-M7

High-performance Wireless

Note : Cortex-M0+ Radio Co-processor

TM

  • Low-power
  • Cost sensitive
  • Advanced peripherals
  • Software libraries
  • Rich choice of tools
slide-4
SLIDE 4

STM32 Nucleo Development Board

4

  • Based on STM32

microprocessors

  • A Boards with 1 MCU and

hardware to program/debug

  • Two connectors to

connect to companion chips boards

  • For all STM32 families

Arduino UNO extension connectors easy access to add-

  • ns (*)

Integrated Debugging and programming probe Morpho extension headers: Direct access to all STM32 I/Os STM32 Microcontroller Flexible board power supply through USB or external source

complete product range from ultra-low power to high-performance

(*) thanks to the electrical compatibility it can be used as a shield for Arduino UNO R3 or similar

slide-5
SLIDE 5

STM32 Firewall

6

slide-6
SLIDE 6

Firewall Overview

8

  • HW peripheral that protects code and sensitive data,

monitoring each access from the AHB masters to the Flash memory or SRAM1 AHB slaves

  • Code in Flash memory or SRAM1
  • Constant data in Flash memory
  • Volatile data in SRAM1
  • Protects the intellectual property of embedded

software interacting with OEM code.

  • Prevents attacks designed to dump/execute

protected code outside of a protected area.

  • Protects access to sensitive data from non-protected

user code execution.

Application benefits

CORTEX- M4 DMA AHB BUS MATRIX SRAM1 Flash memory FIREWALL

slide-7
SLIDE 7

Firewall: Run Time Protection

9

Call Gate Entry Protected Code Execution Execution Complete? NO Clear Intermediate Variables and CPU registers Call Gate Exit YES

TRUSTED EXECUTION Example Code Execution FIREWALL CLOSED FIREWALL OPEN FIREWALL CLOSED CALL GATE EXIT GATE

Closed Open Idle

FIREWALL IDLE

Unprotected Code Unprotected Code

slide-8
SLIDE 8

Secure Computations with Firewall

  • With the firewall, we can build secure computation elements
  • They compute encryptions/decryptions using secrets not accessible to other entities
  • They expose API to perform cryptographic/security services
  • They get inputs and return outputs, without leaking any secret information
  • For example:
  • Trusted Execution Environment services
  • Secure Boot / Secure Firmware Update
  • Secure transmission protocols

11

slide-9
SLIDE 9

Evict&Time vs. MbedTLS AES on STM32

12

slide-10
SLIDE 10

AES

  • This cipher is abstractly defined by algebraic operations:

it could be implemented using just logical and arithmetic operations

13

  • Performance-oriented software implementations typically have precomputed

lookup tables (hardcoded) or compute them during system initialization

  • Thus, during each round, only XOR and memory table lookup are computed:
slide-11
SLIDE 11

Attack core idea

  • At the beginning of the AES computation, the variable-index array lookup 𝑈[𝑙 𝑘 ⊕ 𝑜 𝑘 ] is

done (𝑙  key, 𝑜  plaintext)

  • The attacker:
  • watches the time taken by the victim to handle many 𝑜 's
  • totals the AES timings for each possible 𝑜 𝑗 , for each 𝑗
  • For each 𝑗, collects values of 𝑜 𝑗 for which the timing is maximum (e.g. for 𝑗 = 13, the encryption is

slower when 𝑜 13 = 147)

  • The attacker observes, by carrying out experiments with known keys 𝑙 on an identical

machine, that the overall AES time is maximum when 𝑙 13 ⊕ 𝑜 13 = 8

  • The attacker concludes that: 𝑙 13 ⊕ 𝑜 13

⊕ 𝑜 13 = 8 ⊕ 𝑜 13 𝑙 13 ⊕ 𝑜 13 ⊕ 𝑜 13 = 8 ⊕ 147 𝑙 13 = 8 ⊕ 147 𝑙 13 = 155

14

slide-12
SLIDE 12

Demo

15

NUCLEO-L476RG SERIAL PORT TERMINAL 115.200 bit/s 8-N-1

slide-13
SLIDE 13

Flush+Reload vs. OpenSSL AES on PC

16

slide-14
SLIDE 14

Flush+Reload (1/2)

Last-Level-Cache:

17

The attacker: 1. Clears (flush) a specific cache section 2. Launches/observes victim’s process 3. Checks new cache status, measuring time to access those memory addresses

  • There is an attacker who wants to steal secret information of the victim
  • The attacker runs a spy process on the same physical machine of the victim
slide-15
SLIDE 15

Flush+Reload (2/2)

  • Allow attack’s & victim’s processes to be on different cores
  • They share the LLC (Last-Level-Cache)
  • The spy process can be executed simultaneously to get cache-information

step-by-step (if the victim’s process is long and complex)

  • Needs an accurate (but feasible) work of synchronization between the two processes and

some observations more

  • Several cross-VM attacks exploit this mechanism

18

slide-16
SLIDE 16

Flush+Reload vs. OpenSSL AES

1. Empty the cache (flush) 2. Launch victim’s process 3. Check new cache status

19

flush(probe[i]); service_encrypt(plaintext, ciphertext, service_ctx); size_t time = rdtsc(); maccess(probe[i]); size_t delta = rdtsc() - time; if (delta > MIN_CACHE_MISS_CYCLES) { /* cache miss */ }

Attacker’s code

  • AES-128-ECB
  • Unknown key
  • 4x1024 (ENC) + 4x1024 (DEC)

Bytes Look-Up Tables Sbox

Victim’s OpenSSL-based “secure” service 1 2 3

slide-17
SLIDE 17

Exercise

20

flush_reload_RELEASE.tar.gz

  • attack/

/* attacker’s and victim’s source code */

  • openssl-1.1.0f.tar.gz

/* OpenSSL source code */

  • README.md

/* Step-by-step HowTo */

slide-18
SLIDE 18

Why the attack (hopefully) works?

  • main():92 – The attacker’s process maps in his memory the crypto library (public) used by

the victim

  • main():122(146,168,190) – For each of the 4 Sbox encryption tables, it performs

Flush+Reload attacks flushing the portion of cache used by a certain table

  • Usually unknown  before the Flush+Reload, perform a Prime+Probe attack on the whole cache in
  • rder to identify them
  • main():220 – it selects the most probable key bytes candidates by selecting bytes values who

caused shorter timings

  • Probably, they are values loaded in cache by the victim’s process
  • main():224 – it starts an algorithm to recover the AES master key from the “guessed” AES

last-round key

21

slide-19
SLIDE 19

Countermeasures

  • Avoiding memory access, e.g. bitslice implementations
  • Avoid cache miss
  • Lookup tables masking
  • HW AES Instructions

22

slide-20
SLIDE 20

The END

Thanks!

Any questions?

23

Contact: matteo.bocchi@st.com