Pipeline for Our Example Using SCALE Atmel dataset A Acquire - - PowerPoint PPT Presentation

▶

Sep 19, 2022 212 likes •561 views

How Differential Power Attacks Work Profiled Attack Example 28 Pipeline for Our Example Using SCALE Atmel dataset A Acquire training data 1000 traces, random known plaintexts Fixed known key is less ideal Traces are already aligned B Build a

SLIDE 1

How Differential Power Attacks Work Profiled Attack Example 28

Pipeline for Our Example

Using SCALE Atmel dataset

A Acquire training data 1000 traces, random known plaintexts Fixed known key is less ideal Traces are already aligned B Build a profile

1 We already identified potential PoIs 2 Model and profiling tbd

C Collect target traces 1000 traces, random known plaintexts D Distinguish

1 Template Attack 2 Stochastic Attack

SLIDE 2

How Differential Power Attacks Work Profiled Attack Example 29

Template Attacks

Naive Bayes i xi power 12 1.35... 1 123 4.65... . . . . . . . . . . . . . . . . . . 999 59 2.79... ⇒ k0 score 0.134... 1 0.116... . . . . . . 255 0.098...

1. From Probability to Likelihood For each key candidate k determine its a posteriori probability given the observed leakage L

SLIDE 3

How Differential Power Attacks Work Profiled Attack Example 29

Template Attacks

Naive Bayes i xi power 12 1.35... 1 123 4.65... . . . . . . . . . . . . . . . . . . 999 59 2.79... ⇒ k0 score 0.134... 1 0.116... . . . . . . 255 0.098...

1. From Probability to Likelihood Pr[k | L ] = Pr[L | k ] · Pr[k] Pr[L] Pr[L | k ] is the likelihood Pr[k] and Pr[L] can be ignored

SLIDE 4

How Differential Power Attacks Work Profiled Attack Example 29

Template Attacks

Naive Bayes i xi power 12 1.35... 1 123 4.65... . . . . . . . . . . . . . . . . . . 999 59 2.79... ⇒ k0 score 0.134... 1 0.116... . . . . . . 255 0.098...

2. From Likelihood to Sum of Log Likelihoods Assume each trace leaks independently, then Pr[L | k ] =

Pr[Li | k ]

SLIDE 5

How Differential Power Attacks Work Profiled Attack Example 29

Template Attacks

Naive Bayes i xi power 12 1.35... 1 123 4.65... . . . . . . . . . . . . . . . . . . 999 59 2.79... ⇒ k0 score 0.134... 1 0.116... . . . . . . 255 0.098...

2. From Likelihood to Sum of Log Likelihoods Assume each trace leaks independently, then afer taking logs log2 Pr[L | k ] =

log2 Pr[Li | k ]

SLIDE 6

How Differential Power Attacks Work Profiled Attack Example 29

Template Attacks

Naive Bayes i xi power 12 1.35... 1 123 4.65... . . . . . . . . . . . . . . . . . . 999 59 2.79... ⇒ k0 score 0.134... 1 0.116... . . . . . . 255 0.098...

From Log Likelihood to QDA Assume L(data) ∼ ˆ M(xi ⊕ k∗) + N(0, σ) then log2 Pr[Li | k ] = log2 N(Li − ˆ M(xi ⊕ k); 0, σ)

SLIDE 7

How Differential Power Attacks Work Profiled Attack Example 29

Template Attacks

Naive Bayes i xi power 12 1.35... 1 123 4.65... . . . . . . . . . . . . . . . . . . 999 59 2.79... ⇒ k0 score 0.134... 1 0.116... . . . . . . 255 0.098...

From Log Likelihood to QDA Assume L(data) ∼ ˆ M(xi ⊕ k∗) + N(0, σ) then log2 Pr[Li | k ] = − log2 e

Li − ˆ

M(xi ⊕ k) 2 /2σ2 − 1 2(1 + log2 π) − σ

SLIDE 8

How Differential Power Attacks Work Profiled Attack Example 29

Template Attacks

Naive Bayes i xi power 12 1.35... 1 123 4.65... . . . . . . . . . . . . . . . . . . 999 59 2.79... ⇒ k0 score 0.134... 1 0.116... . . . . . . 255 0.098...

QDA Summary score(k|L) =

(Li − ˆ M(xi ⊕ k))2 To profile: ˆ M(z) for all 256 possible z Warning: Scores can no longer be interpreted as posteriors

SLIDE 9

How Differential Power Attacks Work Profiled Attack Example 30

Template and Stochastic Attacks

SCALE Atmel Profiling

50 100 150 200 250 0.04 0.02 0.00 0.02 0.04

11005

50 100 150 200 250 0.04 0.02 0.00 0.02 0.04

11223 Template Attack For all 256 possible S-box input values determine the sample mean (optional) determine the sample variance Problem: 1000 traces is not enough to estimate 256 parameters

SLIDE 10

How Differential Power Attacks Work Profiled Attack Example 30

Template and Stochastic Attacks

SCALE Atmel Profiling

1 2 3 4 5 6 7 8 0.04 0.02 0.00 0.02 0.04 Original data Fitted line

11005

1 2 3 4 5 6 7 8 0.01 0.00 0.01 0.02 0.03 0.04 Original data Fitted line

11223 Stochastic Attack Assume the leakage model Ma,b(k, x) = a · HammingWeight(Sbox(x ⊕ k)) + b estimate a and b (Warning: The right estimation is naively unweighted)

SLIDE 11

How Differential Power Attacks Work Profiled Attack Example 31

Template Attacks

SCALE Atmel Scores

50 100 150 200 250 2000 3000 4000 5000 6000 7000

11005

50 100 150 200 250 45000 46000 47000 48000 49000

11223 Final distinguishing scores Afer incorporating 1000 target traces left One candidate key very clearly sticks out right One candidate key sticks out, but not as much

SLIDE 12

How Differential Power Attacks Work Profiled Attack Example 31

Template Attacks

SCALE Atmel Scores

200 400 600 800 1000 2 4 6 8 10

11005

200 400 600 800 1000 40.0 42.5 45.0 47.5 50.0 52.5 55.0 57.5 60.0

11223 Evolution of distinguishing scores Look at scores as a function of number of traces incorporated left the true key quickly separates from the rest right it takes much longer for the true key to stand out In blue the actual keybyte

SLIDE 13

How Differential Power Attacks Work Profiled Attack Example 31

Template Attacks

SCALE Atmel Scores

0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 2 4 6 8 10 12 14

11005

0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 25 50 75 100 125 150 175

11223 Evolution of distinguishing scores Look at scores as a function of number of traces incorporated left the true key quickly separates from the rest right it takes much longer for the true key to stand out In blue the actual keybyte

SLIDE 14

How Differential Power Attacks Work Profiled Attack Example 32

Template Attacks

SCALE Atmel Success Rate

0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 0.0 0.2 0.4 0.6 0.8

11005

25 50 75 100 125 150 175 200 0.00 0.05 0.10 0.15 0.20 0.25

11223 Success Rate: Probability that best guess wins For each i (x-axis), ran 2000 experiments:

1 Selected i out of 1000 traces 2 Check if best guess is actual keybyte

Warning: resampling methodology used due to available data

SLIDE 15

How Differential Power Attacks Work Profiled Attack Example 32

Template Attacks

SCALE Atmel Success Rate

0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 0.0 0.2 0.4 0.6 0.8

11005

25 50 75 100 125 150 175 200 0.00 0.05 0.10 0.15 0.20 0.25

11223 Success rate conclusion

1 Lef performs better than right 2 Success rate 2−2 for a single keybyte, only gives 2−32 for the full

16-byte key. Note: jaggedness likely due to low number of experiments

SLIDE 16

Key Enumeration and Ranking Enumeration 33

Different Adversarial Scenarios

Not-Quite-Kerckhoffs Principle

❆

K∗ ←$ Kg C ← EK∗(X) L ←$ Leak(K∗, X) L ← Leak(K, X) win ← K = K∗

K win

ˆ K

The adversary can exhaustively search the key

SLIDE 17

Key Enumeration and Ranking Enumeration 33

Different Adversarial Scenarios

Not-Quite-Kerckhoffs Principle

❆

K∗ ←$ Kg Leak ←$ L C ← EK∗(X) L ←$ Leak(K∗, X)

X L

L ← Leak(K, X)

K,X L

win ← K = K∗

K win

ˆ K

The adversary can enumerate the key

SLIDE 18

Key Enumeration and Ranking Enumeration 34

Enumeration

Enhancing Divide-and-Conquer Attacks

k0 score 0.123... 1 0.127... . . . . . . 255 0.238... k1 score 0.134... 1 0.116... . . . . . . 255 0.098... ... k15 score 0.184... 1 0.167... . . . . . . 255 0.152...

Best guess Simply output the most likely 128-bit key overall Key enumeration Test keys from most likely to least likely until success

SLIDE 19

Key Enumeration and Ranking Enumeration 34

Enumeration

Enhancing Divide-and-Conquer Attacks

k0 score 0.123... 1 0.127... . . . . . . 255 0.238... k1 score 0.134... 1 0.116... . . . . . . 255 0.098... ... k15 score 0.184... 1 0.167... . . . . . . 255 0.152...

Best guess obviously k0 = 0, k1 = 255, ..., k15 = 255 But what about the next best guess? Question posed by Veyrat-Charvillon et al. (SAC’12)

SLIDE 20

Key Enumeration and Ranking Enumeration 34

Enumeration

Enhancing Divide-and-Conquer Attacks

k0 score 0.123... 1 0.127... . . . . . . 255 0.238... k1 score 0.134... 1 0.116... . . . . . . 255 0.098... ... k15 score 0.184... 1 0.167... . . . . . . 255 0.152...

DPA with Enumeration A number of cost metrics

1 The number of traces (profile vs.target) 2 The running time of the distinguisher 3 The number of keys to test 4 The overhead (in time) to enumerate

SLIDE 21

Key Enumeration and Ranking Enumeration 34

Enumeration

Enhancing Divide-and-Conquer Attacks

Some approaches Naive Create ordered list of all 2128 keys 2012 Tree-like recursion algorithm

[Veyrat-Charvillon, Gérard, Renauld, Standaert / SAC]

2015 Dynamic programming enabling parallellization

[Martin, O’Connell, Oswald, Stam / Asiacrypt]

SLIDE 22

Key Enumeration and Ranking Enumeration 35

A Typical Side-Channel Attack Pipeline

Adding Enumeration Afer the Distinguish phase, the scores are fed to an Enumeration phase

SLIDE 23

Key Enumeration and Ranking Enumeration 35

A Typical Side-Channel Attack Pipeline

Adding Enumeration Afer the Distinguish phase, the scores are fed to an Enumeration phase But how long will it take, roughly? Question posed by Veyrat-Charvillon et al. (Eurocrypt’13)

SLIDE 24

Key Enumeration and Ranking Ranking 36

A Typical Side-Channel Attack Pipeline

Emulating Enumeration Afer the Distinguishing phase, use knowledge of the target key to determine its rank. Rather than running enumeration, Emulate it to predict its runtime

SLIDE 25

Key Enumeration and Ranking Ranking 37

Key Ranking

Emulating the cost of key enumeration

Relevance: Evaluation Many SCA are run by evalution labs: The care not about actually recovering the key Only how difficult it is to do so The target key will be known already! Ranking algorithms A number of relevant metrics

1 The time to compute 2 Potential for parallellization 3 Quality of the returned rank when approximating

SLIDE 26

Key Enumeration and Ranking Ranking 37

Key Ranking

Emulating the cost of key enumeration

Some algorithmic approaches lore Adding “guessing entropies” 2013 Tree-like recursion algorithm

[Veyrat-Charvillon, Gérard, Renauld, Standaert / Eurocrypt]

2015 Dynamic programming enabling parallellization

[Martin, O’Connell, Oswald, Stam / Asiacrypt]

2015 Convolution of histograms

[Glowacz, Grosso, Poussier, Schüth, Standaert / FSE] [Bernstein, Lange, van Vredendaal / eprint]

SLIDE 27

Key Enumeration and Ranking Ranking 38

Key Rank Distributions

Martin, Mather, Oswald, Stam / Asiacrypt’16 ❆

K∗ ←$ Kg Leak ←$ L C ← EK∗(X) L ←$ Leak(K∗, X)

X L

L ← Leak(K, X) L ← Leak(K, X)

K,X L

win ← K = K∗

K win

ˆ K

The Rank Distribution Evaluator’s task for some keyed device: How long will it roughly take to recover the key as a function of the number of traces?

SLIDE 28

Key Enumeration and Ranking Ranking 38

Key Rank Distributions

Martin, Mather, Oswald, Stam / Asiacrypt’16 ❆

K∗ ←$ Kg Leak ←$ L assert Leak ∈ L C ← EK∗(X) L ←$ Leak(K∗, X)

Leak, X L

L ← Leak(K, X) L ← Leak(K, X)

K,X L

win ← K = K∗

K win

ˆ K

MMOS Setup AES-128 with simulated leakage Sbox output Hamming weight with Gaussian noise For SNRs 2x with x ∈ {−7, −5, −3} Ran an unprofiled Correlation Power Attack (CPA)

SLIDE 29

Key Enumeration and Ranking Ranking 38

Key Rank Distributions

Martin, Mather, Oswald, Stam / Asiacrypt’16

MMOS Lessons

1 Average log rank is more useful than log of average rank

geometric mean versus arithmetic mean

2 The variance in the rank is considerable, esp. in the middle 3 SNR does not affect the shape of the distribution beyond scaling x-axis

SLIDE 30

Key Enumeration and Ranking Ranking 38

Key Rank Distributions

Martin, Mather, Oswald, Stam / Asiacrypt’16

Challenges

1 Improved sensor fusion to combine subkey scores 2 Optimize distinguishers w.r.t. resulting key ranks

Model and feature selection Score computation

3 Rank distribution against various countermeasures

SLIDE 31

Conclusion Want to Learn More? 39

SCALE: A Resource by Dan Page

https://github.com/danpage/scale

Side-Channel Attack Lab. Exercises Provides a suite of material related to side-channel (and fault) attacks that is low-cost, accessible, relevant, coherent, and effective. SCALE Data Sets

1 Four platforms: an Atmel atmega328p (an AVR) plus three NXP ARM

Cortex-M processors

2 Implementation uses an 8-bit datapath and look-up tables for the

S-box and xtime operations (but code not known)

3 2 × 1000 traces of AES-128 each (known vs. unknown key) 4 Traces acquired using a Picoscope 2206B, using triggers for alignment

SLIDE 32

Conclusion Want to Learn More? 40

Power Analysis Attacks

Stefan Mangard, Elisabeth Oswald, and Thomas Popp’s Classic

Revealing the Secrets of Smart Cards “first comprehensive treatment of power analysis attacks and countermeasures” Aimed at the practitioner From 2007 ⇒ no modern ideas and theory

SLIDE 33

Conclusion Want to Learn More? 41

CHES

An IACR Conference

Cryptographic Hardware and Embedded Systems Established in 1999 Efficient implementations How to mount implementation attacks How to protect against them New designs that allow efficient yet secure implementations https://ches.iacr.org