A Comprehensive Study of Deep Learning for Side-Channel Analysis c - - PowerPoint PPT Presentation

a comprehensive study of deep learning for side channel
SMART_READER_LITE
LIVE PREVIEW

A Comprehensive Study of Deep Learning for Side-Channel Analysis c - - PowerPoint PPT Presentation

A Comprehensive Study of Deep Learning for Side-Channel Analysis A Comprehensive Study of Deep Learning for Side-Channel Analysis c Masure 1,3 ecile Dumas 1 Emmanuel Prouff 2, 3 Lo C 1 Univ. Grenoble Alpes, CEA, LETI, DSYS, CESTI, F-38000


slide-1
SLIDE 1

A Comprehensive Study of Deep Learning for Side-Channel Analysis

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Lo¨ ıc Masure1,3 C´ ecile Dumas1 Emmanuel Prouff2, 3

  • 1Univ. Grenoble Alpes, CEA, LETI, DSYS, CESTI, F-38000 Grenoble

loic.masure@cea.fr 2ANSSI, France 3Sorbonne Universit´

e, UPMC Univ Paris 06, POLSYS, UMR 7606, LIP6, F-75005, Paris, France

17/09/2020, Ches

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 1/18

slide-2
SLIDE 2

Outline

  • 1. Context
  • 2. SCA Optimization Problem versus Deep Learning

Based SCA

  • 3. NLL Minimization is PI Maximization
  • 4. Simulation results
  • 5. Experimental results
slide-3
SLIDE 3

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Who am I

◮ PhD student, studying Deep Learning (DL) for Side-Channel Analysis (SCA)

Conceives a component Evaluates Security Claims Delivers a Security Certjfjcatjon Commercialises the certjfjed product Developer ITSEF ANSSI Developer Loïc Cécile

French Certjfjcatjon Scheme

Emmanuel

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 3/18

slide-4
SLIDE 4

A Comprehensive Study of Deep Learning for Side-Channel Analysis

What is SCA?

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 4/18

slide-5
SLIDE 5

A Comprehensive Study of Deep Learning for Side-Channel Analysis

What is SCA?

Encryption Sensitive operation LOAD X ; LOAD B ; MV B ; …

Plaintext P Secret K Measure trace X Z = C(P, K)

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 4/18

slide-6
SLIDE 6

A Comprehensive Study of Deep Learning for Side-Channel Analysis

What is SCA?

Encryption Sensitive operation LOAD X ; LOAD B ; MV B ; …

Plaintext P Secret K Measure trace X Z = C(P, K)

Profiling Attack

Attack using open samples similar to the target device – same code, same chip,

  • etc. – with full knowledge of the secret key

Two steps: ◮ Profiling phase: P, K known = ⇒ Z known, X acquired on an open sample ◮ Attack phase: P known, X acquired on the target device, K guessed

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 4/18

slide-7
SLIDE 7

Outline

  • 1. Context
  • 2. SCA Optimization Problem versus Deep Learning

Based SCA

  • 3. NLL Minimization is PI Maximization
  • 4. Simulation results
  • 5. Experimental results
slide-8
SLIDE 8

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Profiling Attacks

Key Recovery (i.e. attack step)

Given Na attack traces xi with plaintext pi, calculate scores yi = F(xi)

1 . . . K

y0

1 . . . Zi = C(pi, k⋆)

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 6/18

slide-9
SLIDE 9

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Profiling Attacks

Key Recovery (i.e. attack step)

Given Na attack traces xi with plaintext pi, calculate scores yi = F(xi)

1 . . . K

y0

1 . . . Zi = C(pi, k⋆)

y1

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 6/18

slide-10
SLIDE 10

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Profiling Attacks

Key Recovery (i.e. attack step)

Given Na attack traces xi with plaintext pi, calculate scores yi = F(xi)

1 . . . K

y0

1 . . . Zi = C(pi, k⋆)

y1 y2

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 6/18

slide-11
SLIDE 11

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Profiling Attacks

Key Recovery (i.e. attack step)

Given Na attack traces xi with plaintext pi, calculate scores yi = F(xi)

1 . . . K

y0

1 . . . Zi = C(pi, k⋆)

y1 y2

ˆ k

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 6/18

slide-12
SLIDE 12

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Profiling Attacks

Key Recovery (i.e. attack step)

Given Na attack traces xi with plaintext pi, calculate scores yi = F(xi)

1 . . . K

y0

1 . . . Zi = C(pi, k⋆)

y1 y2

ˆ k

Goal: find F that minimizes Na s.t. ˆ k = k⋆ with probability ≥ β (e.g. 0.9)

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 6/18

slide-13
SLIDE 13

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Profiling Attacks

Key Recovery (i.e. attack step)

Given Na attack traces xi with plaintext pi, calculate scores yi = F(xi)

1 . . . K

y0

1 . . . Zi = C(pi, k⋆)

y1 y2

ˆ k

Goal: find F that minimizes Na s.t. ˆ k = k⋆ with probability ≥ β (e.g. 0.9) Optimal model: F ⋆, with N⋆

a traces

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 6/18

slide-14
SLIDE 14

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Profiling Attacks

Key Recovery (i.e. attack step)

Given Na attack traces xi with plaintext pi, calculate scores yi = F(xi)

1 . . . K

y0

1 . . . Zi = C(pi, k⋆)

y1 y2

ˆ k

Goal: find F that minimizes Na s.t. ˆ k = k⋆ with probability ≥ β (e.g. 0.9) Optimal model: F ⋆, with N⋆

a traces

How to find F ⋆ = ⇒ profiling step

Requires to know the probability distribution F ⋆ = Pr[Z|X]

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 6/18

slide-15
SLIDE 15

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Profiling Attacks

Key Recovery (i.e. attack step)

Given Na attack traces xi with plaintext pi, calculate scores yi = F(xi)

1 . . . K

y0

1 . . . Zi = C(pi, k⋆)

y1 y2

ˆ k

Goal: find F that minimizes Na s.t. ˆ k = k⋆ with probability ≥ β (e.g. 0.9) Optimal model: F ⋆, with N⋆

a traces

How to find F ⋆ = ⇒ profiling step

Requires to know the probability distribution F ⋆ = Pr[Z|X] Reality: unknown for the evaluator/attacker. Estimation with parametric models F(., θ):

Estjmator F( . ; θ) P(Z|X=x) 0% 20% 40% 60% 80% 100% Z=0 Z=1 x 17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 6/18

slide-16
SLIDE 16

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Deep Learning (DL) based SCA is a hot topic currently

Recent milestones about its effectiveness: more robust against counter-measures like masking [MPP16], jitter (misalignment) [CDP17], whether on software or FPGA [Kim+19]

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 7/18

slide-17
SLIDE 17

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Deep Learning (DL) based SCA is a hot topic currently

Recent milestones about its effectiveness: more robust against counter-measures like masking [MPP16], jitter (misalignment) [CDP17], whether on software or FPGA [Kim+19]

Training a Neural Network

F(x, θ) Parameters θ z = C(p, k⋆) L(y, z)

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 7/18

slide-18
SLIDE 18

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Deep Learning (DL) based SCA is a hot topic currently

Recent milestones about its effectiveness: more robust against counter-measures like masking [MPP16], jitter (misalignment) [CDP17], whether on software or FPGA [Kim+19]

Training a Neural Network

F(x, θ) Parameters θ z = C(p, k⋆) L(y, z)

L: performance metric (accuracy, recall, ...) or loss function (Mean Square Error, NLL, ...)

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 7/18

slide-19
SLIDE 19

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Open issue with Machine Learning based SCA1

“How to evaluate the quality of a model during training?”

1Picek et al., Ches 2019 [Pic+18] 17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 8/18

slide-20
SLIDE 20

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Open issue with Machine Learning based SCA1

“How to evaluate the quality of a model during training?” ◮ Accuracy: probability to recover the secret key with one trace

1Picek et al., Ches 2019 [Pic+18] 17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 8/18

slide-21
SLIDE 21

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Open issue with Machine Learning based SCA1

“How to evaluate the quality of a model during training?” ◮ Accuracy: probability to recover the secret key with one trace

Their observations

”Accuracy does not seem to be the right performance metric in SCA”

1Picek et al., Ches 2019 [Pic+18] 17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 8/18

slide-22
SLIDE 22

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Open issue with Machine Learning based SCA1

“How to evaluate the quality of a model during training?” ◮ Accuracy: probability to recover the secret key with one trace

Their observations

”Accuracy does not seem to be the right performance metric in SCA” ◮ High accuracy = ⇒ successful key recovery

1Picek et al., Ches 2019 [Pic+18] 17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 8/18

slide-23
SLIDE 23

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Open issue with Machine Learning based SCA1

“How to evaluate the quality of a model during training?” ◮ Accuracy: probability to recover the secret key with one trace

Their observations

”Accuracy does not seem to be the right performance metric in SCA” ◮ High accuracy = ⇒ successful key recovery ◮ Low accuracy = ⇒ nothing

1Picek et al., Ches 2019 [Pic+18] 17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 8/18

slide-24
SLIDE 24

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Open issue with Machine Learning based SCA1

“How to evaluate the quality of a model during training?” ◮ Accuracy: probability to recover the secret key with one trace

Their observations

”Accuracy does not seem to be the right performance metric in SCA” ◮ High accuracy = ⇒ successful key recovery ◮ Low accuracy = ⇒ nothing, problem: often happens (e.g. highly noisy leakages)

1Picek et al., Ches 2019 [Pic+18] 17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 8/18

slide-25
SLIDE 25

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Open issue with Machine Learning based SCA1

“How to evaluate the quality of a model during training?” ◮ Accuracy: probability to recover the secret key with one trace

Their observations

”Accuracy does not seem to be the right performance metric in SCA” ◮ High accuracy = ⇒ successful key recovery ◮ Low accuracy = ⇒ nothing, problem: often happens (e.g. highly noisy leakages) ◮ Apparently, no other machine learning metric related to SCA metrics

1Picek et al., Ches 2019 [Pic+18] 17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 8/18

slide-26
SLIDE 26

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Open issue with Machine Learning based SCA1

“How to evaluate the quality of a model during training?” ◮ Accuracy: probability to recover the secret key with one trace

Their observations

”Accuracy does not seem to be the right performance metric in SCA” ◮ High accuracy = ⇒ successful key recovery ◮ Low accuracy = ⇒ nothing, problem: often happens (e.g. highly noisy leakages) ◮ Apparently, no other machine learning metric related to SCA metrics Accuracy: find β s.t. N⋆

a = 1

1Picek et al., Ches 2019 [Pic+18] 17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 8/18

slide-27
SLIDE 27

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Open issue with Machine Learning based SCA1

“How to evaluate the quality of a model during training?” ◮ Accuracy: probability to recover the secret key with one trace

Their observations

”Accuracy does not seem to be the right performance metric in SCA” ◮ High accuracy = ⇒ successful key recovery ◮ Low accuracy = ⇒ nothing, problem: often happens (e.g. highly noisy leakages) ◮ Apparently, no other machine learning metric related to SCA metrics Accuracy: find β s.t. N⋆

a = 1

= SCA: fix β and find N⋆

a instead

1Picek et al., Ches 2019 [Pic+18] 17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 8/18

slide-28
SLIDE 28

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Open issue with Machine Learning based SCA1

“How to evaluate the quality of a model during training?” ◮ Accuracy: probability to recover the secret key with one trace

Their observations

”Accuracy does not seem to be the right performance metric in SCA” ◮ High accuracy = ⇒ successful key recovery ◮ Low accuracy = ⇒ nothing, problem: often happens (e.g. highly noisy leakages) ◮ Apparently, no other machine learning metric related to SCA metrics Accuracy: find β s.t. N⋆

a = 1

= SCA: fix β and find N⋆

a instead

Our claim: we can accurately estimate N⋆

a with DL !

1Picek et al., Ches 2019 [Pic+18] 17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 8/18

slide-29
SLIDE 29

Outline

  • 1. Context
  • 2. SCA Optimization Problem versus Deep Learning

Based SCA

  • 3. NLL Minimization is PI Maximization
  • 4. Simulation results
  • 5. Experimental results
slide-30
SLIDE 30

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Bridging the gap between the loss function and the SCA metric

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 10/18

slide-31
SLIDE 31

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Bridging the gap between the loss function and the SCA metric

hello

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 10/18

slide-32
SLIDE 32

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Bridging the gap between the loss function and the SCA metric

hello H(Z)

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 10/18

slide-33
SLIDE 33

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Bridging the gap between the loss function and the SCA metric

hello H(Z) H(Z|X)

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 10/18

slide-34
SLIDE 34

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Bridging the gap between the loss function and the SCA metric

hello H(Z) H(Z|X) MI (Z; X)

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 10/18

slide-35
SLIDE 35

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Bridging the gap between the loss function and the SCA metric

hello H(Z) H(Z|X) MI (Z; X) ≥

f (β) N⋆

a Cherisey et al. CHES 19 f (β) = n − (1 − β) log2(2n − 1) + β log2(β) + (1 − β) log2(1 − β) 17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 10/18

slide-36
SLIDE 36

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Bridging the gap between the loss function and the SCA metric

hello H(Z) H(Z|X) MI (Z; X) ≥

f (β) N⋆

a Cherisey et al. CHES 19 f (β) = n − (1 − β) log2(2n − 1) + β log2(β) + (1 − β) log2(1 − β)

PI (Z; X; θ)

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 10/18

slide-37
SLIDE 37

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Bridging the gap between the loss function and the SCA metric

hello H(Z) H(Z|X) MI (Z; X) ≥

f (β) N⋆

a Cherisey et al. CHES 19 f (β) = n − (1 − β) log2(2n − 1) + β log2(β) + (1 − β) log2(1 − β)

PI (Z; X; θ) ≤ MI (Z; X)

Bronchain et al. CRYPTO 19 17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 10/18

slide-38
SLIDE 38

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Bridging the gap between the loss function and the SCA metric

Training: minimization of the NLL a.k.a. Cross Entropy L(θ) = 1 Np

Np

  • i=1

− log2 F(xi, θ)[zi] = H(Z) − PINp (Z; X; θ) hello H(Z) H(Z|X) MI (Z; X) ≥

f (β) N⋆

a Cherisey et al. CHES 19 f (β) = n − (1 − β) log2(2n − 1) + β log2(β) + (1 − β) log2(1 − β)

PI (Z; X; θ) ≤ MI (Z; X)

Bronchain et al. CRYPTO 19

This talk → E

X,Z [L(θ)]

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 10/18

slide-39
SLIDE 39

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Main Result

Proposition

Steps L(θ) H(Z) H(Z|X) MI (Z; X)

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 11/18

slide-40
SLIDE 40

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Main Result

Proposition

Steps L(θ) H(Z) H(Z|X) MI (Z; X)

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 11/18

slide-41
SLIDE 41

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Main Result

Proposition

Steps L(θ) H(Z) H(Z|X) MI (Z; X)

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 11/18

slide-42
SLIDE 42

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Main Result

Proposition

Let ˆ θNp = argminθ L(θ) = argmaxθ PINp (Z; X; θ). Steps L(θ) H(Z) H(Z|X) MI (Z; X) PI

  • Z; X; ˆ

θ1,000

  • 17/09/2020, Ches| Lo¨

ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 11/18

slide-43
SLIDE 43

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Main Result

Proposition

Let ˆ θNp = argminθ L(θ) = argmaxθ PINp (Z; X; θ). Steps L(θ) H(Z) H(Z|X) MI (Z; X) PI

  • Z; X; ˆ

θ2,000

  • 17/09/2020, Ches| Lo¨

ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 11/18

slide-44
SLIDE 44

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Main Result

Proposition

Let ˆ θNp = argminθ L(θ) = argmaxθ PINp (Z; X; θ). Steps L(θ) H(Z) H(Z|X) MI (Z; X) PI

  • Z; X; ˆ

θ5,000

  • 17/09/2020, Ches| Lo¨

ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 11/18

slide-45
SLIDE 45

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Main Result

Proposition

Let ˆ θNp = argminθ L(θ) = argmaxθ PINp (Z; X; θ). Steps L(θ) H(Z) H(Z|X) MI (Z; X) PI

  • Z; X; ˆ

θ∞

  • 17/09/2020, Ches| Lo¨

ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 11/18

slide-46
SLIDE 46

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Main Result

Proposition

Let ˆ θNp = argminθ L(θ) = argmaxθ PINp (Z; X; θ). Then: PI

  • Z; X; ˆ

θNp

  • P

− →

Np→∞

sup

θ

PI (Z; X; θ) ≤ MI (Z; X) Steps L(θ) H(Z) H(Z|X) MI (Z; X) PI

  • Z; X; ˆ

θ∞

  • 17/09/2020, Ches| Lo¨

ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 11/18

slide-47
SLIDE 47

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Tightness of the Lower Bound

To what extent the gap PI/MI is negligible?

Gap composed of three kinds of errors: Steps

L(θ) H(Z) H(Z|X) MI (Z; X)

PI

  • Z; X; ˆ

θ

  • 17/09/2020, Ches| Lo¨

ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 12/18

slide-48
SLIDE 48

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Tightness of the Lower Bound

To what extent the gap PI/MI is negligible?

Gap composed of three kinds of errors: ◮ Approximation error: supθ∈Θ PI (Z; X; θ) − MI (Z; X) ≤ 0 Steps

L(θ) H(Z) H(Z|X) MI (Z; X)

PI

  • Z; X; ˆ

θ

  • 17/09/2020, Ches| Lo¨

ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 12/18

slide-49
SLIDE 49

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Tightness of the Lower Bound

To what extent the gap PI/MI is negligible?

Gap composed of three kinds of errors: ◮ Approximation error: supθ∈Θ PI (Z; X; θ) − MI (Z; X) ≤ 0 ◮ Estimation error: Np < ∞ = ⇒ supθ∈Θ PI (Z; X; θ) → PINp

  • Z; X; ˆ

θNp

  • Steps

L(θ) H(Z) H(Z|X) MI (Z; X)

PI

  • Z; X; ˆ

θ

  • 17/09/2020, Ches| Lo¨

ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 12/18

slide-50
SLIDE 50

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Tightness of the Lower Bound

To what extent the gap PI/MI is negligible?

Gap composed of three kinds of errors: ◮ Approximation error: supθ∈Θ PI (Z; X; θ) − MI (Z; X) ≤ 0 ◮ Estimation error: Np < ∞ = ⇒ supθ∈Θ PI (Z; X; θ) → PINp

  • Z; X; ˆ

θNp

  • ◮ Optimization error: ˆ

θNp unknown, θSGD instead, by SGD Steps

L(θ) H(Z) H(Z|X) MI (Z; X)

PI

  • Z; X; ˆ

θ

  • 17/09/2020, Ches| Lo¨

ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 12/18

slide-51
SLIDE 51

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Tightness of the Lower Bound

To what extent the gap PI/MI is negligible?

Gap composed of three kinds of errors: ◮ Approximation error: supθ∈Θ PI (Z; X; θ) − MI (Z; X) ≤ 0 ◮ Estimation error: Np < ∞ = ⇒ supθ∈Θ PI (Z; X; θ) → PINp

  • Z; X; ˆ

θNp

  • ◮ Optimization error: ˆ

θNp unknown, θSGD instead, by SGD = ⇒ Ideally each error must be discussed through simulations and experiments Steps

L(θ) H(Z) H(Z|X) MI (Z; X)

PI

  • Z; X; ˆ

θ

  • 17/09/2020, Ches| Lo¨

ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 12/18

slide-52
SLIDE 52

Outline

  • 1. Context
  • 2. SCA Optimization Problem versus Deep Learning

Based SCA

  • 3. NLL Minimization is PI Maximization
  • 4. Simulation results
  • 5. Experimental results
slide-53
SLIDE 53

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Settings of the simulations

Leakage model

◮ Hamming weight with additive gaussian noise (σ ∈ [0.01; 3.2]) ◮ Draw an Exhaustive dataset: estimation error negligible

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 14/18

slide-54
SLIDE 54

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Settings of the simulations

Leakage model

◮ Hamming weight with additive gaussian noise (σ ∈ [0.01; 3.2]) ◮ Draw an Exhaustive dataset: estimation error negligible

PI/MI computation

◮ Computation of MI (X; Z) with Monte-Carlo simulations

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 14/18

slide-55
SLIDE 55

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Settings of the simulations

Leakage model

◮ Hamming weight with additive gaussian noise (σ ∈ [0.01; 3.2]) ◮ Draw an Exhaustive dataset: estimation error negligible

PI/MI computation

◮ Computation of MI (X; Z) with Monte-Carlo simulations ◮ Training of a one layer MLP with 1, 000 neurons to maximize PI (Z; X; θ) = n − L(θ), where n = 4 bits

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 14/18

slide-56
SLIDE 56

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Settings of the simulations

Leakage model

◮ Hamming weight with additive gaussian noise (σ ∈ [0.01; 3.2]) ◮ Draw an Exhaustive dataset: estimation error negligible

PI/MI computation

◮ Computation of MI (X; Z) with Monte-Carlo simulations ◮ Training of a one layer MLP with 1, 000 neurons to maximize PI (Z; X; θ) = n − L(θ), where n = 4 bits

Several case studies

◮ Higher-order masking: sensitive variable split into d independent parts ◮ Shuffling: independent operations (e.g. 16 SBoxes in SubBytes) randomly shuffled

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 14/18

slide-57
SLIDE 57

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Simulation results

10−2 10−1 100 σ 10−5 10−3 10−1 Bits

MI(Z, X), 1 share MI(Z, X), 2 shares MI(Z, X), 3 shares MI(Z, X), 4 shares

Figure: H-O masking, w.r.t. level of noise

10−2 10−1 100 σ 10−3 10−2 10−1 100 Bits

No shuffle Shuffle 2 bytes Shuffle 4 bytes Shuffle 16 bytes

Figure: Shuffling, w.r.t. level of noise

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 15/18

slide-58
SLIDE 58

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Simulation results

10−2 10−1 100 σ 10−5 10−3 10−1 Bits

MI(Z, X), 1 share MI(Z, X), 2 shares MI(Z, X), 3 shares MI(Z, X), 4 shares PI

Figure: H-O masking, w.r.t. level of noise

10−2 10−1 100 σ 10−3 10−2 10−1 100 Bits

No shuffle Shuffle 2 bytes Shuffle 4 bytes Shuffle 16 bytes PI(Z, X; θ)

Figure: Shuffling, w.r.t. level of noise

What to interpret

◮ No matter the masking order, PI (Z; X; θSGD) ≈ MI (Z; X)

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 15/18

slide-59
SLIDE 59

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Simulation results

10−2 10−1 100 σ 10−5 10−3 10−1 Bits

MI(Z, X), 1 share MI(Z, X), 2 shares MI(Z, X), 3 shares MI(Z, X), 4 shares PI

Figure: H-O masking, w.r.t. level of noise

10−2 10−1 100 σ 10−3 10−2 10−1 100 Bits

No shuffle Shuffle 2 bytes Shuffle 4 bytes Shuffle 16 bytes PI(Z, X; θ)

Figure: Shuffling, w.r.t. level of noise

What to interpret

◮ No matter the masking order, PI (Z; X; θSGD) ≈ MI (Z; X) ◮ For a simple MLP, the approximation error and the optimization error are negligible

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 15/18

slide-60
SLIDE 60

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Simulation results

10−2 10−1 100 σ 10−5 10−3 10−1 Bits

MI(Z, X), 1 share MI(Z, X), 2 shares MI(Z, X), 3 shares MI(Z, X), 4 shares PI

Figure: H-O masking, w.r.t. level of noise

10−2 10−1 100 σ 10−3 10−2 10−1 100 Bits

No shuffle Shuffle 2 bytes Shuffle 4 bytes Shuffle 16 bytes PI(Z, X; θ)

Figure: Shuffling, w.r.t. level of noise

What to interpret

◮ No matter the masking order, PI (Z; X; θSGD) ≈ MI (Z; X) ◮ For a simple MLP, the approximation error and the optimization error are negligible ◮ Any more complex model should have a negligible approximation error too ◮ Empirical verifications: see appendix

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 15/18

slide-61
SLIDE 61

Outline

  • 1. Context
  • 2. SCA Optimization Problem versus Deep Learning

Based SCA

  • 3. NLL Minimization is PI Maximization
  • 4. Simulation results
  • 5. Experimental results
slide-62
SLIDE 62

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Application on Public Datasets

◮ Na(θ): number of traces obtained with key recovery.

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 17/18

slide-63
SLIDE 63

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Application on Public Datasets

◮ Na(θ): number of traces obtained with key recovery. ◮ So far: N⋆

a ≥ f (β) MI(Z;X) and PI (Z; X; θSGD) ≈ MI (Z; X)

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 17/18

slide-64
SLIDE 64

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Application on Public Datasets

◮ Na(θ)

f (β) PI(Z;X;θ) ≈ f (β) n−L(θ): number of traces obtained with key recovery?

◮ So far: N⋆

a ≥ f (β) MI(Z;X) and PI (Z; X; θSGD) ≈ MI (Z; X)

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 17/18

slide-65
SLIDE 65

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Application on Public Datasets

◮ Na(θ)

f (β) PI(Z;X;θ) ≈ f (β) n−L(θ): number of traces obtained with key recovery?

◮ So far: N⋆

a ≥ f (β) MI(Z;X) and PI (Z; X; θSGD) ≈ MI (Z; X)

◮ Tests on public datasets, using architectures proposed in recent papers [MDP19; Kim+19]

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 17/18

slide-66
SLIDE 66

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Application on Public Datasets

◮ Na(θ)

f (β) PI(Z;X;θ) ≈ f (β) n−L(θ): number of traces obtained with key recovery?

◮ So far: N⋆

a ≥ f (β) MI(Z;X) and PI (Z; X; θSGD) ≈ MI (Z; X)

◮ Tests on public datasets, using architectures proposed in recent papers [MDP19; Kim+19] ◮ Relative error ǫ computed at final epoch

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 17/18

slide-67
SLIDE 67

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Application on Public Datasets

◮ Na(θ)

f (β) PI(Z;X;θ) ≈ f (β) n−L(θ): number of traces obtained with key recovery?

◮ So far: N⋆

a ≥ f (β) MI(Z;X) and PI (Z; X; θSGD) ≈ MI (Z; X)

◮ Tests on public datasets, using architectures proposed in recent papers [MDP19; Kim+19] ◮ Relative error ǫ computed at final epoch Micro-controller protected with misalignment

50 100 150 200 Epoch 100 101 102 N⋆

a

f(β) PI

Key recovery

Figure: AES-RD: ǫ = 0.16

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 17/18

slide-68
SLIDE 68

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Application on Public Datasets

◮ Na(θ)

f (β) PI(Z;X;θ) ≈ f (β) n−L(θ): number of traces obtained with key recovery?

◮ So far: N⋆

a ≥ f (β) MI(Z;X) and PI (Z; X; θSGD) ≈ MI (Z; X)

◮ Tests on public datasets, using architectures proposed in recent papers [MDP19; Kim+19] ◮ Relative error ǫ computed at final epoch Micro-controller protected with misalignment

50 100 150 200 Epoch 100 101 102 N⋆

a

f(β) PI

Key recovery

Figure: AES-RD: ǫ = 0.16

Micro-controller protected with masking

5 10 15 20 25 30 Epoch 102 103 N⋆

a

f(β) PI

Key recovery

Figure: ASCAD: ǫ = 0.16

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 17/18

slide-69
SLIDE 69

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Application on Public Datasets

◮ Na(θ)

f (β) PI(Z;X;θ) ≈ f (β) n−L(θ): number of traces obtained with key recovery?

◮ So far: N⋆

a ≥ f (β) MI(Z;X) and PI (Z; X; θSGD) ≈ MI (Z; X)

◮ Tests on public datasets, using architectures proposed in recent papers [MDP19; Kim+19] ◮ Relative error ǫ computed at final epoch Micro-controller protected with misalignment

50 100 150 200 Epoch 100 101 102 N⋆

a

f(β) PI

Key recovery

Figure: AES-RD: ǫ = 0.16

Micro-controller protected with masking

5 10 15 20 25 30 Epoch 102 103 N⋆

a

f(β) PI

Key recovery

Figure: ASCAD: ǫ = 0.16

Implementation on FPGA (no counter-measure)

10 20 30 40 50 Epoch 102 103 N⋆

a

f(β) PI

Key recovery

Figure: AES-HD: ǫ = 0.18

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 17/18

slide-70
SLIDE 70

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Conclusion

Takeaway messages

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 18/18

slide-71
SLIDE 71

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Conclusion

Takeaway messages

  • 1. Minimizing the NLL loss ≡ maximizing the PI =

⇒ tight lower bound of the MI = ⇒ accurate estimation of N⋆

a

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 18/18

slide-72
SLIDE 72

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Conclusion

Takeaway messages

  • 1. Minimizing the NLL loss ≡ maximizing the PI =

⇒ tight lower bound of the MI = ⇒ accurate estimation of N⋆

a

  • 2. NLL as a loss function is sound from an evaluator point of view

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 18/18

slide-73
SLIDE 73

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Conclusion

Takeaway messages

  • 1. Minimizing the NLL loss ≡ maximizing the PI =

⇒ tight lower bound of the MI = ⇒ accurate estimation of N⋆

a

  • 2. NLL as a loss function is sound from an evaluator point of view
  • 3. Enables to quantitatively measure the impact of counter-measures

Thank You! Questions? Looking for a postdoc candidate in machine-learning-based SCA? Hire me!

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 18/18

slide-74
SLIDE 74

A Comprehensive Study of Deep Learning for Side-Channel Analysis

References I

[CDP17] Eleonora Cagli, C´ ecile Dumas, and Emmanuel Prouff. “Convolutional Neural Networks with Data Augmentation Against Jitter-Based Countermeasures - Profiling Attacks Without Pre-processing”. In: Cryptographic Hardware and Embedded Systems - CHES 2017 - 19th International Conference, Taipei, Taiwan, September 25-28, 2017, Proceedings. Ed. by Wieland Fischer and Naofumi Homma. Vol. 10529. Lecture Notes in Computer Science. Springer, 2017, pp. 45–68. isbn: 978-3-319-66786-7. doi: 10.1007/978-3-319-66787-4\_3. url: https://doi.org/10.1007/978-3-319-66787-4\_3. [Kim+19] Jaehun Kim et al. “Make Some Noise. Unleashing the Power of Convolutional Neural Networks for Profiled Side-channel Analysis”. In: IACR Transactions on Cryptographic Hardware and Embedded Systems 2019.3 (2019), pp. 148–179. doi: 10.13154/tches.v2019.i3.148-179. url: https: //tches.iacr.org/index.php/TCHES/article/view/8292.

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 19/18

slide-75
SLIDE 75

A Comprehensive Study of Deep Learning for Side-Channel Analysis

References II

[MPP16] Houssem Maghrebi, Thibault Portigliatti, and Emmanuel Prouff. “Breaking Cryptographic Implementations Using Deep Learning Techniques”. In: Security, Privacy, and Applied Cryptography Engineering - 6th International Conference, SPACE 2016, Hyderabad, India, December 14-18, 2016, Proceedings. Ed. by Claude Carlet, M. Anwar Hasan, and Vishal Saraswat. Vol. 10076. Lecture Notes in Computer Science. Springer, 2016, pp. 3–26. isbn: 978-3-319-49444-9. doi: 10.1007/978-3-319-49445-6\_1. url: https://doi.org/10.1007/978-3-319-49445-6\_1. [MDP19] Lo¨ ıc Masure, C´ ecile Dumas, and Emmanuel Prouff. “Gradient Visualization for General Characterization in Profiling Attacks”. In: Constructive Side-Channel Analysis and Secure Design - 10th International Workshop, COSADE 2019, Darmstadt, Germany, April 3-5, 2019, Proceedings. Ed. by Ilia Polian and Marc St¨

  • ttinger. Vol. 11421. Lecture Notes in Computer Science.

Springer, 2019, pp. 145–167. isbn: 978-3-030-16349-5. doi: 10.1007/978-3-030-16350-1\_9. url: https://doi.org/10.1007/978-3-030-16350-1\_9.

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 20/18

slide-76
SLIDE 76

A Comprehensive Study of Deep Learning for Side-Channel Analysis

References III

[Pic+18] Stjepan Picek et al. “The Curse of Class Imbalance and Conflicting Metrics with Machine Learning for Side-channel Evaluations”. In: IACR Transactions on Cryptographic Hardware and Embedded Systems 2019.1 (2018), pp. 209–237. doi: 10.13154/tches.v2019.i1.209-237. url: https: //tches.iacr.org/index.php/TCHES/article/view/7339. [Vey+12] Nicolas Veyrat-Charvillon et al. “Shuffling against Side-Channel Attacks: A Comprehensive Study with Cautionary Note”. In: Advances in Cryptology - ASIACRYPT 2012 - 18th International Conference on the Theory and Application of Cryptology and Information Security, Beijing, China, December 2-6, 2012.

  • Proceedings. Ed. by Xiaoyun Wang and Kazue Sako. Vol. 7658.

Lecture Notes in Computer Science. Springer, 2012, pp. 740–757. isbn: 978-3-642-34960-7. doi: 10.1007/978-3-642-34961-4\_44. url: https://doi.org/10.1007/978-3-642-34961-4\_44.

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 21/18

slide-77
SLIDE 77

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Our home dataset

Figure: ChipWhisperer-Lite board

10−3 10−2 10−1 100

Order 1

25 50 75 100 125 150 175 200 10−3 10−2 10−1 100

Order 2

Figure: SNR at orders d = 1, 2

Algorithm 1 loadData

1: LD r0, X

⊲ Loads the first byte in r0

2: CLR r0

⊲ Clears the register

3: ST X, r0

⊲ Stores 0 in the plaintext array

4: LD r0, X

⊲ Do it again to clear the bus

5: CLR r0 6: ST X, r0 7: LD r0, X

⊲ One more time to be sure

8: CLR r0 9: ST X+, r0

Loads sequentially an array of 16 bytes into one register and clears it = ⇒ no joint leakage at order d ≥ 2. 500, 000 traces acquired. We only work on n = 4 bits, |Z| = 2n = 16.

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 22/18

slide-78
SLIDE 78

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Experiment on ChipWhisperer-Lite: masking

◮ Emulation of order d leakages: Z =

i∈[ [0,d] ] plain[i] for

d ∈ {0, 1, 2} ◮ Extraction of PoIs according to SNR. ◮ Learning curve: PI (Z; X; θSGD) and PINp (Z; X; θSGD) w.r.t. Np = ⇒ measures the estimation error.

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 23/18

slide-79
SLIDE 79

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Experiment on ChipWhisperer-Lite: masking

◮ Emulation of order d leakages: Z =

i∈[ [0,d] ] plain[i] for

d ∈ {0, 1, 2} ◮ Extraction of PoIs according to SNR. ◮ Learning curve: PI (Z; X; θSGD) and PINp (Z; X; θSGD) w.r.t. Np = ⇒ measures the estimation error.

100000 200000 300000 400000 Profiling traces 10−3 10−2 10−1 100 Bits

PI(Z; X; θSGD ), One share

  • PINp (Z; X; θSGD ), One share

Two shares Three shares

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 23/18

slide-80
SLIDE 80

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Experiment on ChipWhisperer-Lite: masking

◮ Emulation of order d leakages: Z =

i∈[ [0,d] ] plain[i] for

d ∈ {0, 1, 2} ◮ Extraction of PoIs according to SNR. ◮ Learning curve: PI (Z; X; θSGD) and PINp (Z; X; θSGD) w.r.t. Np = ⇒ measures the estimation error.

100000 200000 300000 400000 Profiling traces 10−3 10−2 10−1 100 Bits

PI(Z; X; θSGD ), One share

  • PINp (Z; X; θSGD ), One share

Two shares Three shares

What to interpret

◮ ≈ one decade lost for each new masking order = ⇒ masking remains sound

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 23/18

slide-81
SLIDE 81

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Experiment on ChipWhisperer-Lite: masking

◮ Emulation of order d leakages: Z =

i∈[ [0,d] ] plain[i] for

d ∈ {0, 1, 2} ◮ Extraction of PoIs according to SNR. ◮ Learning curve: PI (Z; X; θSGD) and PINp (Z; X; θSGD) w.r.t. Np = ⇒ measures the estimation error.

100000 200000 300000 400000 Profiling traces 10−3 10−2 10−1 100 Bits

PI(Z; X; θSGD ), One share

  • PINp (Z; X; θSGD ), One share

Two shares Three shares

What to interpret

◮ ≈ one decade lost for each new masking order = ⇒ masking remains sound ◮ Masking has an effect on the estimation error

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 23/18

slide-82
SLIDE 82

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Experiment on ChipWhisperer-Lite: masking

◮ Emulation of order d leakages: Z =

i∈[ [0,d] ] plain[i] for

d ∈ {0, 1, 2} ◮ Extraction of PoIs according to SNR. ◮ Learning curve: PI (Z; X; θSGD) and PINp (Z; X; θSGD) w.r.t. Np = ⇒ measures the estimation error.

100000 200000 300000 400000 Profiling traces 10−3 10−2 10−1 100 Bits

PI(Z; X; θSGD ), One share

  • PINp (Z; X; θSGD ), One share

Two shares Three shares

What to interpret

◮ ≈ one decade lost for each new masking order = ⇒ masking remains sound ◮ Masking has an effect on the estimation error ◮ For d = 3, Np < 100, 000, no information !

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 23/18

slide-83
SLIDE 83

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Experiment 5: shuffling

◮ Emulation of order c shuffling: Z = plain[i] where i is randomly drawn from a subset of c indices ◮ Complete trace: D = 250

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 24/18

slide-84
SLIDE 84

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Experiment 5: shuffling

◮ Emulation of order c shuffling: Z = plain[i] where i is randomly drawn from a subset of c indices ◮ Complete trace: D = 250

20 40 60 80 100 Epoch 10−3 10−2 10−1 100 PI(Z; X; θSGD)

c = 1 c = 2 c = 4 c = 16

Figure: Exp. 5, shuffling

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 24/18

slide-85
SLIDE 85

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Experiment 5: shuffling

◮ Emulation of order c shuffling: Z = plain[i] where i is randomly drawn from a subset of c indices ◮ Complete trace: D = 250

20 40 60 80 100 Epoch 10−3 10−2 10−1 100 PI(Z; X; θSGD)

c = 1 c = 2 c = 4 c = 16

Figure: Exp. 5, shuffling

What to interpret

◮ Linear decrease of PI, as expected [Vey+12]

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 24/18

slide-86
SLIDE 86

A Comprehensive Study of Deep Learning for Side-Channel Analysis

Experiment 5: shuffling

◮ Emulation of order c shuffling: Z = plain[i] where i is randomly drawn from a subset of c indices ◮ Complete trace: D = 250

20 40 60 80 100 Epoch 10−3 10−2 10−1 100 PI(Z; X; θSGD)

c = 1 c = 2 c = 4 c = 16

Figure: Exp. 5, shuffling

What to interpret

◮ Linear decrease of PI, as expected [Vey+12] ◮ Clearly over-fitting: the estimation error non-negligible

17/09/2020, Ches| Lo¨ ıc Masure, C´ ecile Dumas, Emmanuel Prouff | 24/18