Learning Strikes Again: the Case of the DRS Signature Scheme Yang Yu - - PowerPoint PPT Presentation

learning strikes again the case of the drs signature
SMART_READER_LITE
LIVE PREVIEW

Learning Strikes Again: the Case of the DRS Signature Scheme Yang Yu - - PowerPoint PPT Presentation

Learning Strikes Again: the Case of the DRS Signature Scheme Yang Yu 1 eo Ducas 2 L 1 Tsinghua University 2 Centrum Wiskunde & Informatica January 2019, London 1 / 42 This is a cryptanalysis work... Target: DRS a NIST lattice-based


slide-1
SLIDE 1

Learning Strikes Again: the Case of the DRS Signature Scheme

Yang Yu1 L´ eo Ducas2

1Tsinghua University 2Centrum Wiskunde & Informatica

January 2019, London

1 / 42

slide-2
SLIDE 2

This is a cryptanalysis work...

Target: DRS — a NIST lattice-based signature proposal

2 / 42

slide-3
SLIDE 3

This is a cryptanalysis work...

Target: DRS — a NIST lattice-based signature proposal Techniques: learning & lattice

2 / 42

slide-4
SLIDE 4

This is a cryptanalysis work...

Target: DRS — a NIST lattice-based signature proposal Techniques: learning & lattice

Statistical learning ⇒ secret key information leaks

2 / 42

slide-5
SLIDE 5

This is a cryptanalysis work...

Target: DRS — a NIST lattice-based signature proposal Techniques: learning & lattice

Statistical learning ⇒ secret key information leaks Lattice techniques ⇒ better use of leaks

2 / 42

slide-6
SLIDE 6

This is a cryptanalysis work...

Target: DRS — a NIST lattice-based signature proposal Techniques: learning & lattice

Statistical learning ⇒ secret key information leaks Lattice techniques ⇒ better use of leaks

They claim that Parameter Set-I offers at least 128-bits of security. We show that it actually offers at most 80-bits of security!

2 / 42

slide-7
SLIDE 7

Outline

1 Background 2 DRS signature 3 Learning secret key coefficients 4 Exploiting the leaks 5 Countermeasures 3 / 42

slide-8
SLIDE 8

Outline

1 Background 2 DRS signature 3 Learning secret key coefficients 4 Exploiting the leaks 5 Countermeasures 4 / 42

slide-9
SLIDE 9

Lattice

Definition

A lattice L is a discrete subgroup

  • f Rm.

5 / 42

slide-10
SLIDE 10

Lattice

g1 g2

Definition

A lattice L is a discrete subgroup

  • f Rm.

A lattice is generated by its basis G = (g1, · · · , gn) ∈ Rn×m, e.g. L = {xG | x ∈ Zn}.

5 / 42

slide-11
SLIDE 11

Lattice

g1 g2 b1 b2

Definition

A lattice L is a discrete subgroup

  • f Rm.

A lattice is generated by its basis G = (g1, · · · , gn) ∈ Rn×m, e.g. L = {xG | x ∈ Zn}. L has infinitely many bases G is good, B is bad.

5 / 42

slide-12
SLIDE 12

Finding close vectors

Each basis defines a parallelepiped P. m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v

6 / 42

slide-13
SLIDE 13

Finding close vectors

Each basis defines a parallelepiped P. m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v m v Babai’s round-off algorithm outputs v ∈ L such that v − m ∈ P.

6 / 42

slide-14
SLIDE 14

GGH & NTRUSign schemes

Public key: P, secret key: S Sign

1 Hash the message to a random vector m 2 Round m (using S) to v ∈ L

Verify

1 Check v ∈ L (using P) 2 Check v is close to m 7 / 42

slide-15
SLIDE 15

GGH & NTRUSign are insecure!

v − m ∈ P(S) ⇒ (v, m) leaks some information of S.

8 / 42

slide-16
SLIDE 16

GGH & NTRUSign are insecure!

v − m ∈ P(S) ⇒ (v, m) leaks some information of S. GGH and NTRUSign were broken by “learning the parallelepiped” [NR06]. Some countermeasures were also broken by a similar attack [DN12].

8 / 42

slide-17
SLIDE 17

Countermeasures

Let us focus on Hash-then-Sign approach! Provably secure method [GPV08] rounding based on Gaussian sampling v − m is independent of S

9 / 42

slide-18
SLIDE 18

Countermeasures

Let us focus on Hash-then-Sign approach! Provably secure method [GPV08] rounding based on Gaussian sampling v − m is independent of S Heuristic method [PSW08] rounding based on CVP w.r.t ℓ∞-norm the support of v − m is independent of S DRS [PSDS17] is an instantiation, submitted to the NIST.

9 / 42

slide-19
SLIDE 19

Outline

1 Background 2 DRS signature 3 Learning secret key coefficients 4 Exploiting the leaks 5 Countermeasures 10 / 42

slide-20
SLIDE 20

DRS

DRS = Diagonal-dominant Reduction Signature

11 / 42

slide-21
SLIDE 21

DRS

DRS = Diagonal-dominant Reduction Signature Parameters: (n, D, b, Nb, N1)

n : the dimension D : the diagonal coefficient b : the magnitude of the large coefficients (i.e. {±b}) Nb : the number of large coefficients per row vector N1 : the number of small coefficients (i.e. {±1}) per row vector

S =      D D ... D      +

11 / 42

slide-22
SLIDE 22

DRS

DRS = Diagonal-dominant Reduction Signature Parameters: (n, D, b, Nb, N1)

n : the dimension D : the diagonal coefficient b : the magnitude of the large coefficients (i.e. {±b}) Nb : the number of large coefficients per row vector N1 : the number of small coefficients (i.e. {±1}) per row vector

S =      D D ... D      + ← D > b · Nb + N1; absolute circulant

11 / 42

slide-23
SLIDE 23

Message reduction algorithm

Input: a message m ∈ Zn, the secret matrix S Output: a reduced message w such that w − m ∈ L and w∞ < D

1: w ← m, i ← 0 2: repeat 3:

w ← w − ⌊ wi

D ⌋→0 · si

4:

i ← (i + 1) mod n

5: until w∞ < D 6: return w

12 / 42

slide-24
SLIDE 24

Message reduction algorithm

w

Input: a message m ∈ Zn, the secret matrix S Output: a reduced message w s.t. w − m ∈ L and w∞ < D

1: w ← m, i ← 0 2: repeat 3:

w ← w − ⌊ wi

D ⌋→0 · si

4:

i ← (i + 1) mod n

5: until w∞ < D 6: return w

s1 = (10, 1), s2 = (−1, 10) w = (−933, 1208)

12 / 42

slide-25
SLIDE 25

Message reduction algorithm

w

Input: a message m ∈ Zn, the secret matrix S Output: a reduced message w s.t. w − m ∈ L and w∞ < D

1: w ← m, i ← 0 2: repeat 3:

w ← w − ⌊ wi

D ⌋→0 · si

4:

i ← (i + 1) mod n

5: until w∞ < D 6: return w

s1 = (10, 1), s2 = (−1, 10) w = (−933, 1208) w = w − (−93) · s1 = (−3, 1301)

12 / 42

slide-26
SLIDE 26

Message reduction algorithm

w

Input: a message m ∈ Zn, the secret matrix S Output: a reduced message w s.t. w − m ∈ L and w∞ < D

1: w ← m, i ← 0 2: repeat 3:

w ← w − ⌊ wi

D ⌋→0 · si

4:

i ← (i + 1) mod n

5: until w∞ < D 6: return w

s1 = (10, 1), s2 = (−1, 10) w = (−933, 1208) w = w − (−93) · s1 = (−3, 1301) w = w − 130 · s2 = (127, 1)

12 / 42

slide-27
SLIDE 27

Message reduction algorithm

w

Input: a message m ∈ Zn, the secret matrix S Output: a reduced message w s.t. w − m ∈ L and w∞ < D

1: w ← m, i ← 0 2: repeat 3:

w ← w − ⌊ wi

D ⌋→0 · si

4:

i ← (i + 1) mod n

5: until w∞ < D 6: return w

s1 = (10, 1), s2 = (−1, 10) w = (−933, 1208) w = w − (−93) · s1 = (−3, 1301) w = w − 130 · s2 = (127, 1) w = w − 12 · s1 = (7, −11)

12 / 42

slide-28
SLIDE 28

Message reduction algorithm

w

Input: a message m ∈ Zn, the secret matrix S Output: a reduced message w s.t. w − m ∈ L and w∞ < D

1: w ← m, i ← 0 2: repeat 3:

w ← w − ⌊ wi

D ⌋→0 · si

4:

i ← (i + 1) mod n

5: until w∞ < D 6: return w

s1 = (10, 1), s2 = (−1, 10) w = (−933, 1208) w = w − (−93) · s1 = (−3, 1301) w = w − 130 · s2 = (127, 1) w = w − 12 · s1 = (7, −11) w = w − (−1) · s2 = (6, −1)

12 / 42

slide-29
SLIDE 29

Message reduction algorithm

Intuition: use si to reduce wi decreases a lot for j = i, wj increases a bit

13 / 42

slide-30
SLIDE 30

Message reduction algorithm

Intuition: use si to reduce wi decreases a lot for j = i, wj increases a bit A reduction at i : w → w − qsi, q = ⌊ wi

D ⌋→0

w − qsi1 =

  • k=i

|wk − qsi,k| + |wi| − |q| · D (q · wi > 0) ≤

  • k=i

(|wk| + |qsi,k|) + |wi| − |q| · D = w1 − |q| · (D −

  • k=i

|si,k|) < w1 (diagonal dominance)

13 / 42

slide-31
SLIDE 31

Message reduction algorithm

Intuition: use si to reduce wi decreases a lot for j = i, wj increases a bit A reduction at i : w → w − qsi, q = ⌊ wi

D ⌋→0

w − qsi1 =

  • k=i

|wk − qsi,k| + |wi| − |q| · D (q · wi > 0) ≤

  • k=i

(|wk| + |qsi,k|) + |wi| − |q| · D = w1 − |q| · (D −

  • k=i

|si,k|) < w1 (diagonal dominance) ⇒ message reduction always terminates!

13 / 42

slide-32
SLIDE 32

Resistance to NR attack

The support of w: (−D, D)n

DRS domain

P(S)

14 / 42

slide-33
SLIDE 33

Resistance to NR attack

The support of w: (−D, D)n

DRS domain

P(S) The support is “zero-knowledge”

14 / 42

slide-34
SLIDE 34

Resistance to NR attack

The support of w: (−D, D)n

DRS domain

P(S) The support is “zero-knowledge”, but maybe the distribution is not!

14 / 42

slide-35
SLIDE 35

Outline

1 Background 2 DRS signature 3 Learning secret key coefficients 4 Exploiting the leaks 5 Countermeasures 15 / 42

slide-36
SLIDE 36

Intuition

wi wj

(−D,−D) (D,D)

wi wj

(−D,−D) (D,D)

Si,j = −b

wi wj

(−D,−D) (D,D)

Si,j = 0

wi wj

(−D,−D) (D,D)

Si,j = b

16 / 42

slide-37
SLIDE 37

Correlations

Two sources of correlations between (wi, wj) reduction at i and Si,j = 0

17 / 42

slide-38
SLIDE 38

Correlations

Two sources of correlations between (wi, wj) reduction at i and Si,j = 0 reduction at k and Sk,i, Sk,j = 0

17 / 42

slide-39
SLIDE 39

Correlations

Two sources of correlations between (wi, wj) reduction at i and Si,j = 0 ⋆ reduction at k and Sk,i, Sk,j = 0

17 / 42

slide-40
SLIDE 40

Correlations

Two sources of correlations between (wi, wj) reduction at i and Si,j = 0 ⋆ reduction at k and Sk,i, Sk,j = 0 ⇒ Si,j should be strongly related to Wi,j (the distribution of (wi, wj)) !

17 / 42

slide-41
SLIDE 41

Figure out the model

Can we devise a formula Si,j ≈ f (Wi,j) ?

18 / 42

slide-42
SLIDE 42

Figure out the model

Can we devise a formula Si,j ≈ f (Wi,j) ? Seems complicated! cascading phenomenon: a reduction triggers another one. parasite correlations

18 / 42

slide-43
SLIDE 43

Figure out the model

Can we devise a formula Si,j ≈ f (Wi,j) ? Seems complicated! cascading phenomenon: a reduction triggers another one. parasite correlations ⇒ Search for the best linear fit f ?

18 / 42

slide-44
SLIDE 44

Figure out the model

Can we devise a formula Si,j ≈ f (Wi,j) ? Seems complicated! cascading phenomenon: a reduction triggers another one. parasite correlations ⇒ Search for the best linear fit f ? Search space for all linear f : too large!

18 / 42

slide-45
SLIDE 45

Figure out the model

Can we devise a formula Si,j ≈ f (Wi,j) ? Seems complicated! cascading phenomenon: a reduction triggers another one. parasite correlations ⇒ Search for the best linear fit f ? Search space for all linear f : too large! ⇒ choose some features {fi} and search in span({fi}), i.e. f = xℓfℓ

18 / 42

slide-46
SLIDE 46

Training — feature selection

Lower degree moments:

f1(W ) = E(wiwj) f2(W ) = E(wi · |wi|1/2 · wj) f3(W ) = E(wi · |wi| · wj)

  • 1
  • 0.75
  • 0.5
  • 0.25

0.0 0.25 0.5 0.75 1

x

  • 1
  • 0.75
  • 0.5
  • 0.25

0.0 0.25 0.5 0.75 1

y

  • 1
  • 0.75
  • 0.5
  • 0.25

0.0 0.25 0.5 0.75 1

x

  • 1
  • 0.75
  • 0.5
  • 0.25

0.0 0.25 0.5 0.75 1

y

  • 1
  • 0.75
  • 0.5
  • 0.25

0.0 0.25 0.5 0.75 1

x

  • 1
  • 0.75
  • 0.5
  • 0.25

0.0 0.25 0.5 0.75 1

y

19 / 42

slide-47
SLIDE 47

Training — feature selection

Lower degree moments:

f1(W ) = E(wiwj) f2(W ) = E(wi · |wi|1/2 · wj) f3(W ) = E(wi · |wi| · wj)

  • 1
  • 0.75
  • 0.5
  • 0.25

0.0 0.25 0.5 0.75 1

x

  • 1
  • 0.75
  • 0.5
  • 0.25

0.0 0.25 0.5 0.75 1

y

  • 1
  • 0.75
  • 0.5
  • 0.25

0.0 0.25 0.5 0.75 1

x

  • 1
  • 0.75
  • 0.5
  • 0.25

0.0 0.25 0.5 0.75 1

y

  • 1
  • 0.75
  • 0.5
  • 0.25

0.0 0.25 0.5 0.75 1

x

  • 1
  • 0.75
  • 0.5
  • 0.25

0.0 0.25 0.5 0.75 1

y

Not enough!

19 / 42

slide-48
SLIDE 48

Training — feature selection

wi wj

(−D,−D) (D,D)

wi wj

(−D,−D) (D,D)

Si,j = −b

wi wj

(−D,−D) (D,D)

Si,j = 0

wi wj

(−D,−D) (D,D)

Si,j = b

20 / 42

slide-49
SLIDE 49

Training — feature selection

Pay more attention to the central region (i.e. |wi| small).

f4 = E(wi(wi − 1)(wi + 1)wj) f5 = E(2wi(2wi − 1)(2wi + 1)wj | |2wi| ≤ 1)

  • 1
  • 0.75
  • 0.5
  • 0.25
0.0 0.25 0.5 0.75 1

x

  • 1
  • 0.75
  • 0.5
  • 0.25
0.0 0.25 0.5 0.75 1

y

  • 1
  • 0.75
  • 0.5
  • 0.25
0.0 0.25 0.5 0.75 1

x

  • 1
  • 0.75
  • 0.5
  • 0.25
0.0 0.25 0.5 0.75 1

y

f6 = E(4wi(4wi − 1)(4wi + 1)wj | |4wi| ≤ 1) f7 = E(8wi(8wi − 1)(8wi + 1)wj | |8wi| ≤ 1)

  • 1
  • 0.75
  • 0.5
  • 0.25
0.0 0.25 0.5 0.75 1

x

  • 1
  • 0.75
  • 0.5
  • 0.25
0.0 0.25 0.5 0.75 1

y

  • 1
  • 0.75
  • 0.5
  • 0.25
0.0 0.25 0.5 0.75 1

x

  • 1
  • 0.75
  • 0.5
  • 0.25
0.0 0.25 0.5 0.75 1

y

21 / 42

slide-50
SLIDE 50

Training — feature selection

Pay more attention to the central region (i.e. |wi| small).

f4 = E(wi(wi − 1)(wi + 1)wj) f5 = E(2wi(2wi − 1)(2wi + 1)wj | |2wi| ≤ 1)

  • 1
  • 0.75
  • 0.5
  • 0.25
0.0 0.25 0.5 0.75 1

x

  • 1
  • 0.75
  • 0.5
  • 0.25
0.0 0.25 0.5 0.75 1

y

  • 1
  • 0.75
  • 0.5
  • 0.25
0.0 0.25 0.5 0.75 1

x

  • 1
  • 0.75
  • 0.5
  • 0.25
0.0 0.25 0.5 0.75 1

y

f6 = E(4wi(4wi − 1)(4wi + 1)wj | |4wi| ≤ 1) f7 = E(8wi(8wi − 1)(8wi + 1)wj | |8wi| ≤ 1)

  • 1
  • 0.75
  • 0.5
  • 0.25
0.0 0.25 0.5 0.75 1

x

  • 1
  • 0.75
  • 0.5
  • 0.25
0.0 0.25 0.5 0.75 1

y

  • 1
  • 0.75
  • 0.5
  • 0.25
0.0 0.25 0.5 0.75 1

x

  • 1
  • 0.75
  • 0.5
  • 0.25
0.0 0.25 0.5 0.75 1

y

Together with transposes (i.e. f t(wi, wj) = f (wj, wi)), we finally selected 7 × 2 − 1 = 13 features in experiments.

21 / 42

slide-51
SLIDE 51

Training — model construction

Si,j seems easier to learn when (i − j mod n) is smaller.

22 / 42

slide-52
SLIDE 52

Training — model construction

Si,j seems easier to learn when (i − j mod n) is smaller. f + = x+fℓ, f − = x−fℓ according to (i − j mod n).

22 / 42

slide-53
SLIDE 53

Training — model construction

Si,j seems easier to learn when (i − j mod n) is smaller. f + = x+fℓ, f − = x−fℓ according to (i − j mod n).

22 / 42

slide-54
SLIDE 54

Training — model construction

Si,j seems easier to learn when (i − j mod n) is smaller. f + = x+fℓ, f − = x−fℓ according to (i − j mod n). Build models by least-square fit method 30 instances and 400 000 samples per instances 38 core-hours

22 / 42

slide-55
SLIDE 55

Training — model construction

Si,j seems easier to learn when (i − j mod n) is smaller. f + = x+fℓ, f − = x−fℓ according to (i − j mod n). Build models by least-square fit method 30 instances and 400 000 samples per instances 38 core-hours Possible improvements advanced machine learning techniques

22 / 42

slide-56
SLIDE 56

Training — model construction

Si,j seems easier to learn when (i − j mod n) is smaller. f + = x+fℓ, f − = x−fℓ according to (i − j mod n). Build models by least-square fit method 30 instances and 400 000 samples per instances 38 core-hours Possible improvements advanced machine learning techniques more blocks

22 / 42

slide-57
SLIDE 57

Training — model construction

Si,j seems easier to learn when (i − j mod n) is smaller. f + = x+fℓ, f − = x−fℓ according to (i − j mod n). Build models by least-square fit method 30 instances and 400 000 samples per instances 38 core-hours Possible improvements advanced machine learning techniques more blocks new features

22 / 42

slide-58
SLIDE 58

The models

f −

  • 1
  • 0.75
  • 0.5
  • 0.25

0.0 0.25 0.5 0.75 1

x

  • 1
  • 0.75
  • 0.5
  • 0.25

0.0 0.25 0.5 0.75 1

y

f +

  • 1
  • 0.75
  • 0.5
  • 0.25

0.0 0.25 0.5 0.75 1

x

  • 1
  • 0.75
  • 0.5
  • 0.25

0.0 0.25 0.5 0.75 1

y

23 / 42

slide-59
SLIDE 59

Learning

Let’s learn a new S as S′ = f (W )! f −

−20 −10 10 20 0.00 0.05 0.10 0.15 0.20

Si,j =b Si,j =−b Si,j =1 Si,j =−1 Si,j =0

f +

−10 −5 5 10 0.00 0.05 0.10 0.15 0.20 0.25 0.30

Si,j =b Si,j =−b Si,j =1 Si,j =−1 Si,j =0

24 / 42

slide-60
SLIDE 60

Learning

Let’s learn a new S as S′ = f (W )!

10 5 5 10

f

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35

probability density

Si,j =b Si,j =−b Si,j =1 Si,j =−1 Si,j =0

24 / 42

slide-61
SLIDE 61

Learning

Let’s learn a new S as S′ = f (W )!

10 5 5 10

f

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35

probability density

Si,j =b Si,j =−b Si,j =1 Si,j =−1 Si,j =0

24 / 42

slide-62
SLIDE 62

Learning — location

S = D · I+ is “absolute circulant” ⇒ more confidence via diagonal amplification

25 / 42

slide-63
SLIDE 63

Learning — location

The weight of k-th diagonal Wk = S′2

i,i+k

200 400 600 800

k

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4

Large coefficient W +

k

W−

k

25 / 42

slide-64
SLIDE 64

Learning — location

#signatures 13/16 14/16 15/16 16/16 50 000 5 3 6 6 100 000

  • 20

200 000

  • 20

400 000

  • 20

Table: Location accuracy. The column, labeled by K/16, shows the number of tested instances

in which the largest Nb scaled weights corresponded to exactly K large coefficient diagonals.

25 / 42

slide-65
SLIDE 65

Learning — location

#signatures 13/16 14/16 15/16 16/16 50 000 5 3 6 6 100 000

  • 20

200 000

  • 20

400 000

  • 20

Table: Location accuracy. The column, labeled by K/16, shows the number of tested instances

in which the largest Nb scaled weights corresponded to exactly K large coefficient diagonals.

We locate all large coefficients successfully!

25 / 42

slide-66
SLIDE 66

Learning — location

#signatures 13/16 14/16 15/16 16/16 50 000 5 3 6 6 100 000

  • 20

200 000

  • 20

400 000

  • 20

Table: Location accuracy. The column, labeled by K/16, shows the number of tested instances

in which the largest Nb scaled weights corresponded to exactly K large coefficient diagonals.

We locate all large coefficients successfully! but we are still missing the signs!

25 / 42

slide-67
SLIDE 67

Learning — sign

Si,j ∈ {±b, ±1, 0}

10 5 5 10

f

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35

probability density

Si,j =b Si,j =−b Si,j =1 Si,j =−1 Si,j =0

26 / 42

slide-68
SLIDE 68

Learning — sign

Si,j ∈ {±b}

10 5 5 10

f

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35

probability density

Si,j =b Si,j =−b

26 / 42

slide-69
SLIDE 69

Learning — sign

#signatures pl pu p prow 400 000 0.9975 0.9939 0.9956 0.9323 200 000 0.9920 0.9731 0.9826 0.7546 100 000 0.9722 0.9330 0.9536 0.4675 50 000 0.9273 0.8589 0.8921 0.1608

Table: Experimental measures for pl, pu, p and prow.

p = accuracy of guessing the sign of a large coefficient pl = accuracy for a large coefficient in the lower triangle pu = accuracy for a large coefficient in the upper triangle prow = pNb

26 / 42

slide-70
SLIDE 70

Learning — sign

#signatures pl pu p prow 400 000 0.9975 0.9939 0.9956 0.9323 200 000 0.9920 0.9731 0.9826 0.7546 100 000 0.9722 0.9330 0.9536 0.4675 50 000 0.9273 0.8589 0.8921 0.1608

Table: Experimental measures for pl, pu, p and prow.

p = accuracy of guessing the sign of a large coefficient pl = accuracy for a large coefficient in the lower triangle pu = accuracy for a large coefficient in the upper triangle prow = pNb

We can determine all large coefficients in one row!

26 / 42

slide-71
SLIDE 71

Learning — sign

#signatures pl pu p prow 400 000 0.9975 0.9939 0.9956 0.9323 200 000 0.9920 0.9731 0.9826 0.7546 100 000 0.9722 0.9330 0.9536 0.4675 50 000 0.9273 0.8589 0.8921 0.1608

Table: Experimental measures for pl, pu, p and prow.

p = accuracy of guessing the sign of a large coefficient pl = accuracy for a large coefficient in the lower triangle pu = accuracy for a large coefficient in the upper triangle prow = pNb

We can determine all large coefficients in one row! However, it is still hard to learn small coefficients...

26 / 42

slide-72
SLIDE 72

Outline

1 Background 2 DRS signature 3 Learning secret key coefficients 4 Exploiting the leaks 5 Countermeasures 27 / 42

slide-73
SLIDE 73

BDD & uSVP

BDD (Bounded Distance Decoding)

Given a lattice L and a target t “very close” to L, to find v ∈ L minimizing v − t.

uSVP (Unique SVP)

Given a lattice L with λ1(L) ≪ λ2(L), to find its shortest non-zero vector.

28 / 42

slide-74
SLIDE 74

BDD & uSVP

BDD (Bounded Distance Decoding)

Given a lattice L and a target t “very close” to L, to find v ∈ L minimizing v − t.

uSVP (Unique SVP)

Given a lattice L with λ1(L) ≪ λ2(L), to find its shortest non-zero vector. BDD ⇒ uSVP on L′ spanned by B t 1

  • 28 / 42
slide-75
SLIDE 75

BDD & uSVP

BDD (Bounded Distance Decoding)

Given a lattice L and a target t “very close” to L, to find v ∈ L minimizing v − t.

uSVP (Unique SVP)

Given a lattice L with λ1(L) ≪ λ2(L), to find its shortest non-zero vector. BDD ⇒ uSVP on L′ spanned by B t 1

  • λ1(L′) =
  • 1 + dist(t, L)2

vol(L′) = vol(L)

28 / 42

slide-76
SLIDE 76

Solving uSVP by BKZ

Required blocksize β

[ADPS16, AGVW17]:

  • β/d · λ1(L′) ≤ δ2β−d

β

· vol(L′)

1 d

where d = dim(L′), δβ ≈

  • (πβ)

1 β β

2πe

  • 1

2(β−1)

(β > 50).

29 / 42

slide-77
SLIDE 77

Solving uSVP by BKZ

Required blocksize β

[ADPS16, AGVW17]:

  • β/d · λ1(L′) ≤ δ2β−d

β

· vol(L′)

1 d

where d = dim(L′), δβ ≈

  • (πβ)

1 β β

2πe

  • 1

2(β−1)

(β > 50).

Cost of BKZ-β

[Che13, Alb17]: CBKZ-β = 16d · CSVP-β

29 / 42

slide-78
SLIDE 78

Solving uSVP by BKZ

Required blocksize β

[ADPS16, AGVW17]:

  • β/d · λ1(L′) ≤ δ2β−d

β

· vol(L′)

1 d

where d = dim(L′), δβ ≈

  • (πβ)

1 β β

2πe

  • 1

2(β−1)

(β > 50).

Cost of BKZ-β

[Che13, Alb17]: CBKZ-β = 16d · CSVP-β

Cost of solving SVP-β

Enum[APS15]: 20.270β ln β−1.019β+16.10

29 / 42

slide-79
SLIDE 79

Solving uSVP by BKZ

Required blocksize β

[ADPS16, AGVW17]:

  • β/d · λ1(L′) ≤ δ2β−d

β

· vol(L′)

1 d

where d = dim(L′), δβ ≈

  • (πβ)

1 β β

2πe

  • 1

2(β−1)

(β > 50).

Cost of BKZ-β

[Che13, Alb17]: CBKZ-β = 16d · CSVP-β

Cost of solving SVP-β

Enum[APS15]: 20.270β ln β−1.019β+16.10 Sieve [Duc17]: 20.396β+8.4

29 / 42

slide-80
SLIDE 80

Solving uSVP by BKZ

Required blocksize β

[ADPS16, AGVW17]:

  • β/d · λ1(L′) ≤ δ2β−d

β

· vol(L′)

1 d

where d = dim(L′), δβ ≈

  • (πβ)

1 β β

2πe

  • 1

2(β−1)

(β > 50).

Cost of BKZ-β

[Che13, Alb17]: CBKZ-β = 16d · CSVP-β

Cost of solving SVP-β

Enum[APS15]: 20.270β ln β−1.019β+16.10 ⋆ Sieve [Duc17]: 20.396β+8.4

29 / 42

slide-81
SLIDE 81

Leaks help a lot!

Attack without leaks

d = n + 1, λ1(L′) =

  • b2 · Nb + N1 + 1

cost: > 2128

30 / 42

slide-82
SLIDE 82

Leaks help a lot!

Attack without leaks

d = n + 1, λ1(L′) =

  • b2 · Nb + N1 + 1

cost: > 2128

Naive attack with leaks

d = n + 1, λ1(L′) = √N1 + 1 cost: 278

30 / 42

slide-83
SLIDE 83

Leaks help a lot!

Attack without leaks

d = n + 1, λ1(L′) =

  • b2 · Nb + N1 + 1

cost: > 2128

Naive attack with leaks

d = n + 1, λ1(L′) = √N1 + 1 cost: 278

Improved attack with leaks

d = n − Nb, λ1(L′) = √N1 + 1 cost: 273

30 / 42

slide-84
SLIDE 84

Improved BDD-uSVP attack

Red: D, ±b (known), Blue: 0, ±1 (unknown) t= · · · sk= · · ·

31 / 42

slide-85
SLIDE 85

Improved BDD-uSVP attack

Let H =   

∗ ∗ ∗ ∗ ∗ 1 . . . . . . ... ∗ ∗ 1

   be HNF(L), and s = cH t= · · · sk= · · · c= · · ·

31 / 42

slide-86
SLIDE 86

Improved BDD-uSVP attack

Let H =   

∗ ∗ ∗ ∗ ∗ 1 . . . . . . ... ∗ ∗ 1

   be HNF(L), and s = cH t= · · · sk= · · · c= · · · Let M such that tM= · · · = (0, r) skM= = (b, r) cM= = (p, r)

31 / 42

slide-87
SLIDE 87

Improved BDD-uSVP attack

Let MtHM = H′ H′′ I

  • and

let L′ be the lattice spanned by

  • H′

t′ = rH′′ 1

  • 32 / 42
slide-88
SLIDE 88

Improved BDD-uSVP attack

Let MtHM = H′ H′′ I

  • and

let L′ be the lattice spanned by

  • H′

t′ = rH′′ 1

  • dim(L′) = n − Nb

vol(L′) = vol(L) λ1(L′) = (b, 1) = √N1 + 1

32 / 42

slide-89
SLIDE 89

Improved BDD-uSVP attack

Once one si is recovered exactly ⇒ all 0’s in S are determined tM= · · · skM= cM= dim = n − Nb

33 / 42

slide-90
SLIDE 90

Improved BDD-uSVP attack

Once one si is recovered exactly ⇒ all 0’s in S are determined tM= · · · skM= cM= dim = n − Nb tM= skM= cM= dim = N1 + Nb + 1 ≈ n/2

33 / 42

slide-91
SLIDE 91

Improved BDD-uSVP attack

Once one si is recovered exactly ⇒ all 0’s in S are determined tM= · · · skM= cM= dim = n − Nb tM= skM= cM= dim = N1 + Nb + 1 ≈ n/2 Recovering secret matrix ≈ recovering a first secret.

33 / 42

slide-92
SLIDE 92

Improved BDD-uSVP attack

Once one si is recovered exactly ⇒ all 0’s in S are determined tM= · · · skM= cM= dim = n − Nb tM= skM= cM= dim = N1 + Nb + 1 ≈ n/2 Recovering secret matrix ≈ recovering a first secret. Can we do better with the help of many tk close to sk? [KF17]

33 / 42

slide-93
SLIDE 93

Conclusion

We present a statistical attack against DRS: given 100 000 signatures, security is below 80-bits; even less with the current progress of lattice algorithms.

34 / 42

slide-94
SLIDE 94

Outline

1 Background 2 DRS signature 3 Learning secret key coefficients 4 Exploiting the leaks 5 Countermeasures 35 / 42

slide-95
SLIDE 95

Modified DRS

In DRS: S = D · I + E is diagonal-dominant Version 1 [PSDS17] absolute circulant, Ei,i = 0 three types of coefficients ({0}, {±1}, {±b}) with fixed numbers

36 / 42

slide-96
SLIDE 96

Modified DRS

In DRS: S = D · I + E is diagonal-dominant Version 1 [PSDS17] absolute circulant, Ei,i = 0 three types of coefficients ({0}, {±1}, {±b}) with fixed numbers Version 2 [PSDS18] e1, · · · , en

$

← − {v | v1 < D} variable diagonal elements

36 / 42

slide-97
SLIDE 97

Modified DRS

In DRS: S = D · I + E is diagonal-dominant Version 1 [PSDS17] absolute circulant, Ei,i = 0 three types of coefficients ({0}, {±1}, {±b}) with fixed numbers Version 2 [PSDS18] e1, · · · , en

$

← − {v | v1 < D} variable diagonal elements Impact no circulant structure ⇒ diagonal amplification doesn’t work

36 / 42

slide-98
SLIDE 98

Modified DRS

In DRS: S = D · I + E is diagonal-dominant Version 1 [PSDS17] absolute circulant, Ei,i = 0 three types of coefficients ({0}, {±1}, {±b}) with fixed numbers Version 2 [PSDS18] e1, · · · , en

$

← − {v | v1 < D} variable diagonal elements Impact no circulant structure ⇒ diagonal amplification doesn’t work coefficients are less sparsely distributed ⇒ less confidence of guessing

36 / 42

slide-99
SLIDE 99

Learning attack on modified DRS

We regard Si,j as a random variable following the same distribution. Let S′ be the guess of S and N be the sample size.

37 / 42

slide-100
SLIDE 100

Learning attack on modified DRS

We regard Si,j as a random variable following the same distribution. Let S′ be the guess of S and N be the sample size. As N grows, we hope Var(Si,j − S′

i,j) < Var(Si,j) ⇒ more confidence of guessing

si − s′

i < si ⇒ guessing vector gets closer to the lattice

37 / 42

slide-101
SLIDE 101

Some experiments on modified DRS

We conducted some experiments on reduced parameters.

50 100 150 200 250 300

i

1.0 1.5 2.0 2.5 3.0 3.5 log(N) = 15 log(N) = 16 log(N) = 17 log(N) = 18 log(N) = 19 log(N) = 20 50 100 150 200 250 300

i

0.92 0.94 0.96 0.98 1.00 1.02 1.04 1.06 log(N) = 15 log(N) = 16 log(N) = 17 log(N) = 18 log(N) = 19 log(N) = 20

38 / 42

slide-102
SLIDE 102

Some experiments on modified DRS

We conducted some experiments on reduced parameters. We re-used the same approach with same features.

Var(Sj,i+j−S′

j,i+j)

Var(Sj,i+j)

50 100 150 200 250 300

i

1.0 1.5 2.0 2.5 3.0 3.5 log(N) = 15 log(N) = 16 log(N) = 17 log(N) = 18 log(N) = 19 log(N) = 20

si−s′

i

si

50 100 150 200 250 300

i

0.92 0.94 0.96 0.98 1.00 1.02 1.04 1.06 log(N) = 15 log(N) = 16 log(N) = 17 log(N) = 18 log(N) = 19 log(N) = 20

38 / 42

slide-103
SLIDE 103

Some experiments on modified DRS

We also tried the case of n blocks and some new features.

Var(Sj,i+j−S′

j,i+j)

Var(Sj,i+j)

25 50 75 100 125 150 175 200

i

1 2 3 4 log(N) = 15 log(N) = 16 log(N) = 17 log(N) = 18 log(N) = 19 log(N) = 20

si−s′

i

si

25 50 75 100 125 150 175 200

i

0.8 1.0 1.2 1.4 1.6 log(N) = 15 log(N) = 16 log(N) = 17 log(N) = 18 log(N) = 19 log(N) = 20

39 / 42

slide-104
SLIDE 104

Some experiments on modified DRS

We also tried the case of n blocks and some new features.

Var(Sj,i+j−S′

j,i+j)

Var(Sj,i+j)

25 50 75 100 125 150 175 200

i

1 2 3 4 log(N) = 15 log(N) = 16 log(N) = 17 log(N) = 18 log(N) = 19 log(N) = 20

si−s′

i

si

25 50 75 100 125 150 175 200

i

0.8 1.0 1.2 1.4 1.6 log(N) = 15 log(N) = 16 log(N) = 17 log(N) = 18 log(N) = 19 log(N) = 20

Further study is ongoing...

39 / 42

slide-105
SLIDE 105

Conclusion

A leak still exists despite the new countermeasure.

40 / 42

slide-106
SLIDE 106

Conclusion

A leak still exists despite the new countermeasure. Work in progress use timing leakage to locate the endpoint of message reduction, then to classify samples and to choose most relevant ones Open question well-designed perturbation & statistical arguments

40 / 42

slide-107
SLIDE 107

Thank you!

41 / 42

slide-108
SLIDE 108

References

[NR06]. Learning a Parallelepiped: Cryptanalysis of GGH and NTRU Signatures. Phong Q. Nguyen and Oded Regev. EUROCRYPT 2006. [DN12]. Learning a Zonotope and More: Cryptanalysis of NTRUSign Countermeasures. L´ eo Ducas and Phong Q. Nguyen. ASIACRYPT 2012. [GPV08]. Trapdoors for hard lattices and new cryptographic constructions. Craig Gentry and Chris Peikert and Vinod Vaikuntanathan. STOC 2008. [PSW08]. A Digital Signature Scheme Based on CVP∞. Thomas Plantard and Willy Susilo and Khin Than Win. PKC 2008. [PSDS17]. DRS : Diagonal dominant Reduction for lattice-based Signature. Thomas Plantard and Arnaud Sipasseuth and Cedric Dumondelle and Willy Susilo. Submitted to the NIST PQC Competition. [ADPS16]. Post-quantum Key Exchange—A New Hope. Erdem Alkim and L´ eo Ducas and Thomas P¨

  • ppelmann and Peter Schwabe. USENIX Security 2016.

[AGVW17]. Revisiting the Expected Cost of Solving uSVP and Applications to LWE. Martin R. Albrecht and Florian G¨

  • pfert and Fernando Virdia and Thomas Wunderer. ASIACRYPT 2017.

[Che13]. R´ eduction de r´ eseau et s´ ecurit´ e concr` ete du chiffrement compl` etement homomorphe. Yuanmi Chen. https://www.theses.fr/2013PA077242. [Alb17]. On Dual Lattice Attacks Against Small-Secret LWE and Parameter Choices in HElib and SEAL. Martin R. Albrecht. EUROCRYPT 2017. [APS15]. On the concrete hardness of Learning with Errors. Martin R. Albrecht and Rachel Player and Sam Scott. Journal of Mathematical Cryptology. [Duc17]. Shortest Vector from Lattice Sieving: a Few Dimensions for Free. L´ eo Ducas. EUROCRYPT 2018. [KF17]. Revisiting Lattice Attacks on overstretched NTRU parameters. Paul Kirchner and Pierre-Alain Fouque. EUROCRYPT 2017. [PSDS18]. DRS : Diagonal dominant Reduction for lattice-based Signature Version 2. Thomas Plantard and Arnaud Sipasseuth and Cedric Dumondelle and Willy Susilo. https://www.uow.edu.au/ thomaspl/drs/current/specification.pdf. 42 / 42