SPRING Fast Pseudorandom Functions from Rounded Ring Products G. - - PowerPoint PPT Presentation

spring
SMART_READER_LITE
LIVE PREVIEW

SPRING Fast Pseudorandom Functions from Rounded Ring Products G. - - PowerPoint PPT Presentation

1 / 16 SPRING FSE 2014 Tweaks SPRING Implementation SPRING Fast Pseudorandom Functions from Rounded Ring Products G. Leurent () . . . . . . . . . . . . . . . Abhishek Banerjee 1 Hai Brenner 2 Gatan Leurent 3 Chris Peikert 1 Alon Rosen 2


slide-1
SLIDE 1

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

SPRING

Fast Pseudorandom Functions from Rounded Ring Products Abhishek Banerjee1 Hai Brenner2 Gaëtan Leurent3 Chris Peikert1 Alon Rosen2

1Georgia Institute of Technology 2IDC Herzliya 3UCL  Inria

FSE 2014

  • G. Leurent ()

SPRING FSE 2014 1 / 16

slide-2
SLIDE 2

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

Motivation

. . Public key

▶ Strong algebraic

structure

▶ Security reduction ▶ Slow

. Secret key

▶ Security from

cryptanalysis

▶ Fast

Bridging the gap

▶ Can we have an efficient design with strong algebraic structure?

▶ Security reduction from a wellunderstood problem? ▶ Extra features? ▶ Previous examples: SWIFFT, FSB, Lapin, HB family

  • G. Leurent ()

SPRING FSE 2014 2 / 16

slide-3
SLIDE 3

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

Motivation

. . Public key

▶ Strong algebraic

structure

▶ Security reduction ▶ Slow

. Secret key

▶ Security from

cryptanalysis

▶ Fast

Bridging the gap

▶ Can we have an efficient design with strong algebraic structure?

▶ Security reduction from a wellunderstood problem? ▶ Extra features? ▶ Previous examples: SWIFFT, FSB, Lapin, HB family

  • G. Leurent ()

SPRING FSE 2014 2 / 16

slide-4
SLIDE 4

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

SPRING construction

Subset Product with Rounding over a ring Fa,⃗

s(x1, … , xk) ∶= S

⎛ ⎜ ⎜ ⎝ a ⋅

k

􏾠

j=1

s

xj j

⎞ ⎟ ⎟ ⎠

▶ Latticebased PRF

[BPR, Eurocrypt ’12]

▶ Polynomial ring Rp = ℤp[X]/(Xn + 1) ▶ Key: a, (si)k i=1 ∈ Rp ▶ Rounding function S

▶ e.g. MSB of each polynomial coefficient

  • G. Leurent ()

SPRING FSE 2014 3 / 16

slide-5
SLIDE 5

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

SPRING security

▶ Based on the RL W E assumption

▶ Secret polynomial s ∈ Rp,

Rp = ℤp[X]/(Xn + 1)

▶ Distinguish (ai, ai ⋅ s + ei) from uniform ▶ Reduction to worstcase ideal lattice problems

▶ Deterministic version: RL W R assumption

▶ Secret polynomial s ∈ Rp ▶ Distinguish (ai, ⌊ai ⋅ s⌉) from uniform ▶ Rounding removes information, like adding noise

▶ Two SPRING outputs gives something similar to an LWR sample

▶ Fa,⃗

s(x1, … , xk) ∶= S 􏿶a ⋅ ∏k j=1 s xj j 􏿹

▶ Secret polynomials s, t ▶ Output (⌊t⌉, ⌊t ⋅ s⌉)

  • G. Leurent ()

SPRING FSE 2014 4 / 16

slide-6
SLIDE 6

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

SPRING security

▶ Based on the RL W E assumption

▶ Secret polynomial s ∈ Rp,

Rp = ℤp[X]/(Xn + 1)

▶ Distinguish (ai, ai ⋅ s + ei) from uniform ▶ Reduction to worstcase ideal lattice problems

▶ Deterministic version: RL W R assumption

▶ Secret polynomial s ∈ Rp ▶ Distinguish (ai, ⌊ai ⋅ s⌉) from uniform ▶ Rounding removes information, like adding noise

▶ Two SPRING outputs gives something similar to an LWR sample

▶ Fa,⃗

s(x1, … , xk) ∶= S 􏿶a ⋅ ∏k j=1 s xj j 􏿹

▶ Secret polynomials s, t ▶ Output (⌊t⌉, ⌊t ⋅ s⌉)

  • G. Leurent ()

SPRING FSE 2014 4 / 16

slide-7
SLIDE 7

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

From provable security to efficiency

▶ Security reduction require huge parameters ▶ What happens when we use small parameters?

▶ Security reduction not applicable as such ▶ Guideline towards reasonable constructions (mode of operation?) ▶ Bias can appear (was negligible with large parameters) ▶ Concrete security evaluation needed

  • G. Leurent ()

SPRING FSE 2014 5 / 16

slide-8
SLIDE 8

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

Choice of ring

SPRING Fa,⃗

s(x1, … , xk) ∶= S 􏿶a ⋅ ∏k j=1 s xj j 􏿹

  • ver Rp = ℤp[X]/(Xn + 1)

▶ Select parameters with fast polynomial product 1 Polynomial product very efficient using FFT algorithm 2 Arithmetic mod 2i + 1 is efficient in software ▶ Problem was studied for SWIFFT

▶ Use p = 257, n = 128

  • G. Leurent ()

SPRING FSE 2014 6 / 16

slide-9
SLIDE 9

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

Product in the ring R257

Fast polynomial product h = f ⋅ g

1 Evaluate f and g: fi = f(xi), gi = g(xi)

(256 points)

2 Multiply values coefficientswise 3 Interpolate h s.t. h(xi) = fi × gi

(degree 256)

▶ Let 𝜕 be a 256th root of unity, xi = 𝜕i,

𝜕 = 41 Use FFT for evaluation/interpolation in n log(n)

▶ We want f ⋅ g mod x128 + 1

▶ x128 + 1 = ∏(x − 𝜕2i+1) ▶ Chinese Remainder: compute h mod x − 𝜕2i+1 i.e. h(𝜕2i+1)

▶ Evaluating f(𝜕2i+1)

▶ 𝜚 ∶ ∑ bi ⋅ xi ↦ ∑(bi ⋅ 𝜕i) ⋅ xi ▶ 𝜚(f)(𝜕2i) = f(𝜕2i+1)

▶ FFT128(𝜚(f ⋅ g)) = FFT128(𝜚(f)) × FFT128(𝜚(g))

(coeff.wise ×)

  • G. Leurent ()

SPRING FSE 2014 7 / 16

slide-10
SLIDE 10

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

Product in the ring R257

Fast polynomial product h = f ⋅ g

1 Evaluate f and g: fi = f(xi), gi = g(xi)

(256 points)

2 Multiply values coefficientswise 3 Interpolate h s.t. h(xi) = fi × gi

(degree 256)

▶ Let 𝜕 be a 256th root of unity, xi = 𝜕i,

𝜕 = 41 Use FFT for evaluation/interpolation in n log(n)

▶ We want f ⋅ g mod x128 + 1

▶ x128 + 1 = ∏(x − 𝜕2i+1) ▶ Chinese Remainder: compute h mod x − 𝜕2i+1 i.e. h(𝜕2i+1)

▶ Evaluating f(𝜕2i+1)

▶ 𝜚 ∶ ∑ bi ⋅ xi ↦ ∑(bi ⋅ 𝜕i) ⋅ xi ▶ 𝜚(f)(𝜕2i) = f(𝜕2i+1)

▶ FFT128(𝜚(f ⋅ g)) = FFT128(𝜚(f)) × FFT128(𝜚(g))

(coeff.wise ×)

  • G. Leurent ()

SPRING FSE 2014 7 / 16

slide-11
SLIDE 11

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

Product in the ring R257

Fast polynomial product h = f ⋅ g

1 Evaluate f and g: fi = f(xi), gi = g(xi)

(256 points)

2 Multiply values coefficientswise 3 Interpolate h s.t. h(xi) = fi × gi

(degree 256)

▶ Let 𝜕 be a 256th root of unity, xi = 𝜕i,

𝜕 = 41 Use FFT for evaluation/interpolation in n log(n)

▶ We want f ⋅ g mod x128 + 1

▶ x128 + 1 = ∏(x − 𝜕2i+1) ▶ Chinese Remainder: compute h mod x − 𝜕2i+1 i.e. h(𝜕2i+1)

▶ Evaluating f(𝜕2i+1)

▶ 𝜚 ∶ ∑ bi ⋅ xi ↦ ∑(bi ⋅ 𝜕i) ⋅ xi ▶ 𝜚(f)(𝜕2i) = f(𝜕2i+1)

▶ FFT128(𝜚(f ⋅ g)) = FFT128(𝜚(f)) × FFT128(𝜚(g))

(coeff.wise ×)

  • G. Leurent ()

SPRING FSE 2014 7 / 16

slide-12
SLIDE 12

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

Product in the ring R257

Fast polynomial product h = f ⋅ g mod x128 + 1

1 Evaluate f and g: fi = f(xi), gi = g(xi)

(128 points)

2 Multiply values coefficientswise 3 Interpolate h s.t. h(xi) = fi × gi

(degree 128)

▶ Let 𝜕 be a 256th root of unity, xi = 𝜕2i+1,

𝜕 = 41 Use FFT for evaluation/interpolation in n log(n)

▶ We want f ⋅ g mod x128 + 1

▶ x128 + 1 = ∏(x − 𝜕2i+1) ▶ Chinese Remainder: compute h mod x − 𝜕2i+1 i.e. h(𝜕2i+1)

▶ Evaluating f(𝜕2i+1)

▶ 𝜚 ∶ ∑ bi ⋅ xi ↦ ∑(bi ⋅ 𝜕i) ⋅ xi ▶ 𝜚(f)(𝜕2i) = f(𝜕2i+1)

▶ FFT128(𝜚(f ⋅ g)) = FFT128(𝜚(f)) × FFT128(𝜚(g))

(coeff.wise ×)

  • G. Leurent ()

SPRING FSE 2014 7 / 16

slide-13
SLIDE 13

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

Implementation tricks

SPRING PRF Fa,⃗

s(x1, … , xk) ∶= S 􏿶a ⋅ ∏k j=1 s xj j 􏿹 ▶ Use FFT for the subset product

▶ ∏xj=1 sj = 𝜚−1 􏿶FFT−1 􏿶⨉xj=1 FFT(𝜚(sj))􏿹􏿹 ▶ Store ̃

sj ∶= FFT(𝜚(sj)) (equivalent key)

▶ ∏xj=1 sj = 𝜚−1 􏿶FFT−1 􏿶⨉xj=1 ̃

sj􏿹􏿹 (coefficientswise product)

▶ Use counter mode for a stream cipher

▶ Single addition instead of subsetsum

  • G. Leurent ()

SPRING FSE 2014 8 / 16

slide-14
SLIDE 14

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

Implementation tricks

SPRING PRF Fa,⃗

s(x1, … , xk) ∶= S 􏿶a ⋅ ∏k j=1 s xj j 􏿹 ▶ Use FFT for the subset product

▶ ∏xj=1 sj = 𝜚−1 􏿶FFT−1 􏿶⨉xj=1 FFT(𝜚(sj))􏿹􏿹 ▶ Store 􏾨

sij ∶= log 􏿵􏾫 sij􏿸 , ̃ sj ∶= FFT(𝜚(sj)) (equivalent key)

▶ ∏xj=1 sj = 𝜚−1 􏿶FFT−1 􏿶exp 􏿶∑xj=1 􏾨

sj􏿹􏿹􏿹 (coefficientswise product)

▶ Use counter mode for a stream cipher

▶ Single addition instead of subsetsum

  • G. Leurent ()

SPRING FSE 2014 8 / 16

slide-15
SLIDE 15

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

Implementation tricks

SPRING PRF Fa,⃗

s(x1, … , xk) ∶= S 􏿶a ⋅ ∏k j=1 s xj j 􏿹 ▶ Use FFT for the subset product

▶ ∏xj=1 sj = 𝜚−1 􏿶FFT−1 􏿶⨉xj=1 FFT(𝜚(sj))􏿹􏿹 ▶ Store 􏾨

sij ∶= log 􏿵􏾫 sij􏿸 , ̃ sj ∶= FFT(𝜚(sj)) (equivalent key)

▶ ∏xj=1 sj = 𝜚−1 􏿶FFT−1 􏿶exp 􏿶∑xj=1 􏾨

sj􏿹􏿹􏿹 (coefficientswise product)

▶ Use counter mode for a stream cipher

▶ Single addition instead of subsetsum

  • G. Leurent ()

SPRING FSE 2014 8 / 16

slide-16
SLIDE 16

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

SPRING over R257 (p = 257, n = 128)

.

. Key sij

1024(k + 1) bits

. 1 . x1 . x2 . ⋮ . xk . kbit input x . Subset sum . ∑j xjsij . 1024bit state (128 8bit words) . exp . exp . exp . exp . exp . exp . ℤ256 → ℤ257 . x ↦ 3x mod 257 . FFT . FFT over (ℤ257)128 . 𝜕−0 . 𝜕−1 . 𝜕−2 . 𝜕−3 . 𝜕−4 . 𝜕−5 . xi ↦ xi × 𝜕−i . msb . msb . msb . msb . msb . msb . ℤ257 → ℤ2 128bit output . x ↦ ⌊2x/257⌉

  • G. Leurent ()

SPRING FSE 2014 9 / 16

slide-17
SLIDE 17

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

SPRING over R257 (p = 257, n = 128)

.

. Key sij

1024(k + 1) bits

. 1 . x1 . x2 . ⋮ . xk . kbit input x . Subset sum . ∑j xjsij . 1024bit state (128 8bit words) . exp . exp . exp . exp . exp . exp . ℤ256 → ℤ257 . x ↦ 3x mod 257 . FFT . FFT over (ℤ257)128 . 𝜕−0 . 𝜕−1 . 𝜕−2 . 𝜕−3 . 𝜕−4 . 𝜕−5 . xi ↦ xi × 𝜕−i . msb . msb . msb . msb . msb . msb . ℤ257 → ℤ2 128bit output . x ↦ ⌊2x/257⌉

  • G. Leurent ()

SPRING FSE 2014 9 / 16

slide-18
SLIDE 18

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

SPRING over R257 (p = 257, n = 128)

.

. Key sij

1024(k + 1) bits

. 1 . x1 . x2 . ⋮ . xk . kbit input x . Subset sum . ∑j xjsij . 1024bit state (128 8bit words) . exp . exp . exp . exp . exp . exp . ℤ256 → ℤ257 . x ↦ 3x mod 257 . FFT . FFT over (ℤ257)128 . 𝜕−0 . 𝜕−1 . 𝜕−2 . 𝜕−3 . 𝜕−4 . 𝜕−5 . xi ↦ xi × 𝜕−i . msb . msb . msb . msb . msb . msb . ℤ257 → ℤ2 128bit output . x ↦ ⌊2x/257⌉

  • G. Leurent ()

SPRING FSE 2014 9 / 16

slide-19
SLIDE 19

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

Tweaks to the construction

Problems because of the small parameters

1 Polynomial are noninversible with high probability

▶ Product in a subspace ▶ Use only units for the key elements

2 Rounding from ℤ257 has a bias 1/257

▶ Output bits biased ▶ Combine bits to reduce bias: SPRINGBCH ▶ Or use ℤ514: SPRINGCRT

  • G. Leurent ()

SPRING FSE 2014 10 / 16

slide-20
SLIDE 20

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

Tweaks to the construction

Problems because of the small parameters

1 Polynomial are noninversible with high probability

▶ Product in a subspace ▶ Use only units for the key elements

2 Rounding from ℤ257 has a bias 1/257

▶ Output bits biased ▶ Combine bits to reduce bias: SPRINGBCH ▶ Or use ℤ514: SPRINGCRT

  • G. Leurent ()

SPRING FSE 2014 10 / 16

slide-21
SLIDE 21

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

Tweaks to the construction

Problems because of the small parameters

1 Polynomial are noninversible with high probability

▶ Product in a subspace ▶ Use only units for the key elements

2 Rounding from ℤ257 has a bias 1/257

▶ Output bits biased ▶ Combine bits to reduce bias: SPRINGBCH ▶ Or use ℤ514: SPRINGCRT

  • G. Leurent ()

SPRING FSE 2014 10 / 16

slide-22
SLIDE 22

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

Tweaks to the construction

Problems because of the small parameters

1 Polynomial are noninversible with high probability

▶ Product in a subspace ▶ Use only units for the key elements

2 Rounding from ℤ257 has a bias 1/257

▶ Output bits biased ▶ Combine bits to reduce bias: SPRINGBCH ▶ Or use ℤ514: SPRINGCRT

  • G. Leurent ()

SPRING FSE 2014 10 / 16

slide-23
SLIDE 23

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

SPRING-BCH

▶ Reduce the bias by combining output bits

▶ Pilingup lemma: bias(a ⊕ b) = bias(a) ⋅ bias(b)

▶ Multiply with the transpose of the generating matrix of a code

▶ Syndrome for the dual code ▶ Any linear combination of output bits is the sum of d biased bits ▶ Bias reduced exponentially in d

▶ We use an extended BCH code

▶ Efficient ▶ Best known distance

▶ Efficiency loss: only 64 output bits

  • G. Leurent ()

SPRING FSE 2014 11 / 16

slide-24
SLIDE 24

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

SPRING-CRT

▶ Use the ring R514 = ℤ514[X]/(Xn + 1)

▶ Unbiased rounding from ℤ514

▶ Chinese Remainder decomposition: R514 ≅ R257 × R2

▶ Compute modulo 257 and modulo 2, combine outputs

▶ Computation in R2:

▶ Efficient algorithms for subsetproduct in the paper ▶ In counter mode: single multiplication using PCLMUL, or tables

  • G. Leurent ()

SPRING FSE 2014 12 / 16

slide-25
SLIDE 25

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

Implementation

▶ Implementation using SIMD instructions

▶ Compute operations in parallel on vector of data ▶ SSE2 on Intel/AMD x86: desktop (Core) and embedded (Atom) ▶ NEON on ARM: embedded CPU (Cortex A in smartphones, tablets)

▶ Subset sum optimized with precomputed tables

▶ 2bit inputs: [0, s0, s1, s0 + s1] ▶ 8bit inputs: 256 entries

▶ Multiplication in R2 using PCLMUL instruction (if available),

  • r precomputed tables

▶ Bottleneck is FFT

  • G. Leurent ()

SPRING FSE 2014 13 / 16

slide-26
SLIDE 26

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

FFT implementation tricks

▶ Reuse efficient FFT from the SIMD hash function ▶ Decompose FFT as a twodimensional FFT

▶ Parallel FFT on lines and columns

▶ Elements in ℤ257 as 16bit words ▶ Partial reduction mod257 with (x&256) - (x>>8)

▶ Output in [−127, 383]

▶ Multiplication in ℤ257 using 16bit signed multiplication

▶ Reduce operands to [−128, 128] beforehand

  • G. Leurent ()

SPRING FSE 2014 14 / 16

slide-27
SLIDE 27

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

Performance

▶ 2030 cycle/byte on Core i7 using SSE

▶ Slow for a stream cipher, fast enough for practical use

▶ SPRINGCRTCTR is about 4.5 times slower than AESCTR

▶ Excluding hardware AES instructions ▶ Same ratio on a range of architectures

SPRINGBCH SPRINGCRT AESCTR Single CTR Single CTR

✘✘✘ ✘ ❳❳❳ ❳

AESNI AESNI ARM Cortex A15 220 170 250 77 17.8 N/A Atom 247 137 235 76 17 N/A Core i7 Nehalem 74 60 76 29.5 6.9 N/A Core i7 Ivy Bridge 60 46 62 23.5 5.4 1.3

  • G. Leurent ()

SPRING FSE 2014 15 / 16

slide-28
SLIDE 28

. . . . . . . . SPRING . . . Tweaks . . . . Implementation

Conclusion

S: Subset Product with Rounding over a ring

▶ Strong algebraic structure

▶ Simple design ▶ Subset sum, table lookup, FFT, table lookup with small output ▶ Large linear part good for masking, MPC

▶ Based on a design with security reduction

▶ Security reduction does not apply with small parameters ▶ Cryptanalysis is needed to evaluate the security ▶ Expected security: about 128 bit

▶ High parallelism

▶ Reasonable performances with vector instructions ▶ Good performances in hardware?

  • G. Leurent ()

SPRING FSE 2014 16 / 16

slide-29
SLIDE 29

Pseudo-code for SPRING

Implementation Key: (􏾨 ai)127

i=0 , (􏾨

sij)127

i=0 k−1 j=0 ∈ ℤ256

Input: x1, x2, … xk ∈ {0, 1}

1: for 0 ≤ i < k do 2:

ui ← 􏾨 ai + ∑j xj􏾨 sij mod 256

3:

ui ← 3ui mod 257

4: ⃗

u ← FFT−1

128(⃗

u)

5: for 0 ≤ i < k do 6:

ui ← ui ⋅ 𝜕−i mod 257

7:

yi ← ⌊2 ⋅ ui/257⌉

8: return ⃗

y

  • G. Leurent ()

SPRING FSE 2014 17 / 16