McBits Revisited ia.cr/2017/793 Tung Chou Osaka University, Japan - - PowerPoint PPT Presentation

mcbits revisited
SMART_READER_LITE
LIVE PREVIEW

McBits Revisited ia.cr/2017/793 Tung Chou Osaka University, Japan - - PowerPoint PPT Presentation

McBits Revisited ia.cr/2017/793 Tung Chou Osaka University, Japan Code-based cryptography (encryption) Sender Receiver m + e = r r = m m (noisy channel) 1 Code-based cryptography (encryption) Sender


slide-1
SLIDE 1

McBits Revisited

ia.cr/2017/793 Tung Chou

Osaka University, Japan

slide-2
SLIDE 2

Code-based cryptography (encryption)

Sender Receiver

  • m
  • m +

e = r

  • r =

m

(noisy channel) 1

slide-3
SLIDE 3

Code-based cryptography (encryption)

Sender Receiver

  • c =

mG

  • c +

e = r

  • c,

e = Decode( r)

(noisy channel) 1

slide-4
SLIDE 4

Code-based cryptography (encryption)

Sender Receiver

  • r =

mG + e

  • r
  • c,

e = Decode( r)

1

slide-5
SLIDE 5

Code-based cryptography (encryption)

Sender Receiver

  • r =

mG + e

  • r
  • c,

e = Decode( r)

  • McEliece (1978) using binary Goppa code remains secure.
  • Niederreiter as the dual system.
  • Confidence-inspiring post-quantum cryptosystems.

1

slide-6
SLIDE 6

The old and the new McBits

The old McBits (2013)

  • “McBits: Fast constant-time code-based cryptography”

by Daniel J. Bernstein, Tung Chou, Peter Schwabe

  • Bitslicing, non-conventional algorithms for decoding
  • Using external parallelism
  • High throughput, high latency

2

slide-7
SLIDE 7

The old and the new McBits

The old McBits (2013)

  • “McBits: Fast constant-time code-based cryptography”

by Daniel J. Bernstein, Tung Chou, Peter Schwabe

  • Bitslicing, non-conventional algorithms for decoding
  • Using external parallelism
  • High throughput, high latency

The new McBits (2017)

  • Using internal parallelism
  • High throughput, low latency

2

slide-8
SLIDE 8

Bitslicing

“Simulating w copies of a circuit using bitwise logical operations.”

b0

3

slide-9
SLIDE 9

Bitslicing

“Simulating w copies of a circuit using bitwise logical operations.”

b0 . . . . . . bw−1

3

slide-10
SLIDE 10

Bitslicing

“Simulating w copies of a circuit using bitwise logical operations.”

b0 . . . . . . bw−1 McBits 2013:

  • Inst. 1
  • Inst. w

3

slide-11
SLIDE 11

Bitslicing

“Simulating w copies of a circuit using bitwise logical operations.”

b0 . . . . . . bw−1 McBits 2013:

  • Inst. 1
  • Inst. w

McBits 2017:

  • Inst. 1
  • Inst. 1

3

slide-12
SLIDE 12

Speeds

reference m n t bytes sec perm synd key eq root all arch McBits 2013 13 6624 115 958482 252 23140 83127 102337 65050 444971 IB 13 6960 119 1046739 263 23020 83735 109805 66453 456292 IB McBits 2017 13 8192 128 1357824 297 3783 62170 170576 53825 410132 IB 3444 36076 127070 34491 275092 HW

Timings for decoding

key-generation encryption decryption arch 1552717680 312135 492404 IB 1236054840 289152 343344 HW

Timings for key generation, encryption, and decryption 4

slide-13
SLIDE 13

Decoder

BM Received word

  • r =

c + e Syndrome computation Key-equation solving Root finding

  • e

5

slide-14
SLIDE 14

Decoder

BM Received word

  • r =

c + e Syndrome computation Key-equation solving Root finding

  • e

≈ ≈ transposed multi-point evaluation multi-point evaluation 5

slide-15
SLIDE 15

Decoder

BM Received word

  • r =

c + e Syndrome computation Key-equation solving Root finding

  • e

≈ ≈ transposed multi-point evaluation multi-point evaluation permutation + transposed FFT additive FFT + permutation 5

slide-16
SLIDE 16

Beneˇ s network

  • if c, swap(b0, b1)
  • d ← b0 ⊕ b1; d ← cd; b0 ← b0 ⊕ d; b1 ← b1 ⊕ d;

6

slide-17
SLIDE 17

Beneˇ s network

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

7

slide-18
SLIDE 18

Beneˇ s network

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Stage 1

7

slide-19
SLIDE 19

Beneˇ s network

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Stage 2

7

slide-20
SLIDE 20

Beneˇ s network

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Stage 3

7

slide-21
SLIDE 21

Beneˇ s network

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Stage 4

7

slide-22
SLIDE 22

Beneˇ s network

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Stage 5

7

slide-23
SLIDE 23

Beneˇ s network

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Stage 6

7

slide-24
SLIDE 24

Beneˇ s network

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Stage 7

7

slide-25
SLIDE 25

Bit-matrix transposition

3 2 1

8

slide-26
SLIDE 26

Bit-matrix transposition

3 2 1

8

slide-27
SLIDE 27

Bit-matrix transposition

3 2 1 3 1 2

8

slide-28
SLIDE 28

Bit-matrix transposition

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

9

slide-29
SLIDE 29

Bit-matrix transposition

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

9

slide-30
SLIDE 30

Bit-matrix transposition

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 8 9 12 13 2 3 6 7

9

slide-31
SLIDE 31

Bit-matrix transposition

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 8 9 12 13 2 3 6 7

9

slide-32
SLIDE 32

Bit-matrix transposition

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 8 9 12 13 2 3 6 7 12 14 9 11 4 6 1 3

9

slide-33
SLIDE 33

The Gao–Mateer Additive FFT

  • Multiplicative FFT

f(x) = f(0)(x2) + xf(1)(x2)

  • Additive FFT

f(x) = f(0)(x2 + x) + xf(1)(x2 + x)

10

slide-34
SLIDE 34

Additive FFT (butterflies)

“Full” FFT

11

slide-35
SLIDE 35

Additive FFT (butterflies)

transpose

“Full” FFT

11

slide-36
SLIDE 36

Additive FFT (butterflies)

transpose

Low-degree FFT

11

slide-37
SLIDE 37

Additive FFT (radix conversions)

f0 f1 f2 f3 f4 f5 f6 f7

12

slide-38
SLIDE 38

Additive FFT (radix conversions)

+ + f0 f1 f2 f3 f4 f5 f6 f7

12

slide-39
SLIDE 39

Additive FFT (radix conversions)

+ + + + f0 f1 f2 f3 f4 f5 f6 f7

12

slide-40
SLIDE 40

Additive FFT (radix conversions)

f0 f1 f2 f3 f4 f5 f6 f7 ∗ = α

12

slide-41
SLIDE 41

Additive FFT (radix conversions)

+ + + + f0 f1 f2 f3 f4 f5 f6 f7 ∗ = α

12

slide-42
SLIDE 42

Additive FFT (radix conversions)

+ + + + f0 f1 f2 f3 f4 f5 f6 f7 ∗ = α

  • Additions: logical operations &, ˆ, ≫, ≪.
  • Bitsliced multiplications.
  • Small polynomial degree ⇒ relatively cheap.

12

slide-43
SLIDE 43

Berlekamp-Massey algorithm

Picture from: “Implementation of Berlekamp-Massey algorithm without inversion” by Xu Youzhi 13

slide-44
SLIDE 44

Key generation

Public-key generation

  • Constant-time Gaussian elimination in F2.

H I H′

14

slide-45
SLIDE 45

Key generation

Public-key generation

  • Constant-time Gaussian elimination in F2.

H I H′ Secret-key generation

  • Goppa polynomial: degree-t, irreducible g ∈ F2m[x].
  • Generating random element α ∈ F2mt.
  • Derive minimal polynomial of α with Gaussian elimination in F2m.

14

slide-46
SLIDE 46

tungchou.github.io/mcbits/