Saber on ARM
CCA-secure module lattice-based key encapsulation on ARM Joint work with : Jose Maria Bermudo Mera Sujoy Sinha Roy Ingrid Verbauwhede Angshuman Karmakar CHES, Amsterdam 11th September, 2018
Saber on ARM CCA-secure module lattice-based key encapsulation on - - PowerPoint PPT Presentation
Saber on ARM CCA-secure module lattice-based key encapsulation on ARM Angshuman Karmakar CHES, Amsterdam 11 th September, 2018 Joint work with : Jose Maria Bermudo Mera Sujoy Sinha Roy Ingrid Verbauwhede Saber: CCA secure post-quantum KEM*
CCA-secure module lattice-based key encapsulation on ARM Joint work with : Jose Maria Bermudo Mera Sujoy Sinha Roy Ingrid Verbauwhede Angshuman Karmakar CHES, Amsterdam 11th September, 2018
○ Easy rounding ○ Easy modular reduction in HW/SW ○ Precludes use of NTT
J.-P. D’Anvers, A. Karmakar, S. Sinha Roy, and F. Vercauteren. Saber: Module-lwr based key exchange, cpa-secure encryption and cca-secure kem. In A. Joux, A. Nitaj, and T. Rachidi, editors, Progress in Cryptology – AFRICACRYPT 2018, https://eprint.iacr.org/2018/230.pdf 1
Toom-Cook+Karatsuba+School-book
1
Toom-Cook 4-way
. . . .
256
. . . . . . .
64
. . . . 7
`
. . . . . . 2 . . . . 1
Toom-Cook 4-way
. . . . . . . . . . . . . . 2 . . . .
256
. . . . . . .
64 64 64
B A
2
7
`
Toom-Cook+Karatsuba+School-book
1 1
Toom-Cook 4-way
. . . . . . . . . . . . 7
`
. . . . . . 2 . . . . . . . . . . . . . . 9
256
. . . . . . .
64 16
Karatsuba 2-level
1
Karatsuba 2-level
1
Toom-Cook 4-way
. . . .
256
. . . . . . .
64 16
. . . . . . . . 7
`
. . . . . . 2 . . . . . . . . . . 9 . . . .
16 16
B A
2
Toom-Cook+Karatsuba+School-book
1 1
Toom-Cook 4-way
. . . . . . . . . . . . 7
`
. . . . . . 2 . . . . . . . . . . . . . . 9
256
. . . . . . .
64 16
Karatsuba 2-level
X X 1 1
Toom-Cook 4-way
. . . . . . . . . . . . 7
`
. . . . . . 2 . . . . . . . . . . . . . . 9
256
. . . . . . .
64 16
Karatsuba 2-level
School-book
B A
2
Cortex-M0: XMC2Go by Infineon
instructions
Cortex-M4: STM32F4-discovery by STMicroelectronics
and Cortex-M4
3
4
0/1 * rc 0/1 + rd
32 16
5
0/1 * rc 0/1 + rd
32 16
16 instructions !
5
0 * rc 1 + rb 1 * rc 0 + rd
32 16
5
32 16
SMLADX Instruction count reduces 2 → 1
5
32 16
SMLADX Total Instruction count 12 25% reduction Similarly
5
Total Instruction count 11 ! PKHBT SMLADX
5
16 32
6
divided in 4 smaller polynomials A3-A0 each with 64 coefficients
A0-A3 i.e 256 memory accesses
7
. . . . . . . .
ai
. . . . . . . .
ai
1
. . . . . . . .
ai
2
. . . . . . . .
ai
3
A0 A1 A2 A3
. . . . . . . . . . . . . . Registers
ai a1 ai
2
ai
3
a0
i+ai 1+ai 2+ai 3
aw2 awi
2
7
A2
ai
. . . . . . . .
aw2 awi
2
. . . . . . . .
ai
A0
. . . . . . . .
ai
3
A3
. . . . . . . .
ai
2
. . . . . . . .
ai
1
A1
. . . . . . . .
aw5 awi
5
. . . . . . . .
aw1 awi
1
. . . . . Registers . . . . . .
ai
2
ai
3
3a0
i+8ai 1
4ai
2+ai 3
ai
1
8
A2
ai
. . . . . . . .
aw2 awi
2
. . . . . . . .
ai
A0
. . . . . . . .
ai
3
A3
. . . . . . . .
ai
2
. . . . . . . .
ai
1
A1
. . . . . . . .
aw5 awi
5
. . . . . . . .
aw1 awi
1
. . . . . Registers . . . . . .
ai
2
ai
3
3a0
i+8ai 1
4ai
2+ai 3
ai
1
8
. . . . . . . . . . .
SHAKE-128() Random seed
a00 a01 a02 a10 a11 a12 a20 a21 a22
3744 bytes
9
each element polynomial.
Keccak-absorb() Random seed
a00 a01 a02 a10 a11 a12 a20 a21 a22
280 bytes Keccak-squeeze()
. . . . . . . . . . . S
10
each element polynomial.
Keccak-absorb() Random seed
a00 a01 a02 a10 a11 a12 a20 a21 a22
280 bytes Keccak-squeeze()
. . . . . . . . . . . S
10
each element polynomial.
Keccak-absorb() Random seed
a00 a01 a02 a10 a11 a12 a20 a21 a22
280 bytes Keccak-squeeze()
. . . . . . . . . . . S
10
each element polynomial.
Keccak-absorb() Random seed
a00 a01 a02 a10 a11 a12 a20 a21 a22
280 bytes Keccak-squeeze()
. . . . . . . . . . . S
10
Cryptosystem Platform Key generation [Kcycles]/[bytes] Encapsulation [Kcycles]/[bytes] Decapsulation [Kcycles]/[bytes] Multiplication [type] NewHope-CCA * Cortex-M4 1,246 / 11,160 1,966 / 17,456 1,977 / 19,656 NTT Kyber* Cortex-M4 1,200 / 10,304 1,497 / 13,464 1,526 / 14,624 NTT Saber-speed Cortex-M4 1,147 / 13,883 1,444 / 16,667 1,543 / 17,763 TC+Kara+SB Saber-memory Cortex-M4 1,165 / 6,931 1,530 / 7,019 1,635 / 8,115 TC+Kara+SB Saber-mem-M0 Cortex-M0 4,786 / 5,031 6,328 / 5,119 7,509 / 6,215 TC+Kara+SB
*pqm4 post-quantum crypto library for the arm cortex-m4. https://github.com/ mupq/pqm4, 2018. [ accessed 15-April-2018] 11
constrained devices
multiplication can be very competitive
12