on fpga using low complexity ntt intt
play

on FPGA using Low-Complexity NTT/INTT Neng Zhang , Bohan Yang, Chen - PowerPoint PPT Presentation

CHES 2020 Highly Efficient Architecture of NewHope-NIST on FPGA using Low-Complexity NTT/INTT Neng Zhang , Bohan Yang, Chen Chen, Shouyi Yin, Shaojun Wei and Leibo Liu Institute of Microelectronics, Tsinghua University, Beijing, China Institute


  1. CHES 2020 Highly Efficient Architecture of NewHope-NIST on FPGA using Low-Complexity NTT/INTT Neng Zhang , Bohan Yang, Chen Chen, Shouyi Yin, Shaojun Wei and Leibo Liu Institute of Microelectronics, Tsinghua University, Beijing, China Institute of Microelectronics, Tsinghua University.

  2. Outline  1. Introduction  2. Low-Complexity NTT/INTT  3. Hardware Architecture  4. Implementation Results Institute of Microelectronics, Tsinghua University. 2

  3. 1 Introduction  NewHope: a PQC algorithm for key encapsulation mechanism (KEM) NewHope-USENIX NewHope-Simple NewHope-NIST  A candidate in the 2 nd round of NIST PQC standardization process, but not in the 3 rd round  Low-complexity NTT/INTT can be utilized by other algorithms. Crystals- qTesla Falcon LTV BFV PQC FHE Dilithium Institute of Microelectronics, Tsinghua University. 3

  4. 1 Introduction  Main mathematical objects of NewHope polynomials over the ring ℝ 𝒓 = 𝕬 𝒓 𝒚 / 𝒚 𝑶 + 𝟐 q 12289 𝝏 𝑶 Primitive N-th root of unit over 𝑎 𝑟 𝜹 𝟑𝑶 Square root of 𝜕 𝑂 N 1024 or 512  Encryption-based KEM Key Generation 2 NTTs Encryption 2 NTTs, 1 INTT Decryption 1 INTT Institute of Microelectronics, Tsinghua University. 4

  5. 1 Introduction  Multiplication over the ring Z q [x]/f(x) ➢ f(x) is arbitrary ➢ Convolution theory ➢ q≡1 ( 𝑛𝑝𝑒 𝑂 ) ➢ f(x) = x N +1 ➢ Negative Wrapped Convolution (NWC) ➢ q≡1 ( 𝑛𝑝𝑒 2 𝑂 ) Institute of Microelectronics, Tsinghua University. 5

  6. 1 Introduction  Why do we need low-complexity ? area speed Low-complexity Low area High speed Institute of Microelectronics, Tsinghua University. 6

  7. 2.1 Low-Complexity NTT Number of modular multiplications of NTT  Cost of the pre-processing is considerable ( N/ 2) log N + N pre-processing FFT  Low-Complexity NTT ➢ A low-complexity NTT with twiddle factors computed on-the-fly [1]. ➢ Merge the pre-processing into the DIT FFT with twiddle factors pre-computed. [1] S. Roy, et al., Compact ring-lwe cryptoprocessor. CHES 2014 Institute of Microelectronics, Tsinghua University. 7

  8. 2.1 Low-Complexity NTT  Derivation of the low-complexity NTT ➢ Inspired by the strategy of the Cooley-Turkey FFT ➢ Follow the divide-and-conquer method of FFT that divides in time domain (DIT) ➢ First, the pre-processing and the FFT are written together as a summation of N items ➢ Second, the summation is split into two groups according to parity of the index of a Institute of Microelectronics, Tsinghua University. 8

  9. 2.1 Low-Complexity NTT  Derivation of the low-complexity NTT ➢ Third, the equation is grouped into two parts according to the size of index i. (0) and ො (1) are N/2-point NTTs 𝑏 𝑗 ො 𝑏 𝑗 of 𝑏 2𝑘 and 𝑏 2𝑘+1 ➢ In this way, N-point NTT can be resolved with two N/2-point NTTs N/4-point NTT N/2-point NTT 2-point NTT N/4-point NTT … N-point NTT … N/4-point NTT N/2-point NTT 2-point NTT N/4-point NTT Institute of Microelectronics, Tsinghua University. 9

  10. 2.1 Low-Complexity NTT Butterfly of low-complexity NTT Dataflow of a 8-point low-complexity NTT Institute of Microelectronics, Tsinghua University. 10

  11. 2.1 Low-Complexity NTT In classic FFT: Computational complexity: 𝑘𝑂/𝑛 𝜕 = 𝜕 𝑂 ( N/ 2) log N + N → ( N/ 2) log N No additional timing cost; No additional hardware resources cost Institute of Microelectronics, Tsinghua University. 11

  12. 2.2 Low-Complexity INTT  Cost of the post-processing is greater than pre-processing Number of modular multiplications of NTT and INTT ( N/ 2) log N + 2 N post-processing FFT  Low-Complexity INTT −𝑗 into the FFT. ➢ [1] merges the scaling of 𝜇 2𝑂 ➢ Further merge the scaling of N −1 into the FFT [1] T. Pöppelmann, et al., High-performance ideal lattice-based cryptography on 8-bit atxmega microcontrollers. LATINCRYPT 2015 Institute of Microelectronics, Tsinghua University. 12

  13. 2.2 Low-Complexity INTT  Derivation of the low-complexity INTT ➢ Inspired by the strategy of the Gentleman-Sande FFT ➢ Follow the divide-and-conquer method of FFT that divides in frequency domain (DIF) ➢ First, the post-processing and the FFT are written together as a summation of N items ➢ Second, the summation is split into two groups according to the size of index of ො 𝑏 Institute of Microelectronics, Tsinghua University. 13

  14. 2.2 Low-Complexity INTT  Derivation of the low-complexity INTT ➢ Third, the equation is grouped into two parts according to the parity of i. 𝑏 2𝑗 and 𝑏 2𝑗+1 correspond to N/2- (0) and ෠ (1) point INTT of ෠ 𝑐 𝑗 𝑐 𝑗 ➢ In this way, N-point INTT can be resolved with two N/2-point INTTs N/4-point INTT N/2-point INTT 2-point INTT N/4-point INTT … N-point INTT … N/4-point INTT N/2-point INTT 2-point INTT N/4-point INTT Institute of Microelectronics, Tsinghua University. 14

  15. 2.2 Low-Complexity INTT Butterfly of low-complexity INTT Dataflow of a 8-point low-complexity INTT Institute of Microelectronics, Tsinghua University. 15

  16. 2.2 Low-Complexity INTT In classic FFT: Computational complexity: −𝑘𝑂/𝑛 𝜕 = 𝜕 𝑂 ( N/ 2) log N + 2 N → ( N/ 2) log N 𝑣 + 𝑢 𝑣 − 𝑢 No additional timing cost; slightly modify the butterfly unit Institute of Microelectronics, Tsinghua University. 16

  17. 3 The Hardware Architecture  The architecture of NTT/INTT  Multi-bank memory ➢ Address generator [1] : ➢ Log N: Even √ Odd ╳ ➢ The execution order of the last s-loop is rearranged as : [1] W. Wang, et al., VLSI design of a large number multiplier for fully homomorphic encryption. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 22(9):1879 – 1887, Sept 2014. Institute of Microelectronics, Tsinghua University. 17

  18. 3 The Hardware Architecture  Compact Butterfly Unit Institute of Microelectronics, Tsinghua University. 18

  19. 3 The Hardware Architecture  Low-Complexity Modular Multiplication No additional multiplication; Time-constant Institute of Microelectronics, Tsinghua University. 19

  20. 3 The Hardware Architecture  The architecture of NewHope-NIST ➢ Support: key generation, encryption and decryption ➢ Doubled bandwidth matching ➢ RAM (R0, R1): two data in an address Institute of Microelectronics, Tsinghua University. 20

  21. 3 The Hardware Architecture  Timing hiding ➢ Resource conflict ➢ data dependency A RAM may be read and write by operations in the same line. Institute of Microelectronics, Tsinghua University. 21

  22. 4 Implementation Results  Implementation platform ➢ Xilinx Artix-7 FPGA ➢ Vivado 2019.1.1  Implementation Results of NTT/INTT 120 250 70 3000 350 60 300 100 2500 200 50 250 80 2000 150 40 200 Ours 60 1500 [FS19] 30 150 100 [KLC+7] 40 1000 20 100 [JGCS19] 50 20 500 [FSM+19] 10 50 [BUC19b] 0 0 0 0 0 Time ATP ATP ATP ATP (us) (LUT x ms) (FF x ms) (DSP x us) (BRAM x us) Institute of Microelectronics, Tsinghua University. 22

  23. 4 Implementation Results  Implementation Results of NewHope-NIST [FSM+19] 90000 Ours [JGCS19-1] [JGCS19-2] [buc19b] 3500 Time 80000 3000 (us) 2500 70000 2000 1500 60000 1000 500 50000 0 KeyGen+Decrypt Encrypt 40000 16000 30 14000 25 30000 12000 20 10000 20000 8000 15 6000 10 10000 4000 5 2000 0 0 0 ATP ATP ATP ATP (LUT x ms) (FF x ms) (DSP x us) (BRAM x us) LUTs FFs DSPs BRAMs Institute of Microelectronics, Tsinghua University. 23

  24. Conclusion  Low-complexity NTT/INTT ➢ NTT: no pre-processing ➢ INTT: no post-processing  A highly efficient architecture of NewHope-NIST ➢ A clear advantage in both speed and ATP  Low-complexity NTT/INTT can benefit other NTT-inside algorithms Institute of Microelectronics, Tsinghua University. 24

  25. Thanks ! Institute of Microelectronics, Tsinghua University. 25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend