Normal Basis is Usin ing Novel Concurrent Seria ial Squarin ing - PowerPoint PPT Presentation

A New Mult ltipli licative In Inverse Archit itecture in in Normal Basis is Usin ing Novel Concurrent Seria ial Squarin ing and Mult ltipli lication Amin Monfared, Hayssam El-Razouk and Arash Reyhani-Masoleh Presented by: Arash Reyhani-Masoleh Department of Electrical and Computer Engineering Western University, London, Ontario, Canada 24 th IEEE Symposium on Computer Arithmetic, 2017 1

Outline • Motivation • Arithmetic operations over 𝐻𝐺(2 𝑛 ) using Gaussian Normal Basis (GNB) • Proposed digit-level square-multiply architecture • It computes 𝐵 × 𝐶 2 𝑓 • Both digits of inputs 𝐵 and 𝐶 are entered serially • Denoted by Digit-Level Fully Serial-In Square-Multiply (DL-FSISM) • Proposed inversion architecture • It uses the DL-FSISM • ASIC implementations and comparison • Conclusions and future work 2

Motivation: Fin inite Fields • Many applications use arithmetic operations over 𝐻𝐺(2 𝑛 ) • Cryptography: Elliptic Curve, AES • Error control coding • Reed-Solomon code • There are different bases to represent a field element. • Polynomial basis, normal basis (NB), dual basis, etc. • In NB, squaring is free in hardware. 3

Motivation: Gaussian Normal Basis is (GNB) • GNB over 𝐻𝐺 2 𝑛 is a special class of NB and exists whenever 𝑛 is not divisible by 8. • GNBs have been included in IEEE and NIST standards for ECDSA. • Any field element 𝐵 can be represented as 𝑛−1 𝑏 𝑗 𝛾 2 𝑗 , where 𝑏 𝑗 𝜗{0,1} and 𝐵 = ෍ 𝑗=0 {𝛾, … , 𝛾 2 𝑛−1 } is a GNB over 𝐻𝐺 2 𝑛 . • In this paper, we consider GNB and propose new digit-level architectures for square-multiply and inversion. 4

ic Operations over 𝐻𝐺(2 𝑛 ) using GNB Arit ithmetic GNB • Addition • Let 𝐵 and 𝐶 be two Field elements represented in GNB. • The addition operation is bit-wise XOR operation of the coordinates of the two inputs: 𝑛−1 (𝑏 𝑗 +𝑐 𝑗 )𝛾 2 𝑗 𝐵 + 𝐶 = ෍ 𝑗=0 • Squaring • Squaring operation is performed by right cyclic shift of the coordinates of 𝐵 : 𝑛−1 𝐵 2 = ෍ 𝑏 𝑗 𝛾 2 𝑗+1 𝑗=0 • It is free in hardware if all coordinates are available in parallel. 5

Arit ithmetic ic Operations usin ing GNB: : Mult ltip ipli lication • Finite field multiplication is more complex than addition and squaring. • Multiplication can be implemented in digit-level architectures, in which the digit size can be chosen based on available resources. • In this paper, we have used two different types of digit-level multiplier namely: • Digit-Level Parallel-In Serial-Out (DL-PISO) • Digit-Level Parallel-In Parallel-Out (DL-PIPO) • Also, we have proposed a new multiplier/squarer architecture • Digit-Level Fully Serial-In Square-Multiply (DL-FSISM). 6

Arit ithmetic ic Operations usin ing GNB: : In Inversion • Based on Fermat Little Theorem, an inversion can be calculated by • 𝐵 −1 = 𝐵 2 𝑛 −2 ∈ 𝐻𝐺 2 𝑛 , 𝐵 ≠ 0. • In Itoh and Tsujii algorithm (ITA) [4], the number of multiplications is reduced based on decomposing 2 𝑛−1 − 1 • As an example for the NIST recommended field over 𝐻𝐺(2 233 ) : 2 232 − 1 = (1 + 2)(1 + 2 2 )(1 + 2 4 )(1 + 2 8 (1 + 2 8 )(1 + 2 16 )(1 + 2 32 (1 + 2 32 )(1 + 2 64 (1 + 2 64 )))) • The inversion using ITA takes a total of 10 iterations. • Each iteration consists of one single digit-level parallel-in parallel- out (DL-PIPO) multiplication and one free squaring. -------------------------------------------------------------------------------------- 7 [4] T. Itoh and S. Tsujii , “A fast algorithm for computing multiplicative inverses in GF(2 m ) using normal bases,” Information and computation, vol. 78, no. 3, pp. 171 – 177, 1988.

Arit rithmetic ic Operatio ions usi sing GNB: In Inversio ion ( cont’d) • Our inversion flow diagram (based on ITA) uses an interleaved computations of digit-level parallel-in serial-out (DL-PISO) multiplier and our new DL-FSISM architecture. • It only needs a total of 5 iterations. • Each iteration consists of two single multiplications (and squarings) • In this paper, we propose a new digit-level fully serial-in parallel-out square-multiply (DL-FSISM) architecture which performs concurrent squaring and multiplication without introducing any delay. 8

Proposed Dig igit it-Level l Fully lly Se Seri rial-In Sq Square-Mult ltip iply ly (DL-FSISM) (D • Let A and B be field elements and e be an integer. • The proposed scheme reads the inputs of A and B digit-by- digit serially and concurrently computes 𝐺 = 𝐵 × 𝐶 2 𝑓 . • The composite operations of squaring and multiplication are concurrently performed without introducing any additional delay. 𝑛 • For a digit size of 𝑒 bits, it would take ⌈ 𝑒 ⌉ clock cycles to generate the result 𝐺 = 𝐵 × 𝐶 2 𝑓 . 9

Proposed DL-FSISM: Key y Formulation Proposition 1: Let 𝐵 and 𝐶 be two 𝐻𝐺(2 𝑛 ) elements that are represented in GNB {𝛾, … , 𝛾 2 𝑛−1 } . One can compute 𝐺 = 𝐵𝐶 2 𝑓 , by proceeding from 𝑗 = 0 to 𝑙 − 1 , the result 𝐺 = 𝐺 𝑙−1 = 𝐵 (𝑙−1) (𝐶 𝑙−1 ) 2 𝑓 is obtained using the following recurrence relation 𝐺 𝑗 = 𝐺 𝑗−1 2 𝑒 + σ 𝑘=0 2 𝑓 𝑒−1 𝜀 𝑘 𝑏 𝑒 𝑙−1−𝑗 +𝑘 , 𝐶 𝑗 + 2 𝑒−𝑓 𝑒−1 𝜀 ) 2 𝑓 𝑐 𝑒 𝑙−1−𝑗 +𝑘 , 𝐵 𝑗−1 (σ 𝑘=0 𝑘 𝑛−1 𝑤 𝑚 𝛾 2 𝑚 ∈ 𝐻𝐺 2 𝑛 . 𝑘 𝑣, 𝑊 = 𝑣𝑊𝛾 2 𝑘 , u 𝜗 0,1 and 𝑊 = σ 𝑚=0 where 𝜀 10

Proposed DL-FSISM: Archit itecture 𝑒−1 𝑒−1 𝑗−1 2 𝑒 + ෍ 2 𝑓 2 𝑒−𝑓 ) 2 𝑓 𝑘 𝑏 𝑒 𝑙−1−𝑗 +𝑘 , 𝐶 𝑗 𝑐 𝑒 𝑙−1−𝑗 +𝑘 , 𝐵 𝑗−1 𝐺 𝑗 = 𝐺 𝜀 + (෍ 𝜀 𝑘 𝑘=0 𝑘=0 • Three registers X, a d(k-1-i)+d- 1 in1 1 d m-d d m B (i) »e n Y, and Z are d- 1 m e n in2 B B B m - - i - m m 0 k 1 k 1 B (i) + 0 m-d -1 <Y> initially cleared d n d 0 m -1 a d(k-1-i)+ 0 m <Z> in1 1 + d d 𝐵𝐶 2 𝑓 • Digits of inputs 0 m in2 m m d are entered to X b d(k-1-i)+d- 1 in1 1 d and Y serially ((A (i-1) »d)«e n ) m-d m d- 1 m e n d in2 m + A (i- 1 ) from MSB A A A - - - e n 0 k 1 i k 1 n b d(k-1-i)+ 0 0 m-d -1 m m in1 <X> 1 d d • After ⌈ 𝑛 n 0 m 𝑒 ⌉ clock in2 m cycles, Z contains 𝐵𝐶 2 𝑓 11

Proposed DL-FSISM: Archit itecture (cont’d) 𝑒−1 𝑒−1 𝑗−1 2 𝑒 + ෍ 2 𝑓 2 𝑒−𝑓 ) 2 𝑓 𝑘 𝑏 𝑒 𝑙−1−𝑗 +𝑘 , 𝐶 𝑗 𝑘 × 𝑐 𝑒 𝑙−1−𝑗 +𝑘 , 𝐵 𝑗−1 𝐺 𝑗 = 𝐺 𝜀 + (෍ 𝜀 𝑘=0 𝑘=0 a d(k-1-i)+d- 1 in1 1 d m-d d m B (i) »e n d- 1 m e n in2 B B B - - i - m m m k 1 0 k 1 B (i) + 0 m-d -1 <Y> d n d 0 m -1 a d(k-1-i)+ 0 m <Z> in1 1 + d d 0 m in2 m m d b d(k-1-i)+d- 1 in1 1 d ((A (i-1) »d)«e n ) m-d m d- 1 m e n in2 d m + A (i- 1 ) A A A - - - e n 0 k 1 i k 1 n b d(k-1-i)+ 0 m 𝜀 0 m-d -1 m in1 <X> 1 𝑘 d d n 0 m in2 m 1 1 in1 in2 1  m j j 1 m m m m e 1 e 1 m m 1 X 2 -e n X 2 e n e n e n 1 X X m m 1 m m m m e v m e v m n n 12

Proposed In Inversion Archit itecture • The inversion core is made by serially connecting of DL-PISO and DL-FSISM • The register file only stores from the multipliers • 𝑒 -bits register is ε = {2,8,32,64} 32 added between two multipliers to shorten the critical path • Each iteration selects one of inputs of multiplexers and takes ⌈ 𝑛 𝑒 ⌉ +1 clock cycles 13

In Inversion Archit itecture Comparison (Number of It Iterations) Architecture Algorithm Multiplication Number of m = 163 m = 233 m = 283 m = 409 m = 571 type Iterations [4] ITA 1 × Single N 1 9 10 11 11 13 [7, 6] TIT/MTIT 1 × double N 2 5 9 8 7 8 [8] Optimal-3 1 × double N 3 5 7 6 7 7 chain ⌈ N 1 Proposed ITA 2 × Single 5 5 6 6 7 2 ⌉ Interleaved • Our Proposed inversion architecture reduces the required number of iterations as compared with previous works. • The best performance is achieved when 𝑛 = 233. [4] T. Itoh and S. Tsujii , “A fast algorithm for computing multiplicative inverses in GF(2m) using normal bases,” Information and computation, vol. 78, no. 3, pp. 171 – 177, 1988. [6] J. Hu, W. Guo , J. Wei, and R. Cheung, “Fast and Generic Inversion Architectures Over GF(2m) Using Modified Itoh– Tsujii Algorithms,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 62, pp. 367– 371, April 2015. [7] R. Azarderakhsh, K. Jarvinen, and V. Dimitrov , “Fast Inversion in GF(2m) with Normal Basis Using Hybrid - Double Multipliers,” IEEE Trans. Comput., vol. 63, pp. 1041 – 1047, April 2014. [8] K. Jarvinen, V. Dimitrov, and R. Azarderakhsh , “A Generalization of Addition Chains and Fast Inversions in Binary Fields,” IEEE Trans. Comput., vol. 64, pp. 2421 – 2432, Sept. 2015. 14

Normal Basis is Usin ing Novel Concurrent Seria ial Squarin ing - PowerPoint PPT Presentation

A New Mult ltipli licative In Inverse Archit itecture in in Normal Basis is Usin ing Novel Concurrent Seria ial Squarin ing and Mult ltipli lication Amin Monfared, Hayssam El-Razouk and Arash Reyhani-Masoleh Presented by: Arash

Potentia ial fo l for B Bankin ing a and Potentia ial fo l for B Bankin ing a and Fin

Specification In Inference Usin ing Context-Free Language Reachability Osbert Bastani, Saswat

Linear regression How to measure the accuracy of linear regression models Linear Regression

Spelling, Punctuation and Grammar Suffixes -ing Year One SPaG | Suffixes -ing Suffixes Suffixes

Novel Gaits for a Novel Novel Gaits for a Novel Crawling/Grasping Mechanism Crawling/Grasping

Normal A Spectrum of Engineering Design Normal Radical A Spectrum of Engineering Design Normal

Flexfilm A Novel Film-based MID Process Dr -Ing Marcus Schuck 1 Dr.-Ing. Marcus Schuck Prof.

Enh nhanc ncing ing Fina inanc ncia ial Susta taina inability f ty for r Socia ial F

Evalu luatin ing c commercia ial p l probe d data o on arteria ial f l facili ilitie

Us Usin ing g Mu Multime ltimedia dia An And d Hy Hype permedia media In Teac aching

Achievin ing Lig ightweight Mult lticast in in Asynchronous Networks-on on-Chip Usin ing

Housing Market Cr Crash Prediction Us Usin ing Machin ine Le Learn rnin ing and His

Concurrent Enrollment A Guide for Parents and Students What is Concurrent Enrollment? Concurrent

Concurrent Message Service M. Clemencic CERN - LHCb Forum on Concurrent Programming Models and

Concurrent Programming in Scala 1 / 7 Concurrent Programming 1 Concurrent programming:

Chomsky Normal Form Chomsky Normal Form Chomsky Normal Form A context free grammar is in

P1788 Standardization of Interval Arithmetic Vincent LEFVRE AriC, INRIA Grenoble

Arithmetic of Extension Fields of Small Characteristics Recent Developments Abhijit Das

Systems with Generic Operations Previous section: designed systems in which data Topic 15

Operators Lecture 3 COP 3014 Fall 2018 January 15, 2019 Operators Special built-in symbols

Efficient and secure modular operations using the Polynomial Modular Number System (Part 1)

Implementing real numbers with RZ Andrej Bauer Iztok Kavkler Faculty of mathematics and physics

An Alternative to SAT-based Approaches for Bit-Vectors S ebastien Bardin, Philippe Herrmann,

Notes 10 Spring 2005 Clancy/Wagner The next sequence of lectures in on the topic of Arithmetic