Design of an ASIP IDEA Crypto Processor Reza Faghih Mirzaee 1 and - - PowerPoint PPT Presentation

▶

Sep 10, 2023 417 likes •617 views

Design of an ASIP IDEA Crypto Processor Reza Faghih Mirzaee 1 and Mohammad Eshghi 2 1 Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran. 2 Faculty of Electrical and Computer Engineering, Shahid

SLIDE 1

Reza Faghih Mirzaee1 and Mohammad Eshghi2

1Department of Computer Engineering, Science and Research Branch,

Islamic Azad University, Tehran, Iran.

2Faculty of Electrical and Computer Engineering, Shahid Beheshti University,

G.C., Tehran, Iran.

Design of an ASIP IDEA Crypto Processor

SLIDE 2

Design ¡of ¡an ¡ASIP ¡IDEA ¡Crypto ¡Processor, ¡

R. ¡Faghih ¡Mirzaee ¡and ¡M. ¡Eshghi ¡

2 ¡

ASIP vs. ASIC and General-Purpose Processor IDEA Crypto Algorithm Implementing Specific Instructions Testing Process and Results

SLIDE 3

v Hardware Dominant Advantages : Ø High Throughput Ø Energy Efficiency

Application Specific Instruction-set Processor (ASIP) General-Purpose Programmable Processor Application Specific Integrated Circuit (ASIC)

Hardware / Software v Software Dominant Advantages : Ø Software Programmability and Flexibility Advantages : Ø Programmability Ø Simple Designing Ø Fast Debugging Ø Short Time to Market Ø Efficiency

Design ¡of ¡an ¡ASIP ¡IDEA ¡Crypto ¡Processor, ¡

R. ¡Faghih ¡Mirzaee ¡and ¡M. ¡Eshghi ¡

3 ¡

SLIDE 4

International Data Encryption Algorithm (IDEA)

v Lai & Massey (1991) v 64-bit Plaintext v 128-bit Initial Key v 64-bit Ciphertext v 8 Identical Rounds

1 Final Round

Design ¡of ¡an ¡ASIP ¡IDEA ¡Crypto ¡Processor, ¡

R. ¡Faghih ¡Mirzaee ¡and ¡M. ¡Eshghi ¡

4 ¡

SLIDE 5

Instruction-Set

Specific Instructions General-Purpose Instructions CODER × 8 FINAL CODER × 1

Design ¡of ¡an ¡ASIP ¡IDEA ¡Crypto ¡Processor, ¡

R. ¡Faghih ¡Mirzaee ¡and ¡M. ¡Eshghi ¡

5 ¡

SLIDE 6

CODER Specific Instruction

v Only One ALU Ø 14 Stages : Ø Modulo 216 Adder : Ø Modulo 216+1 Multiplier : Ø Bitwise Exclusive-OR Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6 Stage 7 Stage 8 Stage 9 Stage 10 Stage 11 Stage 12 Stage 13 Stage 14

Design ¡of ¡an ¡ASIP ¡IDEA ¡Crypto ¡Processor, ¡

R. ¡Faghih ¡Mirzaee ¡and ¡M. ¡Eshghi ¡

6 ¡

SLIDE 7

Proposed Register and ALSU Configuration

v RTL Ø #Clock Cycles Specifications : Ø Address Bus : AR/PC Ø ACH / ACL Ø Specific Xi Registers Ø Specific Ki Registers Ø 17-bit Subtractor

Design ¡of ¡an ¡ASIP ¡IDEA ¡Crypto ¡Processor, ¡

R. ¡Faghih ¡Mirzaee ¡and ¡M. ¡Eshghi ¡

7 ¡

SLIDE 8

A B 16-bit Result

Modulo 216+1 Multiplication Modulo 216 Addition Bitwise Exclusive-OR

A B 16-bit Result A B Carry / 16-bit Sum

unit16 mulmod(unit16 x, unit16 y) { x = (x - 1) & 0xFFFF; y = (y - 1) & 0xFFFF; unit32 t = (x × y) + x + y + 1; x = t & 0xFFFF; y = t >> 16; x = (x – y) + (1 if x ≤ y); return x; } F21D0Ti: ACH/ACL ß ß DR×ACL , TR1 ß ß ACL F21D0Ti+1: ACL ß ß DR+ACL , DR ß ß TR1 F21D0CTi+2: ACH ß ß ACH+1 F21D0Ti+2: ACL ß ß DR+ACL F21D0CTi+3: ACH ß ß ACH+1 F21D0Ti+3: ACL ß ß ACL+1 F21D0CTi+4: ACH ß ß ACH+1 v Meier and Zimmermann

Design ¡of ¡an ¡ASIP ¡IDEA ¡Crypto ¡Processor, ¡

R. ¡Faghih ¡Mirzaee ¡and ¡M. ¡Eshghi ¡

8 ¡

SLIDE 9

ALSU Opcode Unit Operation Function AU ADD A+B 1 SUB A-B 1 DEC A-1 1 1 INC A+1 1 × × MUL A×B 1 LU AND A˄B 1 1 OR A˅B 1 1 NAND (A˄B)’ 1 1 1 NOR (A˅B)’ 1 1 XOR A⊕B 1 1 1 XNOR (A⊕B)’ 1 1 1 NOT B’ 1 1 1 1 PASS A 1 × × SU SHL SHL(B) 1 × × 1 SHR SHR(B) 1 × × 1 ROL ROL(B) 1 × × 1 1 ROR ROR(B)

Arithmetic Logic Shift Unit (ALSU)

Ø Carry Flag = Cout when ADD|SUB|DEC|INC|SHL|SHR Ø Overflow Flag = ‘1’ when (ADD|INC|SHL & Cout = ‘1’) | (SUB|DEC & Cout = ‘0’) | (MUL & ACH ≠ 0) Ø Zero Flag = ‘1’ when ACL=0 Ø Sign Flag = ‘1’ when (SUB|DEC & Cout = ‘0’) Ø Even Flag = ACL[0]’

Design ¡of ¡an ¡ASIP ¡IDEA ¡Crypto ¡Processor, ¡

R. ¡Faghih ¡Mirzaee ¡and ¡M. ¡Eshghi ¡

9 ¡

SLIDE 10

Instruction Types

Di I1 I0 Instruction Type Instruction Reference Addressing Mode D0 2

Reg. | I/O
D1

1 1 Memory Immediate D2 1 1 Memory Direct D3 1 1 1 Memory Indirect I1 I0 Opcode

No. of Iterations / Unused

Address / Data 15 14 13 9 8 I1 I0 Opcode Unused 15 14 13 9 8 Type 1 : Type 2 :

Design ¡of ¡an ¡ASIP ¡IDEA ¡Crypto ¡Processor, ¡

R. ¡Faghih ¡Mirzaee ¡and ¡M. ¡Eshghi ¡

10 ¡

SLIDE 11

Generating Encryption Subkeys

128-bit Initial Private Key SK0 SK1 SK2 SK3 SK4 SK5 SK6 SK7 v 6 Subkeys per Round + 4 Subkeys for Final Round à à 8 × 6 + 4 = 52 Subkeys Rotate Left 25 bits

15-7 6-0 15-7 6-0 15-7 6-0 15-7 6-0 15-7 6-0 15-7 6-0 15-7 6-0 15-7 6-0

SK8 SK9 SK10 SK11 SK12 SK13 SK14 SK15 Ø Ki[8-0] ß ß K(i+2) mod 8[15-7] Ø Ki[15-9] ß ß K(i+1) mod 8[6-0]

Design ¡of ¡an ¡ASIP ¡IDEA ¡Crypto ¡Processor, ¡

R. ¡Faghih ¡Mirzaee ¡and ¡M. ¡Eshghi ¡

11 ¡

SLIDE 12

Generating Decryption Subkeys

v IDEA : Symmetric v K’ : Decryption Subkey v -K : 216 Additive Inversion Ø - Ki = (216 – Ki) & 0xFFFF v K-1 : 216+1 Multiplicative Inversion v Extended Euclidean Algorithm Ø ax + by = GCD(a, b) , a ≥ b ≥ 0 Ø K-1 = y Round r r = 1 2 ≤ r ≤ 8 r = 9 K’1

(r)

(K1

(10-r))-1

(K1

(10-r))-1

(K1

(10-r))-1

K’2

(r)

(10-r)

K’3

(r)

(10-r)

K’4

(r)

(K4

(10-r))-1

(K4

(10-r))-1

(K4

(10-r))-1

K’5

(r)

(9-r)

K’6

(r)

(9-r)

v Devider

Ø Low Frequency & High Chip Area v Binary Extended GCD Algorithm Ø a = 216+1 , b = Ki , y = Ki

Design ¡of ¡an ¡ASIP ¡IDEA ¡Crypto ¡Processor, ¡

R. ¡Faghih ¡Mirzaee ¡and ¡M. ¡Eshghi ¡

12 ¡

SLIDE 13

Simplified Version of Binary Extended GCD Algorithm for IDEA

1 if (b = 0) then : y ß ß 0 , return y 2 u ß ß a , v ß ß b , B ß ß 0 , D ß ß 1 3 while (u is even) do : u ß ß u/2 if (B is even) then : B ß ß B/2 else : B ß ß (B-a)/2 4 while (v is even) do : v ß ß v/2 if (D is even) then : D ß ß D/2 else : D ß ß (D-a)/2 5 if (u ≥ v) then : u ß ß u-v , B ß ß B-D else : v ß ß v-u , D ß ß D-B 6 if (u = 0) then : if (D < 0 ) then : D ß ß D+216+1 y ß ß D , return y else : goto step 3 Selector Operation Function SUB A-B 1 SUB+SHR (A-B)/2 v u , v , B , D Ø Xi Specific Registers Ø B , D : Negative Values Ø Sign Bits Ø 17bit Subtractor F19(D2+D3)X2[0]’X0[0]’Ti : X0 ß ß SHR[X0] , X2 ß ß SHR[X2] , X0S ß ß 0 , SC ß ß i

Design ¡of ¡an ¡ASIP ¡IDEA ¡Crypto ¡Processor, ¡

R. ¡Faghih ¡Mirzaee ¡and ¡M. ¡Eshghi ¡

13 ¡

SLIDE 14

Testing & Synthesis Results

v Structural VHDL Code v FPGA Family : Virtex5 v Max. Frequency = 110.713MHz v Throughput ≈ 9.4Mbps 100 GEK Addr1 101 CLA 102 LDD Addr2,0 103 QLA 8 104 STA Addr4 105 LDK Addr1,8 106 COD 107 DSZ Addr4 108 JMP 105 109 LDK Addr1,8 110 FCD 111 CLA 112 STD Addr3,0

#Slice Registers #Slice LUTs #Fully used Bit Slices

#Bonded IOBs Used 346 1348 312 66 Utilization 1% 7% 22% 16%

Design ¡of ¡an ¡ASIP ¡IDEA ¡Crypto ¡Processor, ¡

R. ¡Faghih ¡Mirzaee ¡and ¡M. ¡Eshghi ¡

14 ¡

SLIDE 15

15 ¡ Design ¡of ¡an ¡ASIP ¡IDEA ¡Crypto ¡Processor, ¡

R. ¡Faghih ¡Mirzaee ¡and ¡M. ¡Eshghi ¡

Di Fi Instruction Name Description

Ins. Reference

Opcode D0 F0 RCV Receive ACL ß ß INPR I/O 00000 D0 F1 SND Send OUTR ß ß ACL I/O 00001 D0 F2 SKI Skip if FGI FGI: PC ß ß PC+1 I/O 00010 D0 F3 SKO Skip if FGO FGO: PC ß ß PC+1 I/O 00011 D0 F4 ION IEN On IEN ß ß 1 I/O 00100 D0 F5 IOF IEN Off IEN ß ß 0 I/O 00101 D0 F6 CLA Clear Accumulator ACL ß ß 0 Register 00110 D0 F7 CAF Clear Arithmetic Flags S, Z, O, C, E ß ß 0 Register 00111 D0 F8 CMA Complement Accumulator ACL ß ß ¬ACL Register 01000 D0 F9 CMC Complement Carry Flag C ß ß ¬C Register 01001 D0 F10 INC Increment Accumulator ACL ß ß ACL+1 Register 01010 D0 F11 DEC Decrement Accumulator ACL ß ß ACL-1 Register 01011 D0 F12 SHL Shift Left Accumulator ACL ß ß SHL(ACL) Register 01100 D0 F13 SHR Shift Right Accumulator ACL ß ß SHR(ACL) Register 01101 D0 F14 ROL Rotate Left Accumulator ACL ß ß ROL(ACL) Register 01110 D0 F15 ROR Rotate Right Accumulator ACL ß ß ROR(ACL) Register 01111 D0 F16 SNA Skip if Negative Accumulator S: PC ß ß PC+1 Register 10000 D0 F17 SZA Skip if Zero Accumulator Z: PC ß ß PC+1 Register 10001 D0 F18 SEA Skip if Even Accumulator E: PC ß ß PC+1 Register 10010 D0 F19 SZC Skip if Zero Carry Flag ¬C: PC ß ß PC+1 Register 10011 D0 F20 HLT Hult SC ß ß Disable Register 10100 D0 F21 COD Coder Xi ß ß COD(Xi) Specific / Register 10101 D0 F22 FCD Final Coder Xi ß ß FCD(Xi) Specific / Register 10110

SLIDE 16

16 ¡ Design ¡of ¡an ¡ASIP ¡IDEA ¡Crypto ¡Processor, ¡

R. ¡Faghih ¡Mirzaee ¡and ¡M. ¡Eshghi ¡

Di Fi Instruction Name Description

Ins. Reference

Opcode D2 | D3 F0 COM Complement ACL ß ß ¬M[AR] Memory 00000 D2 | D3 F1 AND AND ACL ß ß M[AR]∧ACL Memory 00001 D2 | D3 F2 ORR OR ACL ß ß M[AR]∨ACL Memory 00010 D2 | D3 F3 NAN NAND ACL ß ß ¬(M[AR]∧ACL) Memory 00011 D2 | D3 F4 NOR NOR ACL ß ß ¬(M[AR]∨ACL) Memory 00100 D2 | D3 F5 XOR XOR ACL ß ß M[AR]⊕ACL Memory 00101 D2 | D3 F6 XNR XNOR ACL ß ß ¬(M[AR]⊕ACL) Memory 00110 D2 | D3 F7 ADD Addition ACL ß ß M[AR]+ACL Memory 00111 D2 | D3 F8 SUB Subtraction ACL ß ß M[AR]-ACL Memory 01000 D2 | D3 F9 MUL Multiplication ACL ß ß M[AR]×ACL Memory 01001 D2 | D3 F10 LDA Load Accumulator ACL ß ß M[AR] Memory 01010 D2 | D3 F11 STA Store Accumulator M[AR] ß ß ACL Memory 01011 D2 | D3 F12 JMP Jump PC ß ß AR Memory 01100 D2 | D3 F13 JSR Jump and Save Return Address M[AR] ß ß PC PC ß ß AR Memory 01101 D2 | D3 F14 DSZ Decrement and Skip if Zero ACL/M[AR] ß ß M[AR]-1 Z: PC ß ß PC+1 Memory 01110 D2 | D3 F15 LDK Load Keys Ki ß ß M[AR] Specific / Memory 01111 D2 | D3 F16 LDD Load Data Xi ß ß M[AR] Specific / Memory 10000 D2 | D3 F17 STD Store Data M[AR] ß ß Xi Specific / Memory 10001 D2 | D3 F18 GEK Generate Encryption Keys M[AR] ß ß K1 … K52 Specific / Memory 10010 D2 | D3 F19 GDK Generate Decryption Keys M[AR+54] ß ß K’1 … K’52 Specific / Memory 10011

SLIDE 17

17 ¡ Design ¡of ¡an ¡ASIP ¡IDEA ¡Crypto ¡Processor, ¡

R. ¡Faghih ¡Mirzaee ¡and ¡M. ¡Eshghi ¡

References

K. Keutzer, S. Malik, and A.R. Newton, “From ASIC to

ASIP: the next design discontinuity”, In Proceedings of IEEE International Conference on Computer Design: VLSI in Computers and Processors, pp. 84-90, 2002.

2. X. Lai and J. Massey, “A proposal for a new block encryption

standard”, Advances in Cryptology-EUROCRYPT’90, Berlin, Germany: Springer-Verlag, pp. 389-404, 1991.

3. A. Hamalainen, M. Tomminska, and J. Skytta, “6.78 gigabits

per second implementation of the IDEA cryptographic algorithm”, 12th International Conference on Field Programmable Logic and Applications, pp. 760-769, 2002.

4. S. Wolter, H. Matz, A. Schubert, and R. Laur, “On the VLSI

implementation of the international data encryption algorithm IDEA”, Symposium on IEEE International Circuits and Systems, ISCAS’95, Seattle, WA, USA, pp. 397-400, 1995.

5. I. Gonzalez, S. Lopez-Buedo, F.J. Gomez, and J. Martinez,

“Using partial reconfiguration in cryptographic applications: an implementation of the IDEA algorithm”, 13th International Conference on Field Programmable Logic and Application, pp. 194-203, 2003.

6. N. Sklavos and O. Koufopavlou, “Asynchronous low power

VLSI implementation of the International Data Encryption Algorithm”, 8th IEEE International Conference on Electronics, Circuits and Systems, ICECS’01, Vol. 3, pp. 1425-1428, 2001.

7. O.Y.H. Cheung, K.H. Tsoi, M.P. Leong, and M.P. Leong,

“Tradeoffs in Parallel and Serial Implementations of the International Data Encryption Algorithm IDEA”, In Proceedings of the Cryptographic Hardware and Embedded Systems Workshop, CHES, Paris, pp. 333-347, 2001.

8. J.M. Granado, M.A. Vega-Rodriguez, J.M. Sanchez-Perez,

and J.A. Gomez-Pulido, “IDEA and AES, two cryptographic algorithms implemented using partial and dynamic reconfiguration”, Microelectronics Journal, Vol. 40, No. 6, pp. 1032-1040, June 2009.

9. R. Buchty, Cryptonite – A Programmable Crypto Processor

Architecture for Hih-Bandwidth Applications, PhD thesis, Technische Universitat Munchen, LRR, Sep. 2002.

10. R. Modugu, Y.-B. Kim, and M. Choi, “Design and

performance measurement of efficient IDEA (International Data Encryption Algorithm) crypto-hardware using novel modular arithmetic components”, Instrumentation and Measurements Technology Conference (I2MTC), Austin, TX,

pp. 1222-1227, 2010.
11. S. Mukherjee and B. Sahoo, “A novel modulo (2n+1)

multiplication approach for IDEA cipher”, International Journal of Programmable Device Circuits and Systems, Vol. 2, No. 11, Nov. 2010.

12. A.J. Menezes, P.C. Van Oorschot, and S.A. Vanstone,

Handbook of Applied Cryptography, CRC Press, Page 608, 1996.

13. C. Meier and R. Zimmerman, A multiplier midule 2n+1,

Diploma thesis, Institut fur Integrierte Systeme, ETH, Zurich, Switzerland, 1991.

14. P.H.W. Leong, O.Y.H. Cheung, K.H. Tsoi, and P.H.W. Leong,

“A Bit-Serial Implementation of the International Data Encryption Algorithm IDEA”, In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, FCCM, Napa Valley, California, USA, pp. 122-131, 2000.

15. M.M. Mano, Computer System Architecture, 3rd Edition, 1992.
16. A.J. Menezes, P.C. Van Oorschot, and S.A. Vanstone,

Handbook of Applied Cryptography, CRC Press, pp. 265-266, 1996.

SLIDE 18

18 ¡

Thank you for your attention

Design ¡of ¡an ¡ASIP ¡IDEA ¡Crypto ¡Processor, ¡

R. ¡Faghih ¡Mirzaee ¡and ¡M. ¡Eshghi ¡