 
              Lightweight Cryptography and and RFID Security Svetla Nikova COSIC KUL COSIC, KULeuven and UTwente d UT t
Overview • Lightweight cryptography - state of the art • Comparison Standard vs. Lightweight Comparison Standard vs. Lightweight • SCA countermeasures - TI approach • TI implementations • Area, Power or Throughput? • Conclusions
Lightweight Crypto Stream ciphers (3): only the eStream finalists: 2005 Grain, Trivium, Mickey
Lightweight Crypto Stream ciphers (3): only the eStream finalists: 2005 Grain, Trivium, Mickey Bl Block ciphers (25): k i h (25) 1977 DES, 1989 GOST 1997 XTEA 1998 AES 2005 mCrypton, STEA 2006 Hight, SEA 2007 Clefia Kasumi DESL DESXL Present 2007 Clefia, Kasumi, DESL, DESXL, Present 2008 Puffin 2009 Katan, Ktantan, Hummingbird, MIBS 2010 PRINT 2011 Klein, LED, Twine, EPCBC, Vitamin-B, Piccolo
Lightweight Crypto Stream ciphers (3): only the eStream finalists: 2005 Grain, Trivium, Mickey Bl Block ciphers (25): k i h (25) 1977 DES, 1989 GOST 1997 XTEA 1998 AES 2005 mCrypton, STEA 2006 Hight, SEA 2007 Clefia Kasumi DESL DESXL Present 2007 Clefia, Kasumi, DESL, DESXL, Present 2008 Puffin 2009 Katan, Ktantan, Hummingbird, MIBS 2010 PRINT 2011 Klein, LED, Twine, EPCBC, Vitamin-B, Piccolo Hash functions (10): 2007 MAME 2008 Squash DM Present H Present Keccak 2008 Squash, DM-Present, H-Present, Keccak 2010 Quark, Armadilo 2011 Spongent, Vitamin-H, Photon
Overview • Lightweight crypto - state of the art • Comparison Standard vs. Lightweight Comparison Standard vs. Lightweight • SCA countermeasures - TI approach • TI implementations • Area, Power or Throughput? • Conclusions
Standard vs. Lightweight Modules AES ‐ T Present AES-T (TUG) 2005 0.35 µm CMOS Technology of Philips Memory 2040 887 100kHz @1.5V Encryption + Decryption 100kHz @1 5V Encryption + Decryption [GE] Present (RUB/DTU/ORANGE) 2007 Mix Column 306 0 UMC L180 0.18 μ m 1P6M 100 kHz @1.8V [GE] Encryption only S ‐ box 408 32 Difficult to compare: [GE] • Different technology – GE differs • Power depends even more on the FSM + Rest 646 192 [GE] [GE] technology used. • Here 0.35 µm vs. 0.18 μ m is a big technology difference! Total Area 3400 1111 • Cycles do not depend on technology C l d t d d t h l [GE] • But AES=128 while Present=64 bits Cycles 1000 547 • Some implementations - encryption only others include decryption too only others include decryption too. Power 3.0 1.34 • Including decryption in AES adds [ μ A] cost in MixColumn and FSM.
Standard (hits back) vs. Lightweight Modules AES ‐ T AES ‐ B Present Memory 2040 1678 887 AES T AES-T [GE] 0.35 µm CMOS Mix Column 306 373 0 Technology of Philips [GE] 100kHz @1.5V S ‐ box 408 233 32 Encryption + Decryption [GE] Present and AES-B FSM + Rest 646 317 192 UMC L180 0 18 μ m 1P6M UMC L180 0.18 μ m 1P6M [GE] [GE] 100 kHz @1.8V Encryption only Total Area 3400 2601 1111 Fair comparison is now F i i i [GE] possible. Cycles 1000 226 547 Power 3.0 3.7 1.34 [ μ A]
Standard (hits back) vs. Lightweight Standard designs hit back AES-B (RUB/NTU) 2011
Standard (hits back) vs. Lightweight • Memory becomes smaller Modules AES ‐ T AES ‐ B Present in GE due to technology Memory 2040 1678 887 change change. [GE] • MixColumns become Mix Column 306 373 0 bigger but this is the [GE] trade-off in order to trade off in order to S ‐ box 408 233 32 gain more in the FSM. [GE] • Canright’s S-box is used FSM + Rest 646 317 192 which is smaller, but not [GE] [GE] as much as indicated (again because of the technology change). Total Area 3400 2601 1111 • It is difficult to compare I i diffi l [GE] the FSM since AES-T Cycles 1000 226 547 contains also the decryption still AES B decryption, still AES-B Power 3.0 3.7 1.34 state machine is smaller. [ μ A]
Standard vs. Lightweight (updated) Smaller key and block size Modules AES ‐ B Present • 128 bit - too much Memory 1678 887 • 80 bit key and 64 bit data – ok y [GE] • 32, 48 bit data might be acceptable? Mix Column 373 0 128 + 128 100 % [GE] 80 + 64 80 + 64 56 25 % 56.25 % S ‐ box 233 32 [GE] 80 + 48 50 % FSM + Rest 317 192 80 + 32 43.75 % [GE] [GE] • Memory • 65% for AES-B Total Area 2601 1111 • 80% for Present • 80% for Present [GE] Cycles 226 547 Power 3.7 1.34 [ μ A]
Standard vs. Lightweight (updated) • P-layer costs 0 for Present. Modules AES ‐ B Present • Simple FSM can save a lot. Memory 1678 887 [GE] • 8x8 S-box costs ~300 GE or Mix Column 373 0 at least 200. [GE] • While an 4x4 S-box costs ~ 50 GE or at least 30. S ‐ box 233 32 [GE] • Saving of 6 to 7 times in the S-box. FSM + Rest 317 192 • Weaker S-box and P-layer W k S b d P l [GE] [GE] compensated by a larger number of rounds - 31 vs. 10. Total Area 2601 1111 [GE] Cycles 226 547 Power 3.7 1.34 [ μ A]
Standard vs. Lightweight (updated) Still the lightweight cipher is more than twice smaller. And also the power consumption is ~ 3 times less. p p
Overview • Lightweight crypto - state of the art • Comparison Standard vs. Lightweight Comparison Standard vs. Lightweight • SCA countermeasures - TI approach • TI implementations • Area, Power or Throughput? • Conclusions
Side-Channel Attacks Device executing the cryptographic algorithm leaks information on internal state Instantaneous leakage depends on intermediate variables, which results in equations ti That have lower nonlinearity That may contain noise Power consumption depends on:  Instructions executed  Data processed Signal is noisy; multiple measurements needed d d
SCA countermeasures at different levels Hardware logic style Hardware logic style Relieves cryptographers Places burden on hardware designers Algorithms and implementations Algorithms and implementations Probably lowest feasible level Ciphers and Protocols Ciphers and Protocols New standards, takes time
Lightweight SCA protection Simple masking are vulnerable due to glitches. Private circuits [Ishai et al ] Private circuits [Ishai et.al.] – too expensive, not realistic model. too expensive not realistic model Multi-party computation (TI) made practical. 1. Correctness z = f (x) z = f (x) 2. Non-completeness z 1 = f 1 (x 1 ,x 2 ) f ( ) 3 3. I d Independent uniform d t if distribution of input z 2 = f 2 (x 1 ,x 3 ) z 3 = f 3 (x 2 ,x 3 ) 3 ( 2 3 ) 3 Power consumption of each f i is independent of x 1 , x 2 , x 3. Secure in the presence of glitches (transition count model) Secure in the presence of glitches (transition count model) against 1 st order SCA.
Example: multiplier • = secure AND gate • 3 shares 3 shares • Secure in the presence of glitches
Lightweight SCA protection Protecting Arbitrary Functions: Multiplication of Multiplication of n elements needs at least n+1 shares elements needs at least +1 shares Hardware size increases about quadratic with the number of shares Can we reduce the number of shares? Hence 3 shares we can apply only to the quadratic functions Hence 3 shares we can apply only to the quadratic functions. Pipelining: Registers are insensitive to glitches g g Split functions into parts with less non-linearity Use registers between combinatorial parts Problem: Property 3: the inputs of each step need to be independent uniformly distributed Pipelining: output of each step is input of next step W We need Property 3 for output as well. d P t 3 f t t ll
Lightweight SCA protection Which functions can we protect? The number of shares depends on the degree of the function. Th b f h d d h d f h f i Hence 3 shares we can apply only to the quadratic functions. • The multiplications in GF(2) (AND gate) and GF(4). • The Boolean functions with 2 and 3 inputs • The Boolean functions with 2 and 3 inputs. • Noekeon (KUL) 2000, S-box. S(x) = NL(L(NL(x)) Pipelined implementation
Noekeon Implementation Results • Implementation using Austria Microsystems Standard Cell Library CMOS 0.35 μ m • S-Box:  54 GE (implementation of 2 quadratic mappings)  correlation • Protected S-Box:  188 GE (excluding 12 bit register)  no correlation between shares and unshared values unshared values • Less than 4x increase (actually 3.5x) in size N t Note nonlinear part only. li t l
Noekeon Implementation Results • An 4x4 S-box costs ~ 50 GE or at-least 30 GE, but the s-box of Noekeon is 54 the s-box of Noekeon is 54 GE when decomposed in two quadratic mappings. • • Since the shared mappings Since the shared mappings are less efficient than the originals we get instead of theoretically expected 3x y p increase slightly more 3.5x.
Recommend
More recommend