Lightweight Cryptography and and RFID Security Svetla Nikova - - PowerPoint PPT Presentation

lightweight cryptography and and rfid security
SMART_READER_LITE
LIVE PREVIEW

Lightweight Cryptography and and RFID Security Svetla Nikova - - PowerPoint PPT Presentation

Lightweight Cryptography and and RFID Security Svetla Nikova COSIC KUL COSIC, KULeuven and UTwente d UT t Overview Lightweight cryptography - state of the art Comparison Standard vs. Lightweight Comparison Standard vs. Lightweight


slide-1
SLIDE 1

Lightweight Cryptography and and RFID Security

Svetla Nikova COSIC KUL d UT t COSIC, KULeuven and UTwente

slide-2
SLIDE 2

Overview

  • Lightweight cryptography - state of the art
  • Comparison Standard vs. Lightweight

Comparison Standard vs. Lightweight

  • SCA countermeasures - TI approach
  • TI implementations
  • Area, Power or Throughput?
  • Conclusions
slide-3
SLIDE 3

Lightweight Crypto

Stream ciphers (3): only the eStream finalists:

2005 Grain, Trivium, Mickey

slide-4
SLIDE 4

Lightweight Crypto

Stream ciphers (3): only the eStream finalists:

2005 Grain, Trivium, Mickey

Bl k i h (25) Block ciphers (25):

1977 DES, 1989 GOST 1997 XTEA 1998 AES 2005 mCrypton, STEA 2006 Hight, SEA 2007 Clefia Kasumi DESL DESXL Present 2007 Clefia, Kasumi, DESL, DESXL, Present 2008 Puffin 2009 Katan, Ktantan, Hummingbird, MIBS 2010 PRINT 2011 Klein, LED, Twine, EPCBC, Vitamin-B, Piccolo

slide-5
SLIDE 5

Lightweight Crypto

Stream ciphers (3): only the eStream finalists:

2005 Grain, Trivium, Mickey

Bl k i h (25) Block ciphers (25):

1977 DES, 1989 GOST 1997 XTEA 1998 AES 2005 mCrypton, STEA 2006 Hight, SEA 2007 Clefia Kasumi DESL DESXL Present 2007 Clefia, Kasumi, DESL, DESXL, Present 2008 Puffin 2009 Katan, Ktantan, Hummingbird, MIBS 2010 PRINT 2011 Klein, LED, Twine, EPCBC, Vitamin-B, Piccolo

Hash functions (10):

2007 MAME 2008 Squash DM Present H Present Keccak 2008 Squash, DM-Present, H-Present, Keccak 2010 Quark, Armadilo 2011 Spongent, Vitamin-H, Photon

slide-6
SLIDE 6

Overview

  • Lightweight crypto - state of the art
  • Comparison Standard vs. Lightweight

Comparison Standard vs. Lightweight

  • SCA countermeasures - TI approach
  • TI implementations
  • Area, Power or Throughput?
  • Conclusions
slide-7
SLIDE 7

Standard vs. Lightweight

Modules AES ‐ T Present Memory 2040 887 AES-T (TUG) 2005 0.35 µm CMOS Technology of Philips 100kHz @1 5V Encryption + Decryption [GE] Mix Column [GE] 306 100kHz @1.5V Encryption + Decryption Present (RUB/DTU/ORANGE) 2007 UMC L180 0.18μm 1P6M 100 kHz @1.8V Encryption only S‐box [GE] 408 32 FSM + Rest [GE] 646 192

Difficult to compare:

  • Different technology – GE differs
  • Power depends even more on the

[GE] Total Area 3400 1111

technology used.

  • Here 0.35 µm vs. 0.18 μm is a big

technology difference! C l d t d d t h l

[GE] Cycles 1000 547

  • Cycles do not depend on technology
  • But AES=128 while Present=64 bits
  • Some implementations - encryption
  • nly others include decryption too

Power [μA] 3.0 1.34

  • nly others include decryption too.
  • Including decryption in AES adds

cost in MixColumn and FSM.

slide-8
SLIDE 8

Standard (hits back) vs. Lightweight

Modules AES ‐ T AES ‐ B Present Memory 2040 1678 887 AES T [GE] Mix Column [GE] 306 373 AES-T 0.35 µm CMOS Technology of Philips 100kHz @1.5V S‐box [GE] 408 233 32 FSM + Rest [GE] 646 317 192 Encryption + Decryption Present and AES-B UMC L180 0 18μm 1P6M [GE] Total Area 3400 2601 1111 UMC L180 0.18μm 1P6M 100 kHz @1.8V Encryption only F i i i [GE] Cycles 1000 226 547 Fair comparison is now possible. Power [μA] 3.0 3.7 1.34

slide-9
SLIDE 9

Standard (hits back) vs. Lightweight

Standard designs hit back AES-B (RUB/NTU) 2011

slide-10
SLIDE 10

Standard (hits back) vs. Lightweight

Modules AES ‐ T AES ‐ B Present Memory 2040 1678 887

  • Memory becomes smaller

in GE due to technology change

[GE] Mix Column [GE] 306 373

change.

  • MixColumns become

bigger but this is the trade-off in order to

S‐box [GE] 408 233 32 FSM + Rest [GE] 646 317 192

trade off in order to gain more in the FSM.

  • Canright’s S-box is used

which is smaller, but not

[GE] Total Area 3400 2601 1111

as much as indicated (again because of the technology change). I i diffi l

[GE] Cycles 1000 226 547

  • It is difficult to compare

the FSM since AES-T contains also the decryption still AES B

Power [μA] 3.0 3.7 1.34

decryption, still AES-B state machine is smaller.

slide-11
SLIDE 11

Standard vs. Lightweight (updated)

Modules AES ‐ B Present Memory 1678 887

Smaller key and block size

  • 128 bit - too much
  • 80 bit key and 64 bit data – ok

[GE] Mix Column [GE] 373

y

  • 32, 48 bit data might be acceptable?

128 + 128 100 % 80 + 64 56 25 % S‐box [GE] 233 32 FSM + Rest [GE] 317 192 80 + 64 56.25 % 80 + 48 50 % 80 + 32 43.75 % [GE] Total Area 2601 1111

  • Memory
  • 65% for AES-B
  • 80% for Present

[GE] Cycles 226 547

  • 80% for Present

Power [μA] 3.7 1.34

slide-12
SLIDE 12

Standard vs. Lightweight (updated)

Modules AES ‐ B Present Memory 1678 887

  • P-layer costs 0 for Present.
  • Simple FSM can save a lot.

[GE] Mix Column [GE] 373

  • 8x8 S-box costs ~300 GE or

at least 200.

  • While an 4x4 S-box costs ~ 50 GE or

S‐box [GE] 233 32 FSM + Rest [GE] 317 192

at least 30.

  • Saving of 6 to 7 times in the S-box.

W k S b d P l

[GE] Total Area 2601 1111

  • Weaker S-box and P-layer

compensated by a larger number of rounds - 31 vs. 10.

[GE] Cycles 226 547 Power [μA] 3.7 1.34

slide-13
SLIDE 13

Standard vs. Lightweight (updated)

Still the lightweight cipher is more than twice smaller. And also the power consumption is ~ 3 times less. p p

slide-14
SLIDE 14

Overview

  • Lightweight crypto - state of the art
  • Comparison Standard vs. Lightweight

Comparison Standard vs. Lightweight

  • SCA countermeasures - TI approach
  • TI implementations
  • Area, Power or Throughput?
  • Conclusions
slide-15
SLIDE 15

Side-Channel Attacks

Device executing the cryptographic algorithm leaks information on internal state Instantaneous leakage depends on intermediate variables, which results in ti equations That have lower nonlinearity That may contain noise Power consumption depends on:

  • Instructions executed
  • Data processed

Signal is noisy; multiple measurements d d needed

slide-16
SLIDE 16

SCA countermeasures at different levels

Hardware logic style Hardware logic style Relieves cryptographers Places burden on hardware designers Algorithms and implementations Algorithms and implementations Probably lowest feasible level Ciphers and Protocols Ciphers and Protocols New standards, takes time

slide-17
SLIDE 17

Lightweight SCA protection

Simple masking are vulnerable due to glitches. Private circuits [Ishai et al ] too expensive not realistic model z = f (x) Private circuits [Ishai et.al.] – too expensive, not realistic model. Multi-party computation (TI) made practical.

1. Correctness

z = f (x) f ( )

2. Non-completeness 3 I d d t if

z1= f1 (x1,x2) z2= f2 (x1,x3) z3= f3 (x2,x3)

3. Independent uniform distribution of input

3 3 ( 2 3)

Power consumption of each fi is independent of x1, x2, x3. Secure in the presence of glitches (transition count model) Secure in the presence of glitches (transition count model) against 1st order SCA.

slide-18
SLIDE 18

Example: multiplier

  • = secure AND gate
  • 3 shares

3 shares

  • Secure in the presence of glitches
slide-19
SLIDE 19

Lightweight SCA protection

Protecting Arbitrary Functions: Multiplication of elements needs at least +1 shares Multiplication of n elements needs at least n+1 shares Hardware size increases about quadratic with the number of shares Can we reduce the number of shares? Hence 3 shares we can apply only to the quadratic functions Hence 3 shares we can apply only to the quadratic functions. Pipelining: Registers are insensitive to glitches g g Split functions into parts with less non-linearity Use registers between combinatorial parts Problem: Property 3: the inputs of each step need to be independent uniformly distributed Pipelining: output of each step is input of next step W d P t 3 f t t ll We need Property 3 for output as well.

slide-20
SLIDE 20

Lightweight SCA protection

Which functions can we protect? Th b f h d d h d f h f i The number of shares depends on the degree of the function. Hence 3 shares we can apply only to the quadratic functions.

  • The multiplications in GF(2) (AND gate) and GF(4).
  • The Boolean functions with 2 and 3 inputs
  • The Boolean functions with 2 and 3 inputs.
  • Noekeon (KUL) 2000, S-box.

S(x) = NL(L(NL(x)) Pipelined implementation

slide-21
SLIDE 21

Noekeon Implementation Results

  • Implementation using Austria

Microsystems Standard Cell Library CMOS 0.35μm

  • S-Box:
  • 54 GE (implementation of 2

quadratic mappings)

  • correlation
  • Protected S-Box:
  • 188 GE (excluding 12 bit register)
  • no correlation between shares and

unshared values unshared values

  • Less than 4x increase (actually 3.5x)

in size N t li t l Note nonlinear part only.

slide-22
SLIDE 22

Noekeon Implementation Results

  • An 4x4 S-box costs ~ 50

GE or at-least 30 GE, but the s-box of Noekeon is 54 the s-box of Noekeon is 54 GE when decomposed in two quadratic mappings.

  • Since the shared mappings
  • Since the shared mappings

are less efficient than the

  • riginals we get instead of

theoretically expected 3x y p increase slightly more 3.5x.

slide-23
SLIDE 23

Lightweight SCA protection

Which 4x4 S-boxes can we protect? A Poschmann et al 2010 Present S box also can be decomposed A.Poschmann et.al 2010 – Present S-box also can be decomposed. Hence similar to Noekeon, Present can be shared with 3 shares only. There are 302 affine-equivalent classes for the 4 x 4 bijections: There are 302 affine equivalent classes for the 4 x 4 bijections: 295 cubic classes, 6 quadratic classes and 1 affine class. Bijections (permutations) in GF(2)4 belong to the symmetric group S16. Theorem A 4 x 4 bijection can be decomposed using quadratic bijections

  • Theorem. A 4 x 4 bijection can be decomposed using quadratic bijections

if and only if it belongs to the alternating group A16 (151 classes).

slide-24
SLIDE 24

Lightweight SCA protection

Which 4x4 S-boxes can we protect? There are 302 affine-equivalent classes for the 4 x 4 bijections: 295 cubic classes, 6 quadratic classes and 1 affine class. Bijections (permutations) in GF(2)4 belong to the symmetric group S16. j (p ) ( ) g y g p

16.

  • Theorem. A 4 x 4 bijection can be decomposed using quadratic bijections

if and only if it belongs to the alternating group A16 (151 classes). H th 302/2 6 1 144 bi l i A hi h b Hence there are 302/2 - 6 - 1 = 144 cubic classes in A16 which can be decomposed.

  • 30 classes can be decomposed with length 2,
  • the remaining 114 classes can be decomposed with length 3
  • the remaining 114 classes can be decomposed with length 3.

Thus 144 classes can be masked using only 3 shares. Decomposable S-boxes: Noekeon; Present; Serpent 0 1 2 6; Khazad PQ Decomposable S boxes: Noekeon; Present; Serpent 0,1,2,6; Khazad P,Q.

slide-25
SLIDE 25

Overview

  • Lightweight crypto - state of the art
  • Comparison Standard vs. Lightweight

Comparison Standard vs. Lightweight

  • SCA countermeasures - TI approach
  • TI implementations
  • Area, Power or Throughput?
  • Conclusion
slide-26
SLIDE 26

PRESENT ‐ Implementation Results

Modules Present Present TI Memory 887 2635 .8V

A.Poschmann et.al 2010

[GE] 300% Mix Column [GE] 0 kHz @1

Memory 80+64 bits Shared 3x increase Efficient S-box only 32 GE.

S‐box [GE] 32 355 11x FSM + Rest [G ] 192 592 308% P6M - 100

y Shared 8.8x increase + 12 bit register (pipelined) The FSM increases 3 times

[GE] 308% Total Area 1111 3582 0.18μm 1

The FSM increases 3 times. Pipeline increases the cycles and slightly the control.

[GE] 322% Cycles 547 578 106% UMC L180

In total the increase is ~ 3x Cycles – small increase only. But the power increases ~4x.

Power [μA] 1.34 5.02 375% U

slide-27
SLIDE 27

Lightweight SCA protection

Can we protect AES S-box 8x8 or only 4x4 S-boxes? N li i i

  • Nonlinear part = inversion over

GF(256)

  • Tower field approach
  • Need to ensure Property 3 in

every step

  • No efficient method

No efficient method

  • Large search space
  • Ongoing research to make it

g g efficient.

slide-28
SLIDE 28

Lightweight SCA protection

Can we protect the AES S-box? R b h h h i f h l i li i i GF(4) Remember we have the sharing of the multiplications in GF(4). But this multiplication is the only non-linear in the AES (Canright) S-box. S-box is transformed from S-box is transformed from GF(28) to GF(28)/GF(24)/GF(22) Tower field approach RUB/NTU 2011 [MPLPW2011]

slide-29
SLIDE 29

Lightweight SCA protection

Can we protect the AES S-box? Th l i li i i GF(4) i h l li i h AES S b The multiplications in GF(4) is the only non-linear in the AES S-box. Recall our countermeasure requires registers between different stages of shared functions. Thus Canright’s S box representation requires in total five pipelining Thus Canright s S-box representation requires in total five pipelining stages. This implies that in total one needs to store 174 bits.

slide-30
SLIDE 30

AES Implementation Results

Modules AES ‐ B AES TI Memory 1678 5055 8V

RUB/NTU 2011 [MPLPW2011] Memory 2x128 bits

y [GE] 300% Mix Column [GE] 373 1120 300% 0 kHz @1.8

Memory 2x128 bits Shared 3x increase Complex S-box only 233 GE. Sh d 13 7 i +

S‐box [GE] 233 4244 18x FSM + Rest 317 695 P6M - 100

Shared 13.7x increase + 174 bit register (pipelined) The FSM increases only

[GE] 219% Total Area 2601 11114 0.18μm 1P

2 times. In total the increase is ~ 4x

[GE] 427% Cycles 226 266 118% MC L180 0

Cycles – small increase only. But the power increases ~4x.

Power [μA] 3.7 13.4 362% UM

slide-31
SLIDE 31

Threshold Implementation [MPLPW2011]

  • Present TI - first order DPA fail with 5 million measurements.

(data masking, key masking, random data and key permutations).

  • AES TI - 5 million traces correlation collision attack succeeds because

uniformity fails and resharing is required.

  • With resharing 100 million traces are still insufficient for CPA using a

HD d l d MIA i HD d l thi d d CPA ith 400 HD model and MIA using a HD model, even third-order CPA with 400 million traces fails.

slide-32
SLIDE 32

Threshold Implementation Results

Modules AES ‐ B AES TI Present Present TI Memory 1678 5055 887 2635 UMC L1 [GE] 300% 300% Mix Column [GE] 373 1120 300% 180 0.18μ S‐box [GE] 233 4244 18215% 32 355 11094% FSM + Rest [G ] 317 695 219% 192 592 308% m 1P6M - [GE] 219% 308% Total Area 2601 11114 1111 3582

  • 100 kHz

[GE] 427% 322% Cycles 226 266 118% 547 578 106% @1.8V Power [μA] 3.7 13.4 362% 1.34 5.02 375%

slide-33
SLIDE 33

Overview

  • Lightweight crypto - state of the art
  • Comparison Standard vs. Lightweight

Comparison Standard vs. Lightweight

  • SCA countermeasures - TI approach
  • Comparing different TI implementations

g

  • Area, Power or Throughput?
  • Conclusions
slide-34
SLIDE 34

Area, Power or Throughput

3000 4000

Area

Cipher Area [GE] NXP 0.140 µm Power [µW] Consumption @ 1 MHz, 1.2V Throughput [bit/cycle]

1000 2000

Power

AES ‐T 3162 5.95 0.12 Present 1173 3.45 0.12

00 10.00 15.00

Power

1598 5.56 2.06 Katan 64 984 7.62 0.50

0.00 5.00

Throughput

1102 8.63 0.75 Grain 861 7.40 1.00

0.00 0.50 1.00 1.50 2.00 2.50 A P P K K G T C

Trivium 1298 12.02 1.00 Crypto 1 306 2.57 1.00

AES (Tina) Present Present Katan64 Katan64 Grain Trivium Crypto 1

slide-35
SLIDE 35

Power and Throughput

Road tolling example: car passing with high speed should authenticate with antenna/reader on certain (height) distance. ( g ) Requirements: Di t < 10 12 Ti < 10 Distance < 10-12 m; Time < 10 ms. Why power is so important? In that “extreme” example the power consumption is more important than In that extreme example the power consumption is more important than the area. The excess of power can be used to improve the throughput. Can we do crypto on RFID 12 meters far away? C th ti t t i h t ti f ? Can we authenticate a tag in short time so far away?

slide-36
SLIDE 36

Power and Throughput

Toll example requirements: Distance < 10-12 m; Time < 10 ms. So we can not only do a crypto but we can make a So we can not only do a crypto, but we can make a full authentication even with SCA protected lightweight implementation.

Distance for Fixed Time 10 ms

Cipher / Authentication Time [ms] Distance [m] AES‐T 6 10 10 12

12 20 11 10 20 30

Fixed Time 10 ms

AES TI 10 7 23 10

7

Time at Fixed Distance 10 m

Present 2 10 10 20 Present TI 8 10

6 23 8 5 10 15 20 25

Present TI 8 10 10 11

2 5 AES AES TI Present Presnt TI

slide-37
SLIDE 37

Overview

  • Lightweight crypto - state of the art
  • Comparison Standard vs. Lightweight

Comparison Standard vs. Lightweight

  • SCA countermeasures - TI approach
  • TI implementations
  • Lightweight Area, Power or Throughput?
  • Conclusions
slide-38
SLIDE 38

Conclusions

  • Young and challenging research area

g g g

  • Already many interesting lightweight designs

available

  • New lightweight primitives should be designed with

SCA protection in mind

  • The semiconductor industry shows interest in
  • The semiconductor industry shows interest in

implementing lightweight primitives with SCA countermeasures

  • Research should focus on all parameters not only
  • n area
slide-39
SLIDE 39

Thank you!