Better than Brute-Force Optimized Hardware Architecture for - - PowerPoint PPT Presentation

better than brute force
SMART_READER_LITE
LIVE PREVIEW

Better than Brute-Force Optimized Hardware Architecture for - - PowerPoint PPT Presentation

Better than Brute-Force Optimized Hardware Architecture for Effcient Biclique Attacks on AES-128 Andrey Bogdanov*, Elif Bilge Kavun**, Christof Paar**, Christian Rechberger***, Tolga Yalcin** * KU Leuven, Belgium, ** HGI-RUB, Germany, ***


slide-1
SLIDE 1

Andrey Bogdanov*, Elif Bilge Kavun**, Christof Paar**, Christian Rechberger***, Tolga Yalcin**

* KU Leuven, Belgium, ** HGI-RUB, Germany, *** DTU, Denmark

Better than Brute-Force

Optimized Hardware Architecture for Effcient Biclique Attacks on AES-128

slide-2
SLIDE 2
  • Meet-in-the-Middle with Bicliques
  • Low Data Complexity Biclique Cryptanalysis of

AES-128

  • Optimized Brute Force Attack on AES-128

– on FPGA – on ASIC

  • Biclique Attack on AES-128

– on FPGA – on ASIC

  • Conclusion

Overview

slide-3
SLIDE 3

MITM with Bicliques

  • Allow all key bits affect a part of the cipher
  • Stick to a structure to enable efficient

enumeration of keys and states in this part

  • Structure = biclique!

plaintexts {P} ciphertexts {C} m encryption oracle

K2 K2 K1 K

all key bits

slide-4
SLIDE 4

MITM with Bicliques

slide-5
SLIDE 5
  • Start modifications in the first round of AES-128
  • Divide entire space of 2128 keys into of 2124 non-overlapping

groups of 24 keys

  • Fix a base key and enumerate all other keys in the key group

x = (x0x1x2x3x4x500)2 a = (000000a1a0)2 y = (y0y1y2y3y4y500)2 b = (000000b1b0)2

  • Modify base key at two byte positions independently (in 22

ways each)

  • Follow propagation of modifications forwards and backwards

Low Data Complexity Biclique Cryptanalysis of AES-128

slide-6
SLIDE 6

Low Data Complexity Biclique Cryptanalysis of AES-128

slide-7
SLIDE 7

Low Data Complexity Biclique Cryptanalysis of AES-128

Recomputation at matching

slide-8
SLIDE 8

Low Data Complexity Biclique Cryptanalysis of AES-128

Complexities:

  • Computational complexity to precompute all states Sa and Sb

in each key group: 0.3 AES-128 runs (first step).

  • About 7.12 AES-128 runs to test all 16 keys in the key group

(second step).

  • Negligible computation complexity (2-32) for false positives
  • Overall computation complexity:

2124(0.3 + 7.12) = 2126.89 AES executions.

  • Data complexity:

Only 16 chosen plaintexts!!

slide-9
SLIDE 9

Implementation

  • FPGA target platform: RIVYERA Computing

Cluster

128 Xilinx Spartan3 XC3S500 high performance FPGAs Equivalent computing power of 640 million system gates

  • ASIC target technology: NANGATE

45 nm Generic Library

slide-10
SLIDE 10

Optimized Brute-Force Attack on AES-128

  • Highly pipelined architecture for

highest possible speed (11-stage pipeline within each AES round)

  • Composite field inverters over

GF((22) 2) 2 for s-boxes

  • Register based (RAMless) design

– suitable for both FPGA and ASIC implementation

S1 Round-1 Round-2 K1 S2 K2 S8

K8

S9

K9

S10 Key Gen Output

ORACLE Byte Match

Fixed Plaintext Round-9 Round-10

= FF ?

slide-11
SLIDE 11

Optimized Brute-Force Attack on AES-128

  • Design implemented in two

favors:

All identical rounds (for a fair comparison with respect to the

  • riginal biclique advantage figures)

Partial matching in the last three rounds (for better area utilization – makes no difference for FPGA)

  • Smaller and faster than the

reported fastest design (362KGE vs 660KGE and 2.5GHz vs 2GHz)

S1 Round-1 Round-2 K1 S2 K2 S8

K8

S9

K9

S10 Key Gen Output

ORACLE Byte Match

Fixed Plaintext Round-9 Round-10

= FF ?

slide-12
SLIDE 12

Optimized Brute-Force Attack on AES-128

* Pipeline register cost negligible for FPGA implementation – already part of the slice!

slide-13
SLIDE 13

Optimized Brute-Force Attack on AES-128

[7:4]

Input GF(22)2 Multiplier GF(22)2 Multiplier

[3:0]

GF(22)2 Multiplier GF(22)2 Multiplier P GF(22)2 Inverter Output

[7:4] [3:0]

slide-14
SLIDE 14

Optimized Brute-Force Attack on AES-128

FPGA Performance

Slice Utilization % FPGA Utilization Maximum Freq (MHz) Keys tested/sec/FPGA 26949 / 33278 80.98 263.16 526 x 106

ASIC Performance

Core Area (GE) Maximum Freq (MHz) Average Power (mW) Keys tested/mW 362181 2480 622.937 3.98 x 106

slide-15
SLIDE 15

Biclique Attack on AES-128

Starting Point: Conceptual design

  • One-to-one maps theory to

implementation

  • Based on precomputation of all

base and biclique states

  • Not feasible for hardware

implementation

Requires too many RAMs Interconnection and control logic too complex to allow an area and speed efficient design

Round-7 S6 K6 S7 K7 Round-8 S8

K8

Round-9 S9 K9 Round-10 S10

Match

Round-6 S5 K5 Round-5 S4 K4 S3 K3 Round-4

ORACLE

Regular (Full) Rounds Partial Rounds Key RAM(s) Key RAM(s) 1 State RAM(s) State RAM(s) 1

S

MixColumns

Plaintext Memory Plaintext + Key

slide-16
SLIDE 16

Biclique Attack on AES-128

New Approach: Recomputation

  • On the fly calculation of base

and biclique states

  • Pipeline registers act as state

storage media

No additional RAMs/registers required – virtual storage

  • Similar to optimized brute force

attack in structure

simpler control logic and interconnections

Round-7 S6 K6 S7 K7 Round-8 S8

K8

Round-9 S9 K9 Round-10 S10

Match

Round-6 S5 K5 Round-5 S4 K4 Round-3 S2 K2 S3 K3 Round-4 Round-2 S1 K1 Round-1 P K

Key Gen Ptxt ORACLE

Biclique Rounds Regular (Full) Rounds Partial Rounds

slide-17
SLIDE 17

Biclique Attack on AES-128

First “Biclique” Round:

  • Serial AES implementation
  • 8-bit (!) datapath
  • Single S-Box
slide-18
SLIDE 18

Biclique Attack on AES-128

Second “Biclique” Round:

  • Slightly modified serial AES implementation
  • Still 8-bit (!) datapath
  • Two S-Boxes
  • Limited additional storage (shift registers) for biclique states
slide-19
SLIDE 19

Biclique Attack on AES-128

Third “Biclique” Round:

slide-20
SLIDE 20

Biclique Attack on AES-128

Third “Biclique” Round:

  • Serial AES implementation on 4 separate paths
  • Still 8-bit (!) datapath (on each path)
  • Four S-Boxes
  • Slightly more complex control logic
  • More registers for double-buffering of biclique states (still

shift registers with minimal cost

  • Only covers the “SubBytes” stage of a full AES round – the

rest implemented as in a regular round

slide-21
SLIDE 21

Optimized brute-force attack on AES-128

FPGA Performance

Slice Utilization % FPGA Utilization Maximum Freq* (MHz) Keys tested/sec/FPGA 30720 / 33278 92.31 236.22 945 x 106

ASIC Performance

Core Area (GE) Maximum Freq (MHz) Average Power (mW) Keys tested/mW 163912 1548 211.545 7.32 x 106 * Slower than the brute-force attack due to reduced number of pipeline stages

slide-22
SLIDE 22

Conclusion

  • The fastest brute-force attack implementation on

AES-128

  • The first biclique attack implementation on AES-128

Almost a factor of 2 speed and cost gain Only 16 chosen plaintexts (w.r.t. 288 in the

  • riginal biclique attack paper)
  • Suitable for both FPGA and ASIC implementation
  • Applicable to AES-192 and AES-256 as well