Andrey Bogdanov*, Elif Bilge Kavun**, Christof Paar**, Christian Rechberger***, Tolga Yalcin**
* KU Leuven, Belgium, ** HGI-RUB, Germany, *** DTU, Denmark
Better than Brute-Force Optimized Hardware Architecture for - - PowerPoint PPT Presentation
Better than Brute-Force Optimized Hardware Architecture for Effcient Biclique Attacks on AES-128 Andrey Bogdanov*, Elif Bilge Kavun**, Christof Paar**, Christian Rechberger***, Tolga Yalcin** * KU Leuven, Belgium, ** HGI-RUB, Germany, ***
Andrey Bogdanov*, Elif Bilge Kavun**, Christof Paar**, Christian Rechberger***, Tolga Yalcin**
* KU Leuven, Belgium, ** HGI-RUB, Germany, *** DTU, Denmark
– on FPGA – on ASIC
– on FPGA – on ASIC
plaintexts {P} ciphertexts {C} m encryption oracle
K2 K2 K1 K
all key bits
groups of 24 keys
x = (x0x1x2x3x4x500)2 a = (000000a1a0)2 y = (y0y1y2y3y4y500)2 b = (000000b1b0)2
ways each)
Recomputation at matching
Complexities:
in each key group: 0.3 AES-128 runs (first step).
(second step).
2124(0.3 + 7.12) = 2126.89 AES executions.
Only 16 chosen plaintexts!!
128 Xilinx Spartan3 XC3S500 high performance FPGAs Equivalent computing power of 640 million system gates
45 nm Generic Library
highest possible speed (11-stage pipeline within each AES round)
GF((22) 2) 2 for s-boxes
– suitable for both FPGA and ASIC implementation
S1 Round-1 Round-2 K1 S2 K2 S8
K8
S9
K9
S10 Key Gen Output
ORACLE Byte Match
Fixed Plaintext Round-9 Round-10
= FF ?
favors:
All identical rounds (for a fair comparison with respect to the
Partial matching in the last three rounds (for better area utilization – makes no difference for FPGA)
reported fastest design (362KGE vs 660KGE and 2.5GHz vs 2GHz)
S1 Round-1 Round-2 K1 S2 K2 S8
K8
S9
K9
S10 Key Gen Output
ORACLE Byte Match
Fixed Plaintext Round-9 Round-10
= FF ?
* Pipeline register cost negligible for FPGA implementation – already part of the slice!
[7:4]
Input GF(22)2 Multiplier GF(22)2 Multiplier
[3:0]
GF(22)2 Multiplier GF(22)2 Multiplier P GF(22)2 Inverter Output
[7:4] [3:0]
FPGA Performance
Slice Utilization % FPGA Utilization Maximum Freq (MHz) Keys tested/sec/FPGA 26949 / 33278 80.98 263.16 526 x 106
ASIC Performance
Core Area (GE) Maximum Freq (MHz) Average Power (mW) Keys tested/mW 362181 2480 622.937 3.98 x 106
Starting Point: Conceptual design
implementation
base and biclique states
implementation
Requires too many RAMs Interconnection and control logic too complex to allow an area and speed efficient design
Round-7 S6 K6 S7 K7 Round-8 S8
K8
Round-9 S9 K9 Round-10 S10
Match
Round-6 S5 K5 Round-5 S4 K4 S3 K3 Round-4
ORACLE
Regular (Full) Rounds Partial Rounds Key RAM(s) Key RAM(s) 1 State RAM(s) State RAM(s) 1
S
MixColumns
Plaintext Memory Plaintext + Key
New Approach: Recomputation
and biclique states
storage media
No additional RAMs/registers required – virtual storage
attack in structure
simpler control logic and interconnections
Round-7 S6 K6 S7 K7 Round-8 S8
K8
Round-9 S9 K9 Round-10 S10
Match
Round-6 S5 K5 Round-5 S4 K4 Round-3 S2 K2 S3 K3 Round-4 Round-2 S1 K1 Round-1 P K
Key Gen Ptxt ORACLE
Biclique Rounds Regular (Full) Rounds Partial Rounds
First “Biclique” Round:
Second “Biclique” Round:
Third “Biclique” Round:
Third “Biclique” Round:
shift registers with minimal cost
rest implemented as in a regular round
FPGA Performance
Slice Utilization % FPGA Utilization Maximum Freq* (MHz) Keys tested/sec/FPGA 30720 / 33278 92.31 236.22 945 x 106
ASIC Performance
Core Area (GE) Maximum Freq (MHz) Average Power (mW) Keys tested/mW 163912 1548 211.545 7.32 x 106 * Slower than the brute-force attack due to reduced number of pipeline stages
AES-128
Almost a factor of 2 speed and cost gain Only 16 chosen plaintexts (w.r.t. 288 in the