Generalized Polynomial Decomposition for S-boxes with Application to - - PowerPoint PPT Presentation
Generalized Polynomial Decomposition for S-boxes with Application to - - PowerPoint PPT Presentation
Generalized Polynomial Decomposition for S-boxes with Application to Side-Channel Countermeasures Dahmun Goudarzi, Matthieu Rivain, Srinivas Vivek, and Damien Vergnaud Background 2 Secure Software S-box Implementations Higher-Order
Background
2
Secure Software S-box Implementations
Higher-Order Masking Main Challenge: S-box evaluations Linear operations: Non-linear operations: Goal: Find S-box representation with less non-linear operations
3
O(d2)
x = x1 + x2 + · · · + xd
O(d)
Polynomial Methods
S-box seen as a polynomial over Generic Methods Specific Methods: example on AES CRV decomposition, RP 4-mult chain on Algebraic decomposition, KHL 5-mult chain on
4
S(x) =
n
X
i=0
aixi ? = × ? = S(x) = X
i
(pi ? qi)(x) SAES(x) = Aff(x254) F2n F28 F24
Bitslice Methods
S-box seen as a Boolean circuit Generic Methods Specific Methods: example on AES
Based on Boolean functions Based on a Boolean circuit (BMP13)
5
S : x → (f1(x), f2(x), . . . , fm(x))
Polynomial vs Bitslice
6
Clock Cycles 100000 200000 300000 400000 Masking Order 2 3 4 5 6 7 8 9 10
Polynomial Bitslice
Generic 8-bit S-box evaluation
Full Field Boolean Field
CRV decomposition Boolean decomposition Intermediate Field ?
This work
1 8-bit function 4-bit functions
7
S(x) → (S1(y, z), S2(y, z))
Motivation
Working on smaller fields Degree of parallelisation increased (32-bit architecture) Example: 16 AES S-box with polynomial method
8
2 4 6 8 10 1 2 3 4 5 ·104 d clock cycles KHL RP (ISW-HT)
Boolean Case 4-bit field 8-bit field
- Mult. in //
32 8 4
Our results
Generalized decomposition method for any S-boxes w.r.t 3 parameters : number of inputs : number of output elements : bit-size of the elements Study of the median case: example on 8-bit S-boxes: with Implementation in ARM assembly to compare with state of the art
9
n m λ S(x, y) = (f1(x, y), f2(x, y)) x, y ∈ F24
Generalized Decomposition Method
10
S-box Characterization
S-box seen as a -bit to -bit polynomial over : where (set of functions from to )
11
F2λ nλ mλ S(x) = (f1(x), f2(x), . . . , fm(x)) f1, f2, . . . , fm ∈ Fn,λ Fn
2λ
F2λ
Coordinate Function Decomposition
: random linear combinations from a basis with find s.t by solving :
12
gi f(x) =
t
X
i=0
gi(x) · hi(x) f(x) = X
i
( X
j
ai,jφj(x))( X
j
ci,jφj(x)), ∀x ci,j hi = X
j
ci,jφj h ¯ Bii = 8 < :g, g =
λ−1
X
i=0
X
φ∈B
cφ,i ⇥ φ2i 9 = ; h ¯ Bi
Solving a Linear System
13
f(x) = X
i
( X
j
ai,jφj(x))( X
j
ci,jφj(x)), ∀x {ei}2nλ
i=1 = Fn 2λ
A1c1 + A2c2 + · · · + Atct = (f(e1), f(e2), . . . , f(e2n)) Ai = φ1(e1) · gi(e1) φ2(e1) · gi(e1) ... φ|B|(e1) · gi(e1) φ1(e2) · gi(e2) φ2(e2) · gi(e2) ... φ|B|(e2) · gi(e2) . . . . . . ... . . . φ1(e2n) · gi(e2n) φ2(e2n) · gi(e2n) ... φ|B|(e2n) · gi(e2n)
Conditions
unknowns, equations: Condition on the sum: Condition on the basis: has to span the entire space
14
(t + 1)|B| 2nλ (t + 1)|B| ≥ 2nλ Fn,λ t ≥ ⇠2nλ |B| ⇡ − 1 h ¯ B ⇥ ¯ Bi
Spanning Property
with where and
15
h ¯ B ⇥ ¯ Bi = Fn,λ ( ) rank(Mat( ¯ B ⇥ ¯ B)) = 2nλ Mat(S) = ϕ1(e1) ϕ2(e1) ... ϕ|S|(e1) ϕ1(e2) ϕ2(e2) ... ϕ|S|(e2) . . . . . . ... . . . ϕ1(e2nλ) ϕ2(e2nλ) ... ϕ|S|(e2nλ) {ϕ1, ϕ2, . . . , ϕ|S|} = S S = h ¯ B ⇥ ¯ Bi
Basis Construction
16
Start with Pick in at random, where Compute with Redo times and choose that increase the rank most Repeat until rank is at least φ, ψ (φ, ψ) N 2nλ rank(Mat(S × S)) S = B ∪ φ · ψ B = {1, x1, x2 . . . , xn} h ¯ Bi h ¯ Bi = 8 < :g , g =
λ−1
X
i=0
X
φ∈B
cφ,i ⇥ φ2i 9 = ;
Random Basis
Improvements w.r.t previous methods: Boolean case : initial basis from 25 to 17
17
4-bit s-boxes 8-bit s-boxes (λ, n) (1,4) (2,2) (4,1) (1,8) (2,4) (4,2) (8,1) |B1| 7 4 3 26 14 8 5 r 2 1 1 17 9 5 3
Decomposition of the S-box
Sbox: Apply coordinate decompositions on the ’s Basis update: Start with a basis At each step:
18
S : x → (f1(x), f2(x), . . . , fm(x)) m fi Bi Bi+1 ← Bi ∪ {gi · hi}ti
i=0
Decomposition example
19
S : F28 → F28 n = 2, m = 2, λ = 4 → S(x, y) = (f1(x, y), f2(x, y)) f1(x) =
t1
X
i=0
g1,i(x) · h1,i(x) B1 | B1 × B1 spans Fn,λ B2 = B1 ∪ {g1,i · h1,i}t1
i=0
f2(x) =
t2
X
i=0
g2,i(x) · h2,i(x)
Experimental Results and Implementations
20
Optimal Parameters
Cost of the decomposition: with Optimal parameters:
21
r +
m
X
i=1
ti ti ≥ ⇠ 2nλ − 1 λ|Bi| + 1 ⇡ − 1 ti = ⇠ 2nλ − 1 λ|Bi| + 1 ⇡ − 1
Achievable Results for Median Cases
22
Optimal/Achievable (λ, n) |B1| r t1, t2, . . . , tn C∗ 4-bit s-boxes Optimal (2,2) 5 2 1,1 4 Achievable (2,2) 5 2 1,1 4 8-bit s-boxes Optimal (2,4) 16 11 8,5,4,3 31 Achievable (2,4) 16 11 9,6,5,3 34 Optimal (4,2) 10 7 6,4 17 Achievable (4,2) 10 7 7,4 18
Implementation Results
23
2 4 6 8 10 1 2 3 4 ·105 d clock cycles Bitslice 16 CRV (4 × 4) Our implementations
Code Size RAM CRV 27.5 KB 80d B Boolean Dec 4.6 KB 644d B Our impl. 8.7 KB 92d B
Conclusion
Generalized decomposition method well suited for any s-boxes or target architectures against side-channel attacks
24
Decomposition fjeld size Number of coordinate functions [GR16] T h i s w
- r
k E x t e n s i
- n
b a s e d
- n
[ P V 1 6 ] [CRV14]
n n
Conclusion
Case study on 32-bit ARM Median case 8-bit S-box => 2 4-bit functions Implementation comparison with state of the art Memory trade-off Can suit low end device with smaller architecture Parallelisation level decreased => poor bitslice performances Few memory requirements
25