An Algorithm for Inversion in GF(2 m ) Suitable for Implementation - - PowerPoint PPT Presentation

an algorithm for inversion in gf 2 m
SMART_READER_LITE
LIVE PREVIEW

An Algorithm for Inversion in GF(2 m ) Suitable for Implementation - - PowerPoint PPT Presentation

An Algorithm for Inversion in GF(2 m ) Suitable for Implementation Using a Polynomial Multiply Instruction on GF(2) K. Kobayashi, N. Takagi, and K. Takagi Graduate School of Information Science, Nagoya University Outline Background and


slide-1
SLIDE 1

An Algorithm for Inversion in GF(2m) Suitable for Implementation Using a Polynomial Multiply Instruction on GF(2)

  • K. Kobayashi, N. Takagi, and K. Takagi

Graduate School of Information Science, Nagoya University

slide-2
SLIDE 2

Outline

Background and objective Preliminaries

GF(2m)

A polynomial multiply instruction on GF(2) A conventional algorithm for inversion in GF(2m) A new algorithm for inversion in GF(2m) Evaluation Concluding remarks

– p.1

slide-3
SLIDE 3

Background and Objective

GF(2m)

plays important roles in error-correcting codes and cryptography A fast algorithm for inversion in GF(2m) is required Polynomial multiply instruction on GF(2) accelerates multiplication in GF(2m). We propose a fast algorithm for inversion in GF(2m) that is suitable for implementation using a polynomial multiply instruction on GF(2)

– p.2

slide-4
SLIDE 4

GF(2m) (1/2)

GF(2m)

extension field of GF(2) any element A(x) ∈ GF(2m)

A(x) = am−1xm−1 + · · · + a1x + a0 (ai ∈ {0, 1})

Addition in GF(2m) polynomial addition on GF(2)

A(x) + B(x) = ((am−1 + bm−1) mod 2)xm−1 + · · · + ((a0 + b0) mod 2)

executed by exclusive-OR operation for every coefficient

– p.3

slide-5
SLIDE 5

GF(2m) (2/2)

Multiplication in GF(2m) polynomial multiplication modulo G(x) on GF(2)

G(x): the irreducible polynomial with degree m A(x) · B(x) = A(x) × B(x) mod G(x) · : multiplication in GF(2m) ×: polynomial multiplication in GF(2)

Multiplicative inverse of A(x) The element A−1(x) is such that

A(x) · A−1(x) = 1.

time-consuming operation

– p.4

slide-6
SLIDE 6

MULGF2

MULGF2 instruction A typical polynomial multiply instruction on GF(2) calculates the 2-word polynomial product from two

1-word polynomial operands

rs rt HI LO

accelerates multiplication in GF(2m) A multiplier for MULGF2 can be realized very easily “carry-free” version of an integer multiplier

– p.5

slide-7
SLIDE 7

Algorithm for Inversion in GF(2m)

By extending the Euclid’s algorithm for polynomial, we can execute inversion in GF(2m).

R−1(x) := G(x); R0(x) := A(x); j := 0;

repeat

j := j + 1; Qj(x) := Rj−2(x) ÷ Rj−1(x); Rj(x) := Rj−2(x) − Qj(x) × Rj−1(x);

until Rj(x) = 0;

  • utputs Rj−1(x) as GCD(A(x), G(x))

▽ – p.6

slide-8
SLIDE 8

Algorithm for Inversion in GF(2m)

By extending the Euclid’s algorithm for polynomial, we can execute inversion in GF(2m).

R−1(x) := G(x); U−1(x) := 0; R0(x) := A(x); U0(x) := 1; j := 0;

repeat

j := j + 1; Qj(x) := Rj−2(x) ÷ Rj−1(x); Rj(x) := Rj−2(x) − Qj(x) × Rj−1(x); Uj(x) := Uj−2(x) − Qj(x) × Uj−1(x);

until Rj(x) = 0;

  • utputs Rj−1(x) as GCD(A(x), G(x))
  • utputs Uj−1(x) as A−1(x)

(A(x)×A−1(x)mod G(x)=1)

– p.6

slide-9
SLIDE 9

Software Implementation of EA

software implementation of the Euclid’s algorithm

S(x) := G(x); R(x) := A(x); while R(x) = 0 do δ := deg(S(x)) − deg(R(x)); if deg(S(x)) < deg(R(x)) then R(x) ↔ S(x); δ := −δ; end if S(x) := S(x) − xδ × R(x); end while

x 2 x 3 x 2 1st iteration R: + 1 S: + + 1

▽ – p.7

slide-10
SLIDE 10

Software Implementation of EA

software implementation of the Euclid’s algorithm

S(x) := G(x); R(x) := A(x); while R(x) = 0 do δ := deg(S(x)) − deg(R(x)); if deg(S(x)) < deg(R(x)) then R(x) ↔ S(x); δ := −δ; end if S(x) := S(x) − xδ × R(x); end while

x 2 x 3 x 2 1st iteration R: + 1

3−2

S(x) := S(x) − x R(x); S: + + 1

▽ – p.7

slide-11
SLIDE 11

Software Implementation of EA

software implementation of the Euclid’s algorithm

S(x) := G(x); R(x) := A(x); while R(x) = 0 do δ := deg(S(x)) − deg(R(x)); if deg(S(x)) < deg(R(x)) then R(x) ↔ S(x); δ := −δ; end if S(x) := S(x) − xδ × R(x); end while

x 2 x 2 x 3 x 2 x 2 2nd iteration R: + 1 1st iteration R: + 1 S: + + 1 1 + x + S:

▽ – p.7

slide-12
SLIDE 12

Software Implementation of EA

software implementation of the Euclid’s algorithm

S(x) := G(x); R(x) := A(x); while R(x) = 0 do δ := deg(S(x)) − deg(R(x)); if deg(S(x)) < deg(R(x)) then R(x) ↔ S(x); δ := −δ; end if S(x) := S(x) − xδ × R(x); end while

x 3 x 2 x 2 1st iteration R: S: + 1 + + 1 x 2 x 2 S(x) := S(x) − x R(x);

2−2

1 + S: 2nd iteration x + R: + 1

▽ – p.7

slide-13
SLIDE 13

Software Implementation of EA

software implementation of the Euclid’s algorithm

S(x) := G(x); R(x) := A(x); while R(x) = 0 do δ := deg(S(x)) − deg(R(x)); if deg(S(x)) < deg(R(x)) then R(x) ↔ S(x); δ := −δ; end if S(x) := S(x) − xδ × R(x); end while

x 3 x 2 x 2 x 2 x 2 x 2 1st iteration S: + + 1 R: + 1 2nd iteration S: + x + 1 1 + R: 3rd iteration S: x 1 + 1 R:

▽ – p.7

slide-14
SLIDE 14

Software Implementation of EA

software implementation of the Euclid’s algorithm

S(x) := G(x); R(x) := A(x); while R(x) = 0 do δ := deg(S(x)) − deg(R(x)); if deg(S(x)) < deg(R(x)) then R(x) ↔ S(x); δ := −δ; end if S(x) := S(x) − xδ × R(x); end while

x 3 x 2 x 2 x 2 x 2 x 2 1st iteration S: + + 1 R: + 1 2nd iteration S: + x + 1 1 + R: 3rd iteration S: x 1 + 1 R: S(x) <−> R(x);

2−1

S(x) := S(x) − x R(x);

▽ – p.7

slide-15
SLIDE 15

Software Implementation of EA

software implementation of the Euclid’s algorithm

S(x) := G(x); R(x) := A(x); while R(x) = 0 do δ := deg(S(x)) − deg(R(x)); if deg(S(x)) < deg(R(x)) then R(x) ↔ S(x); δ := −δ; end if S(x) := S(x) − xδ × R(x); end while

1st & 2nd iterations correspond to one polynomial division

x 2 x 3 x 2 x 2 x 2 x 2 x 2 1st iteration S: + + 1 R: + 1 2nd iteration S: + x + 1 1 + R: 3rd iteration S: x 1 + 1 R: S: x 1 + 1 R: 4th iteration

– p.7

slide-16
SLIDE 16

Main Idea

Key point The conventional algorithm can not use MULGF2 efficiently

S(x) := S(x) − xδ × R(x);

New algorithm based on Brunner’s hardware algorithm for inversion use MULGF2 efficiently executed with regularity

– p.8

slide-17
SLIDE 17

HW implementation

Hardware algorithm for inversion [Brunner et al., ’93]

S(x) := G(x); R(x) := A(x); δ := 0; for i = 1 to 2m do if rm = 0 then R(x) := x × R(x); δ := δ + 1; else if sm = 1 then S(x) := S(x) − R(x); end if S(x) := x × S(x); if δ = 0 then R(x) ↔ S(x); δ := δ + 1; else δ := δ − 1; end if end if end for

x 3 x 2 x 2 1 + S: + R: 1st iteration + 1

δ = 0

▽ – p.9

slide-18
SLIDE 18

HW implementation

Hardware algorithm for inversion [Brunner et al., ’93]

S(x) := G(x); R(x) := A(x); δ := 0; for i = 1 to 2m do if rm = 0 then R(x) := x × R(x); δ := δ + 1; else if sm = 1 then S(x) := S(x) − R(x); end if S(x) := x × S(x); if δ = 0 then R(x) ↔ S(x); δ := δ + 1; else δ := δ − 1; end if end if end for

x 2 x 2 x 3 1 + S: + R: 1st iteration + 1 R(x) := x R(x);

δ := δ + 1; δ = 0

▽ – p.9

slide-19
SLIDE 19

HW implementation

Hardware algorithm for inversion [Brunner et al., ’93]

S(x) := G(x); R(x) := A(x); δ := 0; for i = 1 to 2m do if rm = 0 then R(x) := x × R(x); δ := δ + 1; else if sm = 1 then S(x) := S(x) − R(x); end if S(x) := x × S(x); if δ = 0 then R(x) ↔ S(x); δ := δ + 1; else δ := δ − 1; end if end if end for

x 3 x 2 x 3 x 3 x 2 x 2 1 + S: + R: 2nd iteration + x 1 + S: + R: 1st iteration + 1

δ = 0 δ = 1

▽ – p.9

slide-20
SLIDE 20

HW implementation

Hardware algorithm for inversion [Brunner et al., ’93]

S(x) := G(x); R(x) := A(x); δ := 0; for i = 1 to 2m do if rm = 0 then R(x) := x × R(x); δ := δ + 1; else if sm = 1 then S(x) := S(x) − R(x); end if S(x) := x × S(x); if δ = 0 then R(x) ↔ S(x); δ := δ + 1; else δ := δ − 1; end if end if end for

x 3 x 2 x 3 x 3 x 2 x 2 + S: + R: 2nd iteration + x 1 S(x) := x (S(x) − R(x)); 1 + S: + R: 1st iteration + 1

δ = 0 δ = 1 δ := δ − 1;

▽ – p.9

slide-21
SLIDE 21

HW implementation

Hardware algorithm for inversion [Brunner et al., ’93]

S(x) := G(x); R(x) := A(x); δ := 0; for i = 1 to 2m do if rm = 0 then R(x) := x × R(x); δ := δ + 1; else if sm = 1 then S(x) := S(x) − R(x); end if S(x) := x × S(x); if δ = 0 then R(x) ↔ S(x); δ := δ + 1; else δ := δ − 1; end if end if end for

x 2 x 3 x 3 x 2 x 3 x 3 x 3 x 2 x 2 S: + R: 3rd iteration + x + S: + R: 2nd iteration + x 1 + x 1 + S: + R: 1st iteration + 1

δ = 0 δ = 1 δ = 0

▽ – p.9

slide-22
SLIDE 22

HW implementation

Hardware algorithm for inversion [Brunner et al., ’93]

S(x) := G(x); R(x) := A(x); δ := 0; for i = 1 to 2m do if rm = 0 then R(x) := x × R(x); δ := δ + 1; else if sm = 1 then S(x) := S(x) − R(x); end if S(x) := x × S(x); if δ = 0 then R(x) ↔ S(x); δ := δ + 1; else δ := δ − 1; end if end if end for

x 2 x 3 x 3 x 3 x 2 x 2 x 2 x 3 x 3 + S: + R: 2nd iteration + x 1 1 + S: + R: 1st iteration + 1 S: + R: 3rd iteration + x + x S(x) := x (S(x)−R(x)); S(x) <−> R(x);

δ := δ + 1; δ = 0 δ = 1 δ = 0

▽ – p.9

slide-23
SLIDE 23

HW implementation

Hardware algorithm for inversion [Brunner et al., ’93]

S(x) := G(x); R(x) := A(x); δ := 0; for i = 1 to 2m do if rm = 0 then R(x) := x × R(x); δ := δ + 1; else if sm = 1 then S(x) := S(x) − R(x); end if S(x) := x × S(x); if δ = 0 then R(x) ↔ S(x); δ := δ + 1; else δ := δ − 1; end if end if end for

x 2 x 3 x 3 x 3 x 2 x 2 x 2 x 3 x 3 x 3 x 3 + S: + R: 2nd iteration + x 1 1 + S: + R: 1st iteration + 1 S: + R: 3rd iteration + x + x S: R: 4th iteration + x

δ = 0 δ = 1 δ = 0 δ = 1

– p.9

slide-24
SLIDE 24

Main Idea 2

Operations corresponding to contiguous k iterations of Brunner’s algorithm can be represented as

  • R(x)

U(x) S(x) V (x)

  • := H(x) ×
  • R(x)

U(x) S(x) V (x)

  • ;

Each element of the matrix H(x) is a polynomial with degree less than or equal to k on GF(2)

– p.10

slide-25
SLIDE 25

The Matrix H(x) (1/2)

x 2 x 2 x 3 1 + S: + R: 1st iteration + 1 R(x) := x R(x);

δ := δ + 1; δ = 0

The operation is represented in matrices as

  • R(x)

S(x)

  • :=
  • x

1

  • ×
  • R(x)

S(x)

  • ;

▽ – p.11

slide-26
SLIDE 26

The Matrix H(x) (1/2)

x 3 x 2 x 3 x 3 x 2 x 2 + S: + R: 2nd iteration + x 1 S(x) := x (S(x) − R(x)); 1 + S: + R: 1st iteration + 1

δ = 0 δ = 1 δ := δ − 1;

The operations are represented in matrices as

  • R(x)

S(x)

  • :=
  • x

1

  • ×
  • R(x)

S(x)

  • ;
  • R(x)

S(x)

  • :=
  • 1

x x

  • ×
  • R(x)

S(x)

  • ;

▽ – p.11

slide-27
SLIDE 27

The Matrix H(x) (1/2)

x 2 x 3 x 3 x 3 x 2 x 2 x 2 x 3 x 3 + S: + R: 2nd iteration + x 1 1 + S: + R: 1st iteration + 1 S: + R: 3rd iteration + x + x S(x) := x (S(x)−R(x)); S(x) <−> R(x);

δ := δ + 1; δ = 0 δ = 1 δ = 0

The operations are represented in matrices as

  • R(x)

S(x)

  • :=
  • x

1

  • ×
  • R(x)

S(x)

  • ;
  • R(x)

S(x)

  • :=
  • 1

x x

  • ×
  • R(x)

S(x)

  • ;
  • R(x)

S(x)

  • :=
  • x

x 1

  • ×
  • R(x)

S(x)

  • ;

– p.11

slide-28
SLIDE 28

The Matrix H(x) (2/2)

The operations in these three iterations can be represented as

  • R(x)

S(x)

  • :=
  • x

x 1

  • ×
  • 1

x x

  • ×
  • x

1

  • ×
  • R(x)

S(x)

  • ;

=

  • x3 + x2

x2 x

  • = H(x)

By using H(x) We can calculate the operations in these three iterations at once We can use MULGF2 instruction efficiently

– p.12

slide-29
SLIDE 29

New Algorithm

  • 1. calculates H(x) from the most significant word of R(x) and

S(x)

with only single-word operations

  • 2. calculates
  • R(x)

U(x) S(x) V (x)

  • := H(x) ×
  • R(x)

U(x) S(x) V (x)

  • ;

efficiently by using MULGF2

  • 3. continues the process until R(x) becomes 0

– p.13

slide-30
SLIDE 30

Evaluation

We compared # of MULGF2 and XOR instructions of the proposed algorithm with that of the conventional one Assumption We compared average # of instructions for executing inversion of 1, 000 random elements We counted instructions for multi-word operations in two algorithms MULGF2 has single cycle latency

– p.14

slide-31
SLIDE 31

Comparison of # of instruction (1/2)

the word size of a processor = 16

5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 55000

163 203 283 409 571 # of instructions Proposed Conventional m

▽ – p.15

slide-32
SLIDE 32

Comparison of # of instruction (1/2)

the word size of a processor = 16

5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 55000

163 203 283 409 571 # of instructions Proposed Conventional m

for large m The proposed algorithm is fast

– p.15

slide-33
SLIDE 33

Comparison of # of instruction (2/2)

the word size of a processor = 32

5000 10000 15000 20000 25000 30000

203 283 409 571 163 Conventional m Proposed # of instructions

▽ – p.16

slide-34
SLIDE 34

Comparison of # of instruction (2/2)

the word size of a processor = 32

5000 10000 15000 20000 25000 30000

203 283 409 571 163 Conventional m Proposed # of instructions

The proposed algorithm is fast for large m about the half #

▽ – p.16

slide-35
SLIDE 35

Comparison of # of instruction (2/2)

the word size of a processor = 32

5000 10000 15000 20000 25000 30000

203 283 409 571 163 Conventional m Proposed # of instructions

The proposed algorithm is very fast when both m and the word size are large

– p.16

slide-36
SLIDE 36

Concluding Remarks

We have proposed a new algorithm for inversion in GF(2m) the matrix H(x) represents operations corresponding to several contiguous iterations of Brunner’s algorithm

  • btained with only single-word operation

suitable for implementation using MULGF2 executed with regularity When both m and the word size of a processor are large the proposed algorithm can execute inversion very fast

– p.17

slide-37
SLIDE 37

Thank you for listening!

– p.18