On the use of polynomial matrix approximant in the block Wiedemann - - PowerPoint PPT Presentation

on the use of polynomial matrix approximant in the block
SMART_READER_LITE
LIVE PREVIEW

On the use of polynomial matrix approximant in the block Wiedemann - - PowerPoint PPT Presentation

On the use of polynomial matrix approximant in the block Wiedemann algorithm Pascal Giorgi Symbolic Computation Group , Laboratoire LIP, School of Computer Science, ENS Lyon, France ECOLE NORMALE SUPERIEURE DE LYON University of Waterloo,


slide-1
SLIDE 1

On the use of polynomial matrix approximant in the block Wiedemann algorithm

Pascal Giorgi

ECOLE NORMALE SUPERIEURE DE LYON

Laboratoire LIP, ENS Lyon, France Symbolic Computation Group, School of Computer Science, University of Waterloo, Canada

in collaboration with C-P. Jeannerod and G. Villard ENS-Lyon (France) June 6, CMS/CSHPM Summer 2005 Meeting Mathematics of Computer Algebra and Analysis

slide-2
SLIDE 2

Motivations

Large sparse linear systems are involved in many mathematical applications

  • ver a field :

◮ integers factorization [Odlyzko 1999]], ◮ discrete logarithm [Odlyzko 1999 ; Thom´

e 2003],

  • ver the integers :

◮ number theory [Cohen 1993], ◮ group theory [Newman 1972], ◮ integer programming [Aardal, Hurkens, Lenstra 1999]

slide-3
SLIDE 3

Solving sparse system over finite fields

Matrices arising in pratice have dimensions around few millions with few millions of non zero entries The use of classic Gaussian elimination is proscribed due to fill-in

slide-4
SLIDE 4

Solving sparse system over finite fields

Matrices arising in pratice have dimensions around few millions with few millions of non zero entries The use of classic Gaussian elimination is proscribed due to fill-in = ⇒ algorithms must preserve the sparsity of the matrix

slide-5
SLIDE 5

Solving sparse system over finite fields

Matrices arising in pratice have dimensions around few millions with few millions of non zero entries The use of classic Gaussian elimination is proscribed due to fill-in = ⇒ algorithms must preserve the sparsity of the matrix Iteratives methods revealed successful over a finite field :

◮ Krylov/Wiedemann method [Wiedemann 1986] ◮ conjugate gradient [Lamacchia, Odlyzko 1990], ◮ Lanczos method [Lamacchia, Odlyzko 1990 ; Lambert 1996], ◮ block Lanczos method [Coppersmith 1993, Montgomery 1995] ◮ block Krylov/Wiedemann method [Coppersmith 1994, Thom´

e 2002]

slide-6
SLIDE 6

Scheme of Krylov/Wiedemann method

Let A ∈ I FN×N of full rank and b ∈ I

  • FN. Then x = A−1b can be

expressed as a linear combination of the Krylov subspace {b, Ab, ..., ANb}

slide-7
SLIDE 7

Scheme of Krylov/Wiedemann method

Let A ∈ I FN×N of full rank and b ∈ I

  • FN. Then x = A−1b can be

expressed as a linear combination of the Krylov subspace {b, Ab, ..., ANb} Let ΠAb(λ) = c0 + c1λ + ... + λd ∈ I F[λ] be the minimal polynomial of the sequence {Aib}∞

i=0

slide-8
SLIDE 8

Scheme of Krylov/Wiedemann method

Let A ∈ I FN×N of full rank and b ∈ I

  • FN. Then x = A−1b can be

expressed as a linear combination of the Krylov subspace {b, Ab, ..., ANb} Let ΠAb(λ) = c0 + c1λ + ... + λd ∈ I F[λ] be the minimal polynomial of the sequence {Aib}∞

i=0

A × −1 c0 (c1b + c2Ab + ... + Ad−1b)

  • =

b

slide-9
SLIDE 9

Scheme of Krylov/Wiedemann method

Let A ∈ I FN×N of full rank and b ∈ I

  • FN. Then x = A−1b can be

expressed as a linear combination of the Krylov subspace {b, Ab, ..., ANb} Let ΠAb(λ) = c0 + c1λ + ... + λd ∈ I F[λ] be the minimal polynomial of the sequence {Aib}∞

i=0

A × −1 c0 (c1b + c2Ab + ... + Ad−1b)

  • =

b x

slide-10
SLIDE 10

Scheme of Krylov/Wiedemann method

Let A ∈ I FN×N of full rank and b ∈ I

  • FN. Then x = A−1b can be

expressed as a linear combination of the Krylov subspace {b, Ab, ..., ANb} Let ΠAb(λ) = c0 + c1λ + ... + λd ∈ I F[λ] be the minimal polynomial of the sequence {Aib}∞

i=0

A × −1 c0 (c1b + c2Ab + ... + Ad−1b)

  • =

b x Choose u ∈ I FN uniformaly and randomly and compute the minimal polynomial Πu,Ab of the scalar sequence {uTAib}∞

i=0.

with probability greater than 1 − deg(ΠAb)

Card(I F) we have ΠAb = Πu,AB.

slide-11
SLIDE 11

Wiedemann algorithm

Three steps :

  • 1. compute 2N + ǫ elements of the sequence {uTAib}∞

i=0.

  • 2. compute the minimal polynomial of the scalar sequence.
  • 3. compute the linear combination of {b, Ab, ..., Ad−1b}.
slide-12
SLIDE 12

Wiedemann algorithm

Three steps :

  • 1. compute 2N + ǫ elements of the sequence {uTAib}∞

i=0.

  • 2. compute the minimal polynomial of the scalar sequence.
  • 3. compute the linear combination of {b, Ab, ..., Ad−1b}.

Let γ be the cost of applying a vector to A.

slide-13
SLIDE 13

Wiedemann algorithm

Three steps :

  • 1. compute 2N + ǫ elements of the sequence {uTAib}∞

i=0.

  • 2. compute the minimal polynomial of the scalar sequence.
  • 3. compute the linear combination of {b, Ab, ..., Ad−1b}.

Let γ be the cost of applying a vector to A. step 1 : 2N matrix-vector products + 2N dot products = ⇒ cost 2Nγ + O(N2) field operations

slide-14
SLIDE 14

Wiedemann algorithm

Three steps :

  • 1. compute 2N + ǫ elements of the sequence {uTAib}∞

i=0.

  • 2. compute the minimal polynomial of the scalar sequence.
  • 3. compute the linear combination of {b, Ab, ..., Ad−1b}.

Let γ be the cost of applying a vector to A. step 1 : 2N matrix-vector products + 2N dot products = ⇒ cost 2Nγ + O(N2) field operations step 2 : use of Berlekamp/Massey algorithm [Berlekamp 1968, Massey 1969] = ⇒ cost O(N2) field operations

slide-15
SLIDE 15

Wiedemann algorithm

Three steps :

  • 1. compute 2N + ǫ elements of the sequence {uTAib}∞

i=0.

  • 2. compute the minimal polynomial of the scalar sequence.
  • 3. compute the linear combination of {b, Ab, ..., Ad−1b}.

Let γ be the cost of applying a vector to A. step 1 : 2N matrix-vector products + 2N dot products = ⇒ cost 2Nγ + O(N2) field operations step 2 : use of Berlekamp/Massey algorithm [Berlekamp 1968, Massey 1969] = ⇒ cost O(N2) field operations step 3 : d − 1 matrix-vector products + d − 1 vectors operations = ⇒ cost at most Nγ + O(N2) field operations total cost of O(Nγ + N2) fied operations with O(N) additional space

slide-16
SLIDE 16

Block Wiedemann method

Replace the projection vectors by blocks of vectors. Let U ∈ I Fm×N and V = [b ¯ V ] ∈ I FN×n. We now consider the matrix sequence {UAiV }∞

i=0.

Main interest :

◮ parallel coarse and fine grain implementation (on columns of V ), ◮ better probability of success [Villard 1997], ◮ (1 + ǫ)N matrix-vector products (sequential) [Kaltofen 1995].

Difficulty : minimal generating matrix polynomial of a matrix sequence.

slide-17
SLIDE 17

Minimal generating matrix polynomial

Let {Si}∞

i=0 be a m × m matrix sequence.

Let P ∈ I Fm×m[λ] be minimal with degree k s.t. ∀j > 0 :

k

  • i=0

Si+jP[i] = 0m×m the cost to compute P is :

◮ O(m3k2) field operations [Coppersmith 1994], ◮ O˜(m3k log k) field operations [Beckermann, Labahn 1994 ; Thom´

e 2002],

◮ O˜(mωk log k) field operations [Giorgi, Jeannerod, Villard 2003].

where ω is the exponent in the complexity of matrix multiplication

slide-18
SLIDE 18

Minimal generating matrix polynomial

Let {Si}∞

i=0 be a m × m matrix sequence.

Let P ∈ I Fm×m[λ] be minimal with degree k s.t. ∀j > 0 :

k

  • i=0

Si+jP[i] = 0m×m the cost to compute P is :

◮ O(m3k2) field operations [Coppersmith 1994], ◮ O˜(m3k log k) field operations [Beckermann, Labahn 1994 ; Thom´

e 2002],

◮ O˜(mωk log k) field operations [Giorgi, Jeannerod, Villard 2003].

where ω is the exponent in the complexity of matrix multiplication Latter complexity is based on σ-bases computation with :

  • divide and conquer approach (idea from [Beckermann, Labahn 1994])
  • matrix product-based Gaussian elimination [Ibarra et al 1982]
slide-19
SLIDE 19

Minimal approximant basis : σ-basis

Problem : Given a matrix power series G ∈ I Fm×n[[λ]] and an approximation order d ; find the minimal nonsingular polynomial matrix M ∈ I Fm×m[λ] s.t. MG = λdR ∈ I Fm×n[[λ]] M is called an order d σ-bases of G.

slide-20
SLIDE 20

Minimal approximant basis : σ-basis

Problem : Given a matrix power series G ∈ I Fm×n[[λ]] and an approximation order d ; find the minimal nonsingular polynomial matrix M ∈ I Fm×m[λ] s.t. MG = λdR ∈ I Fm×n[[λ]] M is called an order d σ-bases of G. minimality : Let f (λ) = G(λn)[1, λ, λ2, ..., λn]T ∈ I F[[λ]]m every v ∈ I F1×m[λ] such that v(λn)f (λ) = λrw(λ) ∈ I F[[λ]] with r ≥ nd has a unique decomposition v =

m

  • i=1

c(i)M(i,∗) with deg c(i) + deg M(i,∗) ≤ deg v

slide-21
SLIDE 21

Sketch of the reduction

divide and conquer :

[Beckermann, Labahn 1994 : theorem 6.1]

Given M′ and M′′ two order d/2 σ-basis of respectively G and λ

−d 2 M′G.

The polynomial matrix M = M′M′′ is an order d σ-bases of G. base case (order 1 σ-basis) :

◮ compute ∆ = G mod λ, ◮ compute the LSP-factorization of π∆, with π a permutation, ◮ return M = DL−1π where D is a diagonal matrix (with 1’s and λ’s)

slide-22
SLIDE 22

Sketch of the reduction

divide and conquer :

[Beckermann, Labahn 1994 : theorem 6.1]

Given M′ and M′′ two order d/2 σ-basis of respectively G and λ

−d 2 M′G.

The polynomial matrix M = M′M′′ is an order d σ-bases of G. base case (order 1 σ-basis) :

◮ compute ∆ = G mod λ, ◮ compute the LSP-factorization of π∆, with π a permutation, ◮ return M = DL−1π where D is a diagonal matrix (with 1’s and λ’s)

cost :

  • C(m, n, d) = 2C(m, n, d/2) + 2MM(m, d/2)
  • C(m, n, 1) = O(MM(m))

= ⇒ reduction to polynomial matrix multiplication

slide-23
SLIDE 23

σ-bases and minimal generating matrix polynomial

Considering the matrix power series G(λ) =

  • i=0

UAiV λi ∈ I Fm×n[[λ]] Let P, T ∈ I Fn×n and Q, S ∈ I Fm×m[λ] defining the right 2N/m σ-bases G −Im P S Q T

  • = λ2N/mR ∈ I

Fm×(m+n)[[λ]] such that deg P > deg Q and P is full rank.

slide-24
SLIDE 24

σ-bases and minimal generating matrix polynomial

Considering the matrix power series G(λ) =

  • i=0

UAiV λi ∈ I Fm×n[[λ]] Let P, T ∈ I Fn×n and Q, S ∈ I Fm×m[λ] defining the right 2N/m σ-bases G −Im P S Q T

  • = λ2N/mR ∈ I

Fm×(m+n)[[λ]] such that deg P > deg Q and P is full rank. The reversal matrix polynomial of P according to its column degrees define a right minimal generating matrix polynomial for the matrix sequence {UAiV }∞

i=0

Proof : ∀k ∈ {deg P, ..., 2N/m}

deg P

  • i=0

G (k−i)P(i) = 0m×n

slide-25
SLIDE 25

Block Wiedemann algorithm

Three steps :

  • 1. compute 2N/m + ǫ elements of the sequence {UTAiV }∞

i=0.

  • 2. compute the right minimal matrix polynomial through σ-bases.
  • 3. compute the linear combination of {b, Ab, ..., Ad−1b}.
slide-26
SLIDE 26

Block Wiedemann algorithm

Three steps :

  • 1. compute 2N/m + ǫ elements of the sequence {UTAiV }∞

i=0.

  • 2. compute the right minimal matrix polynomial through σ-bases.
  • 3. compute the linear combination of {b, Ab, ..., Ad−1b}.

Let γ be the cost of applying a vector to A.

slide-27
SLIDE 27

Block Wiedemann algorithm

Three steps :

  • 1. compute 2N/m + ǫ elements of the sequence {UTAiV }∞

i=0.

  • 2. compute the right minimal matrix polynomial through σ-bases.
  • 3. compute the linear combination of {b, Ab, ..., Ad−1b}.

Let γ be the cost of applying a vector to A. step 1 : with m processors = ⇒ cost 2Nγ/m + O(N2) field operations

slide-28
SLIDE 28

Block Wiedemann algorithm

Three steps :

  • 1. compute 2N/m + ǫ elements of the sequence {UTAiV }∞

i=0.

  • 2. compute the right minimal matrix polynomial through σ-bases.
  • 3. compute the linear combination of {b, Ab, ..., Ad−1b}.

Let γ be the cost of applying a vector to A. step 1 : with m processors = ⇒ cost 2Nγ/m + O(N2) field operations step 2 : with 1 processors = ⇒ cost O˜(mω−1N) field operations

slide-29
SLIDE 29

Block Wiedemann algorithm

Three steps :

  • 1. compute 2N/m + ǫ elements of the sequence {UTAiV }∞

i=0.

  • 2. compute the right minimal matrix polynomial through σ-bases.
  • 3. compute the linear combination of {b, Ab, ..., Ad−1b}.

Let γ be the cost of applying a vector to A. step 1 : with m processors = ⇒ cost 2Nγ/m + O(N2) field operations step 2 : with 1 processors = ⇒ cost O˜(mω−1N) field operations step 3 : with m processors = ⇒ cost at most Nγ/m + O(N2) field operations total of O(Nγ/m) + O(N2) + O˜(nω−1N) fied op. with m processors

slide-30
SLIDE 30

Implementation within LinBox library

◮ LinBox project (Canada-France-USA) : www.linal.org ◮ Generic implementation with respect to : finite field, blackbox. ◮ σ-basis implementation :

  • hybrid dense linear algebra over finite field [Dumas, Giorgi, Pernet 2004]
  • polynomial matrix multiplication :

Karatsuba algorithm + BLAS-based matrix multiplication

  • Karatsuba polynomial matrix middle product [Hanrot et al. 2003]
slide-31
SLIDE 31

Performances : minimal generating matrix polynomial

  • over GF(17), matrix sparsity is 99%, block dimension is 20

2 4 8 16 32 64 128 256 512 1024 5000 10000 15000 20000 25000 30000 Time in second Matrix size Minimal generating matrix polynomial vs minimal polynomial PM−Basis (block) M−Basis (block) Berlekamp/Massey (scalar)

N = 30 000 : practical block/scalar ≈ O(1) scalar sequence computation : ≈ 12.4h

slide-32
SLIDE 32

Conclusions

  • O˜(nωd) algorithm for Pade approximation problem.

→ advantage for solving sparse linear system (block Wiedemann) → O˜(nωd) algorithm for column reduction [Giorgi, Jeannerod, Villard 2003]

  • still unclear : characteristic polynomial, Hermite and Frobenius form.

Into practice :

  • fast polynomial matrix multiplication

[Cantor, Kaltofen 1991 ; Bostan, Schost 2004]

  • compare LinBox with Magma (implementation of Thom´

e algorithm)

  • Specialization for GF(2) and parallel implementation (SMP, grid)