Symmetric Indefinite Triangular Factorization Revealing the Rank - - PowerPoint PPT Presentation

symmetric indefinite triangular factorization revealing
SMART_READER_LITE
LIVE PREVIEW

Symmetric Indefinite Triangular Factorization Revealing the Rank - - PowerPoint PPT Presentation

Symmetric Indefinite Triangular Factorization Revealing the Rank Profile Matrix Jean-Guillaume Dumas, Cl ement Pernet Universit e Grenoble Alpes, Laboratoire Jean Kuntzmann, UMR CNRS ISSAC18, New York, USA July 17, 2017 Supported by


slide-1
SLIDE 1

Symmetric Indefinite Triangular Factorization Revealing the Rank Profile Matrix

Jean-Guillaume Dumas, Cl´ ement Pernet

Universit´ e Grenoble Alpes, Laboratoire Jean Kuntzmann, UMR CNRS

ISSAC’18, New York, USA July 17, 2017

Supported by OpenDreamKit Horizon 2020 European RI project (#676541)

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 1 / 16

slide-2
SLIDE 2

Introduction

Context

Applications of symmetric Gaussian elimination Symmetric linear system solving Signature LLL: R factor of rational QR [Villard’12])

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 2 / 16

slide-3
SLIDE 3

Introduction

Context

Applications of symmetric Gaussian elimination Symmetric linear system solving Signature LLL: R factor of rational QR [Villard’12]) Compared to unsymmetric Gaussian elimination Save a factor of 2 in time complexity Invariants specific to symmetric matrices (signature)

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 2 / 16

slide-4
SLIDE 4

Introduction

Context

Applications of symmetric Gaussian elimination Symmetric linear system solving Signature LLL: R factor of rational QR [Villard’12]) Compared to unsymmetric Gaussian elimination Save a factor of 2 in time complexity Invariants specific to symmetric matrices (signature) Motivation here fsytrf: finite field dense symmetric elimination in fflas-ffpack to be lifted for LinBox signature over Z reduction to matrix product: O(n2r ω−2) and BLAS3 investigate symmetric rank profile matrix and related pivoting

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 2 / 16

slide-5
SLIDE 5

Introduction

Outline

1

State of the art on symmetric factorizations

2

Rank profile and pivoting

3

Algorithms

4

The characteristic 2 case

5

Performance

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 3 / 16

slide-6
SLIDE 6

State of the art on symmetric factorizations

Symmetric factorizations

Symmetric Decomposition Exists for

A B BT =

Field with sqrt & Generic rank profile

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 4 / 16

slide-7
SLIDE 7

State of the art on symmetric factorizations

Symmetric factorizations

Symmetric Decomposition Exists for

A B BT =

Field with sqrt & Generic rank profile

D A L LT =

Generic rank profile

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 4 / 16

slide-8
SLIDE 8

State of the art on symmetric factorizations

Symmetric factorizations

Symmetric Decomposition Exists for

A B BT =

Field with sqrt & Generic rank profile

D A L LT =

Generic rank profile

L P P D A LT

T

=

No [ 0 1

1 0 ]-like blocks

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 4 / 16

slide-9
SLIDE 9

State of the art on symmetric factorizations

Symmetric factorizations

Symmetric Decomposition Exists for

A B BT =

Field with sqrt & Generic rank profile

D A L LT =

Generic rank profile

L P P D A LT

T

=

No [ 0 1

1 0 ]-like blocks

L P P T A LT

T

=

Any [Parlett-Reid 1970] T tridiagonal

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 4 / 16

slide-10
SLIDE 10

State of the art on symmetric factorizations

Symmetric factorizations

Symmetric Decomposition Exists for

A B BT =

Field with sqrt & Generic rank profile

D A L LT =

Generic rank profile

L P P D A LT

T

=

No [ 0 1

1 0 ]-like blocks

L P P T A LT

T

=

Any [Parlett-Reid 1970] T tridiagonal . . .

L P P A LT

T

= Y

Any [Bunch-Kaufmann 1977] Y with 1×1 and 2×2 blocks

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 4 / 16

slide-11
SLIDE 11

State of the art on symmetric factorizations

State of the art

Form Properties

T

[Parlett-Reid 1970] [Bunch-Parlett 1971] [Aasen 1971]

Diagonal Pivoting Full pivoting Partial pivoting Iterative Iterative Iterative

2 3n3 2 3n3 1 3n3

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 5 / 16

slide-12
SLIDE 12

State of the art on symmetric factorizations

State of the art

Form Properties

T

[Parlett-Reid 1970] [Bunch-Parlett 1971] [Aasen 1971]

Diagonal Pivoting Full pivoting Partial pivoting Iterative Iterative Iterative

2 3n3 2 3n3 1 3n3

Y

[Bunch-Kaufmann 1977]

Partial pivoting Iterative

1 3n3

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 5 / 16

slide-13
SLIDE 13

State of the art on symmetric factorizations

State of the art

Form Properties

T

[Parlett-Reid 1970] [Bunch-Parlett 1971] [Aasen 1971]

Diagonal Pivoting Full pivoting Partial pivoting Iterative Iterative Iterative

2 3n3 2 3n3 1 3n3

Y

[Bunch-Kaufmann 1977] [Shklarski-Toledo 2011]

Partial pivoting Partial pivoting Iterative Recursive (GRP hyp.)

1 3n3 1 3n3

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 5 / 16

slide-14
SLIDE 14

State of the art on symmetric factorizations

State of the art

Form Properties

T

[Parlett-Reid 1970] [Bunch-Parlett 1971] [Aasen 1971]

Diagonal Pivoting Full pivoting Partial pivoting Iterative Iterative Iterative

2 3n3 2 3n3 1 3n3

Y

[Bunch-Kaufmann 1977] [Shklarski-Toledo 2011] [Yamazaki-Dongarra 2017]

Partial pivoting Partial pivoting Partial pivoting Iterative Recursive (GRP hyp.) Block Iterative

1 3n3 1 3n3 1 3n3

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 5 / 16

slide-15
SLIDE 15

State of the art on symmetric factorizations

State of the art

Form Properties

T

[Parlett-Reid 1970] [Bunch-Parlett 1971] [Aasen 1971]

Diagonal Pivoting Full pivoting Partial pivoting Iterative Iterative Iterative

2 3n3 2 3n3 1 3n3

Y

[Bunch-Kaufmann 1977] [Shklarski-Toledo 2011] [Yamazaki-Dongarra 2017]

Partial pivoting Partial pivoting Partial pivoting Iterative Recursive (GRP hyp.) Block Iterative

1 3n3 1 3n3 1 3n3

Here Pivoting revealing the rank profile matrix Recursive for any matrix O(n2r ω−2) (gives 1

3n3 when rank=n & ω=3)

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 5 / 16

slide-16
SLIDE 16

Rank profile and pivoting

Symmetric pivoting

Diagonal pivoting

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 6 / 16

slide-17
SLIDE 17

Rank profile and pivoting

Symmetric pivoting

Diagonal pivoting

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 6 / 16

slide-18
SLIDE 18

Rank profile and pivoting

Symmetric pivoting

Diagonal pivoting ⇒ LDLT with D diagonal

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 6 / 16

slide-19
SLIDE 19

Rank profile and pivoting

Symmetric pivoting

Diagonal pivoting ⇒ LDLT with D diagonal Off-diagonal pivoting with zero diagonal

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 6 / 16

slide-20
SLIDE 20

Rank profile and pivoting

Symmetric pivoting

Diagonal pivoting ⇒ LDLT with D diagonal Off-diagonal pivoting with zero diagonal

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 6 / 16

slide-21
SLIDE 21

Rank profile and pivoting

Symmetric pivoting

Diagonal pivoting ⇒ LDLT with D diagonal Off-diagonal pivoting with zero diagonal ⇒ L∆LT with ∆ block diagonal, 1 × 1

  • r 2 × 2 [ 0 1

1 0 ] blocks

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 6 / 16

slide-22
SLIDE 22

Rank profile and pivoting

Symmetric pivoting

Diagonal pivoting ⇒ LDLT with D diagonal Off-diagonal pivoting with zero diagonal ⇒ L∆LT with ∆ block diagonal, 1 × 1

  • r 2 × 2 [ 0 1

1 0 ] blocks

Off-diagonal pivoting with non-zero diagonal

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 6 / 16

slide-23
SLIDE 23

Rank profile and pivoting

Symmetric pivoting

Diagonal pivoting ⇒ LDLT with D diagonal Off-diagonal pivoting with zero diagonal ⇒ L∆LT with ∆ block diagonal, 1 × 1

  • r 2 × 2 [ 0 1

1 0 ] blocks

Off-diagonal pivoting with non-zero diagonal

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 6 / 16

slide-24
SLIDE 24

Rank profile and pivoting

Symmetric pivoting

Diagonal pivoting ⇒ LDLT with D diagonal Off-diagonal pivoting with zero diagonal ⇒ L∆LT with ∆ block diagonal, 1 × 1

  • r 2 × 2 [ 0 1

1 0 ] blocks

Off-diagonal pivoting with non-zero diagonal ⇒ LDLT with D diagonal ⇒ requires division by 2

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 6 / 16

slide-25
SLIDE 25

Rank profile and pivoting

The rank profile matrix

Rank Profiles Given a matrix A of rank r: Example A =     2 3 4 5    

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 7 / 16

slide-26
SLIDE 26

Rank profile and pivoting

The rank profile matrix

Rank Profiles Given a matrix A of rank r: RRP (Row Rank Profile): first r linearly independant rows Example A =     2 3 4 5    

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 7 / 16

slide-27
SLIDE 27

Rank profile and pivoting

The rank profile matrix

Rank Profiles Given a matrix A of rank r: RRP (Row Rank Profile): first r linearly independant rows CRP (Column Rank Profile): first r linearly independant columns Example A =     2 3 4 5    

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 7 / 16

slide-28
SLIDE 28

Rank profile and pivoting

The rank profile matrix

Rank Profiles Given a matrix A of rank r: RRP (Row Rank Profile): first r linearly independant rows CRP (Column Rank Profile): first r linearly independant columns Example A =     2 3 4 5    

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 7 / 16

slide-29
SLIDE 29

Rank profile and pivoting

The rank profile matrix

Rank Profiles Given a matrix A of rank r: RRP (Row Rank Profile): first r linearly independant rows CRP (Column Rank Profile): first r linearly independant columns Example A =     2 3 4 5     RPM (Rank Profile Matrix) The unique RA such that any pair of (i, j)-leading sub-matrix of RA and of A have the same rank.

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 7 / 16

slide-30
SLIDE 30

Rank profile and pivoting

Pivoting revealing the rank profile matrix (unsymmetric)

[D., P., Sultan. ISSAC’15.] Computing the Rank Profile Matrix. Definition A = PLUQ reveals the rank profile matrix if P Ir

  • Q = RA

Search Row perm.

  • Col. perm.

RowRP ColRP RA Instance Row order Transposition Transposition ⋆ [IMH82] [JPS13]

  • Col. order

Transposition Transposition ⋆ [KG85] [JPS13] Lexico. Transposition Transposition ⋆ [Sto00] Lexico. Transposition Rotation ⋆ ⋆ ⋆ [DPS15] Lexico. Rotation Rotation ⋆ ⋆ ⋆ [DPS15]

  • Rev. lex.

Transposition Transposition ⋆ [Sto00]

  • Rev. lex.

Rotation Transposition ⋆ ⋆ ⋆ [DPS15]

  • Rev. lex.

Rotation Rotation ⋆ ⋆ ⋆ [DPS15] Product Rotation Transposition ⋆ [DPS15] Product Transposition Rotation ⋆ [DPS15] Product Rotation Rotation ⋆ ⋆ ⋆ [DPS13]

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 8 / 16

slide-31
SLIDE 31

Rank profile and pivoting

Pivoting revealing the rank profile matrix (symmetric case)

Definition A = PL∆LTPT reveals the rank profile matrix RA if P∆PT = Diag(d1, . . . , dn)RA

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 9 / 16

slide-32
SLIDE 32

Rank profile and pivoting

Pivoting revealing the rank profile matrix (symmetric case)

Definition A = PL∆LTPT reveals the rank profile matrix RA if P∆PT = Diag(d1, . . . , dn)RA Pivoting revealing the rank profile matrix

1

Find pivot with minimal coordinates w.r.t.

lexicographic, reverse lexicographic or product order

2

Move it using cyclic rotations

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 9 / 16

slide-33
SLIDE 33

Rank profile and pivoting

Pivoting revealing the rank profile matrix (symmetric case)

Definition A = PL∆LTPT reveals the rank profile matrix RA if P∆PT = Diag(d1, . . . , dn)RA Pivoting revealing the rank profile matrix

1

Find pivot with minimal coordinates w.r.t.

lexicographic, reverse lexicographic or product order

2

Move it using cyclic rotations ⇒ Off-diagonal pivoting with non-zero diagonal ⇒ Requires division by 2.

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 9 / 16

slide-34
SLIDE 34

Algorithms

The recursive algorithm: full rank case

1

Split

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 10 / 16

slide-35
SLIDE 35

Algorithms

The recursive algorithm: full rank case

1

Split

2

Recursive call: A11 = L1LT

1

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 10 / 16

slide-36
SLIDE 36

Algorithms

The recursive algorithm: full rank case

1

Split

2

Recursive call: A11 = L1LT

1

3

TRSM: G ← A21L−1

1

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 10 / 16

slide-37
SLIDE 37

Algorithms

The recursive algorithm: full rank case

1

Split

2

Recursive call: A11 = L1LT

1

3

TRSM: G ← A21L−1

1

4

SYRK: H ← A22 − GG T

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 10 / 16

slide-38
SLIDE 38

Algorithms

The recursive algorithm: full rank case

1

Split

2

Recursive call: A11 = L1LT

1

3

TRSM: G ← A21L−1

1

4

SYRK: H ← A22 − GG T

5

Recursive call: H = L2LT

2

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 10 / 16

slide-39
SLIDE 39

Algorithms

The recursive algorithm: arbitrary rank case

1

Split

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 11 / 16

slide-40
SLIDE 40

Algorithms

The recursive algorithm: arbitrary rank case

1

Split

2

Recursive call: A11 = L1LT

1

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 11 / 16

slide-41
SLIDE 41

Algorithms

The recursive algorithm: arbitrary rank case

1

Split

2

Recursive call: A11 = L1LT

1

3

TRSM: G ← A21L−1

1

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 11 / 16

slide-42
SLIDE 42

Algorithms

The recursive algorithm: arbitrary rank case

1

Split

2

Recursive call: A11 = L1LT

1

3

TRSM: G ← A21L−1

1

4

GEMM: K ← A22 − GMT

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 11 / 16

slide-43
SLIDE 43

Algorithms

The recursive algorithm: arbitrary rank case

1

Split

2

Recursive call: A11 = L1LT

1

3

TRSM: G ← A21L−1

1

4

GEMM: K ← A22 − GMT

5

SYRK: H ← A22 − GG T

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 11 / 16

slide-44
SLIDE 44

Algorithms

The recursive algorithm: arbitrary rank case

1

Split

2

Recursive call: A11 = L1LT

1

3

TRSM: G ← A21L−1

1

4

GEMM: K ← A22 − GMT

5

SYRK: H ← A22 − GG T

6

PLUQ: K = P2L2U2Q2

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 11 / 16

slide-45
SLIDE 45

Algorithms

The recursive algorithm: arbitrary rank case

1

Split

2

Recursive call: A11 = L1LT

1

3

TRSM: G ← A21L−1

1

4

GEMM: K ← A22 − GMT

5

SYRK: H ← A22 − GG T

6

PLUQ: K = P2L2U2Q2

7

TRSYR2K: find X s.t. L2X T + XLT

2 = H11

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 11 / 16

slide-46
SLIDE 46

Algorithms

The recursive algorithm: arbitrary rank case

1

Split

2

Recursive call: A11 = L1LT

1

3

TRSM: G ← A21L−1

1

4

GEMM: K ← A22 − GMT

5

SYRK: H ← A22 − GG T

6

PLUQ: K = P2L2U2Q2

7

TRSYR2K: find X s.t. L2X T + XLT

2 = H11

8

TRMM: H21 ← H21 − M2X

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 11 / 16

slide-47
SLIDE 47

Algorithms

The recursive algorithm: arbitrary rank case

1

Split

2

Recursive call: A11 = L1LT

1

3

TRSM: G ← A21L−1

1

4

GEMM: K ← A22 − GMT

5

SYRK: H ← A22 − GG T

6

PLUQ: K = P2L2U2Q2

7

TRSYR2K: find X s.t. L2X T + XLT

2 = H11

8

TRMM: H21 ← H21 − M2X

9

TRSM: H21 ← H21L−T

2

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 11 / 16

slide-48
SLIDE 48

Algorithms

The recursive algorithm: arbitrary rank case

1

Split

2

Recursive call: A11 = L1LT

1

3

TRSM: G ← A21L−1

1

4

GEMM: K ← A22 − GMT

5

SYRK: H ← A22 − GG T

6

PLUQ: K = P2L2U2Q2

7

TRSYR2K: find X s.t. L2X T + XLT

2 = H11

8

TRMM: H21 ← H21 − M2X

9

TRSM: H21 ← H21L−T

2

10 SYR2K:

H22 ← H22 − M2HT

21 − H21MT 2

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 11 / 16

slide-49
SLIDE 49

Algorithms

The recursive algorithm: arbitrary rank case

1

Split

2

Recursive call: A11 = L1LT

1

3

TRSM: G ← A21L−1

1

4

GEMM: K ← A22 − GMT

5

SYRK: H ← A22 − GG T

6

PLUQ: K = P2L2U2Q2

7

TRSYR2K: find X s.t. L2X T + XLT

2 = H11

8

TRMM: H21 ← H21 − M2X

9

TRSM: H21 ← H21L−T

2

10 SYR2K:

H22 ← H22 − M2HT

21 − H21MT 2

11 Recursive call: H22 = L3LT

3

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 11 / 16

slide-50
SLIDE 50

Algorithms

The recursive algorithm: arbitrary rank case

1

Split

2

Recursive call: A11 = L1LT

1

3

TRSM: G ← A21L−1

1

4

GEMM: K ← A22 − GMT

5

SYRK: H ← A22 − GG T

6

PLUQ: K = P2L2U2Q2

7

TRSYR2K: find X s.t. L2X T + XLT

2 = H11

8

TRMM: H21 ← H21 − M2X

9

TRSM: H21 ← H21L−T

2

10 SYR2K:

H22 ← H22 − M2HT

21 − H21MT 2

11 Recursive call: H22 = L3LT

3

12 Permutation J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 11 / 16

slide-51
SLIDE 51

Algorithms

The trsyr2k routine

Problem Given C symmetric and L lower triangular Find X lower triangular such that LX T + XLT = C L1 L2 L3 X T

1

X T

2

X T

3

  • +

X1 X2 X3 LT

1

LT

2

LT

3

  • =
  • C1

C T

2

C2 C3

  • .

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 12 / 16

slide-52
SLIDE 52

Algorithms

The trsyr2k routine

Problem Given C symmetric and L lower triangular Find X lower triangular such that LX T + XLT = C L1 L2 L3 X T

1

X T

2

X T

3

  • +

X1 X2 X3 LT

1

LT

2

LT

3

  • =
  • C1

C T

2

C2 C3

  • .

L1X1

T + X1LT 1 = C1

1

Find X1 s.t. L1X T

1 + X1LT 1 = C1

recurse

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 12 / 16

slide-53
SLIDE 53

Algorithms

The trsyr2k routine

Problem Given C symmetric and L lower triangular Find X lower triangular such that LX T + XLT = C L1 L2 L3 X T

1

X T

2

X T

3

  • +

X1 X2 X3 LT

1

LT

2

LT

3

  • =
  • C1

C T

2

C2 C3

  • .

L2X T

1 + X2LT 1 = C2

1

Find X1 s.t. L1X T

1 + X1LT 1 = C1

recurse

2

Y ← C2 − L2X T

1

trmm

3

X2 ← (LT

1 )−1

trsm

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 12 / 16

slide-54
SLIDE 54

Algorithms

The trsyr2k routine

Problem Given C symmetric and L lower triangular Find X lower triangular such that LX T + XLT = C L1 L2 L3 X T

1

X T

2

X T

3

  • +

X1 X2 X3 LT

1

LT

2

LT

3

  • =
  • C1

C T

2

C2 C3

  • .

L2X T

2 + X2LT 2 + L3X3 T + X3LT 3 = C3

1

Find X1 s.t. L1X T

1 + X1LT 1 = C1

recurse

2

Y ← C2 − L2X T

1

trmm

3

X2 ← (LT

1 )−1

trsm

4

Z ← C3 − L2X T

2 + X2L2

syr2k

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 12 / 16

slide-55
SLIDE 55

Algorithms

The trsyr2k routine

Problem Given C symmetric and L lower triangular Find X lower triangular such that LX T + XLT = C L1 L2 L3 X T

1

X T

2

X T

3

  • +

X1 X2 X3 LT

1

LT

2

LT

3

  • =
  • C1

C T

2

C2 C3

  • .

L2X T

2 + X2LT 2 + L3X3 T + X3LT 3 = C3

1

Find X1 s.t. L1X T

1 + X1LT 1 = C1

recurse

2

Y ← C2 − L2X T

1

trmm

3

X2 ← (LT

1 )−1

trsm

4

Z ← C3 − L2X T

2 + X2L2

syr2k

5

Find X3 s.t. L3X T

3 + X3LT 3 = Z

recurse

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 12 / 16

slide-56
SLIDE 56

The characteristic 2 case

Problem in characteristic 2

Consider c c d

  • (with R =

1 1

  • )

Diagonal pivoting:

  • c

c d

  • =
  • 1

1 1

c d

1

  • d

− c2

d

1

c d

1

  • 1

1

  • J-G. Dumas, C. Pernet (UGA)

P · L · ∆ · LT · PT ISSAC’18, New York 13 / 16

slide-57
SLIDE 57

The characteristic 2 case

Problem in characteristic 2

Consider c c d

  • (with R =

1 1

  • )

Diagonal pivoting:

  • c

c d

  • =
  • 1

1 1

c d

1

  • d

− c2

d

1

c d

1

  • 1

1

  • Off-diagonal pivoting:

c c d

  • =

1

d 2c

1 c c

  • 1

d 2c

1

  • J-G. Dumas, C. Pernet (UGA)

P · L · ∆ · LT · PT ISSAC’18, New York 13 / 16

slide-58
SLIDE 58

The characteristic 2 case

Problem in characteristic 2

Consider c c d

  • (with R =

1 1

  • )

Diagonal pivoting:

  • c

c d

  • =
  • 1

1 1

c d

1

  • d

− c2

d

1

c d

1

  • 1

1

  • Off-diagonal pivoting:

c c d

  • =

1

d 2c

1 c c

  • 1

d 2c

1

  • Diagonal pivoting breaks RPM-revealing property

⇒ off-diagonal pivoting required ⇒ division by 2 required

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 13 / 16

slide-59
SLIDE 59

The characteristic 2 case

Problem in characteristic 2

Consider c c d

  • (with R =

1 1

  • )

Diagonal pivoting:

  • c

c d

  • =
  • 1

1 1

c d

1

  • d

− c2

d

1

c d

1

  • 1

1

  • Off-diagonal pivoting:

c c d

  • =

1

d 2c

1 c c

  • 1

d 2c

1

  • Diagonal pivoting breaks RPM-revealing property

⇒ off-diagonal pivoting required ⇒ division by 2 required In characteristic 2 ⇒ In general there is no RPM-revealing P · L · ∆ · LT · PT!

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 13 / 16

slide-60
SLIDE 60

The characteristic 2 case

Characteristic 2: Preserve [ 0 1

1 1 ] factors Compute instead A = P · L · D · Ψ · LT · PT: ⇒ D diagonal, Ψ block diagonal with 1×1, [ 0 1

1 0 ] and [ 0 1 1 1 ] blocks

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 14 / 16

slide-61
SLIDE 61

The characteristic 2 case

Characteristic 2: Preserve [ 0 1

1 1 ] factors Compute instead A = P · L · D · Ψ · LT · PT: ⇒ D diagonal, Ψ block diagonal with 1×1, [ 0 1

1 0 ] and [ 0 1 1 1 ] blocks

From this intermediate form:

Either recover RA = P · RΨ · PT; Or use:

  • c

c d

  • =
  • 1

1

  • ·

1 c/d 1

  • ·

d −c2/d

  • ·

1 c/d 1

  • ·
  • 1

1

  • + commuting

⇒ ˜ P · ˜ L · ∆ · ˜ LT · ˜ PT symmetric factorization (but RA is lost)

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 14 / 16

slide-62
SLIDE 62

Performance

Iterative base case

Practical performance:

Stop recursion (less mod reductions/data move on small matrices)

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 15 / 16

slide-63
SLIDE 63

Performance

Iterative base case

Practical performance:

Stop recursion (less mod reductions/data move on small matrices)

⇒ iterative base case and cascading

pivot search minimizing the lexicographic order cyclic shifts on the row and columns Crout elimination schedule

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 15 / 16

slide-64
SLIDE 64

Performance

Iterative base case

Practical performance:

Stop recursion (less mod reductions/data move on small matrices)

⇒ iterative base case and cascading

pivot search minimizing the lexicographic order cyclic shifts on the row and columns Crout elimination schedule

5 10 15 20 200 400 600 800 1000 1200 1400 1600 Effective Gfops matrix dimension Rank n/2, random rank profile matrix, 1-core i5-4690, @3.5GHz Cascade Base case Recursive

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 15 / 16

slide-65
SLIDE 65

Performance

LAPACK vs FFPACK modulo 8 388 593

LAPACK FFPACK n dgetrf (LU) dsytrf (LDLT) fgetrf (LU) fsytrf (LDLT) 5000 2.01s 1.60s 3.90s 1.59s 10000 14.95s 11.98s 24.12s 10.90s

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 16 / 16

slide-66
SLIDE 66

Performance

LAPACK vs FFPACK modulo 8 388 593

LAPACK FFPACK n dgetrf (LU) dsytrf (LDLT) fgetrf (LU) fsytrf (LDLT) 5000 2.01s 1.60s 3.90s 1.59s 10000 14.95s 11.98s 24.12s 10.90s

5 10 15 20 25 30 35 40 45 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 5 10 15 20 25 30 35 40 45 Effective Gfops matrix dimension Full rank, 1-core Intel Haswell i5-4690, @3.5GHz LAPACK dgetrf LAPACK dsytrf FFPACK fgetrf FFPACK fsytrf

J-G. Dumas, C. Pernet (UGA) P · L · ∆ · LT · PT ISSAC’18, New York 16 / 16