On fast multiplication of a matrix by its transpose Jean-Guillaume - PowerPoint PPT Presentation

On fast multiplication of a matrix by its transpose Jean-Guillaume Dumas Cl´ ement Pernet Alexandre Sedoglavic Luminy, 3 Mars 2020 Centre de Recherche en Informatique, Signal et Automatique de Lille

Strassen-Winograd fast multiplication algorithm Outline Strassen-Winograd fast multiplication algorithm 1 Fast matrix product by its transpose 2 Skew orthogonal matrices 3 Complexity bounds for block algorithms 4 Space and time efficient implementation 5 Minimality 6 Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 2 / 23

Strassen-Winograd fast multiplication algorithm 2 ˆ 2 matrix multiplication � � � � � � � � A 11 A 12 B 11 B 12 ( A 11 B 11 + A 12 B 21 ) ( A 11 B 12 + A 12 B 22 ) C 11 C 12 = = ˆ A 21 A 22 B 21 B 22 ( A 21 B 11 + A 22 B 21 ) ( A 21 B 12 + A 22 B 22 ) C 21 C 22 Classical Algorithm 8 multiplications, 4 additions Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 3 / 23

Strassen-Winograd fast multiplication algorithm 2 ˆ 2 matrix multiplication � � � � � � � � A 11 A 12 B 11 B 12 ( A 11 B 11 + A 12 B 21 ) ( A 11 B 12 + A 12 B 22 ) C 11 C 12 = = ˆ A 21 A 22 B 21 B 22 ( A 21 B 11 + A 22 B 21 ) ( A 21 B 12 + A 22 B 22 ) C 21 C 22 [Strassen 1969] 7 multiplications, 18 additions Classical Algorithm 8 multiplications, 4 additions Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 3 / 23

Strassen-Winograd fast multiplication algorithm 2 ˆ 2 matrix multiplication � � � � � � � � A 11 A 12 B 11 B 12 ( A 11 B 11 + A 12 B 21 ) ( A 11 B 12 + A 12 B 22 ) C 11 C 12 = = ˆ A 21 A 22 B 21 B 22 ( A 21 B 11 + A 22 B 21 ) ( A 21 B 12 + A 22 B 22 ) C 21 C 22 [Strassen 1969] 7 multiplications, 18 additions [Winograd 1973? 1977] Classical Algorithm 7 multiplications, 15 additions 8 multiplications, 4 additions Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 3 / 23

Strassen-Winograd fast multiplication algorithm 2 ˆ 2 matrix multiplication � � � � � � � � A 11 A 12 B 11 B 12 ( A 11 B 11 + A 12 B 21 ) ( A 11 B 12 + A 12 B 22 ) C 11 C 12 = = ˆ A 21 A 22 B 21 B 22 ( A 21 B 11 + A 22 B 21 ) ( A 21 B 12 + A 22 B 22 ) C 21 C 22 [Strassen 1969] 7 multiplications, 18 additions [Winograd 1973? 1977] Classical Algorithm 7 multiplications, 15 additions 8 multiplications, 4 additions [ Hopcroft-Kerr 1969] : 7 multiplications minimum [ Bshouty 1995] : 15 additions minimum (for a bilin. alg. with 7 mult.) Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 3 / 23

Strassen-Winograd fast multiplication algorithm Matrix multiplication by its transpose A ¨ A ⊺ � � � A ⊺ A ⊺ � � ( A 11 A ⊺ 11 + A 12 A ⊺ � � � A 11 A 12 12 ) C ⊺ C 11 11 21 21 = = ˆ A ⊺ A ⊺ ( A 21 A ⊺ 11 + A 22 A ⊺ ( A 21 A ⊺ 21 + A 22 A ⊺ A 21 A 22 12 ) 22 ) C 21 C 22 12 22 Divide & Conquer Algorithm 6 multiplications, 3 additions Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 4 / 23

Strassen-Winograd fast multiplication algorithm Matrix multiplication by its transpose A ¨ A ⊺ � � � A ⊺ A ⊺ � � ( A 11 A ⊺ 11 + A 12 A ⊺ � � � A 11 A 12 12 ) C ⊺ C 11 11 21 21 = = ˆ A ⊺ A ⊺ ( A 21 A ⊺ 11 + A 22 A ⊺ ( A 21 A ⊺ 21 + A 22 A ⊺ A 21 A 22 12 ) 22 ) C 21 C 22 12 22 Divide & Conquer Algorithm here (over C , over any finite field) 6 multiplications, 3 additions 5 multiplications, 7.5 additions Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 4 / 23

Fast matrix product by its transpose Outline Strassen-Winograd fast multiplication algorithm 1 Fast matrix product by its transpose 2 Skew orthogonal matrices 3 Complexity bounds for block algorithms 4 Space and time efficient implementation 5 Minimality 6 Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 5 / 23

Fast matrix product by its transpose From Strassen-Winograd fast multiplication algorithm � � Require: A = [ a 11 a 12 b 11 b 12 a 21 a 22 ] and B = ; b 21 b 22 Ensure: C = A ¨ B 8 additions: 1 s 1 Ð a 11 ´ a 21 , s 2 Ð a 21 + a 22 , s 3 Ð s 2 ´ a 11 , s 4 Ð a 12 ´ s 3 , t 1 Ð b 22 ´ b 12 , t 2 Ð b 12 ´ b 11 , t 3 Ð b 11 + t 1 , t 4 Ð b 21 ´ t 3 . 7 recursive multiplications: 2 p 1 Ð a 11 ¨ b 11 , p 2 Ð a 12 ¨ b 21 , p 3 Ð a 22 ¨ t 4 , p 4 Ð s 1 ¨ t 1 , p 5 Ð s 3 ¨ t 3 , p 6 Ð s 4 ¨ b 22 , p 7 Ð s 2 ¨ t 2 . 7 final additions: 3 c 1 Ð p 1 + p 5 , c 2 Ð c 1 + p 4 , c 3 Ð p 1 + p 2 , c 4 Ð c 2 + p 3 , c 5 Ð c 2 + p 7 , c 6 Ð c 1 + p 7 , c 7 Ð c 6 + p 6 . return C = [ c 3 c 7 c 4 c 5 ] . 4 Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 6 / 23

Fast matrix product by its transpose Matrix product by its transpose � a ⊺ 11 a ⊺ � a 21 a 22 ] with A ⊺ = Require: A = [ a 11 a 12 21 ; a ⊺ 12 a ⊺ 22 Ensure: C = A ¨ A ⊺ 6 additions: 1 ✭✭✭✭✭✭ ❤❤❤❤❤❤ ✭ s 1 Ð a 11 ´ a 21 , s 2 Ð a 21 + a 22 , s 3 Ð s 2 ´ a 11 , s 4 Ð a 12 ´ s 3 , ❤ ❤❤❤❤❤❤ ✭ ✭✭✭✭✭✭ t 1 Ð a ⊺ 22 ´ a ⊺ t 2 Ð a ⊺ 21 ´ a ⊺ t 3 Ð a ⊺ t 4 Ð a ⊺ 21 , 11 , 11 + t 1 , 12 ´ t 3 . ❤ 6 multiplications (2 recursive, 4 general): 2 p 1 Ð a 11 ¨ a ⊺ p 2 Ð a 12 ¨ a ⊺ 11 , 12 , p 3 Ð a 22 ¨ t 4 , p 4 Ð s 1 ¨ t 1 , ❳❳❳❳❳ ✘ ✘✘✘✘✘ p 6 Ð s 4 ¨ a ⊺ p 7 Ð s 2 ¨ s ⊺ p 5 Ð s 3 ¨ t 3 , 22 , 1 . ❳ 5 final additions: 3 c 1 Ð p 1 + p 5 , c 2 Ð c 1 + p 4 , c 3 Ð p 1 + p 2 , c 4 Ð c 2 + p 3 , ❤❤❤❤❤ ✭ ❤❤❤❤❤ ✭ ✭✭✭✭✭ ✭✭✭✭✭ c 5 Ð c 2 ´ p 7 , c 6 Ð c 1 ´ p 7 , c 7 Ð c 6 + p 6 . ❤ ❤ � c 3 ✚ ❩ c 7 � return C = . 4 c 4 c 5 Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 7 / 23

Fast matrix product by its transpose Matrix product by its transpose � a ⊺ 11 a ⊺ � a 21 a 22 ] with A ⊺ = Require: A = [ a 11 a 12 21 ; a ⊺ 12 a ⊺ 22 all variants have sign discrepancies Ensure: C = A ¨ A ⊺ 6 additions: 1 ✭✭✭✭✭✭ ❤❤❤❤❤❤ ✭ s 1 Ð a 11 ´ a 21 , s 2 Ð a 21 + a 22 , s 3 Ð s 2 ´ a 11 , s 4 Ð a 12 ´ s 3 , ❤ ❤❤❤❤❤❤ ✭ ✭✭✭✭✭✭ t 1 Ð a ⊺ 22 ´ a ⊺ t 2 Ð a ⊺ 21 ´ a ⊺ t 3 Ð a ⊺ t 4 Ð a ⊺ 21 , 11 , 11 + t 1 , 12 ´ t 3 . ❤ 6 multiplications (2 recursive, 4 general): 2 p 1 Ð a 11 ¨ a ⊺ p 2 Ð a 12 ¨ a ⊺ 11 , 12 , p 3 Ð a 22 ¨ t 4 , p 4 Ð s 1 ¨ t 1 , ❳❳❳❳❳ ✘ ✘✘✘✘✘ p 6 Ð s 4 ¨ a ⊺ p 7 Ð s 2 ¨ s ⊺ p 5 Ð s 3 ¨ t 3 , 22 , 1 . ❳ 5 final additions: 3 c 1 Ð p 1 + p 5 , c 2 Ð c 1 + p 4 , c 3 Ð p 1 + p 2 , c 4 Ð c 2 + p 3 , ❤❤❤❤❤ ✭ ❤❤❤❤❤ ✭ ✭✭✭✭✭ ✭✭✭✭✭ c 5 Ð c 2 ´ p 7 , c 6 Ð c 1 ´ p 7 , c 7 Ð c 6 + p 6 . ❤ ❤ � c 3 ✚ ❩ c 7 � return C = . 4 c 4 c 5 Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 7 / 23

Fast matrix product by its transpose Parameterized matrix product by its transpose! a 21 a 22 ] and Y s.t. YY ⊺ = ´ I n ; Require: A = [ a 11 a 12 Ensure: C = A ¨ A ⊺ 4 additions and 2 multiplications by Y : 1 s 1 Ð ( a 21 ´ a 11 ) Y , s 2 Ð a 22 ´ a 21 Y , s 3 Ð ´ a 11 Y ´ s 2 . s 4 Ð s 3 + a 12 , ❤❤❤❤❤❤❤ ✭ ❤❤❤❤❤❤❤❤ ✭✭✭✭✭✭✭✭ ✭ ❳❳❳❳❳❳ ✘✘✘✘✘✘ ✭✭✭✭✭✭✭ t 1 Ð Y ⊺ a ⊺ 21 ´ a ⊺ t 3 Ð ´ Y ⊺ a ⊺ 11 ´ t 1 t 4 Ð t 3 ´ a ⊺ ❤ ❤ 22 12 5 multiplications (3 recursive, 2 general): 2 p 1 Ð a 11 ¨ a ⊺ p 2 Ð a 12 ¨ a ⊺ p 3 Ð a 22 ¨ s ⊺ p 4 Ð s 1 ¨ s ⊺ 11 , 12 , 4 , 2 , ✘✘✘✘✘ ❳❳❳❳❳ ✘ p 5 Ð s 3 ¨ s ⊺ p 7 Ð s 2 ¨ s ⊺ 3 . ❳ 1 5 final additions: 3 c 1 Ð p 1 + p 5 , c 2 Ð c 1 + p 4 , c 3 Ð p 1 + p 2 , c 4 Ð c 2 + p 3 , c 5 Ð c 2 + p ⊺ 4 . return C = [ c 3 c 4 c 5 ] . 4 Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 8 / 23

Fast matrix product by its transpose Fast Matrix product by its transpose, using symmetries a 21 a 22 ] and Y s.t. YY ⊺ = ´ I n ; Require: A = [ a 11 a 12 Ensure: C = A ¨ A ⊺ 4 additions and 2 multiplications by Y : 1 s 1 Ð ( a 21 ´ a 11 ) Y , s 2 Ð a 22 ´ a 21 Y , s 3 Ð ´ a 11 Y ´ s 2 . s 4 Ð s 3 + a 12 , 5 multiplications (3 recursive, 2 general): 2 p 1 Ð a 11 ¨ a ⊺ p 2 Ð a 12 ¨ a ⊺ p 3 Ð a 22 ¨ s ⊺ p 4 Ð s 1 ¨ s ⊺ 11 , 12 , 4 , 2 , p 5 Ð s 3 ¨ s ⊺ 3 . 2 complete and 3 symmetric additions : 3 Low ( c 1 ) Ð Low ( p 1 ) + Low ( p 5 ) , c 2 Ð c 1 + p 4 , Low ( c 3 ) Ð Low ( p 1 ) + Low ( p 2 ) , Low ( c 5 ) Ð Low ( c 2 ) + Low ( p ⊺ 4 ) , c 4 Ð c 2 + p 3 . return C = [ c 3 c 4 c 5 ] . 4 Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 9 / 23

Skew orthogonal matrices Outline Strassen-Winograd fast multiplication algorithm 1 Fast matrix product by its transpose 2 Skew orthogonal matrices 3 Complexity bounds for block algorithms 4 Space and time efficient implementation 5 Minimality 6 Dumas-Pernet-Sedoglavic On fast multiplication of a matrix by its transpose JNCF 2020 10 / 23

On fast multiplication of a matrix by its transpose Jean-Guillaume - PowerPoint PPT Presentation

On fast multiplication of a matrix by its transpose Jean-Guillaume Dumas Cl ement Pernet Alexandre Sedoglavic Luminy, 3 Mars 2020 Centre de Recherche en Informatique, Signal et Automatique de Lille Strassen-Winograd fast multiplication

Matrix Multiplication Matrix multiplication is an operation with properties quite different from

Matrix Multiplication Matrix Multiplication via Matrix-Vector Mult Defn. If matrix A is m n

Exploiting Matrix Reuse and Data Locality in Sparse Matrix-Vector and Matrix-Transpose-Vector

Parallel Sparse Matrix-Vector and Matrix- Transpose-Vector Multiplication using Compressed Sparse

CS 140 : Matrix multiplication Warmup: Matrix times vector: communication volume Matrix

Properties of Transpose Transpose has higher precedence than multiplica- tion and addition, so AB

Shared Memory with Cilk++ Matrix-matrix multiplication Matrix-vector multiplication

Parallel Scientific Computing Matrix-vector multiplication. Matrix-matrix multiplication.

Proposing a Fast and Scalable Systolic Array for Matrix Multiplication Bahar Asgari , , Ra

Complexity of matrix multiplication (For Hierarchical matrix) For Usual matrix The

CS 401 Integer Multiplication / Matrix Multiplication Xiaorui Sun 1 Integer Multiplication

Matrix-chain multiplication Carola Wenk 1 CMPS 6610 Algorithms Matrix-chain multiplication

Chapter VI All Pair Shortest Paths and Matrix Multiplication VI.1 APSPs and Matrix

Efficient multiplication 2 Matrix multiplication If you have square matrices A and B, then C =

Matrix Calculations: Kernels & Images, Matrix Multiplication A. Kissinger (and H. Geuvers)

Communication Lower Bounds for Matrix-Matrix Multiplication Dagstuhl Seminar #15281 July 6-9,

Objectives Quadratic Residues Primality Testing: Solovay Strassen Algorithm Computing

Efficient Algorithms and Problem Complexity Divide and Conquer Frank Drewes Department

Divide and Conquer CSE 421 Algorithms Richard Anderson Lecture 12 Recurrences and Divide and

Divide and conquer 1 The main idea for the divide and conquer is trying to divide a problem into

Randomized Algorithms Abundance of Witnesses Mohammad Heidari Yazd University May 8, 2016

Algorithms for Public Key Cryptography Eli Biham - May 3, 2005 c 408 Algorithms for Public

The Case for Malleable Stream Architectures Christopher Batten 1 , 3 , Hidetaka Aoki 2 , Krste

Apache Ignite Extensions - Modularization Saikat Maitra Twitter @samaitra Github samaitra

On fast multiplication of a matrix by its transpose Jean-Guillaume - PowerPoint PPT Presentation

On fast multiplication of a matrix by its transpose Jean-Guillaume Dumas Cl ement Pernet Alexandre Sedoglavic Luminy, 3 Mars 2020 Centre de Recherche en Informatique, Signal et Automatique de Lille Strassen-Winograd fast multiplication

Matrix Multiplication Matrix multiplication is an operation with properties quite different from

Matrix Multiplication Matrix Multiplication via Matrix-Vector Mult Defn. If matrix A is m n

Exploiting Matrix Reuse and Data Locality in Sparse Matrix-Vector and Matrix-Transpose-Vector

Parallel Sparse Matrix-Vector and Matrix- Transpose-Vector Multiplication using Compressed Sparse

CS 140 : Matrix multiplication Warmup: Matrix times vector: communication volume Matrix

Properties of Transpose Transpose has higher precedence than multiplica- tion and addition, so AB

Shared Memory with Cilk++ Matrix-matrix multiplication Matrix-vector multiplication

Parallel Scientific Computing Matrix-vector multiplication. Matrix-matrix multiplication.

Proposing a Fast and Scalable Systolic Array for Matrix Multiplication Bahar Asgari , , Ra

Complexity of matrix multiplication (For Hierarchical matrix) For Usual matrix The

CS 401 Integer Multiplication / Matrix Multiplication Xiaorui Sun 1 Integer Multiplication

Matrix-chain multiplication Carola Wenk 1 CMPS 6610 Algorithms Matrix-chain multiplication

Chapter VI All Pair Shortest Paths and Matrix Multiplication VI.1 APSPs and Matrix

Efficient multiplication 2 Matrix multiplication If you have square matrices A and B, then C =

Matrix Calculations: Kernels &amp; Images, Matrix Multiplication A. Kissinger (and H. Geuvers)

Communication Lower Bounds for Matrix-Matrix Multiplication Dagstuhl Seminar #15281 July 6-9,

Objectives Quadratic Residues Primality Testing: Solovay Strassen Algorithm Computing

Efficient Algorithms and Problem Complexity Divide and Conquer Frank Drewes Department

Divide and Conquer CSE 421 Algorithms Richard Anderson Lecture 12 Recurrences and Divide and

Divide and conquer 1 The main idea for the divide and conquer is trying to divide a problem into

Randomized Algorithms Abundance of Witnesses Mohammad Heidari Yazd University May 8, 2016

Algorithms for Public Key Cryptography Eli Biham - May 3, 2005 c 408 Algorithms for Public

The Case for Malleable Stream Architectures Christopher Batten 1 , 3 , Hidetaka Aoki 2 , Krste

Apache Ignite Extensions - Modularization Saikat Maitra Twitter @samaitra Github samaitra

Matrix Calculations: Kernels & Images, Matrix Multiplication A. Kissinger (and H. Geuvers)