matrix vector multiplication in sub quadratic time some
play

Matrix-Vector Multiplication in Sub-Quadratic Time (Some - PowerPoint PPT Presentation

Matrix-Vector Multiplication in Sub-Quadratic Time (Some Preprocessing Required) Ryan Williams Carnegie Mellon University 0-0 Introduction Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing 1 Introduction


  1. Matrix-Vector Multiplication in Sub-Quadratic Time (Some Preprocessing Required) Ryan Williams Carnegie Mellon University 0-0

  2. Introduction Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing 1

  3. Introduction Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing How fast can n × n matrix-vector multiplication be? 1-a

  4. Introduction Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing How fast can n × n matrix-vector multiplication be? Θ( n 2 ) steps just to read the matrix! 1-b

  5. Introduction Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing How fast can n × n matrix-vector multiplication be? Θ( n 2 ) steps just to read the matrix! Main Result: If we allow O ( n 2+ ε ) preprocessing, then matrix-vector multiplication over any finite semiring can be done in O ( n 2 / ( ε log n ) 2 ) . 1-c

  6. Better Algorithms for Matrix Multiplication Three of the major developments: 2

  7. Better Algorithms for Matrix Multiplication Three of the major developments: • Arlazarov et al. , a.k.a. “Four Russians” (1960’s): O ( n 3 / log n ) operations Uses table lookups Good for hardware with short vector operations as primitives 2-a

  8. Better Algorithms for Matrix Multiplication Three of the major developments: • Arlazarov et al. , a.k.a. “Four Russians” (1960’s): O ( n 3 / log n ) operations Uses table lookups Good for hardware with short vector operations as primitives log 7 log 2 = O ( n 2 . 81 ) operations • Strassen (1969): n Asymptotically fast, but overhead in the big-O Experiments in practice are inconclusive about Strassen vs. Four Russians for Boolean matrix multiplication (Bard, 2006) 2-b

  9. Better Algorithms for Matrix Multiplication Three of the major developments: • Arlazarov et al. , a.k.a. “Four Russians” (1960’s): O ( n 3 / log n ) operations Uses table lookups Good for hardware with short vector operations as primitives log 7 log 2 = O ( n 2 . 81 ) operations • Strassen (1969): n Asymptotically fast, but overhead in the big-O Experiments in practice are inconclusive about Strassen vs. Four Russians for Boolean matrix multiplication (Bard, 2006) • Coppersmith and Winograd (1990): O ( n 2 . 376 ) operations Not yet practical 2-c

  10. Focus: Combinatorial Matrix Multiplication Algorithms 3

  11. Focus: Combinatorial Matrix Multiplication Algorithms • Also called non-algebraic ; let’s call them non-subtractive E.g. Four-Russians is combinatorial, Strassen isn’t 3-a

  12. Focus: Combinatorial Matrix Multiplication Algorithms • Also called non-algebraic ; let’s call them non-subtractive E.g. Four-Russians is combinatorial, Strassen isn’t More Non-Subtractive Boolean Matrix Mult. Algorithms: • Atkinson and Santoro: O ( n 3 / log 3 / 2 n ) on a (log n ) -word RAM • Rytter and Basch-Khanna-Motwani: O ( n 3 / log 2 n ) on a RAM • Chan: Four Russians can be implemented on O ( n 3 / log 2 n ) on a pointer machine 3-b

  13. Main Result The O ( n 3 / log 2 n ) matrix multiplication algorithm can be “de-amortized” 4

  14. Main Result The O ( n 3 / log 2 n ) matrix multiplication algorithm can be “de-amortized” More precisely, we can: Preprocess an n × n matrix A over a finite semiring in O ( n 2+ ε ) Such that vector multiplications with A can be done in O ( n 2 / ( ε log n ) 2 ) 4-a

  15. Main Result The O ( n 3 / log 2 n ) matrix multiplication algorithm can be “de-amortized” More precisely, we can: Preprocess an n × n matrix A over a finite semiring in O ( n 2+ ε ) Such that vector multiplications with A can be done in O ( n 2 / ( ε log n ) 2 ) Allows for “non-subtractive” matrix multiplication to be done on-line 4-b

  16. Main Result The O ( n 3 / log 2 n ) matrix multiplication algorithm can be “de-amortized” More precisely, we can: Preprocess an n × n matrix A over a finite semiring in O ( n 2+ ε ) Such that vector multiplications with A can be done in O ( n 2 / ( ε log n ) 2 ) Allows for “non-subtractive” matrix multiplication to be done on-line Can be implemented on a pointer machine 4-c

  17. Main Result The O ( n 3 / log 2 n ) matrix multiplication algorithm can be “de-amortized” More precisely, we can: Preprocess an n × n matrix A over a finite semiring in O ( n 2+ ε ) Such that vector multiplications with A can be done in O ( n 2 / ( ε log n ) 2 ) Allows for “non-subtractive” matrix multiplication to be done on-line Can be implemented on a pointer machine This Talk: The Boolean case 4-d

  18. Preprocessing Phase: The Boolean Case Partition the input matrix A into blocks of ⌈ ε log n ⌉ × ⌈ ε log n ⌉ size: A 1 , 1 A 1 , 2 A 1 , · · · n ε log n . . ε log n . A 2 , 1 A = A i,j ε log n . . . . . . A A n · · · · · · ε log n , 1 n n ε log n , ε log n 5

  19. Preprocessing Phase: The Boolean Case Build a graph G with parts P 1 , . . . , P n/ ( ε log n ) , Q 1 , . . . , Q n/ ( ε log n ) P 1 2 ε log n 2 ε log n Q 1 Each part has 2 ε log n vertices, one for each possible ε log n vector P 2 2 ε log n 2 ε log n Q 2 . . . . . . . . . . . . P 2 ε log n Q 2 ε log n n n ε log n ε log n 6

  20. Preprocessing Phase: The Boolean Case Edges of G : Each vertex v in each P i has exactly one edge into each Q j 2 ε log n Q j P i 2 ε log n v A j,i v 7

  21. Preprocessing Phase: The Boolean Case Edges of G : Each vertex v in each P i has exactly one edge into each Q j 2 ε log n Q j P i 2 ε log n v A j,i v Time to build the graph: ε log n · 2 ε log n · ( ε log n ) 2 = O ( n 2+ ε ) n n ε log n · number number number matrix-vector mult of Q j of P i of A j,i and v of nodes in P i 7-a

  22. How to Do Fast Vector Multiplications Let v be a column vector. Want: A · v . 8

  23. How to Do Fast Vector Multiplications Let v be a column vector. Want: A · v . (1) Break up v into ε log n sized chunks:   v 1   v 2   v =   .  .  .     v n ε log n 8-a

  24. How to Do Fast Vector Multiplications (2) For each i = 1 , . . . , n/ ( ε log n ) , look up v i in P i . 9

  25. How to Do Fast Vector Multiplications (2) For each i = 1 , . . . , n/ ( ε log n ) , look up v i in P i . P 1 Q 1 2 ε log n 2 ε log n v 1 P 2 2 ε log n 2 ε log n Q 2 v 2 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v n/ ( ε log n ) Takes ˜ O ( n ) time. 9-a

  26. How to Do Fast Vector Multiplications (2) For each i = 1 , . . . , n/ ( ε log n ) , look up v i in P i . P 1 Q 1 2 ε log n 2 ε log n v 1 P 2 2 ε log n 2 ε log n Q 2 v 2 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v n/ ( ε log n ) Takes ˜ O ( n ) time. 10

  27. How to Do Fast Vector Multiplications (3) Look up the neighbors of v i , mark each neighbor found. 11

  28. How to Do Fast Vector Multiplications (3) Look up the neighbors of v i , mark each neighbor found. P 1 Q 1 2 ε log n 2 ε log n v 1 A 1 , 1 · v 1 P 2 2 ε log n 2 ε log n Q 2 v 2 A 2 , 1 · v 1 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v n/ ( ε log n ) A ε log n , 1 · v 1 n 11-a

  29. How to Do Fast Vector Multiplications (3) Look up the neighbors of v i , mark each neighbor found. P 1 Q 1 2 ε log n 2 ε log n v 1 A 1 , 2 · v 2 P 2 2 ε log n 2 ε log n Q 2 v 2 A 2 , 2 · v 2 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n A ε log n , 2 · v 2 n v n/ ( ε log n ) 12

  30. How to Do Fast Vector Multiplications (3) Look up the neighbors of v i , mark each neighbor found. P 1 Q 1 2 ε log n 2 ε log n v 1 A 1 , ε log n · v n/ ( ε log n ) n P 2 2 ε log n 2 ε log n Q 2 v 2 A 2 , ε log n · v n/ ( ε log n ) n . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v n/ ( ε log n ) �� � 2 � A ε log n · v n/ ( ε log n ) n n ε log n , n Takes O ε log n 13

  31. How to Do Fast Vector Multiplications (4) For each Q j , define v ′ j as the OR of all marked vectors in Q j P 1 Q 1 2 ε log n 2 ε log n v 1 v ′ ⇒ ∨ 1 P 2 2 ε log n 2 ε log n Q 2 v 2 v ′ ⇒ ∨ 2 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v ′ ⇒ ∨ v n/ ( ε log n ) n/ ( ε log n ) Takes ˜ O ( n 1+ ε ) time 14

  32. How to Do Fast Vector Multiplications (4) For each Q j , define v ′ j as the OR of all marked vectors in Q j P 1 Q 1 2 ε log n 2 ε log n v 1 v ′ ⇒ ∨ 1 P 2 2 ε log n 2 ε log n Q 2 v 2 v ′ ⇒ ∨ 2 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v ′ ⇒ ∨ v n/ ( ε log n ) n/ ( ε log n ) Takes ˜ O ( n 1+ ε ) time 15

  33. How to Do Fast Vector Multiplications   v ′ 1   v ′   2 (5) Output v ′ := Claim: v ′ = A · v . .   . .   .     v ′ n ε log n 16

  34. How to Do Fast Vector Multiplications   v ′ 1   v ′   2 (5) Output v ′ := Claim: v ′ = A · v . .   . .   .     v ′ n ε log n j = � n/ ( ε log n ) Proof: By definition, v ′ A j,i · v i . i =1 16-a

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend