Matrix-Vector Multiplication in Sub-Quadratic Time (Some - PowerPoint PPT Presentation

Matrix-Vector Multiplication in Sub-Quadratic Time (Some Preprocessing Required) Ryan Williams Carnegie Mellon University 0-0

Introduction Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing 1

Introduction Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing How fast can n × n matrix-vector multiplication be? 1-a

Introduction Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing How fast can n × n matrix-vector multiplication be? Θ( n 2 ) steps just to read the matrix! 1-b

Introduction Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing How fast can n × n matrix-vector multiplication be? Θ( n 2 ) steps just to read the matrix! Main Result: If we allow O ( n 2+ ε ) preprocessing, then matrix-vector multiplication over any finite semiring can be done in O ( n 2 / ( ε log n ) 2 ) . 1-c

Better Algorithms for Matrix Multiplication Three of the major developments: 2

Better Algorithms for Matrix Multiplication Three of the major developments: • Arlazarov et al. , a.k.a. “Four Russians” (1960’s): O ( n 3 / log n ) operations Uses table lookups Good for hardware with short vector operations as primitives 2-a

Better Algorithms for Matrix Multiplication Three of the major developments: • Arlazarov et al. , a.k.a. “Four Russians” (1960’s): O ( n 3 / log n ) operations Uses table lookups Good for hardware with short vector operations as primitives log 7 log 2 = O ( n 2 . 81 ) operations • Strassen (1969): n Asymptotically fast, but overhead in the big-O Experiments in practice are inconclusive about Strassen vs. Four Russians for Boolean matrix multiplication (Bard, 2006) 2-b

Better Algorithms for Matrix Multiplication Three of the major developments: • Arlazarov et al. , a.k.a. “Four Russians” (1960’s): O ( n 3 / log n ) operations Uses table lookups Good for hardware with short vector operations as primitives log 7 log 2 = O ( n 2 . 81 ) operations • Strassen (1969): n Asymptotically fast, but overhead in the big-O Experiments in practice are inconclusive about Strassen vs. Four Russians for Boolean matrix multiplication (Bard, 2006) • Coppersmith and Winograd (1990): O ( n 2 . 376 ) operations Not yet practical 2-c

Focus: Combinatorial Matrix Multiplication Algorithms 3

Focus: Combinatorial Matrix Multiplication Algorithms • Also called non-algebraic ; let’s call them non-subtractive E.g. Four-Russians is combinatorial, Strassen isn’t 3-a

Focus: Combinatorial Matrix Multiplication Algorithms • Also called non-algebraic ; let’s call them non-subtractive E.g. Four-Russians is combinatorial, Strassen isn’t More Non-Subtractive Boolean Matrix Mult. Algorithms: • Atkinson and Santoro: O ( n 3 / log 3 / 2 n ) on a (log n ) -word RAM • Rytter and Basch-Khanna-Motwani: O ( n 3 / log 2 n ) on a RAM • Chan: Four Russians can be implemented on O ( n 3 / log 2 n ) on a pointer machine 3-b

Main Result The O ( n 3 / log 2 n ) matrix multiplication algorithm can be “de-amortized” 4

Main Result The O ( n 3 / log 2 n ) matrix multiplication algorithm can be “de-amortized” More precisely, we can: Preprocess an n × n matrix A over a finite semiring in O ( n 2+ ε ) Such that vector multiplications with A can be done in O ( n 2 / ( ε log n ) 2 ) 4-a

Main Result The O ( n 3 / log 2 n ) matrix multiplication algorithm can be “de-amortized” More precisely, we can: Preprocess an n × n matrix A over a finite semiring in O ( n 2+ ε ) Such that vector multiplications with A can be done in O ( n 2 / ( ε log n ) 2 ) Allows for “non-subtractive” matrix multiplication to be done on-line 4-b

Main Result The O ( n 3 / log 2 n ) matrix multiplication algorithm can be “de-amortized” More precisely, we can: Preprocess an n × n matrix A over a finite semiring in O ( n 2+ ε ) Such that vector multiplications with A can be done in O ( n 2 / ( ε log n ) 2 ) Allows for “non-subtractive” matrix multiplication to be done on-line Can be implemented on a pointer machine 4-c

Main Result The O ( n 3 / log 2 n ) matrix multiplication algorithm can be “de-amortized” More precisely, we can: Preprocess an n × n matrix A over a finite semiring in O ( n 2+ ε ) Such that vector multiplications with A can be done in O ( n 2 / ( ε log n ) 2 ) Allows for “non-subtractive” matrix multiplication to be done on-line Can be implemented on a pointer machine This Talk: The Boolean case 4-d

Preprocessing Phase: The Boolean Case Partition the input matrix A into blocks of ⌈ ε log n ⌉ × ⌈ ε log n ⌉ size: A 1 , 1 A 1 , 2 A 1 , · · · n ε log n . . ε log n . A 2 , 1 A = A i,j ε log n . . . . . . A A n · · · · · · ε log n , 1 n n ε log n , ε log n 5

Preprocessing Phase: The Boolean Case Build a graph G with parts P 1 , . . . , P n/ ( ε log n ) , Q 1 , . . . , Q n/ ( ε log n ) P 1 2 ε log n 2 ε log n Q 1 Each part has 2 ε log n vertices, one for each possible ε log n vector P 2 2 ε log n 2 ε log n Q 2 . . . . . . . . . . . . P 2 ε log n Q 2 ε log n n n ε log n ε log n 6

Preprocessing Phase: The Boolean Case Edges of G : Each vertex v in each P i has exactly one edge into each Q j 2 ε log n Q j P i 2 ε log n v A j,i v 7

Preprocessing Phase: The Boolean Case Edges of G : Each vertex v in each P i has exactly one edge into each Q j 2 ε log n Q j P i 2 ε log n v A j,i v Time to build the graph: ε log n · 2 ε log n · ( ε log n ) 2 = O ( n 2+ ε ) n n ε log n · number number number matrix-vector mult of Q j of P i of A j,i and v of nodes in P i 7-a

How to Do Fast Vector Multiplications Let v be a column vector. Want: A · v . 8

How to Do Fast Vector Multiplications Let v be a column vector. Want: A · v . (1) Break up v into ε log n sized chunks:   v 1   v 2   v =   .  .  .     v n ε log n 8-a

How to Do Fast Vector Multiplications (2) For each i = 1 , . . . , n/ ( ε log n ) , look up v i in P i . 9

How to Do Fast Vector Multiplications (2) For each i = 1 , . . . , n/ ( ε log n ) , look up v i in P i . P 1 Q 1 2 ε log n 2 ε log n v 1 P 2 2 ε log n 2 ε log n Q 2 v 2 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v n/ ( ε log n ) Takes ˜ O ( n ) time. 9-a

How to Do Fast Vector Multiplications (2) For each i = 1 , . . . , n/ ( ε log n ) , look up v i in P i . P 1 Q 1 2 ε log n 2 ε log n v 1 P 2 2 ε log n 2 ε log n Q 2 v 2 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v n/ ( ε log n ) Takes ˜ O ( n ) time. 10

How to Do Fast Vector Multiplications (3) Look up the neighbors of v i , mark each neighbor found. 11

How to Do Fast Vector Multiplications (3) Look up the neighbors of v i , mark each neighbor found. P 1 Q 1 2 ε log n 2 ε log n v 1 A 1 , 1 · v 1 P 2 2 ε log n 2 ε log n Q 2 v 2 A 2 , 1 · v 1 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v n/ ( ε log n ) A ε log n , 1 · v 1 n 11-a

How to Do Fast Vector Multiplications (3) Look up the neighbors of v i , mark each neighbor found. P 1 Q 1 2 ε log n 2 ε log n v 1 A 1 , 2 · v 2 P 2 2 ε log n 2 ε log n Q 2 v 2 A 2 , 2 · v 2 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n A ε log n , 2 · v 2 n v n/ ( ε log n ) 12

How to Do Fast Vector Multiplications (3) Look up the neighbors of v i , mark each neighbor found. P 1 Q 1 2 ε log n 2 ε log n v 1 A 1 , ε log n · v n/ ( ε log n ) n P 2 2 ε log n 2 ε log n Q 2 v 2 A 2 , ε log n · v n/ ( ε log n ) n . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v n/ ( ε log n ) �� 2 � A ε log n · v n/ ( ε log n ) n n ε log n , n Takes O ε log n 13

How to Do Fast Vector Multiplications (4) For each Q j , define v ′ j as the OR of all marked vectors in Q j P 1 Q 1 2 ε log n 2 ε log n v 1 v ′ ⇒ ∨ 1 P 2 2 ε log n 2 ε log n Q 2 v 2 v ′ ⇒ ∨ 2 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v ′ ⇒ ∨ v n/ ( ε log n ) n/ ( ε log n ) Takes ˜ O ( n 1+ ε ) time 14

How to Do Fast Vector Multiplications (4) For each Q j , define v ′ j as the OR of all marked vectors in Q j P 1 Q 1 2 ε log n 2 ε log n v 1 v ′ ⇒ ∨ 1 P 2 2 ε log n 2 ε log n Q 2 v 2 v ′ ⇒ ∨ 2 . . . . . . . . . . . . P Q 2 ε log n 2 ε log n n n ε log n ε log n v ′ ⇒ ∨ v n/ ( ε log n ) n/ ( ε log n ) Takes ˜ O ( n 1+ ε ) time 15

How to Do Fast Vector Multiplications   v ′ 1   v ′   2 (5) Output v ′ := Claim: v ′ = A · v . .   . .   .     v ′ n ε log n 16

How to Do Fast Vector Multiplications   v ′ 1   v ′   2 (5) Output v ′ := Claim: v ′ = A · v . .   . .   .     v ′ n ε log n j = � n/ ( ε log n ) Proof: By definition, v ′ A j,i · v i . i =1 16-a

Matrix-Vector Multiplication in Sub-Quadratic Time (Some - PowerPoint PPT Presentation

Matrix-Vector Multiplication in Sub-Quadratic Time (Some Preprocessing Required) Ryan Williams Carnegie Mellon University 0-0 Introduction Matrix-Vector Multiplication: Fundamental Operation in Scientific Computing 1 Introduction

Matrix Multiplication Matrix Multiplication via Matrix-Vector Mult Defn. If matrix A is m n

CS 140 : Matrix multiplication Warmup: Matrix times vector: communication volume Matrix

Matrix Multiplication Matrix multiplication is an operation with properties quite different from

Shared Memory with Cilk++ Matrix-matrix multiplication Matrix-vector multiplication

Parallel Scientific Computing Matrix-vector multiplication. Matrix-matrix multiplication.

Matrix and Vector Operations Matrix and Vector Operations 1 / 21 Matrix and Vector Operations

The quadratic formula You may recall the quadratic formula for roots of quadratic polynomials ax 2

Visualizing Model Architecture john.sekar@mssm.edu SASB `17 Kinetics ~ Reaction Rules Enz Sub

Quiz I Give our two primary interpretations of matrix-vector multiplication. I Give the

Exploiting Matrix Reuse and Data Locality in Sparse Matrix-Vector and Matrix-Transpose-Vector

Convolution in the time domain Multiplication in the frequency domain Matrix-vector

Sparse Matrix Partitioning, Reordering and Vector Multiplication Albert-Jan Yzelman, Utrecht

Efficient multiplication 2 Matrix multiplication If you have square matrices A and B, then C =

Complexity of matrix multiplication (For Hierarchical matrix) For Usual matrix The

CS 401 Integer Multiplication / Matrix Multiplication Xiaorui Sun 1 Integer Multiplication

Matrix-chain multiplication Carola Wenk 1 CMPS 6610 Algorithms Matrix-chain multiplication

About me, your instructor Moira Chas Ph.D. in Mathematics, Universitat Autonoma de Barcelona.

MAT 137 LEC 0601 Instructor: Alessandro Malus TA: Muhammad Mohid September 15th, 2020

MAT 137 LEC 0601 Instructor: Alessandro Malus TA: Muhammad Mohid September 10th, 2020

Faa Samoa : local governance in Samoa frome indigenous lenses 1 / 7 00001 - 00:00:01 Faa

Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2019 Natural Language

MAT 129 Precalculus Trigonometry Review Angles and their Measures David J. Gisch Angles

Parsing Eric McCreath Overview In this lecture we will look at: structured text, generation,

CS 309: Autonomous Intelligent Robotics FRI I Lecture 14: OpenCV Rviz