the input output complexity of sparse matrix
play

The Input/Output Complexity of Sparse Matrix Multiplication Rasmus - PowerPoint PPT Presentation

The Input/Output Complexity of Sparse Matrix Multiplication Rasmus Pagh 1 , Morten St ockel 2 1 IT University of Copenhagen, 2 University of Copenhagen SIAM LA, October 26 2015 Pagh, St ockel ITU, DIKU October 26 2015 1 / 30 Sparse matrix


  1. The Input/Output Complexity of Sparse Matrix Multiplication Rasmus Pagh 1 , Morten St¨ ockel 2 1 IT University of Copenhagen, 2 University of Copenhagen SIAM LA, October 26 2015 Pagh, St¨ ockel ITU, DIKU October 26 2015 1 / 30

  2. Sparse matrix multiplication Problem description Sparse matrix multiplication Problem description Upper bound Size estimation Partitioning Outputting from partitions Summary Lower bound Technique used Bounding #phases Pagh, St¨ ockel ITU, DIKU October 26 2015 2 / 30

  3. Sparse matrix multiplication Problem description Overview I Let A and C be matrices over a semiring R with N nonzero entries in total. I The problem: Compute matrix product [ AC ] i,j = P k A i,k C k,j with Z nonzero entries. I Central result: Can be done in (for most of parameter space) optimal √ ⇣ ⌘ ˜ N Z O I/Os. √ B M Pagh, St¨ ockel ITU, DIKU October 26 2015 3 / 30

  4. Sparse matrix multiplication Problem description Cancellation of elementary products C : p rows q columns   ... c 11 c 12 c 1 q           ...  c 21 c 22 c 2 q      a 21 × c 12     . . .  ...  . . . +  . . .  c 22   ×   a 22    ...  + c p 1 c p 2 c pq .   . . + c p 2 × a 2 p     ... ... a 11 a 12 a 1 p ac 11 ac 12 ac 1 q             We say that we have cancellation         ... ...  a 21 a 22 a 2 p   ac 21 ac 22 ac 2 q                  . . ... . . . ... . when two or more summands of  . . .   . . .   . . .   . . .               ...   ...  a n 1 a n 2 a np ac n 1 ac n 2 ac nq     [ AC ] i,j = P k A i,k C k,j are nonzero A : n rows p columns AC = A × C : n rows q columns but the sum is zero. Our algorithm handles such cases. 1 Pagh, St¨ ockel ITU, DIKU October 26 2015 4 / 30

  5. Sparse matrix multiplication Problem description Motivation Lots of applications. Some of them: I Computing determinants and inverses of matrices. I Bioinformatics. I Graphs: counting cycles, computing matchings. Pagh, St¨ ockel ITU, DIKU October 26 2015 5 / 30

  6. Sparse matrix multiplication Problem description The semiring I/O model, 1 I A word is big enough to hold a matrix element plus its coordinates. I Internal memory that holds M words and disk of infinite size. I One I/O: Transfer B words from disk to internal memory. I Cost of an algorithm: Number of I/Os used. I Operations allowed: Semiring operations, copy and equality check. Pagh, St¨ ockel ITU, DIKU October 26 2015 6 / 30

  7. Sparse matrix multiplication Problem description The semiring I/O model, 2 I We make no assumptions about cancellation. I To produce output: must invoke emit ( . ) on every nonzero output entry once. I Matrices are of size U × U . I ˜ O suppresses polylog factors in U and N . Pagh, St¨ ockel ITU, DIKU October 26 2015 7 / 30

  8. Sparse matrix multiplication Problem description Our results, 1 I Let A and C be U × U matrices over semiring R with N nonzero input and Z nonzero output entries. There exist algorithms 1 and 2 such that: 1. emits the set of nonzero entries of AC with probability at least √ √ ⇣ ⌘ 1 − 1 /U , using ˜ O N Z/ ( B M ) I/Os. � N 2 / ( MB ) � 2. emits the set of nonzero entries of AC , and uses O I/Os. √ ⇣ ⌘ I Previous best [Amossen-Pagh, ’09]: ˜ Z/ ( BM 1 / 8 ) O N I/Os (boolean matrices = ⇒ no cancellation). Pagh, St¨ ockel ITU, DIKU October 26 2015 8 / 30

  9. Sparse matrix multiplication Problem description Our results, 2 I Let A and C be U × U matrices over semiring R with N nonzero input and Z nonzero output entries. There exist algorithms 1 and 2 such that: 1. emits the set of nonzero entries of AC with probability at least √ √ ⇣ ⌘ 1 − 1 /U , using ˜ O N Z/ ( B M ) I/Os. � N 2 / ( MB ) � 2. emits the set of nonzero entries of AC , and uses O I/Os. √ ⇣ ⇣ ⌘⌘ N 2 MB , N Z I There exist matrices that require Ω min I/Os to √ B M compute all nonzero entries of AC . Pagh, St¨ ockel ITU, DIKU October 26 2015 8 / 30

  10. Upper bound Size estimation Output size estimation Size estimation tool: Given matrices A and C with N nonzero entries, compute ε -estimate of number of nonzeroes of each column of AC using ˜ O ( ε − 3 N/B ) I/Os. Fact (Bender et al, ’07) For dense 1 × U vector y and sparse U × U matrix S we can compute yS in ˜ O (( nnz ( S ) /B ) I/Os. Pagh, St¨ ockel ITU, DIKU October 26 2015 9 / 30

  11. Upper bound Size estimation Distinct elements and matrix size I Distinct elements: Given frequency vector x of size n where x i i | x i | 0 . denotes the number of times element i occurs, then F 0 = P I Fundamental problem in streaming: Estimate F 0 without materializing x . I Observation: The distinct elements of AC is nnz ( AC ) . I Good news: use existing machinery. Size O ( ε − 3 log n log δ − 1 ) × n matrix F exists s.t Fx gives F 0 whp [Flajolet-Martin, ’85]. Pagh, St¨ ockel ITU, DIKU October 26 2015 10 / 30

  12. Upper bound Size estimation Output estimation F is ε − 3 log δ − 1 log U × U . A and C are U × U . To get size estimate we must compute: F × A × C Pagh, St¨ ockel ITU, DIKU October 26 2015 11 / 30

  13. Upper bound Size estimation Output estimation F is ε − 3 log δ − 1 log U × U . A and C are U × U . To get size estimate we must compute: ( F × A ) × C Due to associativity: Pick cheap order. Analysis: ε − 3 log δ − 1 log U invocations of dense vector sparse matrix black box: ˜ O ( ε − 3 N/B ) I/Os. Note: Works with cancellation, contrary to previous size estimation. Pagh, St¨ ockel ITU, DIKU October 26 2015 11 / 30

  14. Upper bound Partitioning Matrix mult partitioning, 1 × A C Pagh, St¨ ockel ITU, DIKU October 26 2015 12 / 30

  15. Upper bound Partitioning Matrix mult partitioning, 1 × A C Pagh, St¨ ockel ITU, DIKU October 26 2015 12 / 30

  16. Upper bound Partitioning Matrix mult partitioning, 2 A C = × × + × + × + × Pagh, St¨ ockel ITU, DIKU October 26 2015 13 / 30

  17. Upper bound Partitioning Partitioning the matrices I What we want: Split matrices into disjoint colored groups s.t. every color combination has at most M nonzero output entries. I Problem: Can’t be done. I Instead: Color rows of A using c colors. For each c groups of rows, do an independent coloring with c colors of columns of C . + × × Pagh, St¨ ockel ITU, DIKU October 26 2015 14 / 30

  18. Upper bound Partitioning Partitioning the matrices, 2 Overview of how to partition matrices A and C : q nnz ( AC ) log U 1. Pick number of colors c = + O (1) M 2. Recurse: Split A into A 1 and A 2 where it holds: nnz ( A 1 C ) ≈ nnz ( AC ) / 2 and nnz ( A 2 C ) ≈ nnz ( AC ) . 3. After log c + O (1) recursive levels we have O ( c ) disjoint colored groups of rows of A . 4. For each of those groups: Repeat procedure for columns of C . 5. The key point: O ( c 2 ) problems of size nnz ( AC ) /c 2 = O ( M/ log U ) . Pagh, St¨ ockel ITU, DIKU October 26 2015 15 / 30

  19. Upper bound Partitioning Getting the correct subproblem size Say we can do splits of A into A 1 , A 2 s.t. (1 − log − 1 U ) nnz ( AC ) / 2; (1 + log − 1 U ) nnz ( AC ) / 2 ⇥ ⇤ 1. nnz ( A 1 C ) ∈ . (1 − log − 1 U ) nnz ( AC ) / 2; (1 + log − 1 U ) nnz ( AC ) / 2 ⇥ ⇤ 2. nnz ( A 2 C ) ∈ . Assume biggest possible positive error: after q recursions have problem output size nnz ( AC )(1 / 2 + 1 / (2 log U )) q . Then after log c 2 + O (1) recursions: ◆ log c 2 ✓ 1 1 log c 2 ≤ nnz ( AC )2 − log c 2 e nnz ( AC ) 2 + log U 2 log U ≤ nnz ( AC ) O (1) /c 2 = O ( M/ log U ) Pagh, St¨ ockel ITU, DIKU October 26 2015 16 / 30

  20. Upper bound Partitioning How to compute the split How to do relative error 1 / log U splits: Use size estimation tool: For any set r of rows we have access to ˆ z i ’s s.t. X ! X ! (1 − log − 1 U ) nnz z i ≤ (1+log − 1 U ) nnz X [ AC ] i ∗ ˆ [ AC ] i ∗ . ≤ i ∈ r i ∈ r i ∈ r Splitting A into A 1 and A 2 : 1. Let ˆ Z = P i ˆ z i . z i ≥ ˆ 2. Add rows from A to A 1 until P i ∈ A 1 ˆ Z/ 2 . 3. The row that y overflows A 1 : Compute y × C directly. 4. Add remaining rows to A 2 Pagh, St¨ ockel ITU, DIKU October 26 2015 17 / 30

  21. Upper bound Partitioning I/O cost of splitting I/O cost: I Initial size est: ˜ O ( N/B ) . I Partition A : c dense-vector-sparse-matrix: ˜ O ( cN/B ) . I For the c A -partitions: one size est of total ˜ O ( N/B ) and c DVSM of total ˜ O ( cN/B ) . N √ ✓ ◆ q nnz ( AC ) I Total: ˜ O ( cN/B ) = ˜ nnz ( AC ) log U O since c = . √ M B M Pagh, St¨ ockel ITU, DIKU October 26 2015 18 / 30

  22. Upper bound Outputting from partitions Are we done? + × × Pagh, St¨ ockel ITU, DIKU October 26 2015 19 / 30

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend