Matrix Multiplication Rasmus Pagh IT University of Copenhagen - PowerPoint PPT Presentation

Matrix Multiplication Rasmus Pagh IT University of Copenhagen ITCS, January 10, 2012 1

Matrix Multiplication Rasmus Pagh IT University of Copenhagen ITCS, January 10, 2012 2

Outline • Algorithm and analysis • Related work • Case study: Correlations • Open problems 3

Informal problem statement • Input : n -by- n matrices A and B , parameter b . • Output : Approximation of AB that is good if AB is dominated by its b largest entries (“compressible”). 4

Basic algorithm 1. Take hash functions s 1 , s 2 : [ n ] → {-1,1} and h 1 , h 2 : [ n ] → [ b ]. 2. Compute the polynomial ! 0 1 n n n X X X X A ik s 1 ( i ) x h 1 ( i ) B kj s 2 ( j ) x h 2 ( j ) A . c i x i = . @ i =1 j =1 i k =1 3. Extract unbiased estimator ( AB ) ij ≈ s 1 ( i ) s 2 ( j ) c h 1 ( i )+ h 2 ( j ) 5

Basic algorithm 1. Take hash functions s 1 , s 2 : [ n ] → {-1,1} and h 1 , h 2 : [ n ] → [ b ]. 2. Compute the polynomial ! 0 1 n n n X X X X A ik s 1 ( i ) x h 1 ( i ) B kj s 2 ( j ) x h 2 ( j ) A . c i x i = . @ i =1 j =1 i k =1 3. Extract unbiased estimator Observation : Each ( AB ) ij ≈ s 1 ( i ) s 2 ( j ) c h 1 ( i )+ h 2 ( j ) coefficient c i is a sum of entries of AB with random signs 5

Why unbiased? Lemma : If s 1 and s 2 are pairwise independent, ⇢ 1 if i 1 = i 2 and j 1 = j 2 E [ s 1 ( i 1 ) s 1 ( i 2 ) s 2 ( j 1 ) s 2 ( j 2 )] = 0 otherwise . X c i x i Using lemma, expected value of is: s 1 ( i ) s 2 ( j ) . i 2 ! 0 1 3 n n n X X X A ik s 1 ( i ) x h 1 ( i ) B kj s 2 ( j ) x h 2 ( j ) 5 . 4 s 1 ( i ) s 2 ( j ) E @ A k =1 i =1 j =1 n X A ik s 2 1 ( i ) A ik x h 1 ( i ) s 2 2 ( j ) B kj x h 2 ( j ) = k =1 = ( AB ) ij x h 1 ( i )+ h 2 ( j ) 6

What is the variance? • Consider the “noise” in estimator caused by (AB) i’j’ : ⇢ s 1 ( i 0 ) s 2 ( j 0 )( AB ) i 0 j 0 if h 1 ( i ) + h 2 ( j ) = h 1 ( i 0 ) + h 2 ( j 0 ) X i 0 j 0 = 0 otherwise . • If h 1 , h 2 are 3-wise independent, these random . variables are uncorrelated, so: 0 1 @X X X A = E [ X 2 X i 0 j 0 Var ( X i 0 j 0 ) = i 0 j 0 ] Var i 0 ,j 0 i 0 ,j 0 i 0 ,j 0 X ( AB ) 2 i 0 j 0 /b = || AB || 2 F /b ≤ i 0 ,j 0 7

Sparse outputs • Suppose AB has at most b /3 nonzero entries. • Then with probability 2/3 there is no noise in a given estimator. • Repeat O(log n ) times and take median estimate, to get exact result whp. 8

Time analysis • Construct 2 n degree b polynomials: O( n 2 + nb ). • Multiply n pairs of degree b polynomials, using FFT: O( nb log b ). • Extracting estimates: O( n 2 ). Total time : O( n 2 + nb log b ). 9

Background • The polynomial computed is in fact a Count- Sketch [Charikar et al. ’04], an early compressed sensing method. 10

Background • The polynomial computed is in fact a Count- Sketch [Charikar et al. ’04], an early compressed sensing method. • Polynomial multiplication combines Count- Sketch es of column vector of A and a row vector of B into a Count-Sketch for their outer product . 10

Background • The polynomial computed is in fact a Count- Sketch [Charikar et al. ’04], an early compressed sensing method. • Polynomial multiplication combines Count- Sketch es of column vector of A and a row vector of B into a Count-Sketch for their outer product . • Add up outer product sketches to get a sketch for AB . 10

Some related results • Folklore : Computing AB with b nonzeros in time O( nb ) if there are no cancellations . 11

Some related results • Folklore : Computing AB with b nonzeros in time O( nb ) if there are no cancellations . • Cohen and Lewis ’99 : For nonnegative matrices, estimate AB with low relative error. 11

Some related results • Folklore : Computing AB with b nonzeros in time O( nb ) if there are no cancellations . • Cohen and Lewis ’99 : For nonnegative matrices, estimate AB with low relative error. • Iwen and Spencer ’09 : Computing AB with ≤ b/n nonzeros in each column in time Õ( nb ). 11

Some related results • Folklore : Computing AB with b nonzeros in time O( nb ) if there are no cancellations . • Cohen and Lewis ’99 : For nonnegative matrices, estimate AB with low relative error. • Iwen and Spencer ’09 : Computing AB with ≤ b/n nonzeros in each column in time Õ( nb ). • Drineas, Kannan, Mahoney ’06; Sarlós ’06 : Computing AB with low total error in terms of || A || F and || B || F . 11

Case study: Correlations A = Two rows of A are correlated. Which ones? 12

Sample covariance matrix AA T = 13

Sample covariance matrix estimated using compressed matrix multiplication AA T ≈ 14

Sample covariance matrix estimated using compressed matrix multiplication f(AA T )= Showing large values not explained by hash collisions. 15

Some open problems • Can other problems with “sparse solutions” be solved efficiently using compressed sensing techniques? - Matrix inversion? - Linear systems with a sparse solution? - Sparse transitive closure of a graph? - Product of > 2 matrices? 16

Discussion: Combinatorial algorithms • Compressed MM can be considered “combinatorial”. • Another view: No large hidden constants (in contrast to “algebraic” approaches leading to ω < 2.3727) . 17

Discussion: Combinatorial algorithms • Compressed MM can be considered “combinatorial”. • Another view: No large hidden constants (in contrast to “algebraic” approaches leading to ω < 2.3727) . • It is interesting to consider what other subclasses of matrix products can be computed in time, say, n 2+ ε , using algorithms with these properties. 17

Hidden slide: Extra application = 18

Hidden slide: Extra application = http://xkcd.com/651/ 18

Matrix Multiplication Rasmus Pagh IT University of Copenhagen - PowerPoint PPT Presentation

Matrix Multiplication Rasmus Pagh IT University of Copenhagen ITCS, January 10, 2012 1 Matrix Multiplication Rasmus Pagh IT University of Copenhagen ITCS, January 10, 2012 2 Outline Algorithm and analysis Related work Case

Matrix Multiplication Matrix Multiplication via Matrix-Vector Mult Defn. If matrix A is m n

Matrix Multiplication Matrix multiplication is an operation with properties quite different from

CS 140 : Matrix multiplication Warmup: Matrix times vector: communication volume Matrix

Shared Memory with Cilk++ Matrix-matrix multiplication Matrix-vector multiplication

Parallel Scientific Computing Matrix-vector multiplication. Matrix-matrix multiplication.

Complexity of matrix multiplication (For Hierarchical matrix) For Usual matrix The

CS 401 Integer Multiplication / Matrix Multiplication Xiaorui Sun 1 Integer Multiplication

Matrix-chain multiplication Carola Wenk 1 CMPS 6610 Algorithms Matrix-chain multiplication

Chapter VI All Pair Shortest Paths and Matrix Multiplication VI.1 APSPs and Matrix

Efficient multiplication 2 Matrix multiplication If you have square matrices A and B, then C =

Matrix Calculations: Kernels & Images, Matrix Multiplication A. Kissinger (and H. Geuvers)

Communication Lower Bounds for Matrix-Matrix Multiplication Dagstuhl Seminar #15281 July 6-9,

High-performance and Memory-saving Sparse General Matrix-Matrix Multiplication for Pascal GPU

MATH 105: Finite Mathematics 2-5: Matrix Multiplication Prof. Jonathan Duncan Walla Walla

Column-Based Matrix Partitioning for Parallel Matrix Multiplication on Heterogeneous Processors

I/O Lower Bounds and Algorithms for Matrix-Matrix Multiplication Tyler M. Smith July 5, 2017 1

The Lockdown Effect: Implications of the COVID-19 Pandemic on Internet Traffic Oliver Gasser, Max

Welcome! How do you measure a community's wellbeing? December 8, 2016 Council Chambers

Cultural relativism Case against Case against Because two societies Because two societies do

The Challenges Japanese Ombudsman Has Faced After the Great East Japan Earthquake WATARAI Osamu

PSNC's Optical Developments Artur Binczewski, Tomasz Szewczyk PSNC TERENA 3rd Network Architects

Scaling up rou-ng CSCI 466: Networks Keith Vertanen

Function Approximation Wouter J. Den Haan London School of Economics by Wouter J. Den Haan c

Low complexity Tilings of the Plane Jarkko Kari Department of Mathematics and Statistics

Matrix Multiplication Rasmus Pagh IT University of Copenhagen - PowerPoint PPT Presentation

Matrix Multiplication Rasmus Pagh IT University of Copenhagen ITCS, January 10, 2012 1 Matrix Multiplication Rasmus Pagh IT University of Copenhagen ITCS, January 10, 2012 2 Outline Algorithm and analysis Related work Case

Matrix Multiplication Matrix Multiplication via Matrix-Vector Mult Defn. If matrix A is m n

Matrix Multiplication Matrix multiplication is an operation with properties quite different from

CS 140 : Matrix multiplication Warmup: Matrix times vector: communication volume Matrix

Shared Memory with Cilk++ Matrix-matrix multiplication Matrix-vector multiplication

Parallel Scientific Computing Matrix-vector multiplication. Matrix-matrix multiplication.

Complexity of matrix multiplication (For Hierarchical matrix) For Usual matrix The

CS 401 Integer Multiplication / Matrix Multiplication Xiaorui Sun 1 Integer Multiplication

Matrix-chain multiplication Carola Wenk 1 CMPS 6610 Algorithms Matrix-chain multiplication

Chapter VI All Pair Shortest Paths and Matrix Multiplication VI.1 APSPs and Matrix

Efficient multiplication 2 Matrix multiplication If you have square matrices A and B, then C =

Matrix Calculations: Kernels &amp; Images, Matrix Multiplication A. Kissinger (and H. Geuvers)

Communication Lower Bounds for Matrix-Matrix Multiplication Dagstuhl Seminar #15281 July 6-9,

High-performance and Memory-saving Sparse General Matrix-Matrix Multiplication for Pascal GPU

MATH 105: Finite Mathematics 2-5: Matrix Multiplication Prof. Jonathan Duncan Walla Walla

Column-Based Matrix Partitioning for Parallel Matrix Multiplication on Heterogeneous Processors

I/O Lower Bounds and Algorithms for Matrix-Matrix Multiplication Tyler M. Smith July 5, 2017 1

The Lockdown Effect: Implications of the COVID-19 Pandemic on Internet Traffic Oliver Gasser, Max

Welcome! How do you measure a community's wellbeing? December 8, 2016 Council Chambers

Cultural relativism Case against Case against Because two societies Because two societies do

The Challenges Japanese Ombudsman Has Faced After the Great East Japan Earthquake WATARAI Osamu

PSNC's Optical Developments Artur Binczewski, Tomasz Szewczyk PSNC TERENA 3rd Network Architects

Scaling up rou-ng CSCI 466: Networks Keith Vertanen

Function Approximation Wouter J. Den Haan London School of Economics by Wouter J. Den Haan c

Low complexity Tilings of the Plane Jarkko Kari Department of Mathematics and Statistics

Matrix Calculations: Kernels & Images, Matrix Multiplication A. Kissinger (and H. Geuvers)