Sparse Linear Algebra for the Language: SparseM ( Details Appeared - - PowerPoint PPT Presentation
Sparse Linear Algebra for the Language: SparseM ( Details Appeared - - PowerPoint PPT Presentation
Sparse Linear Algebra for the Language: SparseM ( Details Appeared in the Journal of Statistical Software, Vol 8, Issue 6 ) Roger Koenker University of Illinois at Urbana-Champaign and Pin Ng Northern Arizona University Outlines Motivations
Motivations Design Philosophy Incomplete List of Existing Algorithms for Sparse Matrices Functionality
Coercion -- Storage Modes Visualization Indexing and Binding Linear Algebra Linear Equation Solving Least Squares Problems
Some Potential Refinements Performance Conclusion
Outlines
Motivations
Many Applications in Statistics Involve Large Sparse Matrices
Parametric linear regression involving longitudinal data with fixed effects
Indicator variables consist of a few ones and a large number of zero elements
Nonparametric regression
Design matrices in smoothing splines often contain less than 1% of nonzero entries
50 100 150 200 100 200 300 400 500 600 700 column row
A Typical Design Matrix of the Penalized Triogram in Koenker and Mizera (2003)
Motivations
Significant Performance Improvements Can Be Achieved by Exploiting the Sparse Structure of the Matrices
Memory Utilization Computational Speed
(Continued)
Design Philosophy
Make the Usage as Transparent as Possible by Adhering to the Method- Dispatch Convention of R Behave Like the Dense Matrix Counterparts whenever Possible Adhere to the GPL License
Incomplete List of Existing Sparse Matrix Utilities
Sparse Basic Linear Algebra Subroutines (Duff, Heroux and Pozo, 2002)
Handle only sparse matrix times dense matrix multiplication at Level 3 Limited matrix utilities
Sparskit (Saad, 1994)
Wide collection of matrix utilities, e.g. masking, sorting, permuting, extracting and filtering Building blocks for SparseM
Algorithms for Solving Sparse Systems of Linear Equations
Iterative Methods (Sparskit)
Conjugate gradient method (CGM) CGM on normal residual equation Bi-conjugate gradient method (BCG) BCG with partial pivoting Transpose-free quasi-minimum residual method
Much Slower than Direct Methods for Our Applications
Reaffirmed by Haunschmid and Ueberhuber (2000), “Iterative Solvers for Sparse Systems”
Algorithm for Solving Sparse Systems of Linear Equations
Direct Methods
Watson Sparse Matrix Package (WSMP), Gupta (2000)
Proprietary license
MUltifrontal Massively Parallel Solver (MUMPS Version 4.2), Amestoy, Duff, L’Excellent and Koster (2002)
LU or Cholesky factorization Written in Fortran 90 (Continued)
UMFPACK Version 4.0, Davis (2002)
LU factorization General sparse solver not written specifically for symmetric positive definite system
Ng and Peyton (1993) Left-Looking Block Sparse Cholesky Algorithm
Specialized for symmetric positive definite matrices Building block for PCx and SparseM
Algorithm for Solving Sparse Systems of Linear Equations
(Continued)
Functionality -- Coercion
Four Storage Modes and Four Sparse Classes
CSR (Compressed sparse row), “matrix.csr”
ra: real array of nnz non-zero elements ja: integer array of nnz column indices for ra ia: integer array of (n+1 ) pointers to the beginning of each row in ra and ja
CSC (Compressed sparse column), “matrix.csc” SSR (Symmetric sparse row), “matrix.ssr”
Only the lower triangular non-zero entries are stored
SSC (Symmetric sparse column), “matrix.ssc”
Functionality -- Coercion
Usage:
as.matrix.csr(x, …) as.matrix.csc(x, …) as.matrix.ssr(x, …) as.matrix.ssc(x, …) as.matrix(x) is.matrix.csr(x, …), is.matrix.csc(x, …), is.matrix.ssr(x, …), is.matrix.ssc(x, …) nrow(x), ncol(z), dim(x), class(x)
(Continued)
Functionality -- Visualization
Allow Exploration of the Sparse Structure Work on Matrices of Class “matrix.csr” Usage:
image(x, col=c(“white”,”gray”), xlab=“column”, ylab=“row”, …)
Functionality -- Indexing and Binding
Operate on Matrices of Class “matrix.csr” Work Just Like Their Dense Matrix Counterparts but Return Matrices of Class “matrix.csr” Usage:
x[i,j] x[i,j] <- value cbind[…] rbind[…]
Functionality -- Linear Algebra
Operations on “matrix.csr” Class Yield Objects in “matrix.csr” Class with a Few Exceptions Group Arith: “+”, “-”, “*”, “/”, “^”, “%%”, “%/%”, Group Compare: “==”, “!=”, “>”, “>=”, “<”, “<=”
Work and behave just like their dense matrix counterparts
“t” and “diff” Produce “matrix.csr” Class
Usage: t(x), diff(x, lag=1, difference=1)
“%*%” Produces “matrix.csr” Class if Both x and y Are “matrix.csr” Classes; Produces Dense “matrix” Class if y Is a Dense Vector
Usage: x%*%y
“diag” Returns a Dense Vector with Appropriate Zeros Reintroduced “diag<-” Returns a “matrix.csr” Class with the Diagonal Entries Replaced
Functionality -- Linear Algebra
(Continued)
Functionality -- Linear Equation Solving
“chol” Performs Cholesky Decomposition
Usage: chol(x, pivot = FALSE, nsubmax, nnzlmax, tmpmax, ...)
“backsolve” Performs a Triangular Back-Fitting
Usage: backsolve(r, x, k, upper.tri, transpose)
Functionality -- Linear Equation Solving
“solve” Combines “chol” and “backsolve”, and Will Compute the Inverse of a Matrix if the Right- Hand-Side Is Missing
Usage: solve(a, b, ...)
For Systems of Linear Equations that Only Vary on the Right-hand-side, the Result from “chol” Can Be Reused
(Continued)
Functionality -- Least Squares Problem
“slm” Emulates “lm” to Fit Linear Model “slm” Processes a Formula Object the Same Way as “lm”, Calls “slm.fit” for the Actual Fitting Using “chol” and “backsolve”
Usage: slm(formula, data, weights, na.action, method = "csr", contrasts = NULL, ...)
Functionality -- Least Squares Problem
Methods “summary”, “print”, “fitted”, “residuals”, and “coef” Are Available for Class “slm” Objects
(Continued)
Potential Refinements
Replace model.matrix(Terms, m, contrasts) in “slm” with a Specialized Version for “matrix.csr” Class Add “crossprod”, “row”, “col”, “eigen”, “det”, “svd” and the Set of Functions for “qr” Decomposition
Performance
“solve” Applied to Koenker and Mizera’s (2003) Penalized Triograms Test Function
( )
2 ' 1 1
min
n n i i j i i
z d
τ
ρ α λ α
− = =
− +
∑ ∑
( ) ( ) ( )
( )
( ) ( )
( )
( )
( ) ( )
( )
( )
2 2 2 2 2 2
40exp 8 .5 .5 , exp 8 .2 .7 exp .7 .2 x y f x y x y x y − + − = − + − + − + −
( )
, 1, ,
i i i i
z f x y u i n = + = L
100 200 500 1000 1e-02 1e+00 1e+02 1e+04 n Time rqfn: log(Time) = -11.49 + 2.53 log(n) srqfn: log(Time) = -11.16 + 1.51 log(n) 426 0.5 0.36 0.01
Median Execution Time (in Seconds)
- f 50 Replications
rqfn: dense matrix version srqfn: sparse matrix version
Storage (in Bytes) Requirement
50 100 200 500 1000 5000 1e+05 1e+07 1e+09 n Storage in Bytes rqfn: log(Storage) = 3.6 + 2 log(n) srqfn: log(Storage) = 7.5 + 1 log(n)
Quadratic Growth Linear Growth