Gilbert: Declarative Sparse Linear Algebra on Massively Parallel - PowerPoint PPT Presentation

Motivation Gilbert Evaluation Conclusion Gilbert: Declarative Sparse Linear Algebra on Massively Parallel Dataflow Systems Till Rohrmann 1 Sebastian Schelter 2 Tilmann Rabl 2 Volker Markl 2 1 Apache Software Foundation 2 Technische Universität Berlin March 8, 2017 1 / 25

Motivation Gilbert Evaluation Conclusion Motivation 2 / 25

Motivation Gilbert Evaluation Conclusion Information Age Collected data grows exponentially Valuable information stored in data Need for scalable analytical methods 3 / 25

Motivation Gilbert Evaluation Conclusion Distributed Computing and Data Analytics Writing parallel algorithms is tedious and error-prone Huge existing code base in form of libraries Need for parallelization tool 4 / 25

Motivation Gilbert Evaluation Conclusion Requirements Linear algebra is lingua franca of analytics Parallelize programs automatically to simplify development Sparse operations to support sparse problems efficiently Goal Development of distributed sparse linear algebra system 5 / 25

Motivation Gilbert Evaluation Conclusion Gilbert 6 / 25

Motivation Gilbert Evaluation Conclusion Gilbert in a Nutshell 7 / 25

Motivation Gilbert Evaluation Conclusion System architecture 8 / 25

Motivation Gilbert Evaluation Conclusion Gilbert Language 1 A = rand (10 , 2 ) ; � language Subset of MATLAB R 2 B = eye ( 1 0 ) ; Support of basic linear algebra 3 A’ ∗ B; operations 4 f = @( x ) x . ^ 2 . 0 ; Fixpoint operator serves as side-effect 5 eps = 0 . 1 ; free loop abstraction 6 c = @(p , c ) norm (p − c , 2 ) < eps ; 7 f i x p o i n t (1/2 , f , 10 , c ) ; Expressive enough to implement a wide variety of machine learning algorithms 9 / 25

Motivation Gilbert Evaluation Conclusion Gilbert Typer Matlab is dynamically typed Dataflow systems require type knowledge at compile type Automatic type inference using the Hindley-Milner type inference algorithm Infer also matrix dimensions for optimizations 1 A = rand (10 , 2 ) : Matrix ( Double , 10 , 2) 2 B = eye ( 1 0 ) : Matrix ( Double , 10 , 10) 3 A’ ∗ B: Matrix ( Double , 2 , 10) 4 f = @( x ) x . ^ 2 . 0 : N − > N 5 eps = 0 . 1 : Double 6 c = @(p , c ) norm (p − c , 2 ) < eps : (N,N) − > Boolean 7 f i x p o i n t (1/2 , f , 10 , c ) : Double 10 / 25

Motivation Gilbert Evaluation Conclusion Intermediate Representation & Gilbert Optimizer Language independent representation of linear algebra programs Abstraction layer facilitates easy extension with new programming languages (such as R) Enables language independent optimizations Transpose push down Matrix multiplication re-ordering 11 / 25

Motivation Gilbert Evaluation Conclusion Distributed Matrices (a) Row partitioning (b) Quadratic block partitioning Which partitioning is better suited for matrix multiplications? n 2 √ n � n 3 � � � io _ cost row = O io _ cost block = O 12 / 25

Motivation Gilbert Evaluation Conclusion Distributed Operations: Addition Apache Flink and Apache Spark offer MapReduce-like API with additional operators: join , coGroup , cross 13 / 25

Motivation Gilbert Evaluation Conclusion Evaluation 14 / 25

Motivation Gilbert Evaluation Conclusion Gaussian Non-Negative Matrix Factorization Given V ∈ R d × w find W ∈ R d × t and H ∈ R t × w such that V ≈ WH Used in many fields: Computer vision, document clustering and topic modeling Efficient distributed implementation for MapReduce systems Algorithm H ← randomMatrix ( t , w ) W ← randomMatrix ( d , t ) while � V − WH � 2 > eps do H ← H · ( W T V / W T WH ) W ← W · ( VH T / WHH T ) end while 15 / 25

Motivation Gilbert Evaluation Conclusion Testing Setup Set t = 10 and w = 100000 V ∈ R d × 100000 with sparsity 0 . 001 Block size 500 × 500 Numbers of cores 64 Flink 1.1.2 & Spark 2.0.0 Gilbert implementation: 5 lines Distributed GNMF on Flink: 70 lines 1 V = rand ( $rows , 100000 , 0 , 1 , 0 . 0 0 1 ) ; 2 H = rand (10 , 100000 , 0 , 1 ) ; 3 W = rand ( $rows , 10 , 0 , 1 ) ; 4 nH = H. ∗ ( (W’ ∗ V) . / (W’ ∗ W ∗ H)) 5 nW = W. ∗ (V ∗ nH ’ ) . / (W ∗ nH ∗ nH ’ ) 16 / 25

Motivation Gilbert Evaluation Conclusion Gilbert Optimizations 300 Optimized Spark Optimized Flink Non-optimized Spark Execution time t in s Non-optimized Flink 200 100 0 10 3 10 4 Rows d of V 17 / 25

Motivation Gilbert Evaluation Conclusion Optimizations Explained Matrix updates H ← H · ( W T V / W T WH ) W ← W · ( VH T / WHH T ) Optimized matrix multiplications Non-optimized matrix multiplications ∈ R 10 × 100000 ∈ R 10 × 100000 � �� W T W W T W H H � �� ∈ R 10 × 10 ∈ R 10 × 10 ∈ R d × 10 ∈ R d × 10 � �� HH T � H T ( WH ) W � �� ∈ R d × 100000 ∈ R 10 × 10 18 / 25

Motivation Gilbert Evaluation Conclusion GNMF Step: Scaling Problem Size Flink SP Flink Spark SP Spark Execution time t in s 10 2 Local 10 1 10 3 10 4 10 5 Number of rows of matrix V Distributed Gilbert execution handles much larger problem sizes than local execution Specialized implementation is slightly faster than Gilbert 19 / 25

Motivation Gilbert Evaluation Conclusion GNMF Step: Weak Scaling 60 Flink Spark Execution time t in s 40 20 0 10 0 10 1 10 2 Number of cores Both distributed backends show good weak scaling behaviour 20 / 25

Motivation Gilbert Evaluation Conclusion PageRank Ranking between entities with reciprocal quotations and references PR ( p j ) D ( p j ) + 1 − d � PR ( p i ) = d N p j ∈ L ( p i ) N - number of pages d - damping factor L ( p i ) - set of pages being linked by p i D ( p i ) - number of linked pages by p i M - transition matrix derived from adjacency matrix R = d · MR + 1 − d · ✶ N 21 / 25

Motivation Gilbert Evaluation Conclusion PageRank Implementation � MATLAB R Gilbert 1 i t = 10; 1 i t = 10; 2 d = sum (A, 2) ; 2 d = sum (A, 2) ; 3 M = ( diag (1 . / d ) ∗ A) ’ ; 3 M = ( diag (1 . / d ) ∗ A) ’ ; 4 r_0 = ones (n , 1) / n ; 4 r_0 = ones (n , 1) / n ; 5 e = ones (n , 1) / n ; 5 e = ones (n , 1) / n ; 6 i = 1: i t 6 f i x p o i n t ( r_0 , f o r 7 r = .85 ∗ M ∗ r + .15 ∗ e 7 @( r ) .85 ∗ M ∗ r + .15 ∗ e , 8 8 i t ) end 22 / 25

Motivation Gilbert Evaluation Conclusion PageRank: 10 Iterations 10 4 Spark Flink Execution time t in s SP Flink 10 3 SP Spark 10 2 10 1 10 4 10 5 Number of vertices n Gilbert backends show similar performance Specialized implementation faster because it can fuse operations 23 / 25

Motivation Gilbert Evaluation Conclusion Conclusion 24 / 25

Motivation Gilbert Evaluation Conclusion Conclusion Easy to use sparse linear algebra environment for people familiar with � MATLAB R Scales to data sizes exceeding a single computer High-level linear algebra optimizations improve runtime Slower than specialized implementations due to abstraction overhead 25 / 25

Gilbert: Declarative Sparse Linear Algebra on Massively Parallel - PowerPoint PPT Presentation

Motivation Gilbert Evaluation Conclusion Gilbert: Declarative Sparse Linear Algebra on Massively Parallel Dataflow Systems Till Rohrmann 1 Sebastian Schelter 2 Tilmann Rabl 2 Volker Markl 2 1 Apache Software Foundation 2 Technische Universitt

A Massively Parallel Dense Symmetric A Massively Parallel Dense Symmetric A Massively Parallel

MPMPLAPACK: A Massively Parallel Multi-Precision Linear Algebra Package Jason Martin

Linear Algebra Linear algebra has become as basic and as applicable as calculus, and

Lecture 14: Dense Linear Algebra David Bindel 18 Oct 2010 Where we are This week: dense

Breaking the Linear-Memory Barrier in Massively Parallel Computing MIS on Trees with Strongly

Chapter 1 What is Linear Algebra? Chapter 1 What is Linear Algebra? The study of linear

Graphics 2014 Linear Algebra II Linear Maps & Matrices Linear Maps & Matrices CORE

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Feeding of the Thousands Leveraging the GPU's Computing Power for Sparse Linear Algebra

Declarative Modelling of Virtual Environments DEM 2 ONS PROJECT 2 ONS PROJECT DEM (Declarative

Connecting declarative software tools Declarative tools [for] connecting software Salvador Lucas

Lecture 31: Declarative Programming Imperative vs. Declarative So far, our programs are

PV Math Department MCL Vision Credit Options Credit General General/Post- College Honors

Linear algebra explained in four pages Excerpt from the N O BULLSHIT GUIDE TO LINEAR ALGEBRA by

Matrices Basic Linear Algebra Overview Lecture will cover why matrices and linear algebra

MATRICES AND LINEAR ALGEBRA Linear Algebra Matrix manipulation is the original essence of

CS6220: DATA MINING TECHNIQUES Matrix Data: Clustering: Part 1 Instructor: Yizhou Sun

2008 Interim Results 21 February 2008 2008 Interim Results Mike Ihlein Chief Executive Officer

Financial Update Mark Mulhern Chief Financial Officer Progress Energy Inc Progress Energy, Inc.

Simulating tokamak edge instabilities: advances and challenges Matthias Hoelzl, GTA Huijsmans, FJ

MACRA Request for Information MIPS APMs PFPMs October 15, 2015 The MACRA: Background The

Program Year 2018 Webinar Presented by: Janet Sheridan Compliance Specialist Jessi Wilson

Delivering inclusive capitalism Sharing success with investors, customers and society LEGAL &

The views expressed in these slides are solely the views of the Investor Advisory Group members

Gilbert: Declarative Sparse Linear Algebra on Massively Parallel - PowerPoint PPT Presentation

Motivation Gilbert Evaluation Conclusion Gilbert: Declarative Sparse Linear Algebra on Massively Parallel Dataflow Systems Till Rohrmann 1 Sebastian Schelter 2 Tilmann Rabl 2 Volker Markl 2 1 Apache Software Foundation 2 Technische Universitt

A Massively Parallel Dense Symmetric A Massively Parallel Dense Symmetric A Massively Parallel

MPMPLAPACK: A Massively Parallel Multi-Precision Linear Algebra Package Jason Martin

Linear Algebra Linear algebra has become as basic and as applicable as calculus, and

Lecture 14: Dense Linear Algebra David Bindel 18 Oct 2010 Where we are This week: dense

Breaking the Linear-Memory Barrier in Massively Parallel Computing MIS on Trees with Strongly

Chapter 1 What is Linear Algebra? Chapter 1 What is Linear Algebra? The study of linear

Graphics 2014 Linear Algebra II Linear Maps &amp; Matrices Linear Maps &amp; Matrices CORE

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Feeding of the Thousands Leveraging the GPU's Computing Power for Sparse Linear Algebra

Declarative Modelling of Virtual Environments DEM 2 ONS PROJECT 2 ONS PROJECT DEM (Declarative

Connecting declarative software tools Declarative tools [for] connecting software Salvador Lucas

Lecture 31: Declarative Programming Imperative vs. Declarative So far, our programs are

PV Math Department MCL Vision Credit Options Credit General General/Post- College Honors

Linear algebra explained in four pages Excerpt from the N O BULLSHIT GUIDE TO LINEAR ALGEBRA by

Matrices Basic Linear Algebra Overview Lecture will cover why matrices and linear algebra

MATRICES AND LINEAR ALGEBRA Linear Algebra Matrix manipulation is the original essence of

CS6220: DATA MINING TECHNIQUES Matrix Data: Clustering: Part 1 Instructor: Yizhou Sun

2008 Interim Results 21 February 2008 2008 Interim Results Mike Ihlein Chief Executive Officer

Financial Update Mark Mulhern Chief Financial Officer Progress Energy Inc Progress Energy, Inc.

Simulating tokamak edge instabilities: advances and challenges Matthias Hoelzl, GTA Huijsmans, FJ

MACRA Request for Information MIPS APMs PFPMs October 15, 2015 The MACRA: Background The

Program Year 2018 Webinar Presented by: Janet Sheridan Compliance Specialist Jessi Wilson

Delivering inclusive capitalism Sharing success with investors, customers and society LEGAL &amp;

The views expressed in these slides are solely the views of the Investor Advisory Group members

Graphics 2014 Linear Algebra II Linear Maps & Matrices Linear Maps & Matrices CORE

Delivering inclusive capitalism Sharing success with investors, customers and society LEGAL &