Moving Toward a Concurrent Computing Grammar Michael Kane and - PowerPoint PPT Presentation

Moving Toward a Concurrent Computing Grammar Michael Kane and Bryan Lewis

"Make the easy things easy and the hard things possible" - Larry Wall

Workshop on Distributed Computing in R

Indrijit and Michael Identified a Few Themes For high-performance computing writing MPI at a low-level is a good option. For in-memory processing, adding some form of distributed objects in R can potentially improve performance. Using simple parallelism constructs, such as lapply , that operate on distributed data structures may make it easier to program in R. Any high level API should support multiple backends, each of which can be optimized for a speci fi c platform, much like R’s snow and foreach package run on any available backend.

Evaluating a Distributed Computing Grammar A grammar provides a su ff icient set of composable constructs for distributed computing tasks. R has great grammars for - for connecting to storage - manipulating data - data visualization - models It should be agnostic to the underlying technology but it needs to be able to use it in a way that's consistent.

Map/Reduce Integrated Fault Comm. Only Data Storage Tolerance hmr x x x RHIPE x x x rmr x x x sparkr x* x x parallel x Rdsm x Rmpi pdbMPI scidb x x distributedR x x

High-Level Distributed Computing Packages snow: Tierney, Rossini, Li, and Sevcikova (2003) foreach: Revo and Steve Weston (2009) ddR: Ma, Roy, and Lawrence (2015) datadr: Hafen and Sego (2016)

Is "good enough" good enough? We built pirls and irlba on top of ddR to answer: 1. Can we do it? 2. Does ddR provide the right constructs to do this in a natural way?

"Cheap" irls implementation irls = function(x, y, family=binomial, maxit=25, tol=1e-08) { b = rep(0, ncol(x)) for(j in 1:maxit) { eta = drop(x %*% b) g = family()$linkinv(eta) gprime = family()$mu.eta(eta) z = eta + (y - g) / gprime W = drop(gprime^2 / family()$variance(g)) bold = b b = solve(crossprod(x, W), crossprod(x, W * z), tol) if(sqrt(drop(crossprod(b - bold))) < tol) break } list(coefficients=b, iterations=j) }

Notation model matrix: X ∈ R n × p dependent data: y ∈ R n ^ slope coefficient estimates: ∈ R p β penalty parameter: λ ∈ [0, ∞ ) elastic net parameter: α ∈ [0, 1] a weight matrix: W ∈ R n × n a link function: g

pirls ^ ^ ^ λ β 1/2 2 2 min ∣∣ W ( y − g ( X )) ∣∣ + (1 − α ) ∣∣ ∣∣ + αλ ∣∣ ∣∣ β β ^ 1 2 β The Ridge Part The Least Absolute Shrinkage and Selection Operator (LASSO) Part

Data are Partitioned by Row If you have too many columns there's: SAFE - discard the jth variable if ∣∣ x ∣∣∣∣ y ∣∣ − λ λ ∣ x y ∣ / n > λ − T 2 2 max j n λ max STRONG - discard if KKT conditions are met and ∣ x y ∣ / n > 2 λ − λ T max j

Simple ddR implementation almost the same dirls = function(x, y, family=binomial, maxit=25, tol=1e-08) { b = rep(0, ncol(x)) for(j in 1:maxit) { eta = drop(x %*% b) g = family()$linkinv(eta) gprime = family()$mu.eta(eta) z = eta + (y - g) / gprime W = drop(gprime^2 / family()$variance(g)) bold = b b = solve(wcross(x, W), cross(x, W * z), tol) if(sqrt(crossprod(b - bold)) < tol) break } list(coefficients=b, iterations=j) }

cross = function(a, b) { Reduce(`+`, Map(function(j) { collect(dmapply(function(x, y) crossprod(x, y), parts(a), split(b, rep(1:nparts(a)[1], psize(a)[,1])), output.type="darray", combine="rbind", nparts=nparts(a)), j) }, seq(1,totalParts(a)))) } wcross = function (a, w) { Reduce(`+`, Map(function(j) { collect(dmapply(function(x, y) crossprod(x, y*x), parts(a), split(w, rep(1:nparts(a)[1], psize(a)[,1])), output.type="darray", combine="rbind", nparts=nparts(a)), j) }, seq(1,totalParts(a)))) }

A Toy Example > x = dmapply(function(x) matrix(runif(4), 2, 2), + 1:4, + output.type="darray", + combine="rbind", + nparts=c(4, 1)) > y = 1:8 > > print(coef(dirls(x, y, gaussian))) [,1] [1,] 6.2148108 [2,] 0.4186009 > print(coef(lm.fit(collect(x), y))) x1 x2 6.2148108 0.4186009

Another Algorithm: IRLBA setMethod("%*%", signature(x="ParallelObj", y="numeric"), function(x ,y) { stopifnot(ncol(x) == length(y)) collect( dmapply(function(a, b) a %*% b, parts(x), replicate(totalParts(x), y, FALSE), output.type="darray", combine="rbind", nparts=nparts(x))) }) setMethod("%*%", signature(x="numeric", y="ParallelObj"), function(x ,y) { stopifnot(length(x) == nrow(y)) colSums( dmapply(function(x, y) x %*% y, split(x, rep(1:nparts(y)[1], psize(y)[, 1])), parts(y), output.type="darray", combine="rbind", nparts=nparts(y))) })

> x = dmapply(function(x) matrix(runif(4), 25, 25), 1:4, + output.type="darray", combine="rbind", nparts=c(4, 1)) > print(irlba(x, nv=3)$d) [1] 32.745771 6.606732 6.506343 > print(svd(collect(x))$d[1:3]) [1] 32.745771 6.606732 6.506343 Applied this approach to compute 1st three principal components of the 1000 Genomes variant data: 2,504 x 81,271,844 (about 10^9 nonzero elements) < 10 minutes across 16 R processes (on EC2)

What we like about ddR The idea of distributed list, matrix, data.frame containers feels right. Pretty easy to use productively.

Ideas we're thinking about Separation of data and container API from execution Simpler and more general data and container API data provider "backends" belong to chunks freedom to work chunk by chunk or more generally containers that infer chunk grid from chunk metadata (allowing non-uniform grids)

Modified chunk data API idea Generalized/simpli fi ed ddR high-level containers get_values(chunk, indices, ...) get_attributes(chunk) get_attr(chunk, x) get_length(chunk) get_object.size(chunk) get_typeof(chunk) new_chunk(backend, ...) as.chunk(backend, value)

Simplified ddR container object ideas chunk1 = as.chunk(backend, matrix(1, 10, 10)) chunk2 = as.chunk(backend, matrix(2, 10, 2)) chunk3 = as.chunk(backend, 1:12) # A 20 x 12 darray x = darray(list(chunk1, chunk1, chunk2, chunk2), nrow=2, ncol=2) # A 12 x 1 darray y = darray(list(chunk3), ncol=1) # 12 x 1 array

Examples Sequential (data API only) z = darray(list(as.chunk(backend, x[1:10,] %*% y[]), as.chunk(backend, x[11:20,] %*% y[])), ncol=1) With a parallel execution framework z = darray(mclapply(list(1:10, 11:20), function(i) { as.chunk(backend, x[i, ] %*% y[]) }), ncol=1)

If you want to know more... cnidaria: A Generative Communication Approach to Scalable, Distributed Learning (2013) Generative communication in Linda, Gelernter (1985)

Moving Toward a Concurrent Computing Grammar Michael Kane and - PowerPoint PPT Presentation

Moving Toward a Concurrent Computing Grammar Michael Kane and Bryan Lewis "Make the easy things easy and the hard things possible" - Larry Wall Workshop on Distributed Computing in R Indrijit and Michael Identified a Few Themes

Working Together What does his future hold? Carres Grammar School Carres Grammar School

Grammar and word order Grammar and word order Grammar Grammar Includes morphology and syntax

DDR solution Sprites overview Moving right arrow Moving left arrow Moving down arrow Moving up

GRAMMAR THROUGH HUMOR BRANDY SHOOKS & WHITNEY SCHARER TEACHING GRAMMAR THROUGH HUMOR Having

General Context-Free Grammar Parsing: Application of grammar rewrite rules A phrase structure

Introduction to English Linguistics 4: Grammar and Syntax I Grammar and Syntax Grammar The

Grammar: The Heart of Numeracy 18 Nov, 2017 0B 2017 NNN2 Grammar: The Heart of Numeracy 1 0B

Introduction to English Linguistics 4: Grammar and Syntax Grammar and Syntax Grammar The rules

General Context-Free Grammar Parsing Application of grammar rewrite rules A phrase structure

Concurrent Enrollment A Guide for Parents and Students What is Concurrent Enrollment? Concurrent

Concurrent Message Service M. Clemencic CERN - LHCb Forum on Concurrent Programming Models and

Concurrent Programming in Scala 1 / 7 Concurrent Programming 1 Concurrent programming:

Lexical Grammar Unicorns I: The Passive Voice Alex Walls Director of Studies, Windsor English

APPR-NN-Sequences and their HPSGs view on lexicon and grammar grammar sign

Surface Reasoning Lecture 2: Logic and Grammar Thomas Icard June 18-22, 2012 Thomas Icard:

Grammar Formalisms: C-structures are represented with trees. Lexical Functional Grammar (LFG)

Can Non-Volatile Memory Benefit MapReduce Applications on HPC Clusters? Md. Wasi-ur- Rahman,

Rigid Body Dynamics CSE169: Computer Animation Instructor: Steve Rotenberg UCSD, Winter 2020

A 3 NCF: An Adaptive Aspect Attention Model for Rating Prediction Zhiyong Cheng 1 , Ying Ding 2 ,

Using BARRA to assess climate related hazards Chun-Hsu Su 1 , Nathan Eizenberg 1 , Peter Steinle 1

On the Complexity of Rationalizing Behavior Jose Apesteguia and Miguel A. Ballester Universitat

MUSer2: An Efficient MUS Extractor SYSTEM DESCRIPTION Anton Belov and Joao Marques-Silva Complex

Quantum circuits: From Structure to Software Aleks Kissinger Quantum Natural Language Processing,

Basic Todaism Carlos Tomei, PUC-Rio Geometry, Dynamics and Mechanics Seminar Universit degli

Sambuz

Useful Links

Newsletter

Mail Us

Moving Toward a Concurrent Computing Grammar Michael Kane and - PowerPoint PPT Presentation

Moving Toward a Concurrent Computing Grammar Michael Kane and Bryan Lewis "Make the easy things easy and the hard things possible" - Larry Wall Workshop on Distributed Computing in R Indrijit and Michael Identified a Few Themes

Working Together What does his future hold? Carres Grammar School Carres Grammar School

Grammar and word order Grammar and word order Grammar Grammar Includes morphology and syntax

DDR solution Sprites overview Moving right arrow Moving left arrow Moving down arrow Moving up

GRAMMAR THROUGH HUMOR BRANDY SHOOKS &amp; WHITNEY SCHARER TEACHING GRAMMAR THROUGH HUMOR Having

General Context-Free Grammar Parsing: Application of grammar rewrite rules A phrase structure

Introduction to English Linguistics 4: Grammar and Syntax I Grammar and Syntax Grammar The

Grammar: The Heart of Numeracy 18 Nov, 2017 0B 2017 NNN2 Grammar: The Heart of Numeracy 1 0B

Introduction to English Linguistics 4: Grammar and Syntax Grammar and Syntax Grammar The rules

General Context-Free Grammar Parsing Application of grammar rewrite rules A phrase structure

Concurrent Enrollment A Guide for Parents and Students What is Concurrent Enrollment? Concurrent

Concurrent Message Service M. Clemencic CERN - LHCb Forum on Concurrent Programming Models and

Concurrent Programming in Scala 1 / 7 Concurrent Programming 1 Concurrent programming:

Lexical Grammar Unicorns I: The Passive Voice Alex Walls Director of Studies, Windsor English

APPR-NN-Sequences and their HPSGs view on lexicon and grammar grammar sign

Surface Reasoning Lecture 2: Logic and Grammar Thomas Icard June 18-22, 2012 Thomas Icard:

Grammar Formalisms: C-structures are represented with trees. Lexical Functional Grammar (LFG)

Can Non-Volatile Memory Benefit MapReduce Applications on HPC Clusters? Md. Wasi-ur- Rahman,

Rigid Body Dynamics CSE169: Computer Animation Instructor: Steve Rotenberg UCSD, Winter 2020

A 3 NCF: An Adaptive Aspect Attention Model for Rating Prediction Zhiyong Cheng 1 , Ying Ding 2 ,

Using BARRA to assess climate related hazards Chun-Hsu Su 1 , Nathan Eizenberg 1 , Peter Steinle 1

On the Complexity of Rationalizing Behavior Jose Apesteguia and Miguel A. Ballester Universitat

MUSer2: An Efficient MUS Extractor SYSTEM DESCRIPTION Anton Belov and Joao Marques-Silva Complex

Quantum circuits: From Structure to Software Aleks Kissinger Quantum Natural Language Processing,

Basic Todaism Carlos Tomei, PUC-Rio Geometry, Dynamics and Mechanics Seminar Universit degli

Sambuz

Useful Links

Newsletter

Mail Us

GRAMMAR THROUGH HUMOR BRANDY SHOOKS & WHITNEY SCHARER TEACHING GRAMMAR THROUGH HUMOR Having