CrowdCL Web-Based Volunteer Computing with WebCL Tommy MacWilliam, - - PowerPoint PPT Presentation

crowdcl
SMART_READER_LITE
LIVE PREVIEW

CrowdCL Web-Based Volunteer Computing with WebCL Tommy MacWilliam, - - PowerPoint PPT Presentation

CrowdCL Web-Based Volunteer Computing with WebCL Tommy MacWilliam, Cris Cecka Computer Science Institute for Applied Computational Science School of Engineering and Applied Sciences Harvard University September 11, 2013 Tommy MacWilliam


slide-1
SLIDE 1

CrowdCL

Web-Based Volunteer Computing with WebCL Tommy MacWilliam, Cris Cecka

Computer Science Institute for Applied Computational Science School of Engineering and Applied Sciences Harvard University

September 11, 2013

Tommy MacWilliam CrowdCL 1 / 26

slide-2
SLIDE 2

Volunteer Computing

“Donation" of CPU cycles to scientific problems Folding@home

300,000 contributors... right now. 5 PetaFLOPS sustained

SETI@home

3 million participants

PrimeGrid, GPUGRID, NFS@Home, NSA@Home (j/k)

Tommy MacWilliam CrowdCL 2 / 26

slide-3
SLIDE 3

High Throughput Science

Tommy MacWilliam CrowdCL 3 / 26

slide-4
SLIDE 4

High Throughput Science

Tommy MacWilliam CrowdCL 4 / 26

slide-5
SLIDE 5

High Throughput Science

Tommy MacWilliam CrowdCL 5 / 26

slide-6
SLIDE 6

High Throughput Science

Tommy MacWilliam CrowdCL 6 / 26

slide-7
SLIDE 7

Goals

Bring volunteer computing to the web browser

“Volunteer" Reduce downloading/installing friction. Web-browser as a high-performance distributed computing platform.

Develop robust library for GPU computing in Javascript.

Enable GPU development and metaprogramming on the web. “Windows and Linux present a near-infinite combination of hardware, software, and drivers that would not be encountered in a local setting. This means that a significant amount of time is spent dealing with incompatibilities when the clients are developed, and every time a new version of the operating system is shipped such as Windows 7,

  • r the latest version of a Linux distribution."

– Beberg et al. Folding@home: Lessons From Eight Years of

Volunteer Distributed Computing

Tommy MacWilliam CrowdCL 7 / 26

slide-8
SLIDE 8

WebCL

Experimental cross-platform JS binding for OpenCL Available for Firefox, WebKit, and Node.js API is verbose, procedural, and difficult to use

Tommy MacWilliam CrowdCL 8 / 26

slide-9
SLIDE 9

Contributions

KernelContext, KernelUtils

PyCUDA inspired abstraction layer for WebCL

CrowdCL

Framework for developing and deploying high performance, web-based volunteer computing projects.

Application to existing crowd-generated data project. Comparison with existing cross-platform solutions

Tommy MacWilliam CrowdCL 9 / 26

slide-10
SLIDE 10

KernelContext

Abstraction layer for WebCL inspired by PyCUDA

Minimizes WebCL/OpenCL boilerplate OpenCL kernels are first-class citizens Lazy evaluation utilizes the OpenCL command queue.

Tommy MacWilliam CrowdCL 10 / 26

slide-11
SLIDE 11

KernelContext

1 var ctx = new KernelContext ; 2 var source_str = "__kernel void FN_NAME (...) {...}" 3 var kernel = ctx.compile(source_str , ’FN_NAME ’); 4 5 var data = new Uint32Array (10); 6 var d_data = ctx.toGPU(data); 7 kernel ({ local: 32, global: 32}, d_data); 8 ctx.fromGPU(d_data , data);

Tommy MacWilliam CrowdCL 11 / 26

slide-12
SLIDE 12

KernelUtils

Dynamically generate kernels following common patterns mapKernel, reduceKernel

Generate a re-usable map or reduce kernel “Templated" on map/reduce operation Hides complexity – job size, multiple launches, etc.

map, reduce

Generate and launch a single-use map or reduce kernel

Tommy MacWilliam CrowdCL 12 / 26

slide-13
SLIDE 13

KernelUtils

1 var ctx = new KernelContext ; 2 var util = new KernelUtils(ctx); 3 4 var a1 = new Uint32Array (10); 5 var result = util.map(’x’, ’x[i] + 1’, a1); 6 7 var a2 = new Uint32Array (100000); 8 var result = util.map(’x’, ’x[i] * 0.43 ’, a2);

Tommy MacWilliam CrowdCL 13 / 26

slide-14
SLIDE 14

KernelUtils

1 var ctx = new KernelContext ; 2 var util = new KernelUtils(ctx); 3 4 var a1 = new Uint32Array (10); 5 var sum1 = util.reduce(’a + b’, a1); 6 var max1 = util.reduce(’(a > b) ? a : b’, a1); 7 8 var a2 = new Uint32Array (100000); 9 var prd2 = util.reduce(’a * b’, a2); 10 var min2 = util.reduce(’(a < b) ? a : b’, a2);

Tommy MacWilliam CrowdCL 14 / 26

slide-15
SLIDE 15

KernelUtils

1 var sum_kernel = util. reduceKernel (Uint32Array , ’a + b’); 2 var max_kernel = util. reduceKernel (Uint32Array , ’(a > b) ? a : b’); 3 4 var a1 = new Float32Array (100000); 5 var d_a1 = ctx.toGPU(a1); 6 7 var sum2 = sum_kernel(d_a1); 8 var max2 = max_kernel(d_a1);

Tommy MacWilliam CrowdCL 15 / 26

slide-16
SLIDE 16

CrowdCL

Built on KernelContext to provide a re-usable framework for volunteer computing applications CrowdCLient

Client library – generate results via WebCL

CrowdServer

Server library – collect results, aggregate data

Tommy MacWilliam CrowdCL 16 / 26

slide-17
SLIDE 17

CrowdCL Architecture

Tommy MacWilliam CrowdCL 17 / 26

slide-18
SLIDE 18

CrowdCLient

Execute code in the background of a web page Send batched results to CrowdServer Acts like a Thread class:

Define a run method that generates results for a problem API to pause, resume, and sleep execution.

Tommy MacWilliam CrowdCL 18 / 26

slide-19
SLIDE 19

CrowdServer

RESTful Node.js application to aggregate CrowdCLient results Supports both MongoDB and MySQL to store data

Tommy MacWilliam CrowdCL 19 / 26

slide-20
SLIDE 20

Thomson Problem

Thomson problem: nonlinear optimization problem, useful in many problems in biology, math, physics, and computer science Lowest energy configurations of N repelling charges on a sphere

Force (gradient) and energy require O(N2) computation. Number of local minima grows exponentially with N

Tommy MacWilliam CrowdCL 20 / 26

slide-21
SLIDE 21

Thomson Problem

Let ωN = x1, . . . , xN with xi = 1 and Es(ωN) =

  • x,y∈ωN

x=y

1 x − ys Gradient descent: Compute the gradient (force on each point x ∈ ωN)

G(ωN)[x] =

  • y∈ωN

y=x

x − y |x − y|3

Compute a (heuristic) step-length

ds := f(ωN, G(ωN))

Update all points in ωN and renormalize

x := x + ds · G(ωN)[x] x :=

x x

Tommy MacWilliam CrowdCL 21 / 26

slide-22
SLIDE 22

NVIDIA 320M

Tommy MacWilliam CrowdCL 22 / 26

slide-23
SLIDE 23

NVIDIA Tesla K20

Tommy MacWilliam CrowdCL 23 / 26

slide-24
SLIDE 24

Kernel Performance (NVIDIA 320M)

Tommy MacWilliam CrowdCL 24 / 26

slide-25
SLIDE 25

WebCL only available on Firefox 19 + plugin Nice, big security issues for general deployment

Tommy MacWilliam CrowdCL 25 / 26

slide-26
SLIDE 26

Thank you

https://github.com/tmacwill/webcl-kernelcontext https://github.com/tmacwill/crowdcl

Tommy MacWilliam CrowdCL 26 / 26