Dandelion Review for R212: 24 th November 2014 Motivation GPU, - - PowerPoint PPT Presentation

▶

Nov 16, 2022 458 likes •628 views

Dandelion Review for R212: 24 th November 2014 Motivation GPU, FPGA, Vector processors becoming increasingly common (data parallel, power requirements, SIMD, etc.) What is Dandelion? Compiler for native .NET-based LINQ Compiler code (in

SLIDE 1

Dandelion

Review for R212: 24th November 2014

SLIDE 2

Motivation

GPU, FPGA, Vector processors becoming increasingly common (data parallel, power requirements, SIMD, etc.)

SLIDE 3

What is Dandelion? Compiler Runtime

Compiler for native

.NET-based LINQ code (in C# or F#) for GPU programming

Abstract scheduling

details from programmer: Multi {machine, CPU, GPU}

SLIDE 4

Compiler

Clean interface to CUDA
Deal with CUDA complexities

– e.g. dynamic memory allocation

Bytecode compilation: benefits
Static analysis

SLIDE 5

Runtime

Needs to consider three scenarios:

– Machine-machine – CPU-local – GPU

SLIDE 6

Runtime

Needs to consider three scenarios:

– Machine-machine – CPU-local – GPU

SLIDE 7

GPU dataflow

SLIDE 8

GPU dataflow

SLIDE 9

Compute cluster

Two techniques:

– Dryad: persistent storage, high availability – Moxie (developed for Dandelion):

Spark-like in-memory storage and checkpoints

SLIDE 10

Compute cluster

Two techniques:

– Dryad: persistent storage, high availability – Moxie (developed for Dandelion):

Spark-like in-memory storage and checkpoints

Master Master Master Container Container Container

SLIDE 11

Evaluation

SLIDE 12

Single machine performance

SLIDE 13

K-means

20x less code

SLIDE 14

Criticisms

No discussion of inter-machine scheduling and

associated overheads

Claim to support FPGAs, but no evaluation of

this (cost reasons perhaps?).

Still suffering Garbage Collection due to

managed runtime overheads.

More evaluation beyond k-means?

SLIDE 15

Summary

Data-parallel hardware becoming mainstream;

need high-level programming support.

Dandelion schedules work onto GPUs (and
thers) from a high-level C# or F#

implementation

Achieves noticeable (30x+) speed

Dandelion

Review for R212: 24th November 2014

Motivation

GPU, FPGA, Vector processors becoming increasingly common (data parallel, power requirements, SIMD, etc.)

What is Dandelion? Compiler Runtime

.NET-based LINQ code (in C# or F#) for GPU programming

details from programmer: Multi {machine, CPU, GPU}

Compiler

Runtime

Runtime

GPU dataflow

GPU dataflow

Compute cluster

Spark-like in-memory storage and checkpoints

Compute cluster

Spark-like in-memory storage and checkpoints

Master Master Master Container Container Container

Evaluation

Single machine performance

K-means

20x less code

Criticisms

associated overheads

this (cost reasons perhaps?).

managed runtime overheads.

Summary

need high-level programming support.

implementation

improvements through use of GPUs, without learning overhead of CUDA or similar.