Towards Jungle Computing with Ibis/Constellation Jason Maassen, - - PowerPoint PPT Presentation

towards jungle computing with ibis constellation
SMART_READER_LITE
LIVE PREVIEW

Towards Jungle Computing with Ibis/Constellation Jason Maassen, - - PowerPoint PPT Presentation

Towards Jungle Computing with Ibis/Constellation Jason Maassen, Niels Drost Henri Bal, Frank Seinstra Department of Computer Science VU University, Amsterdam, The Netherlands Introduction HPC is entering many domains Not just: physics


slide-1
SLIDE 1

Towards Jungle Computing with Ibis/Constellation

Jason Maassen, Niels Drost Henri Bal, Frank Seinstra

Department of Computer Science VU University, Amsterdam, The Netherlands

slide-2
SLIDE 2

Introduction

  • HPC is entering many domains
  • Not just: physics / chemistry / climate modelling
  • Also: semantic web / medical / multimedia analysis /

neuroinformatics / remote sensing / astronomy / ...

  • HPC is becoming more complex
  • Not just large SMP or clusters, instead:
  • Clusters of SMPs / Grids / Clouds / Supers / ...
  • Heterogenous machines using GPU / Cell / FPGA
  • “It‟s a jungle out there“

3DAPAS Workshop 2011 2

slide-3
SLIDE 3

Example Domain

Computational Astrophysics (amusecode.org)

slide-4
SLIDE 4

Jungle Computing

  • Worst case computing ... as required by users
  • Arbitrary combination of distributed, hierarchical,

and heterogenous computing

3DAPAS Workshop 2011 4

slide-5
SLIDE 5

Many Task Computing

According to Raicu, Foster, et al [SC‟08] “High-performance computations comprising multiple distinct activities, coupled via file system operations or message passing. Tasks may be small or large, uni-processor or multi-processor, compute-intensive or data-intensive. The set of tasks may be static or dynamic, homogeneous or heterogeneous, loosely coupled or tightly coupled. The aggregate number of tasks, quantity of computing, and volumes of data may be extremely large.”

3DAPAS Workshop 2011 5

  • Applications are dynamic and heterogeneous

workflows / DAGs of activities

slide-6
SLIDE 6

MTC in the Jungle

  • MTC has advantages for Jungle Computing
  • Many distinct activities
  • Can be implemented independently using the tools

and targeted to the HPC architecture, that best suit them

  • Reduced programming complexity
  • Complete applications are constructed using

sequences and combinations of activities

3DAPAS Workshop 2011 6

slide-7
SLIDE 7

Constellation

  • MTC system for

Jungle Computing

  • Model based on:

activities (tasks) executors (resources) contexts (matchmaking) events (communication)

3DAPAS Workshop 2011 7

slide-8
SLIDE 8

Constellation Model

Application

  • Application: set of activities
  • Distinct tasks
  • Size and complexity may vary
  • Targeted at specific HPC platform
  • (Loosly) Coupled using events
  • Often wrapper around existing code
  • Similar to workflow or DAG of tasks
  • Dynamic and unlimited in size

3DAPAS Workshop 2011 8

slide-9
SLIDE 9

Constellation Model

Hardware

  • Hardware: set of executors
  • Capable of running activities
  • May represent anything from a single

core to an entire cluster, a GPU, etc.

  • May be application specific
  • Provides an application specific

heterogeneous resource pool

3DAPAS Workshop 2011 9

slide-10
SLIDE 10

Constellation Model

Context

  • Both activities and executors

are tagged with a context

  • Application defined label (+ rank)
  • Used to defines relationship between

activites and executors, e.g.:

  • Data dependencies, hardware

requirements, ...

  • May combine contexts
  • Executors may have preference

for label or rank

3DAPAS WorkShop 2011 10

slide-11
SLIDE 11

Constellation Model

Matchmaking

  • RTS performs load-balancing

and match-making

  • Ensures activities are forwarded to

a suitable executor

  • Tries to keep all executors busy
  • Uses context-aware work-stealing
  • RTS also performs event routing
  • Based on unique activity identifier

ComplexHPC Spring School 2011 11

slide-12
SLIDE 12

Constellation API

3DAPAS Workshop 2011 12

slide-13
SLIDE 13

Constellation API

3DAPAS Workshop 2011 13

slide-14
SLIDE 14

DACH 2008

Data Challenge in conjunction with IEEE Cluster/Grid 2008

  • Supernova detection
  • Analyse 1052 image pairs on 11 clusters (Intrigger)
  • „Sequential‟ executable provided

3DAPAS Workshop 2011 14

slide-15
SLIDE 15

DACH 2008

Problem

  • Main problems:
  • Data distribution
  • Heterogeneity of

work and hardware

  • Load balancing

3DAPAS Workshop 2011 15

slide-16
SLIDE 16

DACH 2008

Workflow

  • Winning approach in 2008:
  • Parallelize workflow to improve hardware utilization
  • Create hierarchical master worker framework
  • Scheduling heuristics using data location and size

3DAPAS Workshop 2011 16

slide-17
SLIDE 17

Constellation Version

Option 1: Monolythic

  • Wrap entire application in

a single activity

  • One activity per image pair
  • Wrap each machine in
  • ne executor
  • Multiple cores per executor
  • Use context to influence
  • rder and placement of

each of activities

3DAPAS Workshop 2011 17

slide-18
SLIDE 18

Evaluation

  • Intrigger not available
  • Instead we use DAS3+DAS4
  • 5+6 clusters in the Netherlands
  • Mix of 2/4/8/12/48 core machines
  • Various types of GPUs
  • Three Scenarios
  • Data locality
  • (Executor granularity)
  • Heterogeneous processing

3DAPAS Workshop 2011 18

slide-19
SLIDE 19

Scenario 1

Data Locality

  • Data distributed over 4 clusters of DAS3 + DAS4
  • Use context to express data locality and preferred

processing order

  • Adapt context to tune application
  • No change in application

3DAPAS Workshop 2011 19

slide-20
SLIDE 20

Scenario 1

Results

3DAPAS Workshop 2011 20

Activity Executor Effect “any” “any” Random

  • rder

“any”,50 “any”, biggest Sorted by size “VU3”,”VU4”,50 “VU3”, biggest Local only Sorted by size “VU3”,”VU4”, ”any”,50 “VU3”, “any”, biggest Preference for local Fallback to any, Sorted by size

slide-21
SLIDE 21

Constellation Version

Option 2: Workflow

  • Wrap each stage in activity
  • Wrap each core executor
  • Use context to influence
  • rder and placement of

each of the jobs

3DAPAS Workshop 2011 21

slide-22
SLIDE 22

Scenario 3:

Heterogeneous System

  • 18 node GPU cluster
  • 8 cores + 1 GPU per node
  • Activity: single task
  • Executor:

1 core (top) 1 core or GPU (bottom)

  • Replaced activity 7.2

with GPU version.

  • Label activities and

executors accordingly

  • Significant performance

gain.

ComplexHPC Spring School 2011 22

slide-23
SLIDE 23

Conclusions

  • We think Jungle Computing is a neccesity for

some application areas.

  • Constellation offers a suitable model (MTC) to

create such applications.

  • Initial experiments show that Constellation works

well for a wide range of hardware configurations

  • Easy to reconfigure applications to match resources
  • Allows integration of specialized accellerator codes
  • Suitable basis for a Jungle Computing model

3DAPAS Workshop 2011 23

slide-24
SLIDE 24

Future Work

  • Application development
  • AMUSE
  • Remote Sensing
  • Climate modelling
  • Platform improvements
  • Easier integration of existing codes
  • Smart/automatic deployment/tuning of executors
  • Improve data handling
  • Better monitoring

3DAPAS Workshop 2011 24

slide-25
SLIDE 25

Questions ? jason@cs.vu.nl www.cs.vu.nl/ibis

3DAPAS Workshop 2011 25

slide-26
SLIDE 26

Scenario 2

Executor Granularity

  • 30 largest images only
  • Single 48 core machine
  • Activity: entire application (a-c)

single task (d)

  • Executor: [n]-cores
  • No change in application

for experiment (a-c)

  • Only change executor config.
  • Completely ported

application in (d)

  • Significant performance gain!

3DAPAS Workshop 2011 26