Task Farming For Embarrassingly Parallel Processing Ivan Giro*o - - PowerPoint PPT Presentation

task farming for embarrassingly parallel processing
SMART_READER_LITE
LIVE PREVIEW

Task Farming For Embarrassingly Parallel Processing Ivan Giro*o - - PowerPoint PPT Presentation

Task Farming For Embarrassingly Parallel Processing Ivan Giro*o igiro*o@ictp.it Informa(on & Communica(on Technology Sec(on (ICTS) Interna(onal Centre for Theore(cal Physics (ICTP) Mul(-core system Vs Serial Programming Xeon E5650


slide-1
SLIDE 1

Ivan Giro*o – igiro*o@ictp.it

Informa(on & Communica(on Technology Sec(on (ICTS) Interna(onal Centre for Theore(cal Physics (ICTP)

Task Farming For Embarrassingly Parallel Processing

slide-2
SLIDE 2

Xeon E5650 hex-core processors (12GB - RAM)

Mul(-core system Vs Serial Programming

Ivan GiroLo igiroLo@ictp.it Task Farming For Embarrassingly Parallel Processing 2

slide-3
SLIDE 3

Xeon E5650 hex-core processors (12GB - RAM)

Mul(-core system Vs // Programming

Ivan GiroLo igiroLo@ictp.it Task Farming For Embarrassingly Parallel Processing 3

slide-4
SLIDE 4

NETWORK

Ivan GiroLo igiroLo@ictp.it Task Farming For Embarrassingly Parallel Processing 4

slide-5
SLIDE 5

I don’t know about // Programming

Ivan GiroLo igiroLo@ictp.it Task Farming For Embarrassingly Parallel Processing 5

Deadline: 15/05!!

slide-6
SLIDE 6

... but I’m lucky!!

  • I am working on an embarrassing parallel problem
  • I can divide the work in independent tasks (no

communica(on) that can be performed in parallel

  • Quite common in Computer Graphics, Bioinforma(cs,

Genomics, HEP, anything else requiring processing of large data-set, sampling, ensemble modeling

Ivan GiroLo igiroLo@ictp.it Task Farming For Embarrassingly Parallel Processing 6

slide-7
SLIDE 7

Single Program on Mul(ple Data

  • performing the same program (set of instruc(ons)

among different data

  • Same model adopted by the MPI library
  • A parallel tool is needed to handle the different

processes working in parallel

  • The MPI library provides the mpirun applica(on to

execute parallel instances of the same program

Ivan GiroLo igiroLo@ictp.it Task Farming For Embarrassingly Parallel Processing 7

slide-8
SLIDE 8

Ivan GiroLo igiroLo@ictp.it Task Farming For Embarrassingly Parallel Processing 8 Ivan GiroLo igiroLo@ictp.it

$ mpirun -np 12 my_program.x mynode01 mynode02

slide-9
SLIDE 9

Ivan GiroLo igiroLo@ictp.it Task Farming For Embarrassingly Parallel Processing 9

[igirotto@mynode01 ~]$ mpirun -np 12 /bin/hostname mynode01 mynode02 mynode01 mynode02 mynode01 mynode02 mynode01 mynode02 mynode01 mynode02 mynode01 mynode02

slide-10
SLIDE 10

Parallel Opera(ons in Prac(ce

  • Parallel reading and compu(ng in parallel is

always allowed

  • Parallel wri(ng is extremely dangerous!
  • To control the parallel flow each process should

be unique and iden(fiable (ID)

  • The OpenMPI implementa(on of the MPI library

provides a series of environment variables defined for each MPI process

Ivan GiroLo igiroLo@ictp.it Task Farming For Embarrassingly Parallel Processing 10

slide-11
SLIDE 11

OMPI_COMM_WORLD_SIZE - the number of processes in this process' MPI Comm_World OMPI_COMM_WORLD_RANK - the MPI rank of this process OMPI_COMM_WORLD_LOCAL_RANK - the rela(ve rank of this process on this node within its job. For example, if four processes in a job share a node, they will each be given a local rank ranging from 0 to 3. OMPI_UNIVERSE_SIZE - the number of process slots allocated to this job. Note that this may be different than the number of processes in the job. OMPI_COMM_WORLD_LOCAL_SIZE - the number of ranks from this job that are running on this node. OMPI_COMM_WORLD_NODE_RANK - the rela(ve rank of this process on this node looking across ALL jobs.

Ivan GiroLo igiroLo@ictp.it Task Farming For Embarrassingly Parallel Processing 11

hLp://www.open-mpi.org

slide-12
SLIDE 12

In Python

import os myid = os.environ['OMPI_COMM_WORLD_RANK'] [...]

Ivan GiroLo igiroLo@ictp.it Task Farming For Embarrassingly Parallel Processing 12

In BASH

#!/bin/bash myid=${OMPI_COMM_WORLD_RANK} [...]

[igirotto@mynode01 ~]$ mpirun ./myprogram.[py/sh...]

slide-13
SLIDE 13

Ivan GiroLo igiroLo@ictp.it Task Farming For Embarrassingly Parallel Processing 13

Possible Applica(ons

  • Execu(ng mul(ple instances on the same program

with different inputs/ini(al cond.

  • Reading large binary files by splilng the workload

among processes

  • Searching elements on large data-sets
  • Other parallel execu(on of embarrassingly parallel

problem (no communica(on among tasks)

slide-14
SLIDE 14

Conclusions

  • Task Farming is a simple model to parallelize

simple problems that can be divided in independent task

  • The mpirun applica(on aids to easily perform

mul(ple processes, includes environment selng

  • Load balancing remains a main problem, but

moving from serial to parallel processing can substan(ally speed-up (me of simula(on

Ivan GiroLo igiroLo@ictp.it Task Farming For Embarrassingly Parallel Processing 14