Computing Shanjiang Tang , Bu-Sung Lee, Bingsheng He School of - - PowerPoint PPT Presentation

computing
SMART_READER_LITE
LIVE PREVIEW

Computing Shanjiang Tang , Bu-Sung Lee, Bingsheng He School of - - PowerPoint PPT Presentation

Speedup for Multi-Level Parallel Computing Shanjiang Tang , Bu-Sung Lee, Bingsheng He School of Computer Engineering Nanyang Technological University 21 st May 2012 OutLine Background & Motivation Multi-Level Parallel Speedup


slide-1
SLIDE 1

Speedup for Multi-Level Parallel Computing

School of Computer Engineering Nanyang Technological University 21st May 2012 Shanjiang Tang, Bu-Sung Lee, Bingsheng He

slide-2
SLIDE 2

OutLine

  • Background & Motivation
  • Multi-Level Parallel Speedup
  • Evaluation
  • Conclusion
slide-3
SLIDE 3

Multi-Level Computing Architecture and Paradigm

slide-4
SLIDE 4

Multi-Level Computing Architecture and Paradigm

  • MPI+OpenMP
  • MPI+CUDA
  • MPI+OpenMP+CUDA

…..

slide-5
SLIDE 5

Multi-Level Parallel Computing Model

L

Lm L3 L2 L1

L L L L L L L L

Notes: Sequential Part Parallel Part

PE2,2 PE1,1 PE2,1 PE3,1 PE3,2 PE3,3 PE3,4 PE3,5 PE3,6 PE3,7 PE3,8

slide-6
SLIDE 6

Parallel Speedup

  • Definition
  • Classification

Ø Absolute Speedup Ø Relative Speedup

SequentialExecutionTime Speedup ParallelExecutionTime = BestSequentialALGExecutionTime Speedup ParallelALGExecutionTime = ParallelALGSequentialExecutionTime Speedup ParallelALGExecutionTime =

slide-7
SLIDE 7

Relative Speedup Model

  • Fixed-size Speedup

Ø Amdahl’s Law where is parallel fraction workload of the program, is the number of processors.

  • Fixed-time Speedup

Ø Gustafson’s Law

p me parallelTi Time sequential Speedup α α + − = = 1 1 p p me parallelTi Time sequential Speedup α α α α α α + − = + − + − = = 1 1 1

α

p

slide-8
SLIDE 8

Motivation Example—NAS Benchmark (MPI+OpenMP)

slide-9
SLIDE 9

Motivation Example—NAS Benchmark (MPI+OpenMP)

Amdahl’s Law is UNSUITABLE for Multi-Level Parallel Computing

slide-10
SLIDE 10

OutLine

  • Background & Motivation
  • Multi-Level Parallel Speedup
  • Evaluation
  • Conclusion
slide-11
SLIDE 11

E-Amdahl’s Law

  • Awareness of Different Grained-Level Parallelism

L

Lm L3 L2 L1

L L L L L L L L

Notes : Sequential Part Parallel Part

PE2,2 PE1,1 PE2,1 PE3,1 PE3,2 PE3,3 PE3,4 PE3,5 PE3,6 PE3,7 PE3,8

⎪ ⎪ ⎪ ⎪ ⎩ ⎪ ⎪ ⎪ ⎪ ⎨ ⎧ < ≤ + + − = + − = ) 1 ( ) 1 ( ) ( ) ( ) ( 1 1 ) ( ) ( ) ( ) ( 1 1 ) ( m i i sp i p i f i f m i m p m f m f i sp

slide-12
SLIDE 12

E-Amdahl’s Law

  • Two-Level Parallelism Speedup Model (MPI+OpenMP)

where is the parallel fraction of coarse-grained (MPI-level) parallelism. is the parallel fraction of fine-grained (OpenMP-level) parallelism. is the number of processes spawned. is the number of threads spawned per process.

p t p t t p sp ) 1 ( 1 1 ) , , , ( β β α α β α + − + − = α β

slide-13
SLIDE 13

E-Gustafson’s Law

  • Awareness of Different Grained-Level Parallelism

L

Lm L3 L2 L1

L L L L L L L L

Notes : Sequential Part Parallel Part

PE2,2 PE1,1 PE2,1 PE3,1 PE3,2 PE3,3 PE3,4 PE3,5 PE3,6 PE3,7 PE3,8

⎩ ⎨ ⎧ < ≤ + + − = + − = ) 1 ( ) 1 ( ) ( ) ( ) ( 1 ) ( ) ( ) ( ) ( 1 ) ( m i i sp i p i f i f m i m p m f m f i sp

slide-14
SLIDE 14

OutLine

  • Background & Motivation
  • Multi-Level Parallel Speedup
  • Evaluation
  • Conclusion
slide-15
SLIDE 15

Experiment Setup

  • Platform and Configuration

Ø A linux cluster consisting of eight computing nodes each with two quad-core chips Ø Configuration: One thread per CPU core

  • Benchmarks

NAS Parallel Benchmark (NPB) Multi-Zone (MZ) Version: Ø BT-MZ (Unbalanced Workload Partitioning) Ø SP-MZ (balanced Workload Partitioning) Ø LU-MZ (balanced Workload Partitioning)

slide-16
SLIDE 16

Performance Prediction

slide-17
SLIDE 17

Prediction Result Comparison

slide-18
SLIDE 18

OutLine

  • Background & Motivation
  • Multi-Level Parallel Speedup
  • Evaluation
  • Conclusion
slide-19
SLIDE 19

Conclusion

  • Traditional speedup models are unsuitable for multi-level parallelism

– Unable to be awareness of different granularities of parallelism for multi-level parallel computing.

  • Multi-level Parallelism Model

– A guidance model for multi-level optimization. – A prediction model for multi-level parallelism.

slide-20
SLIDE 20
slide-21
SLIDE 21

Argument Estimation

slide-22
SLIDE 22

Speedup Under E-Amdahl’s Law

slide-23
SLIDE 23

Speedup Under E-Gustafson’s Law