toward understanding heterogeneity in computing
play

Toward Understanding Heterogeneity in Computing Arnold L. Rosenberg - PowerPoint PPT Presentation

Toward Understanding Heterogeneity in Computing Arnold L. Rosenberg Ron C. Chiang Department of Electrical and Computer Engineering Colorado State University Fort Collins, CO, USA {rsnbrg, ron.chiang@colostate.edu} Motivation Goal


  1. Toward Understanding Heterogeneity in Computing Arnold L. Rosenberg Ron C. Chiang Department of Electrical and Computer Engineering Colorado State University Fort Collins, CO, USA {rsnbrg, ron.chiang@colostate.edu}

  2. Motivation • Goal – to increase our understanding of heterogeneity in computing platforms 2

  3. Motivation • Goal – to increase our understanding of heterogeneity in computing platforms • Heterogeneous computing platforms – different computing speeds 3

  4. Motivation • Goal – to increase our understanding of heterogeneity in computing platforms • Heterogeneous computing platforms – different computing speeds – architecturally balanced 4

  5. “Understanding” Heterogeneity Suppose we have • n +1 computers: – the server C 0 – a “cluster” C comprising n computers, C 1 , …, C n • Heterogeneity profile of C ρ – C i can complete one unit of work in time i < > ρ ,..., ρ – 1 n – ρ ρ ρ ≥ ≥ ≥ ... 1 2 n 5

  6. The Cluster-Exploitation Problem (CEP) • C 0 must complete as many units of work as possible on cluster C within a given lifespan of L time units 6

  7. The Cluster-Exploitation Problem (CEP) • C 0 must complete as many units of work as possible on cluster C within a given lifespan of L time units • A worksharing protocol – a schedule that solves the CEP 7

  8. Architectural Parameters Fixed communication cost σ – setup time λ – latency negligible over a long lifespan 8

  9. Architectural Parameters and Sample Values Common parameters: τ μ – transmission rate (e.g. 1 sec. / work unit) δ – output-to-input length ratio (= 1) For computer i , π μ – packaging rate (e.g. 10 sec. / work unit) i μ π – unpackaging rate (e.g. 10 sec. / work unit) i – workload (work units) w i 9

  10. 10 C 1 C n Worksharing Protocols 1 1 w w ) τ π + 0 ( C 0

  11. 11 C 1 C n 1 Worksharing Protocols w 1 ρ ) π + 1 ( n n w w ) τ π + 0 ( C 0

  12. 12 C 1 C n Worksharing Protocols n n w ρ π ) 1 1 w ( + w δ 1 δ ) τ πρ + 1 ( C 0

  13. 13 C 1 C n Worksharing Protocols n w δ n ) w τ δ + n πρ ( C 0

  14. The FIFO Protocol C 0 sends sends sends work to C 1 work to C 2 work to C 3 π + π + π + τ τ τ ( ) w ( ) w ( ) w 0 1 0 2 0 3 C 1 waits processes results + πρ + π ρ τ δ ( 1 ) w ( ) w 1 1 1 1 C 2 waits processes results + πρ + π ρ τ δ ( 1 ) w ( ) w 2 2 2 2 waits processes results C 3 + πρ + π ρ τ δ ( 1 ) w ( ) w 3 3 3 3 (NOT TO SCALE) 14

  15. The FIFO Protocol is Optimal • Theorem [Adler-Gong-Rosenberg] Over any sufficiently long lifespan L , for any heterogeneous cluster C — no matter what its heterogeneity profile : – FIFO worksharing protocols provide optimal solutions to the cluster-exploitation problem – C is equally productive under every FIFO protocol, i.e., under all startup orderings 15

  16. The Work-Production of FIFO Let ⎛ ⎞ + − π τ τδ − 1 n i 1 ⎜ ⎟ = ∑ ∏ − 0 X 1 ⎜ ⎟ + + + + + + + + π τ π πδ ρ π τ π πδ ρ ( ) ( 1 ) ( ) ( 1 ) ⎝ ⎠ = = i 1 j 1 0 i 0 j 16

  17. The Work-Production of FIFO Let ⎛ ⎞ + − π τ τδ − 1 n i 1 ⎜ ⎟ = ∑ ∏ − 0 X 1 ⎜ ⎟ + + + + + + + + π τ π πδ ρ π τ π πδ ρ ( ) ( 1 ) ( ) ( 1 ) ⎝ ⎠ = = i 1 j 1 0 i 0 j Then, 1 = ⋅ W L + 1 τδ X 17

  18. The Work-Production of FIFO Let ⎛ ⎞ + − π τ τδ − 1 n i 1 ⎜ ⎟ = ∑ ∏ − 0 X 1 ⎜ ⎟ + + + + + + + + π τ π πδ ρ π τ π πδ ρ ( ) ( 1 ) ( ) ( 1 ) ⎝ ⎠ = = i 1 j 1 0 i 0 j ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ = + = + + π τ π πδ To simplify, let A and B 1 , 0 ⎛ ⎞ + ρ τδ B − 1 n i 1 ⎜ ⎟ ∑ ∏ = j X ⎜ ⎟ + + ρ ρ A B A B ⎝ ⎠ = = i 1 j 1 i j 18

  19. On Comparing Heterogeneity Profiles • For any cluster C with heterogeneity profile = ρ ρ P , ... , 1 n 19

  20. On Comparing Heterogeneity Profiles • For any cluster C with heterogeneity profile = ρ ρ P , ... , 1 n • C ’s homogeneous-equivalent computing rate ( HECR ) is { } = ≥ ρ ρ ( ) max X ( P ) X ( P ) c ρ = ρ ( ) ρ ρ P , ... , where 20

  21. Heterogeneity Profiles − + n i 1 = ρ Profile 1 : , which spreads evenly in a range i n 8 7 6 1 = when n 8 , , , ,..., 8 8 8 8 Number of Computers 8 16 32 HECR 0.362 0.297 0.251 Recall: faster cluster has smaller HECR value 21

  22. Heterogeneity Profiles 1 = ρ Profile 2 : i i 1 1 1 1 = when n 8 , , , ,..., 1 2 3 8 Number of Computers 8 16 32 HECR 0.216 0.116 0.061 22

  23. Avg. Speed vs. Std-Dev of Speed 8 computers 0.8 0.7 0.6 0.5 HECR 0.4 Std-Dev=0.2 0.3 0.2 Std-Dev=0.1 0.1 Std-Dev=0.05 0 Avg. Avg. Avg. Speed Speed Speed =0.75 =0.5 =0.25 Randomly generate 100 profiles for each combination 23

  24. Avg. Speed vs. Std-Dev of Speed Std-Dev 8 computers’ HECR 0.2 0.1 0.05 0.75 0.681 0.735 0.759 Avg. Speed 0.5 0.411 0.482 0.501 0.25 0.113 0.208 0.239 The probability that these two groups have the same mean × − is 10 2 10 24

  25. Avg. Speed vs. Std-Dev of Speed Std-Dev 8 computers’ HECR 0.2 0.1 0.05 0.75 0.681 0.735 0.759 Avg. Speed 0.5 0.411 0.482 0.501 0.25 0.113 0.208 0.239 Trials with 16, 32 computers show similar pattern 25

  26. Speeding Up Clusters Optimally under FIFO Protocols • Which one computer should you speed up, if you can speed up only one? 26

  27. Speeding Up Clusters Optimally under FIFO Protocols • Which one computer should you speed up, if you can speed up only one? • We study two variants of this question 27

  28. Speeding Up Clusters Optimally under FIFO Protocols For convenienc e, =< > ρ ρ C - let cluster have heterogene ity profile P ,..., , 1 n ≥ ≥ ≥ ρ ρ ρ where ... 1 2 n > - let i and j i be two computer indices 28

  29. Fixed and Proportional Speed-up • Fixed-speedup scenario φ < ρ • by a fixed amount n = − ( i ) ρ ρ ρ φ ρ ρ ρ ρ ρ P ,..., , , ,..., , , ,..., − + − + 1 i 1 i i 1 j 1 j j 1 n = − ρ ρ ρ ρ ρ ρ φ ρ ρ ( j ) P ,..., , , ,..., , , ,..., − + − + 1 i 1 i i 1 j 1 j j 1 n 29

  30. Fixed and Proportional Speed-up φ < ρ • Fixed-speedup scenario (by a fixed amount ) n = − ( i ) ρ ρ ρ φ ρ ρ ρ ρ ρ P ,..., , , ,..., , , ,..., − + − + 1 i 1 i i 1 j 1 j j 1 n = − ρ ρ ρ ρ ρ ρ φ ρ ρ ( j ) P ,..., , , ,..., , , ,..., − + − + 1 i 1 i i 1 j 1 j j 1 n • Proportional-speedup scenario < ψ 1 • by a relative amount = ρ ρ ψρ ρ ρ ρ ρ ρ [ i ] P ,..., , , ,..., , , ,..., − + − + 1 i 1 i i 1 j 1 j j 1 n = ρ ρ ρ ρ ρ ψρ ρ ρ [ j ] P ,..., , , ,..., , , ,..., − + − + 1 i 1 i i 1 j 1 j j 1 n 30

  31. Proposition for Fixed-Speedup • Under the fixed-speedup scenario, the most advantageous single computer to speed up is C ’s fastest computer 31

  32. Terms for following figures 1 = ⋅ • Recall: work production W L + 1 τδ X • Work ratio – the ratio of work production after speedup to work production before speedup • Speedup computer – the single computer that is sped up 32

  33. Fixed-Speedup Scenario 1.5 1.4 1.3 Work ratio <1, 1/2, 1/3, 1/4> 1.2 <1/2, 1/4, 1/6, 1/8> 1.1 1 0.9 1 2 3 4 = φ 1 / 16 speedup computer 33

  34. Proposition for Proportional-Speedup = + = + + > π τ π πδ ρ ρ (Recall : A , B 1 , and ) 0 i j > ψρ ρ τδ 2 • If A / B i j – speeding up (faster) is better C j < ψρ ρ τδ 2 • If A / B i j – speeding up (slower) is better C i 34

  35. Proposition for Proportional-Speedup = + = + + > π τ π πδ ρ ρ (Recall : A , B 1 , and ) 0 i j > = × − ψρ ρ τδ 2 5 A / B 1 . 0 10 • If i j – speeding up (faster) is better C j < = × − ψρ ρ τδ 2 5 A / B 1 . 0 10 • If i j – speeding up (slower) is better C i Parameter Rate μ A 11 second / work unit B with coarse 1.000011 second / work unit (1 sec / task) tasks 35

  36. Proposition for Proportional-Speedup = + = + + > π τ π πδ ρ ρ (Recall : A , B 1 , and ) 0 i j > = × − ψρ ρ τδ 2 5 A / B 1 . 0 10 • If i j – speeding up (faster) is better C j < = × − ψρ ρ τδ 2 5 A / B 1 . 0 10 • If i j – speeding up (slower) is better C i That is, it is more advantageous to speed up the faster one unless either both computers are already “very fast” or the speedup factor is “very large.” 36

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend