SLIDE 1
Toward Understanding Heterogeneity in Computing Arnold L. Rosenberg - - PowerPoint PPT Presentation
Toward Understanding Heterogeneity in Computing Arnold L. Rosenberg - - PowerPoint PPT Presentation
Toward Understanding Heterogeneity in Computing Arnold L. Rosenberg Ron C. Chiang Electrical & Computer Engineering Colorado State University Fort Collins, CO, 80523, USA Heterogeneity in Computing One encounters HETEROGENEITY in virtually
SLIDE 2
SLIDE 3
Heterogeneity in Computing One encounters heterogeneity in virtually all modern computing systems
- Computers in clusters/grids differ in power (NODE-HETEROGENEITY).
SLIDE 4
Heterogeneity in Computing One encounters heterogeneity in virtually all modern computing systems
- Computers in clusters/grids differ in power (node-heterogeneity).
- Computers intercommunicate across varied networks (LINK-HETEROGENEITY).
SLIDE 5
Heterogeneity in Computing One encounters heterogeneity in virtually all modern computing systems
- Computers in clusters/grids differ in power (node-heterogeneity).
- Computers intercommunicate across varied networks (link-heterogeneity).
WE FOCUS ON NODE-HETEROGENEITY.
SLIDE 6
“Big” Questions about Heterogeneity Heterogeneity complicates the efficient use of multicomputer platforms
SLIDE 7
“Big” Questions about Heterogeneity Heterogeneity complicates the efficient use of multicomputer platforms — BUT CAN IT ENHANCE THEIR PERFORMANCE?
SLIDE 8
“Big” Questions about Heterogeneity Heterogeneity complicates the efficient use of multicomputer platforms — but can it enhance their performance? HOW DOES ONE STUDY THIS QUESTION RIGOROUSLY?
SLIDE 9
Detailed Questions about Heterogeneity
- WHAT MAKES ONE CLUSTER MORE POWERFUL THAN ANOTHER?
SLIDE 10
Detailed Questions about Heterogeneity
- What makes one cluster more powerful than another?
- ARE YOU BETTER OFF . . .
— WITH ONE SUPER-FAST COMPUTER AND MANY “AVERAGE” ONES?
SLIDE 11
Detailed Questions about Heterogeneity
- What makes one cluster more powerful than another?
- ARE YOU BETTER OFF . . .
— WITH ONE SUPER-FAST COMPUTER AND MANY “AVERAGE” ONES? — WITH ALL COMPUTERS “MODERATELY” FAST?
SLIDE 12
Detailed Questions about Heterogeneity
- What makes one cluster more powerful than another?
- Are you better off with
— one super-fast computer and many “average” ones — or with all computers “moderately” fast?
- IF YOU COULD “SPEED UP” JUST ONE COMPUTER . . .
WHICH ONE WOULD YOU CHOOSE?
SLIDE 13
Detailed Questions about Heterogeneity
- What makes one cluster more powerful than another?
- Are you better off with
— one super-fast computer and many “average” ones — or with all computers “moderately” fast?
- IF YOU COULD “SPEED UP” JUST ONE COMPUTER . . .
WHICH ONE WOULD YOU CHOOSE? — THE FASTEST ONE?
SLIDE 14
Detailed Questions about Heterogeneity
- What makes one cluster more powerful than another?
- Are you better off with
— one super-fast computer and many “average” ones — or with all computers “moderately” fast?
- IF YOU COULD “SPEED UP” JUST ONE COMPUTER . . .
WHICH ONE WOULD YOU CHOOSE? — THE FASTEST ONE? — THE SLOWEST ONE?
SLIDE 15
A Formal Framework for Studying the Questions Cluster C has computers C1, C2, . . . , Cn
SLIDE 16
A Formal Framework for Studying the Questions Cluster C has computers C1, C2, . . . , Cn Ci completes one unit of work in ρi time units.
SLIDE 17
A Formal Framework for Studying the Questions Cluster C has computers C1, C2, . . . , Cn Ci completes one unit of work in ρi time units. C’s heterogeneity profile: PC = ρ1, ρ2, . . . , ρn
SLIDE 18
A Formal Framework for Studying the Questions Cluster C has computers C1, C2, . . . , Cn Ci completes one unit of work in ρi time units. C’s heterogeneity profile: PC = ρ1, ρ2, . . . , ρn One finds in
- M. Adler, Y. Gong, A.L. Rosenberg (2008): On “exploiting” node-heterogeneous
clusters optimally. Theory of Computing Systems 42, 465–487
a solution to the CLUSTER-EXPLOITATION PROBLEM . . . — a search for a schedule that maximizes C’s rate of completing work
SLIDE 19
A Formal Framework for Studying the Questions Cluster C has computers C1, C2, . . . , Cn Ci completes one unit of work in ρi time units. C’s heterogeneity profile: PC = ρ1, ρ2, . . . , ρn One finds in
- M. Adler, Y. Gong, A.L. Rosenberg (2008): On “exploiting” node-heterogeneous
clusters optimally. Theory of Computing Systems 42, 465–487
a solution to the CLUSTER-EXPLOITATION PROBLEM THE OPTIMAL SCHEDULE FOR C DEPENDS ONLY ON PC
SLIDE 20
A Formal Framework for Studying the Questions Cluster C has computers C1, C2, . . . , Cn Ci completes one unit of work in ρi time units. C’s heterogeneity profile: PC = ρ1, ρ2, . . . , ρn One finds in
- M. Adler, Y. Gong, A.L. Rosenberg (2008): On “exploiting” node-heterogeneous
clusters optimally. Theory of Computing Systems 42, 465–487
a solution the CLUSTER-EXPLOITATION PROBLEM The optimal schedule for C depends only on PC THE WORK COMPLETED UNDER THIS SCHEDULE IS OUR MEASURE OF C’s “POWER”
SLIDE 21
A Formal Framework for Studying the Questions Cluster C has computers C1, C2, . . . , Cn Ci completes one unit of work in ρi time units. C’s heterogeneity profile: PC = ρ1, ρ2, . . . , ρn C’s “power”: the work completed by the optimal solution to the CLUSTER-EXPLOITATION PROBLEM The expression for this work is complicated . . . — so we also measure C’s “power” by its HECR: Homogeneous Equivalent Computing Rate
SLIDE 22
A Formal Framework for Studying the Questions Cluster C has computers C1, C2, . . . , Cn Ci completes one unit of work in ρi time units. C’s heterogeneity profile: PC = ρ1, ρ2, . . . , ρn C’s HECR (Homogeneous Equivalent Computing Rate) . . . the computing rate ρ(C) such that the HOMOgeneous cluster with profile ρ(C), ρ(C), . . . , ρ(C) completes work at the same rate as C.
SLIDE 23
ON TO OUR QUESTIONS!
SLIDE 24
Which ONE Computer Should You Speed UP?
SLIDE 25
Which Computer to Speed Up: Additive Speedup Speeding up computer Ci additively by the amount ϕ . . . replaces profile PC = ρ1, . . . , ρi−1, ρi , ρi+1, . . . , ρn by profile PC = ρ1, . . . , ρi−1, ρi − ϕ , ρi+1, . . . , ρn Say that 0 < ϕ < mini{ρi}, so every Ci can be sped up.
SLIDE 26
Which Computer to Speed Up: Additive Speedup Speeding up computer Ci additively by the amount ϕ: ρ1, . . . , ρi−1, ρi , ρi+1, . . . , ρn − → ρ1, . . . , ρi−1, ρi − ϕ , ρi+1, . . . , ρn Theorem. Under the additive-speedup scenario, the most advantageous single computer to speed up is C’s fastest computer.
SLIDE 27
Which Computer to Speed Up: Additive Speedup Speeding up computer Ci additively by the amount ϕ: ρ1, . . . , ρi−1, ρi , ρi+1, . . . , ρn − → ρ1, . . . , ρi−1, ρi − ϕ , ρi+1, . . . , ρn Theorem. Under the additive-speedup scenario, the most advantageous single computer to speed up is C’s fastest computer. Initial profile: 1, 1/2, 1/3, 1/4 Speedup amount: ϕ = 1/16 Speed up Work ratio i computer Ci OLD ÷ NEW 1 15/16, 1/2, 1/3, 1/4 1.008 2 1, 7/16, 1/3, 1/4 1.014 3 1, 1/2, 13/48, 1/4 1.034 4 1, 1/2, 1/3, 3/16 1.159
SLIDE 28
Which Computer to Speed Up: Additive Speedup Speeding up computer Ci additively by the amount ϕ: ρ1, . . . , ρi−1, ρi , ρi+1, . . . , ρn − → ρ1, . . . , ρi−1, ρi − ϕ , ρi+1, . . . , ρn Theorem. Under the additive-speedup scenario, the most advantageous single computer to speed up is C’s fastest computer. Speed up Work ratio i computer Ci OLD ÷ NEW 1 15/16, 1/2, 1/3, 1/4 1.008 2 1, 7/16, 1/3, 1/4 1.014 3 1, 1/2, 13/48, 1/4 1.034 4 1, 1/2, 1/3, 3/16 1.159 INTUITION: MORE BANG FOR THE BUCK
SLIDE 29
Which Computer to Speed Up: Multiplicative Speedup Speeding up computer Ci multiplicatively by factor ψ . . . replaces profile PC = ρ1, . . . , ρi−1, ρi , ρi+1, . . . , ρn by profile PC = ρ1, . . . , ρi−1, ψρi , ρi+1, . . . , ρn Say that 0 < ψ < 1, so every Ci can be sped up.
SLIDE 30
Which Computer to Speed Up: Multiplicative Speedup Speeding up computer Ci multiplicatively by factor ψ: ρ1, . . . , ρi−1, ρi , ρi+1, . . . , ρn − → ρ1, . . . , ρi−1, ψρi , ρi+1, . . . , ρn Say that 0 < ψ < 1, so every Ci can be sped up finitely. “Theorem.” Under the multiplicative-speedup scenario: The most advantageous single computer to speed up is C’s fastest computer . . .
SLIDE 31
Which Computer to Speed Up: Multiplicative Speedup Speeding up computer Ci multiplicatively by factor ψ: ρ1, . . . , ρi−1, ρi , ρi+1, . . . , ρn − → ρ1, . . . , ρi−1, ψρi , ρi+1, . . . , ρn Say that 0 < ψ < 1, so every Ci can be sped up finitely. “Theorem.” Under the multiplicative-speedup scenario: The most advantageous single computer to speed up is C’s fastest computer . . . — UNLESS
SLIDE 32
Which Computer to Speed Up: Multiplicative Speedup Speeding up computer Ci multiplicatively by factor ψ: ρ1, . . . , ρi−1, ρi , ρi+1, . . . , ρn − → ρ1, . . . , ρi−1, ψρi , ρi+1, . . . , ρn Say that 0 < ψ < 1, so every Ci can be sped up finitely. “Theorem.” Under the multiplicative-speedup scenario: The most advantageous single computer to speed up is C’s fastest computer . . . — UNLESS either this computer is already “very fast”
- r the speedup factor ψ is “very small.”
SLIDE 33
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”:
- A 4-computer cluster
— HOMOgeneous (before any speedups)
- Bar height is ρ-value . . .
— a lower bar is a faster computer
SLIDE 34
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”:
- A 4-computer cluster
— HOMOgeneous (before any speedups)
- Bar height is ρ-value . . .
— a lower bar is a faster computer START SPEEDING UP ONE COMPUTER OPTIMALLY . . . — BY THE FACTOR ψ = 1/2
SLIDE 35
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”:
SLIDE 36
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”:
SLIDE 37
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”:
SLIDE 38
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”:
SLIDE 39
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”:
SLIDE 40
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”:
SLIDE 41
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”:
SLIDE 42
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”:
SLIDE 43
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”:
SLIDE 44
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”:
SLIDE 45
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”:
SLIDE 46
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”:
SLIDE 47
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”:
SLIDE 48
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”:
SLIDE 49
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”:
SLIDE 50
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”: When all computers are very fast:
SLIDE 51
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”: When all computers are very fast:
SLIDE 52
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”: When all computers are very fast:
SLIDE 53
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”: When all computers are very fast:
SLIDE 54
Which Computer to Speed Up: Multiplicative Speedup At least one computer is not “very fast”: When all computers are very fast:
SLIDE 55
What Makes Clusters Powerful? Absolute and Relative Answers
SLIDE 56
What Makes Clusters Powerful: Variance in Computer Speeds Say that cluster C1, with profile P1, and cluster C2, with profile P2, share the same mean speed. Theorem. Say that C1 and C2 each has 2 computers. Then C1 outperforms C2 if and only if VAR(P1) > VAR(P2).
SLIDE 57
What Makes Clusters Powerful: Variance in Computer Speeds Say that cluster C1, with profile P1, and cluster C2, with profile P2, share the same mean speed. Say that C1 and C2 each has 2 computers. Then C1 outperforms C2 if and only if VAR(P1) > VAR(P2). Corollary. HETEROGENEITY CAN ACTUALLY LEND POWER TO A CLUSTER . . . if 2-computer clusters C1 and C2 share the same mean speed and C1 is heterogeneous, while C2 is homogeneous then C1 outperforms C2.
SLIDE 58
What Makes Clusters Powerful: Variance in Computer Speeds Say that cluster C1, with profile P1, and cluster C2, with profile P2, share the same mean speed. Say that C1 and C2 each has 2 computers. Then C1 outperforms C2 if and only if VAR(P1) > VAR(P2). Unfortunately: THIS RESULT DOES NOT EXTEND TO 3-COMPUTER CLUSTERS
SLIDE 59
What Makes Clusters Powerful: Variance in Computer Speeds Say that cluster C1, with profile P1, and cluster C2, with profile P2, share the same mean speed. Say that C1 and C2 each has 2 computers. Then C1 outperforms C2 if and only if VAR(P1) > VAR(P2). Unfortunately: This result does not extend to 3-computer clusters BUT . . .
SLIDE 60
What Makes Clusters Powerful: Variance in Computer Speeds Say that cluster C1, with profile P1, and cluster C2, with profile P2, share the same mean speed. Theorem. Say that C1 and C2 each has 3 computers. There exists a threshold θ > 0 such that: if VAR(P1) ≥ VAR(P2) + θ then C1 outperforms C2.
SLIDE 61