static worksharing strategies for heterogeneous computers
play

Static Worksharing Strategies for Heterogeneous Computers with - PowerPoint PPT Presentation

Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion Static Worksharing Strategies for Heterogeneous Computers with Unrecoverable Failures Anne Benoit, Yves Robert, Arnold Rosenberg and Fr ed eric Vivien Ecole


  1. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion Static Worksharing Strategies for Heterogeneous Computers with Unrecoverable Failures Anne Benoit, Yves Robert, Arnold Rosenberg and Fr´ ed´ eric Vivien ´ Ecole Normale Sup´ erieure de Lyon, France Anne.Benoit@ens-lyon.fr http://graal.ens-lyon.fr/~abenoit HeteroPar’2009, August 25 Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 1/ 25

  2. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion Problem Large divisible computational workload Single-round distribution, one-port model Assemblage of p different-speed computers Unrecoverable interruptions A-priori knowledge of risk (failure probability) Goal: maximize expected amount of work done Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 2/ 25

  3. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion Related work Landmark paper by Bhatt, Chung, Leighton & Rosenberg on cycle stealing Hardware failures � Fault tolerant computing (hence scheduling) becomes unavoidable � Well, same story told since very long! Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 3/ 25

  4. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion Related work Landmark paper by Bhatt, Chung, Leighton & Rosenberg on cycle stealing Hardware failures � Fault tolerant computing (hence scheduling) becomes unavoidable � Well, same story told since very long! Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 3/ 25

  5. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion Related work Landmark paper by Bhatt, Chung, Leighton & Rosenberg on cycle stealing Hardware failures � Fault tolerant computing (hence scheduling) becomes unavoidable � Well, same story told since very long! Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 3/ 25

  6. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion Cycle-stealing scenario Big job of size W to execute during week-end Enroll p computers P 1 to P p Assign load fraction to each P i How to compute these load fractions? How to order communications? Risk increases with time Machines reclaimed at 8am on Monday with probability 1 Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 4/ 25

  7. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion Cycle-stealing scenario Big job of size W to execute during week-end Enroll p computers P 1 to P p Assign load fraction to each P i How to compute these load fractions? How to order communications? Risk increases linearly with time Machines reclaimed at 8am on Monday with probability 1 Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 4/ 25

  8. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion Cycle-stealing scenario Big job of size W to execute during week-end Enroll p computers P 1 to P p Assign load fraction to each P i How to compute these load fractions? How to order communications? Risk increases linearly with time Machines reclaimed at 8am on Monday with probability 1 Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 4/ 25

  9. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion Outline Technical framework 1 Homogeneous computers, with communication costs 2 Heterogeneous computers, no communication costs 3 Heterogeneous computers, with communication costs 4 Conclusion 5 Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 5/ 25

  10. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion Outline Technical framework 1 Homogeneous computers, with communication costs 2 Heterogeneous computers, no communication costs 3 Heterogeneous computers, with communication costs 4 Conclusion 5 Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 6/ 25

  11. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion Interruption model � κ dt for t ∈ [0 , 1 /κ ] dPr = 0 otherwise � w � � Pr ( w ) = min 1 , κ dt = min { 1 , κ w } 0 Goal: maximize expected work production Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 7/ 25

  12. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion Interruption model � κ dt for t ∈ [0 , 1 /κ ] dPr = 0 otherwise � w � � Pr ( w ) = min 1 , κ dt = min { 1 , κ w } 0 Goal: maximize expected work production Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 7/ 25

  13. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion Rules of the game Single-round, no overlap, one-port communications Homogeneous network Different-speed computers Failure-rate per unit-load communication z = κ bw Failure-rate per unit-load computation by computer P i κ x i = speed i Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 8/ 25

  14. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion Rules of the game Single-round, no overlap, one-port communications Homogeneous network Different-speed computers Failure-rate per unit-load communication z = κ bw Failure-rate per unit-load computation by computer P i κ x i = speed i Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 8/ 25

  15. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion With two computers (1/2) P 1 z Y x 1 Y First send P 1 a chunk of size Y : E 1 = Y (1 − (z + x 1 ) Y ) Then send P 2 the remaining load (of size W − Y ): E 2 = ( W − Y ) (1 − (z W + x 2 ( W − Y )) Total expectation: E ( Y ) = E 1 + E 2 Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 9/ 25

  16. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion With two computers (1/2) P 1 z Y x 1 Y P 2 z ( W − Y ) x 2 ( W − Y ) First send P 1 a chunk of size Y : E 1 = Y (1 − (z + x 1 ) Y ) Then send P 2 the remaining load (of size W − Y ): E 2 = ( W − Y ) (1 − (z W + x 2 ( W − Y )) Total expectation: E ( Y ) = E 1 + E 2 Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 9/ 25

  17. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion With two computers (1/2) P 1 z Y x 1 Y P 2 z ( W − Y ) x 2 ( W − Y ) First send P 1 a chunk of size Y : E 1 = Y (1 − (z + x 1 ) Y ) Then send P 2 the remaining load (of size W − Y ): E 2 = ( W − Y ) (1 − (z W + x 2 ( W − Y )) Total expectation: E ( Y ) = E 1 + E 2 Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 9/ 25

  18. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion With two computers (2/2) E ( Y ) = Y (1 − (z + x 1 ) Y ) + ( W − Y ) (1 − (z W + x 2 ( W − Y )) E ( Y ) = W − (z + x 2 ) W 2 − (z + x 1 + x 2 ) Y 2 + (z + 2x 2 ) WY z + 2x 2 Y (opt) = 2(z + x 1 + x 2 ) W � 4x 1 x 2 + 4(x 1 + x 2 )z + 3z 2 � E opt ( W , 2) = E ( Y (opt) ) = W − W 2 4(x 1 + x 2 + z) Symmetric in x 1 and x 2 ⇒ ordering of the communications has no impact Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 10/ 25

  19. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion With two computers (2/2) E ( Y ) = Y (1 − (z + x 1 ) Y ) + ( W − Y ) (1 − (z W + x 2 ( W − Y )) E ( Y ) = W − (z + x 2 ) W 2 − (z + x 1 + x 2 ) Y 2 + (z + 2x 2 ) WY z + 2x 2 Y (opt) = 2(z + x 1 + x 2 ) W � 4x 1 x 2 + 4(x 1 + x 2 )z + 3z 2 � E opt ( W , 2) = E ( Y (opt) ) = W − W 2 4(x 1 + x 2 + z) Symmetric in x 1 and x 2 ⇒ ordering of the communications has no impact Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 10/ 25

  20. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion With two computers (2/2) E ( Y ) = Y (1 − (z + x 1 ) Y ) + ( W − Y ) (1 − (z W + x 2 ( W − Y )) E ( Y ) = W − (z + x 2 ) W 2 − (z + x 1 + x 2 ) Y 2 + (z + 2x 2 ) WY z + 2x 2 Y (opt) = 2(z + x 1 + x 2 ) W � 4x 1 x 2 + 4(x 1 + x 2 )z + 3z 2 � E opt ( W , 2) = E ( Y (opt) ) = W − W 2 4(x 1 + x 2 + z) Symmetric in x 1 and x 2 ⇒ ordering of the communications has no impact Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 10/ 25

  21. Framework Homogeneous+coms Heterogeneous Heterogeneous+coms Conclusion Extra rule: distribute entire load Total load W small enough so that we distribute it entirely Quite reasonable but dramatic impact on solution Definition Distrib ( p ): compute E opt ( W , p ), the optimal value of expected total amount of work done when distributing entire workload 1 W ≤ z+max(x i ) to the p remote computers Anne.Benoit@ens-lyon.fr August 25, 2009 Worksharing with Unrecoverable Interruptions 11/ 25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend