how to exploit a heterogeneous cluster of computers
play

How to Exploit a Heterogeneous Cluster of Computers (Asymptotically) - PowerPoint PPT Presentation

How to Exploit a Heterogeneous Cluster of Computers (Asymptotically) Optimally Arnold L. Rosenberg Electrical & Computer Engineering Colorado State University Fort Collins, CO 80523, USA rsnbrg@colostate.edu Joint work with Micah


  1. How to “Exploit” a Heterogeneous Cluster of Computers (Asymptotically) Optimally Arnold L. Rosenberg Electrical & Computer Engineering Colorado State University Fort Collins, CO 80523, USA rsnbrg@colostate.edu Joint work with Micah Adler Ying Gong

  2. The Computational Environment • A “master” computer C 0 (This is our computer.)

  3. The Computational Environment • A “master” computer C 0 • A cluster C of n heterogeneous computers C 1 , C 2 , . . . , C n that are available for dedicated “rental” (The C i differ in processor, memory speeds.)

  4. The Computational Environment • A “master” computer C 0 • A cluster C of n heterogeneous computers C 1 , C 2 , . . . , C n that are available for dedicated “rental” (The C i may be geographically dispersed.)

  5. The Computational Environment • A “master” computer C 0 • A cluster C of n heterogeneous computers C 1 , C 2 , . . . , C n that are available for dedicated “rental” • a large “bag” of (arbitrarily but) equally complex tasks

  6. Two Simple Worksharing Problems The Cluster-Exploitation Problem • One has access to cluster C for L time units. • One wants to accomplish as much work as possible during that time.

  7. Two Simple Worksharing Problems The Cluster-Exploitation Problem • One has access to cluster C for L time units. • One wants to accomplish as much work as possible during that time. The Cluster-Rental Problem • One has W units of work to complete. • One wishes to “rent” cluster C for as short a period of time as necessary to complete that work.

  8. Our Contributions Within HiHCoHP — a heterogeneous, long-message analog of the LogP architectural model — we offer:

  9. Our Contributions Within HiHCoHP — a heterogeneous, long-message analog of the LogP architectural model — we offer: A Generic Worksharing Protocol : • works predictably for many variants of our model. • determines all work-allocations and all communication times.

  10. Our Contributions Within HiHCoHP — a heterogeneous, long-message analog of the LogP architectural model — we offer: A Generic Worksharing Protocol : • works predictably for many variants of our model. • determines all work-allocations and all communication times. An Asymptotically Optimal Worksharing Protocol : • solves the Cluster-Exploitation and -Rental Problems optimally — as long as L is sufficiently long .

  11. Our Contributions — Details Worksharing protocols : • C 0 supplies work to each “rented” C i , in some order — in a single message for each C i

  12. Our Contributions — Details Worksharing protocols : • C 0 supplies work to each “rented” C i , in some order • C i does the work — and returns its results — in a single message from each C i

  13. Our Contributions — Details Worksharing protocols : • C 0 supplies work to each “rented” C i , in some order • C i does the work — and returns its results Asymptotically optimal worksharing protocols : • Computers start and finish computing in the same order : — first started ⇒ first finished • Optimality is independent of computers’ starting order: — even if each C i is 10 10 times faster than C i +1

  14. The Model Calibration • All units — time and packet size — are calibrated to the slowest computer’s computation rate: – This C does one “unit” of work in one “unit” of time. • Each unit of work produces δ units of results (for simplicity).

  15. Computation Rates ρ i is the per-unit work time for computer C i • ρ 1 ≤ ρ 2 ≤ · · · ≤ ρ n (by convention) [The smaller the index, the faster the computer.] • ρ n = 1 (by our calibration)

  16. The Costs of Communication, 1 Message Processing time for C i : Transmission setup: σ time units - per communication Transmission packaging: π i time units - per packet Reception unpackaging: π i time units - per packet • Subscripts reflect computers’ heterogeneity .

  17. The Costs of Communication, 2 Message Transmission Time : Latency: λ time units — for first packet def Bandwidth limitation: τ = 1 /β time units/packet — for remaining packets def • β = network’s end-to-end bandwidth .

  18. C prepares C C C transmits C unpacks C does C prepares C C C transmits C unpacks 0 0 i 0 i i i i 0 i 0 work for C setup work work work results for C setup results results i 0 π 0 σ λ τ π ρ π i δ σ λ τ δ π δ ( − 1) ( − 1) w i w i w i w w i w i w i i i i 0 in in in in C in C , C C , C C in C in i 0 i 0 i network 0 0 network and and network network The timeline as C 0 shares work with C i

  19. A Generic Worksharing Protocol Specifying a worksharing protocol • C 0 sends work to C 1 , C 2 , . . . , C n in the startup order: C s 1 , C s 2 , . . . , C s n (Note subscript-sequence s 1 , s 2 , . . . , s n ) • C 1 , C 2 , . . . , C n return results to C 0 in the finishing order: C f 1 , C f 2 , . . . , C f n (Note subscript-sequence f 1 , f 2 , . . . , f n )

  20. The timeline for three “rented” computers, C 1 , C 2 , C 3 : Lifespan L ������� ������� ������� ������� ������������������������������������������ ������������������������������������������ Prepare Compute Prepare Compute Prepare Compute ������� ������� ������� ������� ������������������������������������������ ������������������������������������������ ������� ������� ������� ������� ������������������������������������������ ������������������������������������������ ������� ������� ������� ������� ������������������������������������������ ������������������������������������������ Transmit Transmit Transmit λ − τ + λ − τ + λ − τ + π 0 σ π 0 σ π 0 σ (Total compute time) C w 0 w 0 w w w s s s τ τ τ 0 1 w 2 w 3 w s s s 1 2 3 ����������������������������� ����������������������������� Receive Compute Prepare Transmit ����������������������������� ����������������������������� ����������������������������� ����������������������������� ����������������������������� ����������������������������� λ − τ + C π ρ ρ π δ w σ w w w s s s τ δ C s s f f f f 1 1 1 w f f 1 1 1 1 1 1 1 1 ������������������������ ������������������������ Receive Compute Prepare Transmit ������������������������ ������������������������ ������������������������ ������������������������ ������������������������ ������������������������ λ − τ + π ρ ρ π δ w σ C C w w w f f τ δ s s s s s f f f w 2 2 f 2 2 2 2 2 2 2 2 2 �������������������� �������������������� �������������������� �������������������� Receive Compute Prepare Transmit �������������������� �������������������� �������������������� �������������������� �������������������� �������������������� λ − τ + π ρ ρ π δ w σ C C w w w τ δ f f s s s s f f w f s 3 3 f 3 3 3 3 3 3 3 3 3 NOTE : Only one message in transit at a time

  21. Some Useful Abbreviations Quantity Meaning τ τ (1 + δ ) 2-way network transmission rate � π i π i + π i δ C i ’s 2-way message-packaging rate � (workload + results) F ( σ + λ − τ ) fixed communication overhead (becomes invisible as L grows) V i π 0 + � τ + � π i C i ’s variable communication overhead rate

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend