low overhead loggp parameter assessment for modern
play

Low-Overhead LogGP Parameter Assessment for Modern Interconnection - PowerPoint PPT Presentation

Low-Overhead LogGP Parameter Assessment for Modern Interconnection Networks T. Hoefler, A. Lichei, W. Rehm Open Systems Lab Computer Architecture Group Indiana University Technical University of Chemnitz Bloomington, USA Chemnitz, Germany


  1. Low-Overhead LogGP Parameter Assessment for Modern Interconnection Networks T. Hoefler, A. Lichei, W. Rehm Open Systems Lab Computer Architecture Group Indiana University Technical University of Chemnitz Bloomington, USA Chemnitz, Germany IPDPS’07 - PMEO-PDS’07 Workshop Long Beach, CA, USA 30th March 2007 T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  2. Introduction network performance prediction is important assess the runtime of parallel algorithms optimize communication patterns (e.g., collective) runtime message scheduling (heterogeneous interfaces) Our approach We propose a contention-free LogGP parameter assessment method to be used in (changing) runtime environments. T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  3. Introduction network performance prediction is important assess the runtime of parallel algorithms optimize communication patterns (e.g., collective) runtime message scheduling (heterogeneous interfaces) Our approach We propose a contention-free LogGP parameter assessment method to be used in (changing) runtime environments. T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  4. The LogGP Model level Sender Receiver CPU Network or o s L g, G g, G time T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  5. The Tools to Measure no central clock → measurements on one host only client server client server o o L o g o o o g ... L o o o L L o o o o L L o o ... T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  6. Related Work Culler et al. / Ianello et al. differentiates between o s and o r o s : issue small number ( n ) of sends and divide by n o r : delay between messages, larger as RTT, substract o s g : flood network L : RTT/2 - o r - o s (third order errors) Kielmann et al. changes the model to pLogP o s : time for a single send o r : time to copy the message from the receive buffer g : flood network (if accurate) L : (RTT(0)-2g(0))/2 (second order errors) T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  7. Related Work Culler et al. / Ianello et al. differentiates between o s and o r o s : issue small number ( n ) of sends and divide by n o r : delay between messages, larger as RTT, substract o s g : flood network L : RTT/2 - o r - o s (third order errors) Kielmann et al. changes the model to pLogP o s : time for a single send o r : time to copy the message from the receive buffer g : flood network (if accurate) L : (RTT(0)-2g(0))/2 (second order errors) T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  8. Related Work Bell et al. differentiates between o s and o r o s : uses delay between message sends (adjust delay until d + o = g + ( s − 1 ) G (multiple measurements) ⇒ o s = g + ( s − 1 ) G − d (second order errors) o r : similar to Culler et al. g : flood network (similar to Kielmann et al.) L : not measured (network effects) EEL : end-to-end latency (RTT) T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  9. Our Approach Definition of PRTT ( n , d , s ) parametrized round-trip-time n - number of successive messages d - delay between messages s - message size PRTT ( n , d , s ) and LogGP PRTT ( 1 , 0 , s ) = 2 · ( L + 2 o + ( s − 1 ) G ) G all = g + ( s − 1 ) G PRTT ( n , 0 , s ) = 2 · ( L + 2 o + ( s − 1 ) G ) + ( n − 1 ) · G all PRTT ( n , 0 , s ) = PRTT ( 1 , 0 , s ) + ( n − 1 ) · G all PRTT ( n , d , s ) = PRTT ( 1 , 0 , s ) + ( n − 1 ) · max { o + d , G all } T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  10. Our Approach Definition of PRTT ( n , d , s ) parametrized round-trip-time n - number of successive messages d - delay between messages s - message size PRTT ( n , d , s ) and LogGP PRTT ( 1 , 0 , s ) = 2 · ( L + 2 o + ( s − 1 ) G ) G all = g + ( s − 1 ) G PRTT ( n , 0 , s ) = 2 · ( L + 2 o + ( s − 1 ) G ) + ( n − 1 ) · G all PRTT ( n , 0 , s ) = PRTT ( 1 , 0 , s ) + ( n − 1 ) · G all PRTT ( n , d , s ) = PRTT ( 1 , 0 , s ) + ( n − 1 ) · max { o + d , G all } T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  11. Assessment of the Overhead Assessing o PRTT ( n , d , s ) − PRTT ( 1 , 0 , s ) = max { o + d , G all } n − 1 we chose d > G all PRTT ( n , d , s ) − PRTT ( 1 , 0 , s ) = o + d n − 1 we chose d = PRTT ( 1 , 0 , s ) (fall back to d = PRTT ( 2 , 0 , s ) for high gaps) 2d+2o d d PRTT(1,0,s) (s−1)G (s−1)G (s−1)G (s−1)G o o o o o o client L server o o o o (s−1)G (s−1)G (s−1)G (s−1)G PRTT(n,d,s) T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  12. Assessment of the Gaps Assessing g and G G ( s − 1 ) + g = PRTT ( n , 0 , s ) − PRTT ( 1 , 0 , s ) n − 1 polynomial of degree 1 (linear function) ⇒ measue PRTT ( n , 0 , s ) and PRTT ( 1 , 0 , s ) for different s more measurement values increase accuracy ( ⇒ least squares linear fit) Detecting Protocol Changes comm. subsystems use data-size dependent protocols different parameters autodetection possible changes in the mean least squares deviation T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  13. Assessment of the Gaps Assessing g and G G ( s − 1 ) + g = PRTT ( n , 0 , s ) − PRTT ( 1 , 0 , s ) n − 1 polynomial of degree 1 (linear function) ⇒ measue PRTT ( n , 0 , s ) and PRTT ( 1 , 0 , s ) for different s more measurement values increase accuracy ( ⇒ least squares linear fit) Detecting Protocol Changes comm. subsystems use data-size dependent protocols different parameters autodetection possible changes in the mean least squares deviation T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  14. Assessment of the Gaps - Example Fit 20 (PRTT(n,0,s)-PRTT(1,0,s))/(n-1) fit (g=4.408, G=0.00099) 18 Time in microseconds 16 14 12 10 8 6 4 0 2000 4000 6000 8000 10000 Datasize in bytes (s) T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  15. Results Netgauge support for multiple low-level Interfaces e.g., TCP , UDP , IB, GM, SCI, MPI ⇒ low-level and MPI measurements (lib overhead) different communication patterns (different measuements) implemented new pattern: loggp MPI Benchmark Environment MPICH2 1.0.3 for Gigabit Ethernet NMPI for SCI Open MPI 1.1 for InfiniBand TM /OFED Open MPI 1.1 for Myrinet/GM T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  16. Results Netgauge support for multiple low-level Interfaces e.g., TCP , UDP , IB, GM, SCI, MPI ⇒ low-level and MPI measurements (lib overhead) different communication patterns (different measuements) implemented new pattern: loggp MPI Benchmark Environment MPICH2 1.0.3 for Gigabit Ethernet NMPI for SCI Open MPI 1.1 for InfiniBand TM /OFED Open MPI 1.1 for Myrinet/GM T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  17. TCP/IP over Gigabit Ethernet MPICH2 - G*s+g 600 MPICH2 - o TCP - G*s+g TCP o 500 Time in microseconds 400 300 200 100 0 0 10000 20000 30000 40000 50000 60000 Datasize in bytes (s) T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  18. InfiniBand TM /OFED 90 Open MPI - G*s+g Open MPI - o 80 OpenIB - G*s+g 70 OpenIB - o Time in microseconds 60 50 40 30 20 10 0 0 10000 20000 30000 40000 50000 60000 Datasize in bytes (s) T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  19. Myrinet/GM 350 Open MPI - G*s+g Open MPI - o 300 Myrinet/GM - G*s+g Myrinet/GM - o Time in microseconds 250 200 150 100 50 0 0 10000 20000 30000 40000 50000 60000 Datasize in bytes (s) T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  20. SCI (MPI only) NMPI - G*s+g NMPI - o 250 Time in microseconds 200 150 100 50 0 0 10000 20000 30000 40000 50000 60000 Datasize in bytes (s) T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  21. Numeric Results Trans. Prot. Int. L ( µ s ) o(1) ( µ s ) g ( µ s ) G ( µ s / byte ) TCP 1 ≤ s 45.74 3.46 0.915 0.00849 SCI 1 ≤ s < 12289 5.48 6.10 7.78 0.0045 12289 ≤ s 5.48 6.10 13.34 0.0037 OFED 1 ≤ s < 12289 5.96 4.72 5.14 0.00073 12289 ≤ s 5.96 4.72 21.39 0.00103 GM 1 ≤ s < 32769 10.53 1.27 9.44 0.0092 32769 ≤ s 10.53 1.27 52.01 0.0042 T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  22. Conclusions and Future Work Conclusions new measurement technique for LogGP parameters congestion-free low overhead accurate (no higher-order errors, multiple datapoints) detects protocol changes automatically Future Work apply scheme to heterogeneous NIC scheduling analyze communication schemes measure non-blocking MPI communication T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

  23. Conclusions and Future Work Conclusions new measurement technique for LogGP parameters congestion-free low overhead accurate (no higher-order errors, multiple datapoints) detects protocol changes automatically Future Work apply scheme to heterogeneous NIC scheduling analyze communication schemes measure non-blocking MPI communication T. Hoefler, A. Lichei, W. Rehm LogGP Parameter Assessment

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend