sparse flat neighborhood networks sfnns scalable
play

Sparse Flat Neighborhood Networks (SFNNs): Scalable Guaranteed - PowerPoint PPT Presentation

Sparse Flat Neighborhood Networks (SFNNs): Scalable Guaranteed Pairwise Bandwidth & Unit Latency Timothy I. Mattox, Henry G. Dietz, & William R. Dieter Electrical and Computer Engineering Department University of Kentucky Lexington, KY


  1. Sparse Flat Neighborhood Networks (SFNNs): Scalable Guaranteed Pairwise Bandwidth & Unit Latency Timothy I. Mattox, Henry G. Dietz, & William R. Dieter Electrical and Computer Engineering Department University of Kentucky Lexington, KY 40506-0046 tmattox@engr.uky.edu, hankd@engr.uky.edu, dieter@engr.uky.edu

  2. Flat Neighborhood Networks • Single switch-hop = low latency • No shared links = guaranteed bandwidth • Multiple Network Interfaces (NIs) per node • Can be lower cost and faster than a fat-tree • Designed by GA (genetic algorithm) • Incorporate program requirements • Search includes asymmetric designs 2

  3. Example Universal FNN PE PE PE PE PE PE PE PE A B C D E F G H 0 1 2 0 1 2 0 3 4 0 3 4 1 3 5 1 3 5 2 4 5 2 4 5 Switch Switch Switch Switch Switch Switch 0 1 2 3 4 5 3

  4. KLAT2's Universal FNN PE00 Solution: • All PE pairs have unit latency Specification: • 3D torus 8x4x2 • All PE pairs ±1 offsets want low latency 153 of 160 pairs • 3D torus 8x4x2 have at least two ±1 offsets units of bandwidth want extra • 4 NIs per PE bandwidth • 9 switches (32-ports each) PE63 Note: KLAT2 was first supercomputer under $1000/GFLOPS, 2000 4

  5. Universal FNN? • All node pairs are equidistant • All permutations pass in one hop • Many non-permutations also are single-hop • A PE's list of neighbors contains every other PE • Scalability is only ~2-5x of a single switch Relax these to get better scaling... 5

  6. Sparse FNN Idea • Select a target suite of parallel programs • Find the set of communication patterns used • Take the union of the important patterns • Construct a desired neighbor list for each PE • Satisfy the FNN property for these neighbors 6

  7. Communication Patterns • O(1): shuffle, bit-reversal, mesh/tori neighbor ± 1 offsets in 2D and 3D • O(log(N)): hypercube, reductions ± power of 2 offsets in any dimension • O(N 1/D ): scatter, gather, all-to-all i.e., not permutations • Overlap between patterns: pair synergy 7

  8. Sparse FNN Properties: • Single switch latency for chosen patterns • Full bisection bandwidth for chosen patterns • Scales better than Universal FNNs • Neighbor lists scale as ~O(log(N)) vs. O(N) • Lower cost (uses narrower switches) • Design solutions found for over 10K PEs 8

  9. KASY0's Sparse FNN PE000 Solution: • All requested Specification: PE pairs have • 1D torus 128 unit latency ±1 offsets • All requested • 2D torus 16x8 PE pairs have all in same dim. at least 1 unit • 3D torus 8x4x4 of bandwidth all in same dim. • 3 NIs per PE • bit-reversal • 17 switches • 7D hypercube (24-ports each) PE127 Note: KASY0 was first supercomputer under $100/GFLOPS, 2003 9

  10. 1024-PE Sparse FNN Example PE000 Solution: Specification: • All requested • 1D torus PE pairs have ±2 k offsets unit latency • 2D tori • All requested ±2 k offsets PE pairs have • 3D tori at least 1 unit ±2 k offsets of bandwidth • shuffle • 2-6 NIs per PE • bit-reversal • 101 switches • 10D hypercube (48-ports each) • 2D transpose PE383 523,766 possible PE Pairs: • • 2.78% requested • 10 • 19.6% covered in this solution •

  11. FNN Runtime Support • Support coming to the Warewulf cluster toolset • Modified Linux 2.4 & 2.6 Bonding Driver • Run any IP layer software, unmodified • Compressed routing table (4KB for 1024 PEs) • MAC addresses locally administered by driver 11

  12. Conclusion: Sparse FNNs • Give more control over cost/performance trade-offs • Take account of what the parallel programs actually need • Can achieve single-switch latency for very large systems • Can guarantee pairwise bandwidth for very large systems 12

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend