HPC and I/O Subsystems Ratan K. Guha School of Electrical - - PDF document
HPC and I/O Subsystems Ratan K. Guha School of Electrical - - PDF document
HPC and I/O Subsystems Ratan K. Guha School of Electrical Engineering and Computer Science University of Central Florida Orlando, FL 32616 Overview My Experience and Current Projects Top 10 Supercomputers Cluster Computers Node
2
My Experience
1990 - 1997- BBN Butterfly, DEC Mpp NSF Grant 2004 – Sun Cluster – ARO Grant W911NF04110100
Cluster Computing Facilities
# Nodes : 48 (Sun Fire V20z) CPU : Dual AMD Opteron 242 1.6GHz processors Memory : 2 GB Network : Gigabit Ethernet 2x36 GB internal disk OS : SunOS 5.9
Ariel
3
Current Projects
Composite cathodes for Intermediate Temperature SOFCs: A comprehensive approach to designing materials for superior functionality, PIs. N. Orlovskaya, A. Sleiti, J. Kapat (MMAE), A. Masunov (NSTC), R. Guha (CS) [CPMD, Fluent] NASA Grant VCluster: A Thread-Based Java Middleware for SMP and Heterogeneous Clusters – Ph. D. Dissertation Work Parallel Simulation ARO Grants DAAD19-01-1-0502, W911NF04110100
Top 5 Supercomputers
- 1. DOE/NNSA/LLNL USA BlueGene/L - eServer Blue
Gene Solution IBM
- 2. NNSA/Sandia National Laboratories USA -Red Storm
- Sandia/ Cray Red Storm, Opteron 2.4 GHz dual core
Cray Inc.
- 3. IBM Thomas J. Watson Research Center USA BGW -
eServer Blue Gene Solution, IBM
- 4. DOE/NNSA/LLNL – USA ASC Purple - eServer
pSeries p5 575 1.9 GHz, IBM
- 5. Barcelona Supercomputing Center, Spain,
MareNostrum - BladeCenter JS21 Cluster, PPC 970, 2.3 GHz, Myrinet, IBM
4
Top 6 -10 Supercomputers
6. NNSA/Sandia National Laboratories USA Thunderbird - PowerEdge 1850, 3.6 GHz, Infiniband Dell 7. Commissariat a l'Energie Atomique (CEA) FranceTera-10 - NovaScale 5160, Itanium2 1.6 GHz, Quadrics, Bull SA 8. NASA/Ames Research Center/NAS USA Columbia - SGI Altix 1.5 GHz, Voltaire Infiniband, SGI 9. GSIC Center, Tokyo Institute of Technology, Japan, TSUBAME Grid Cluster - Sun Fire x4600 Cluster, Opteron 2.4/2.6 GHz and ClearSpeed Accelerator, InfinibandNEC/Sun
- 10. Oak Ridge National Laboratory USA Jaguar - Cray
XT3, 2.6 GHz dual Core, Cray Inc.
Some Statistics
261 Intel processors, 113 AMD Operton family 93 IBM Power processors
5
Cluster Computing
Become popular with the availability of
High performance microprocessors High speed networks Distributed computing tools
Provide performance comparable to supercomputers with a much lower price
Cluster Computing
To run a parallel program on a cluster
Processes must be created on every machine in the cluster Processes must be able to communicate with each other
Process 1 Process 2 Process 3 Process 4 Ethernet
6
Communications
Fiber Channel Gigabit Ethernet Myrinet InfiniBand
Myrinet
Designed by Myricom High-speed LAN used to interconnect machines
Lightweight protocol (2Gb/s) Low latency for short messages Sustained data rate for large messages
7
Gigabit Ethernet
Standardized by IEEE 802.3 Data rates in Gigabits/s Deployed in high-capacity backbone network links High end-to-end throughput and less expensive as compared to Myrinet Four physical layer standards:
optical fiber, twisted pair cable, or balanced copper cable
Fiber Channel
Gigabit-speed network technology used for storage networking
Runs on both twisted pair Cu and fiber optic
Reliable and scalable 4 Gb/s BW Supports many topologies and protocols Efficient Cons
Although initially used for supercomputing, more popular now in storage markets More standard definitions are increasing complexity of the protocol
8
InfiniBand (IB)
High performance, low latency I/O Interconnect architecture for channel-based, switched fabric servers
Replacement for PCI shared-bus
First version released in Oct 2000 by InfiniBand Trade Association (ITA) formed
Compaq, Dell, HP, IBM, Intel, MS, Sun responsible for compliance and interoperability testing of commercial products June 2001 – Version 1.0a released
Why is it different?
Unlike present I/O subsystem, IB is a network
Uses IPv6 with its 128-bit address
IB’s revolutionary approach:
Instead of sending data in parallel across the backplane bus (data path), IB uses a serial (bit-at-a-time) bus Fewer pins saves cost and adds reliability Serial bus can multiplex a signal Supports multiple memory areas, which can be accessed by processors and storage devices
9
Advantages of InfiniBand
High performance
20Gb/s node-to-node 60Gb/s switch-to-switch IB has defined roadmap to 120Gb/s (fastest specification for any interconnect)
Reduced complexity
Multiple I/Os on one cable Consolidates clustering transmissions, communications, storage and management data types over a single connection
Advantages Contd.
Efficient interconnect
Communication processing in HW, not CPU, so full resource utilization at each node Employs Remote DMA (efficient data transfer protocol)
Reliability, stability, scalability
Reliable end-to-end data connections Virtualizations allow multiple apps to run on the same interconnect IB fabrics have multiple paths and fault is limited to a link Can support tens of thousands of nodes in single subnet
10
Integrating into a data center
Connecting Fiber Channel storage fabrics to an IB infrastructure: Bridges
Somewhat costly Create a bottleneck that gates the Fiber Channel access to speeds less than the array is typically capable of delivering
Native interconnects
More cost-effective, easier-to-manage solution Integrate the arrays directly to the IB fabric
InfiniBand and Gigabit Ethernet?
IB is complimentary to GE or Fiber
- Channel. Cost of FC is quite high
GE and Fiber Channel are expected to connect into the IB fabric to access IB- enabled compute resources
Helps IT managers to better balance I/O and processing resources within an IB fabric Allows applications to use IB’s RDMA to fetch data, computer and put intermediate results, good for HPC
11
12
Current and Future Trends
HPC and I/O Subsystem communication will be faster and easier to manage HPC scientific applications will continue New multidisciplinary work will increase Financial business will use HPC systems
13
References
http://www.infinibandta.org http://www.cray.com http://www.myri.com http://www.fibrechannel.org/ Jens Mache, "An Assessment of Gigabit Ethernet as Cluster Interconnect," iwcc , p. 36, 1999. http://www.supercomp.org/sc2002/paperpdfs/pap.pap207.pdf
http://compnetworking.about.com/cs/clustering/g/bldef_infiniban.htm
InfiniBand today, Article by Dave Ellis http://www.wwpi.com/index.php?option=com_content&task=view&id=1163& Itemid=44 http://www.mellanox.com/pdf/presentations/Top500_Nov_06.pdf http://www.mellanox.com/applications/top_500.php Fiber Channel vs. InfiniBand vs. Ethernet http://www.processor.com/editorial/article.asp?article=articles%2Fp2911%2 F31p11%2F31p11%2Easp&guid=934C81176D3D40969DF5ABA3E28DC8 CF&searchtype=&WordList=&bJumpTo=True