Specializing General-Purpose Computing A New Approach to Designing - - PowerPoint PPT Presentation

specializing general purpose computing
SMART_READER_LITE
LIVE PREVIEW

Specializing General-Purpose Computing A New Approach to Designing - - PowerPoint PPT Presentation

Specializing General-Purpose Computing A New Approach to Designing Clusters for High-Performance T echnical Computing Win T reese SiCortex, Inc. What the heck does that mean? High-performance computing often uses specialized hardware


slide-1
SLIDE 1

Specializing General-Purpose Computing

A New Approach to Designing Clusters for High-Performance T echnical Computing

Win T reese SiCortex, Inc.

slide-2
SLIDE 2

What the heck does that mean?

High-performance computing often uses specialized hardware Supercomputers experiments with graphics processors General-purpose computing doesn’t

  • ptimize for technical computing
slide-3
SLIDE 3

With some problems...

Supercomputers Expensive Not on the same technology curve Different programming environment General-purpose computing Amazing technology curve Optimized for desktop and enterprise applications

slide-4
SLIDE 4

A Challenge: The Best of Both

Use general-purpose hardware components With a standard programming environment And SYSTEM DESIGN for technical computing

slide-5
SLIDE 5

The Roadmap

A bit of history A bit about high-performance technical computing (aka “HPTC”) Linux clusters for HPTC Designing a new system for HPTC What we are building

slide-6
SLIDE 6

A Bit of History The SUPERCOMPUTER

slide-7
SLIDE 7

But all is not well in supercomputer land...

You have to pay a lot for them You have write your program differently You have to find some high priests to take care of them Supercomputer companies don’t make money

slide-8
SLIDE 8

...so let’s use lots of little computers

PCs are cheap Linux is free Commodity interconnect (Ethernet) is cheap The (Beowulf) Cluster is born

slide-9
SLIDE 9

A Small Visualization Cluster

slide-10
SLIDE 10

Some characteristics

  • f

high-performance technical computing

slide-11
SLIDE 11

Some typical applications

Climate and weather models Geophysics Complex financial modeling Mechanical design Finite element analysis Fluid dynamics Life sciences analysis and simulation T

  • p-secret stuff

...and many others

slide-12
SLIDE 12

What are they like?

Can run for weeks Consume all the cycles you can afford Not very cache-friendly Parallelism often demands good communications Large data sets (input and output) Many are in Fortran! ...but also in C, C++, Java, Perl, Python, etc.

slide-13
SLIDE 13

The Market for HPTC

HPTC is now mainstream computing! Over $6 billion in Linux cluster hardware sales in 2006 Petascale computing is hot for research, but there is a real market now for teraflops

slide-14
SLIDE 14

Linux Clusters and High-Performance T echnical Computing

slide-15
SLIDE 15

So clusters are great, right?

Cheap, because they use cheap PCs Expandable Easy to get started Software is free They ride the desktop/ server technology curve Interconnect (Ethernet) is cheap Emerging de facto standards Linux Message Passing Interface (MPI) C, Fortran, etc.

slide-16
SLIDE 16

...but not perfect

Computational efficiency is often low Use lots of power Generate lots of heat Many parts to fail ...with a desktop MTBF design Interconnect is slow: XXX microseconds for MPI on Ethernet ...or expensive: using Infiniband can increase the price of a node by 50%

slide-17
SLIDE 17

And software rules!

Software investment is the significant cost Replace the cluster, but keep the software What if we redesign the system with the same programming interface?

slide-18
SLIDE 18

Designing a New System for High-Performance T echnical Computing

slide-19
SLIDE 19

A Design Challenge

1000 nodes in this box ...all running Linux Near-microsecond MPI latency Air-cooled 5' 5' 6'

slide-20
SLIDE 20

The logic of low power

Low power ⇒ less heat Less heat ⇒ parts closer together Parts closer together ⇒ shorter wires ⇒ easier high-performance interconnect Less heat ⇒ greater reliability Burn less power waiting for memory

slide-21
SLIDE 21

The SC5832

5832 Gigaflops 7776 Gigabytes ECC memory 972 6-core 64-bit nodes 2916 2 GByte/s fabric links about 1 microsecond MPI latency 108 8-lane PCI-Express 18 KW 1 Cabinet

5' 5' 6'

slide-22
SLIDE 22
slide-23
SLIDE 23
slide-24
SLIDE 24

The SC648

648 Gigaflops 864 Gigabytes ECC RAM 108 6-core 64-bit nodes 324 2 GB/s fabric links about 1 microsecond MPI latency 12 8-lane PCI-Express 2 KW 1/2 standard 19” rack

slide-25
SLIDE 25

Software

It’s just Linux gcc MPI etc. ...even Emacs! All open source

slide-26
SLIDE 26

Interconnect fabric

Log diameter Multiple paths Cost-effective

slide-27
SLIDE 27

A Cluster Node Chip

L2 Coherence Engine CPU CPU CPU CPU CPU CPU Memory controller Memory controller DMA Engine PCI- Express Fabric switch

RAM RAM I/O

slide-28
SLIDE 28

27-Node Module

Interconnect fabric Compute nodes Memory PCIe modules

slide-29
SLIDE 29

Design for reliability

Lower parts count Lower power = less heat = less stress All RAMs have ECC Redundancy in interconnect

slide-30
SLIDE 30

Parallel I/O

Integrated Lustre cluster filesystem Open source POSIX-compliant Multiple uses Direct-connect storage External Lustre servers RAM-based filesystem

slide-31
SLIDE 31

What have we learned?

T ake general computing techniques ...with some knowledge about the applications Mix well Powerful and usable computing

slide-32
SLIDE 32

Specializing General-Purpose Computing

Win T reese SiCortex, Inc. win.treese@sicortex.com

  • r

treese@acm.org