Future Trends in Hardw are and Softw are for use in Simulation - - PowerPoint PPT Presentation

future trends in hardw are and softw are for use in
SMART_READER_LITE
LIVE PREVIEW

Future Trends in Hardw are and Softw are for use in Simulation - - PowerPoint PPT Presentation

Future Trends in Hardw are and Softw are for use in Simulation Steve Feldman VP/IT, CD-adapco April, 2009 H igh P erformance C omputing Building Blocks CPU I/O Interconnect Software General CPU Maximum clock speeds have


slide-1
SLIDE 1

Future Trends in Hardw are and Softw are for use in Simulation

Steve Feldman VP/IT, CD-adapco April, 2009

slide-2
SLIDE 2

HighPerformanceComputing Building Blocks

  • CPU
  • I/O
  • Interconnect
  • Software
slide-3
SLIDE 3

General CPU

  • Maximum clock speeds have remained relatively constant.

Higher speeds require too much power and cooling. Cpu’s simply are not getting faster.

  • Manufacturers have and will be able to put more cores on

each chip (dual core, quad-core, many-core)

  • Memory bandwidth will likely be the constraint for HPC to

efficiently employ many-core architectures.

  • CD-adapco products treat each core as a single cpu. No

special tricks required to run in parallel. Each core requires an hpc (or Power session) license just like any other cpu whether it resides on the same node or a different one.

slide-4
SLIDE 4

GPGPU

  • Single precision floating point speed is exceptional but

double precision is far less interesting.

  • Programming is difficult. No (current) standard for uniform
  • programming. OpenCL has just been approved.
  • Questionable memory access speed from main memory.
slide-5
SLIDE 5

I/O Performance

  • Many cpu’s all writing at once can overwhelm a cheap I/O

system.

  • Transients in particular write lots of data.
  • RAID systems allow for the possibility of disk failures

and/or disk striping for better performance. SATA in RAID seems to have good performance at reasonable prices.

  • I/O hardware can be hung off of Infiniband, GigE, fibre

channel or directly connected to a dedicated I/O server.

  • There are a number of parallel file systems and associated

hardware that can handle intensive I/O demands of a large HPC cluster.

slide-6
SLIDE 6

Interconnect

  • Allows cpu’s to exchange information
  • Important characteristics

– Bandwidth – How quickly can I stream large arrays? – Latency – How long for a node to signal back that the data

(of any size) was received.

  • CD-adapco solvers have some loops that are bandwidth

bound and others that are latency bound.

  • Transients are more interconnect sensitive than steady-

state analyses.

  • Small problems (fewer cells/cpu) are more interconnect

sensitive than large ones. Higher node counts require better interconnects.

slide-7
SLIDE 7

Interconnect Characteristics

Interconnect Bandwidth(Mbytes/s) Latency(µs) GigE 100 40-60 10GigE 1000 40-60 Myrinet “D” (IP) 50 30 Myrinet “D” (native) 162 10 Infiniband (TCP/IP) 100 50 Infiniband (native-SDR) <1000 2-5 Infiniband (native-DDR) <2000 1.8-5 Infiniband (native-QDR) <4000 ? Cray Rapid Array 1000 2 Shared Mem (ibm pwr4) 2012 3

slide-8
SLIDE 8

Software Components

  • OS
  • OS Management (Cluster Software)
  • Job Management (Queueing)
  • IO Management
slide-9
SLIDE 9

OS

  • CD-adapco supports all the major Unix flavors

– AIX, HP-UX, Solaris (shorter list every year)

  • CD-adapco supports major Linux flavors

– Red Hat Enterprise – Suse Enterprise – Others may or may not work but are not supported

  • Microsoft Windows

– Windows Server 2008 HPC support complete for STAR-CD

and STAR-CCM+

– WinXP64 and Vista are supported but not for multi-node

clusters

slide-10
SLIDE 10

Cluster Management Software

  • Propagates the OS, upgrades, changes to all nodes
  • Provides views of all nodes from one location
  • A cluster is never as easy to maintain as one single

instance of an OS but neither should it be N times harder for N individual nodes.

slide-11
SLIDE 11

Queuing Software

  • Submits jobs in an orderly fashion
  • Resource managers – applies open cpu’s to queued jobs
slide-12
SLIDE 12

I/O Management

  • May be as simple as NFS (Unix/Linx) or standard Windows

shared drives

  • May be much more complicated dealing with multiple I/O

nodes, parallel file servers (on I/O nodes) and clients (on compute nodes).

slide-13
SLIDE 13

Questions for Panel (I)

  • CPU - Intel and AMD (and IBM Power) have all solved the

problem of memory bandwidth to quad cores. Will this approach scale to higher core counts?

  • How will higher core counts affect interconnect needs? Will

Infiniband be able to keep up with increasing numbers of cores connected to a single path? Will something else emerge?

  • How will higher core counts affect I/O streams? What will

high performance I/O systems look like in the future? Will pNFS work or will we continue to rely on proprietary parallel file systems?

slide-14
SLIDE 14

Questions for Panel (II)

  • GPU’s have weak double precision performance, memory

throughput is questionable and there is no current single environment to program all makes. Will they ever become part of the CAE/HPC world? How?

  • Clusters are not easy to manage. Is there anything to look

forward to in Software that will make them look more like a single machine to the IT admins?

  • What will a supercomputer look like 5 or more years from

now?

Is there a “breakthrough” architecture that will allow faster clock rates on the horizon?

– Are there any other disruptive technologies to look for?