Short Summery Taxonomy of parallel computers SISD: von Neumann - - PowerPoint PPT Presentation

short summery
SMART_READER_LITE
LIVE PREVIEW

Short Summery Taxonomy of parallel computers SISD: von Neumann - - PowerPoint PPT Presentation

Short Summery Taxonomy of parallel computers SISD: von Neumann model SIMD: Single Instruction Multiple Data MIMD: Multiple Instruction Multiple Data A different taxonomy SISD, SIMD, MIMD refer to the processor organization


slide-1
SLIDE 1

Short Summery

  • Taxonomy of parallel computers

– SISD: von Neumann model – SIMD: Single Instruction Multiple Data – MIMD: Multiple Instruction Multiple Data

slide-2
SLIDE 2

A different taxonomy

  • SISD, SIMD, MIMD refer to the processor
  • rganization
  • With respect to the memory organization, the two

fundamental models are:

– Distributed memory architecture

  • each processor has its own private memory

– Shared address space architecture

  • al processor have access to a same address space
slide-3
SLIDE 3

Memory Organizations

slide-4
SLIDE 4
  • Pure shared-memory model (fig. (a)) need substantial

interconnection bandwidth

  • Shared-address-space computers can have a local memory to

speed access to non-shared data

– Figures (b) and (c) in previous slide – So called Non Uniform Memory Access (NUMA) as opposed to Uniform Memory Access (UMA) has different access times depending

  • n location of data
  • To reduce speed differential, local memory can also be

used to cache frequently used shared data (Example: Stanford Dash)

– Use of cache introduces the issue of cache coherence. – In some architectures local memory is entirely used as cache – so called cache-only memory access (COMA). Example: KSR-1

Memory Organizations II

slide-5
SLIDE 5

Shared vs Distributed Memory

  • By general consensus, shared address space model

is easier to program but much harder to build

– Caching on local memory is critical for performance, but makes the design much harder and introduces inefficiencies

  • There is a growing trend toward hybrid designs

– i.e. clusters of SMPs, or NUMA machines with physically distributed memory

slide-6
SLIDE 6
  • The interconnect is the crucial component of any

parallel computers

  • Static vs. Dynamic networks

– Static

  • built out of point-to-point communication links between

processors (also known as direct networks)

  • Usually associated to message passing architectures
  • Examples: Completely-/Star-connected, linear array, ring,

mesh, hypercube

– Dynamic

  • built out of links and switches (also known as indirect

networks)

  • Usually associated to shared address space architectures
  • Examples: crossbar, bus-based, multistage

Interconnection Networks

slide-7
SLIDE 7

Crossbar Switching Networks

  • Crossbar switch

– Digital analogous of a switching board – Allows connection of any of p processors to any of b memory banks – Examples: Sun Ultra HPC 1000, Fujitsu VPP 500, Myrinet switch

slide-8
SLIDE 8

Bus-based Networks

  • Very simple concept, its major drawback is that

bandwidth does not scale up with number of processors

– Caches can alleviate problem because reduce traffic to memory

slide-9
SLIDE 9

Multistage Interconnection Network

  • Multistage networks are a good compromise between

cost and performance

– More scalable in terms of cost than crossbar, more scalable in terms of performance than bus – Popular schemes include omega and butterfly networks

slide-10
SLIDE 10

Omega Network