Issues in Multiprocessors Which programming model for interprocessor - PowerPoint PPT Presentation

Issues in Multiprocessors Which programming model for interprocessor communication • shared memory • regular loads & stores • message passing • explicit sends & receives Which execution model • control parallel • identify & synchronize different asynchronous threads • data parallel • same operation on different parts of the shared data space Winter 2006 CSE 548 - Multiprocessors 1

Issues in Multiprocessors How to express parallelism • language support • HPF, ZPL • runtime library constructs • coarse-grain, explicitly parallel C programs • automatic (compiler) detection • implicitly parallel C & Fortran programs, e.g., SUIF & PTRANS compilers Algorithm development • embarrassingly parallel programs could be easily parallelized • development of different algorithms for same problem Winter 2006 CSE 548 - Multiprocessors 2

Issues in Multiprocessors How to get good parallel performance • recognize parallelism • transform programs to increase parallelism without decreasing processor locality • decrease sharing costs Winter 2006 CSE 548 - Multiprocessors 3

Flynn Classification SISD : single instruction stream, single data stream • single-context uniprocessors SIMD : single instruction stream, multiple data streams • exploits data parallelism • example: Thinking Machines CM MISD : multiple instruction streams, single data stream • systolic arrays • example: Intel iWarp, streaming processors MIMD : multiple instruction streams, multiple data streams • multiprocessors • multithreaded processors • parallel programming & multiprogramming • relies on control parallelism: execute & synchronize different asynchronous threads of control • example: most processor companies have MP configurations Winter 2006 CSE 548 - Multiprocessors 4

CM-1 Winter 2006 CSE 548 - Multiprocessors 5

Systolic Array Winter 2006 CSE 548 - Multiprocessors 6

MIMD Low-end • bus-based • simple, but a bottleneck • simple cache coherency protocol • physically centralized memory • uniform memory access (UMA machine) • Sequent Symmetry, SPARCCenter, Alpha-, PowerPC- or SPARC- based servers Winter 2006 CSE 548 - Multiprocessors 7

Low-end MP Winter 2006 CSE 548 - Multiprocessors 8

MIMD High-end • higher bandwidth, multiple-path interconnect • more scalable • more complex cache coherency protocol (if shared memory) • longer latencies • physically distributed memory • non-uniform memory access (NUMA machine) • could have processor clusters • SGI Challenge, Convex Examplar, Cray T3D, IBM SP-2, Intel Paragon Winter 2006 CSE 548 - Multiprocessors 9

High-end MP Winter 2006 CSE 548 - Multiprocessors 10

Comparison of Issue Capabilities Winter 2006 CSE 548 - Multiprocessors 11

MIMD Programming Models Address space organization for physically distributed memory • distributed shared memory • 1 global address space • multicomputers • private address space/processor Inter-processor communication • shared memory • accessed via load/store instructions • SPARCCenter, SGI Challenge, Cray T3D, Convex Exemplar, KSR-1&2 • message passing • explicit communication by sending/receiving messages • TMC CM-5, Intel Paragon, IBM SP-2 Winter 2006 CSE 548 - Multiprocessors 12

Shared Memory vs. Message Passing Shared memory + simple parallel programming model • global shared address space • not worry about data locality but get better performance when program for data placement lower latency when data is local • but can do data placement if it is crucial, but don ’ t have to • hardware maintains data coherence • synchronize to order processor ’ s accesses to shared data • like uniprocessor code so parallelizing by programmer or compiler is easier ⇒ can focus on program semantics, not interprocessor communication Winter 2006 CSE 548 - Multiprocessors 13

Shared Memory vs. Message Passing Shared memory + low latency (no message passing software) but overlap of communication & computation latency-hiding techniques can be applied to message passing machines + higher bandwidth for small transfers but usually the only choice Winter 2006 CSE 548 - Multiprocessors 14

Shared Memory vs. Message Passing Message passing + abstraction in the programming model encapsulates the communication costs but more complex programming model additional language constructs need to program for nearest neighbor communication + no coherency hardware + good throughput on large transfers but what about small transfers? + more scalable (memory latency doesn ’ t scale with the number of processors) but large-scale SM has distributed memory also • hah! so you ’ re going to adopt the message-passing model? Winter 2006 CSE 548 - Multiprocessors 15

Shared Memory vs. Message Passing Why there was a debate • little experimental data • not separate implementation from programming model • can emulate one paradigm with the other • MP on SM machine message buffers in local (to each processor) memory copy messages by ld/st between buffers • SM on MP machine ld/st becomes a message copy sloooooooooow Who won? Winter 2006 CSE 548 - Multiprocessors 16

Issues in Multiprocessors Which programming model for interprocessor - PowerPoint PPT Presentation

Issues in Multiprocessors Which programming model for interprocessor communication shared memory regular loads & stores message passing explicit sends & receives Which execution model control parallel

5 Chip Multiprocessors (II) Chip Multiprocessors (ACS MPhil) Robert Mullins Overview

Why Multiprocessors? Limits on the performance of a single processor: what are they? Spring 2009

4 Chip Multiprocessors (I) Chip Multiprocessors (ACS MPhil) Robert Mullins Overview

Cap5 - Shared Memory Multiprocessors Logical design and software interactions 1 Shared Memory

Shared Memory Multiprocessors Logical design and software interactions 1 Shared Memory

1 Trends when work was done OS Issues for multiprocessors A period when multiprocessors were

Architectural Support for Parallel Reduction in Scalable Shared Memory Multiprocessors in

Multiprocessors (Chapter 9) Idea: create powerful computers by connecting many smaller ones

Lecture 24: Virtual Memory, Multiprocessors Todays topics: Virtual memory

Multiprocessors/Multicores Presented by Yue Gao September 26, 2013 Presented by Yue Gao

Reducing the Interconnection Network Cost of Chip Multiprocessors Pablo Abad , Valentn Puente

Lecture 23: Virtual Memory, Multiprocessors Todays topics: Virtual memory

Understanding POWER multiprocessors Susmit Sarkar 1 Peter Sewell 1 Jade Alglave 2 , 3 Luc Maranget

Overview Synchronization hardware primitives Cache Coherency Issues Coherence misses

IDN Variant Issues Project Integrated Issues Report Coordination Team Meeting IDN Variant Issues

Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors Supervisor

Multiple processor Multiple processor systems systems 1 Multiprocessor Systems Multiprocessor

RETHINKING OPERATING SYSTEM DESIGNS FOR A Ken Birman Based heavily MULTICORE WORLD on a slide

Terminology Programmierung Paralleler und Verteilter Systeme (PPV) Sommer 2015 Frank Feinbube,

Parallel Numerical Algorithms Chapter 1 Parallel Computing Michael T. Heath and Edgar

Chapter 4: Threads Outline Wh a t a r e t h r e a d s ? H o w d o t h

Distributed Systems Share single address space Share data in that space Use threads for

Parallel Computers The Demand for Computational Speed Continual demand for greater computational

Building circuits for integer factorization D. J. Bernstein Thanks to: University of Illinois