CSL 860: Modern Parallel Computation Computation Categories of - PowerPoint PPT Presentation

CSL 860: Modern Parallel Computation Computation

Categories of Processing Flynns classification • Granularity • – Coarse grain: Cray C90, Fujitsu small number of very powerful processors • – Fine grain: CM-2, Quadrics Large number of relatively less powerful processors • – Medium grain: IBM SP2, CM-5 Medium grain: IBM SP2, CM-5 between the two extremes. • – Commuication cost >> computational cost → coarse grain – Commuication cost << computational cost → fine grain Address Space Organization • – Single/shared address space Uniform Memory Address:SMP (UMA) • Non Uniform Memory Address (NUMA) • – Distributed memory Message passing •

Modern Multi-Processor Shared Memory (maybe with L2 cache) Multi-CPU Bus / Corssbar switch L1 cache L1 cache L1 cache State State State St St St ALU ALU FPU FPU ALU ALU FPU FPU ALU ALU FPU FPU State FPU ALU Bus Request Shared L1 cache System Bus Memory State FPU ALU Multi-core

n -dim Grid/Mesh

Hypercube

Tree Network

Fat Tree Network

Butterfly

Current Computer Speed • ~15 Gflop/core • ~60 Gflop for Quad-core • ~3GHz clocks • ~$1000 ~$1000

Cray • Late 70s • Small # vector processors • $9 million • 80 MHz clock 80 MHz clock • Later (Early 80s) – 105 to 117 MHz clock – 800 megaflops for 4-processor machine – $15-20 million

Connection Machine • CM-2 (SIMD) – Host connected – ~1989 – 64k single-bit SIMD processors connected in hypercube, plus 2K Weitek floating point units). – 8 MHz clock – 8 MHz clock – 6 GFLOPS – 400 MFLOPS per million dollars – Hypercube architecture – $15 million • CM-5 (MIMD) – ~1991 – Fat tree network of 896 SPARC RISC processors

nCube • nCube 2 costs between $500,000 and $2m • $2m for 27 GFLOPS machine nCube3 (1994): • 50 MHz 50 MHz • Processor Module: 512 nodes and 32 GB memory • Up to 20 Modules for 1.0 TFLOP system of 10,240 nodes • $40 million • $40,000/Gflop

Maspar Host Array Control Unit PEs connected to 8 neighbors 32 bit ALUs 32 bit ALUs SIMD Also a slow global router 32 PEs per chip, Upto 16K processors overall 12.5 MHz clock 1.2 Gflops $1.5million ~`1000 flops/dollar-second Early 90s

Cray T90 • 1995 • 450 MHz • 4-32 vector processors – Peak 1.8 Gflops per processor – 57.6 Gflops 57.6 Gflops • Shared (upto) 8G memory • Multiple ports – 3 64-bit words per cycle per CPU x32 > 300 GB/s per second • 32-processor version cost $39 million.

Roadrunner • $133 million • Multi-stage InfiniBand interconnect – Infiniband: 2-level fat-tree, each leaf switch has 180 down links and 96 up links (18 such CUs), 12 up links from each CU connected each of the 2nd level from each CU connected each of the 2nd level switches switches • cluster • 122400 cores – 6912 dual-core Opterons – 12960 power XCell eDP: 116640 cores • peak 1.45 PetaFlops

IBM Cell Processor

NVIDIA GF8800 Host Data Assembler Setup / Rstr / ZCull Vtx Thread Issue Geom Thread Issue Pixel Thread Issue SP SP SP SP SP SP SP SP SP SP SP SP SP SP SP SP cessor Thread Proces TF TF TF TF TF TF TF TF L1 L1 L1 L1 L1 L1 L1 L1 L2 L2 L2 L2 L2 L2 FB FB FB FB FB FB

CSL 860: Modern Parallel Computation Computation Categories of - PowerPoint PPT Presentation

CSL 860: Modern Parallel Computation Computation Categories of Processing Flynns classification Granularity Coarse grain: Cray C90, Fujitsu small number of very powerful processors Fine grain: CM-2, Quadrics Large

CSL 860: Modern Parallel Computation Computation Hello OpenMP #pragma omp parallel { // I am

CSL 860: Modern Parallel Computation Computation PARALLEL ALGORITHM TECHNIQUES: BALANCED BINARY

CSL 860: Modern Parallel Computation Computation MPI: MESSAGE PASSING INTERFACE Message

CSL 860: Modern Parallel Computation Computation Course Information

CSL 860: Modern Parallel Computation Computation MEMORY CONSISTENCY Intuitive Memory Model

MODERN 1 MODERN 2 MODERN 3 MODERN 4 MODERN A peep at some distant orb has power to raise

2019 AGM 203-1634 Harvey Ave., Kelowna, BC, Canada Tel: +1 250 860 8599 Fax: +1 250 860 1362

Recent PVS Language Developments Sam Owre owre@csl.sri.com URL: http://www.csl.sri.com/~owre/

Graph Covers and Iterative Decoding of Finite-Length Codes Pascal O. Vontobel (CSL, UIUC) Ralf

Random Testing in PVS Sam Owre owre@csl.sri.com URL: http://www.csl.sri.com/~owre/ Computer

Modern Risk Modern Risk Modern Risk Management Modern Risk Management anagement Concepts:

Models of Parallel Computation Mark Greenstreet CpSc 418 Oct. 10, 2013 The RAM Model of

Polyphaser, Transtector, RO Associates Kevin Turner -- Regional Sales Manager (860) 323-8012

Lead-Zinc Project 203-1634 Harvey Ave., Kelowna, BC, Canada February 2019 Tel: +1 250 860 8582

MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications Class 06: Learning with

MIT 9.520/6.860, Fall 2019 Statistical Learning Theory and Applications Class 02: Statistical

CSE 127: Introduction to Security Lecture 9: Intro to Networking Deian Stefan UCSD Winter 2020

Introduction and Background Danny Dolev Danny Dolev * The lecture notes are based in part on the

GLIF Performance Verification Architecture Task Force Steve Wolff & Jerry Sobieski GLIF

Using IP to Underpin 5G Networks Making the Unreliable Reliable Adrian Farrel

HW/SW Codesign w/ FPGAs The Nature of HW/SW III ECE 522 The Dualism of Hardware and Software The

An O/S perspective on networks: Active Messages and U-Net Theo Jepsen Cornell University 17

Networking Michael Morgenthal, Ruben Ocana Introduction Senior, Computer Engineering Major

A proof - theoretic approach to abstract interpretation Apostolos Tzimoulis joint work with

CSL 860: Modern Parallel Computation Computation Categories of - PowerPoint PPT Presentation

CSL 860: Modern Parallel Computation Computation Categories of Processing Flynns classification Granularity Coarse grain: Cray C90, Fujitsu small number of very powerful processors Fine grain: CM-2, Quadrics Large

CSL 860: Modern Parallel Computation Computation Hello OpenMP #pragma omp parallel { // I am

CSL 860: Modern Parallel Computation Computation PARALLEL ALGORITHM TECHNIQUES: BALANCED BINARY

CSL 860: Modern Parallel Computation Computation MPI: MESSAGE PASSING INTERFACE Message

CSL 860: Modern Parallel Computation Computation Course Information

CSL 860: Modern Parallel Computation Computation MEMORY CONSISTENCY Intuitive Memory Model

MODERN 1 MODERN 2 MODERN 3 MODERN 4 MODERN A peep at some distant orb has power to raise

2019 AGM 203-1634 Harvey Ave., Kelowna, BC, Canada Tel: +1 250 860 8599 Fax: +1 250 860 1362

Recent PVS Language Developments Sam Owre owre@csl.sri.com URL: http://www.csl.sri.com/~owre/

Graph Covers and Iterative Decoding of Finite-Length Codes Pascal O. Vontobel (CSL, UIUC) Ralf

Random Testing in PVS Sam Owre owre@csl.sri.com URL: http://www.csl.sri.com/~owre/ Computer

Modern Risk Modern Risk Modern Risk Management Modern Risk Management anagement Concepts:

Models of Parallel Computation Mark Greenstreet CpSc 418 Oct. 10, 2013 The RAM Model of

Polyphaser, Transtector, RO Associates Kevin Turner -- Regional Sales Manager (860) 323-8012

Lead-Zinc Project 203-1634 Harvey Ave., Kelowna, BC, Canada February 2019 Tel: +1 250 860 8582

MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications Class 06: Learning with

MIT 9.520/6.860, Fall 2019 Statistical Learning Theory and Applications Class 02: Statistical

CSE 127: Introduction to Security Lecture 9: Intro to Networking Deian Stefan UCSD Winter 2020

Introduction and Background Danny Dolev Danny Dolev * The lecture notes are based in part on the

GLIF Performance Verification Architecture Task Force Steve Wolff &amp; Jerry Sobieski GLIF

Using IP to Underpin 5G Networks Making the Unreliable Reliable Adrian Farrel

HW/SW Codesign w/ FPGAs The Nature of HW/SW III ECE 522 The Dualism of Hardware and Software The

An O/S perspective on networks: Active Messages and U-Net Theo Jepsen Cornell University 17

Networking Michael Morgenthal, Ruben Ocana Introduction Senior, Computer Engineering Major

A proof - theoretic approach to abstract interpretation Apostolos Tzimoulis joint work with

GLIF Performance Verification Architecture Task Force Steve Wolff & Jerry Sobieski GLIF