CSCI341 Lecture 38, Introduction to Multicore Architectures GOAL: - PowerPoint PPT Presentation

Jul 11, 2023 •201 likes •471 views

CSCI341 Lecture 38, Introduction to Multicore Architectures GOAL: PERFORMANCE Recall: Power as the overriding issue. Performance, heat, power efficiency. PIPELINING Exploits potential parallelism among instructions.

CSCI341 Lecture 38, Introduction to Multicore Architectures
GOAL: PERFORMANCE Recall: Power as the overriding issue. Performance, heat, power efficiency.
PIPELINING “Exploits potential parallelism among instructions.” “Instruction-level parallelism”
PROCESS-LEVEL PARALLELISM Utilizing multiple processors by running independent programs simultaneously.
PARALLEL PROCESSING PROGRAM Executing one program upon multiple processors simultaneously.
MULTI-PROCESSOR ARCHITECTURES A system with at least two processors.
MULTI-CORE ARCHITECTURES A system with multiple processors (“cores”) within a single integrated circuit.
SEQUENTIAL VS. CONCURRENT
THE PROBLEM (not about the hardware) It is difficult to write software that uses multiple processors that complete tasks faster. Why?
MUST YIELD THE BENEFIT The parallel implementation must be faster, especially as the number of processors increase. Otherwise, what’s the point? Single-processor instruction-level parallelism has evolved. (see superscalar & out-of-order execution)
COMPLICATIONS • scheduling • load balancing • time for synchronization • communication overhead • Amdahl’s law Example: multiple journalists writing a story.
SMP Shared Memory Multiprocessor Multiple processors, single memory address space. All cores have access to all data. (Multi-core architectures generally use this approach)
SMP
SYNCHRONIZATION Coordinating operations on shared data between multiple processors. Common solution: locks.
MESSAGE PASSING What if each processor has its own address space?
MESSAGE PASSING Pragmatically, manifests as clusters of individual machines. But, there’s a cost to administering these individual physical machines.
VIRTUAL MACHINES An additional layer of abstraction on top of hardware. Multiple cluster nodes on top of hardware, each capable of sending/receiving messages.
SO MUCH MORE... • Multithreading • MIMD (Multiple Instruction / Multiple Data Streams) • Vector architectures (see Cray) • GPUs
AND MORE... Storage & I/O (Chapter 6) One simple approach: memory-mapped I/O
AND MORE... Many instructions are loads/stores... how can we exploit the memory hierarchy?
PRINCIPAL OF LOCALITY • Temporal • Spatial
PRINCIPAL OF LOCALITY Memory closest to the processor fastest (most expensive).
HIERARCHY < 3 ns $2000/GB < 70 ns $20/GB < 20m ns $0.25/GB
HIERARCHY
HOMEWORK • Reading 32 • Final exam program No more homework!

Recommend

CSCI341 Lecture 18, IEEE Floating Point Image courtesy of http://debsbookbag.blogspot.com/ The

CSCI341 Lecture 18, IEEE Floating Point Image courtesy of http://debsbookbag.blogspot.com/ The design team for a relationship-problem-solving-unit (RPSU) is choosing between a Candy and Flower implementation. Thousands of relationships may be

537 views • 19 slides

CSCI341 Lecture 27, ASCII &Unicode, Addressing Modes ASCII American Standard Code for

CSCI341 Lecture 27, ASCII &Unicode, Addressing Modes ASCII American Standard Code for Information Interchange One byte = One character How many possible characters? COMMON INSTRUCTIONS lb (as signed) sb lbu (used in C

819 views • 18 slides

CSCI341 Lecture 22, MIPS Programming: Directives, Linkers, Loaders, Memory REVIEW Assemblers

CSCI341 Lecture 22, MIPS Programming: Directives, Linkers, Loaders, Memory REVIEW Assemblers understand special commands called directives Assemblers understand macro commands Assembly programs become object files

756 views • 26 slides

CSCI341 Lecture 31, Control RECALL... The datapath is a representation of the flow of

CSCI341 Lecture 31, Control RECALL... The datapath is a representation of the flow of information (data, instructions) through the CPU Implemented as combination of circuitry and combinatorial & sequential chips In order

456 views • 23 slides

CSCI341 Lecture 30, Building a Datapath RECALL... The datapath is a representation of

CSCI341 Lecture 30, Building a Datapath RECALL... The datapath is a representation of the flow of information (data, instructions) through the CPU Implemented as combination of circuitry and combinatorial & sequential chips

404 views • 17 slides

CSCI341 Lecture 11, Logical Operations Image courtesy of http://debsbookbag.blogspot.com/ vs

CSCI341 Lecture 11, Logical Operations Image courtesy of http://debsbookbag.blogspot.com/ vs PERFORMANCE CPU Time = (Instruction Count x CPI) / Clock Rate Clock Rate CPI class A CPI class B CPI class C CPI class D Watermelon 1.5 GHz 1

687 views • 19 slides

CSCI341 Lecture 36, Pipelining & Hazards RECALL... RECALL... HAZARDS Data Hazards

CSCI341 Lecture 36, Pipelining & Hazards RECALL... RECALL... HAZARDS Data Hazards Control Hazards Dukes of Hazzard DATA HAZARD Hardware solution: Include forwarding paths in the machines datapath. Even though results have not

436 views • 27 slides

CSCI341 Lecture 21, MIPS Programming REVIEW Assemblers understand special commands called

CSCI341 Lecture 21, MIPS Programming REVIEW Assemblers understand special commands called directives instructions for the assembler, eg, .text Assemblers understand macro commands higher-level instructions that abstract

722 views • 32 slides

CSCI341 Lecture 37, Introduction to Parallelism PIPELINING Exploits potential parallelism

CSCI341 Lecture 37, Introduction to Parallelism PIPELINING Exploits potential parallelism among instructions. Instruction-level parallelism INSTRUCTION-LEVEL PARALLELISM Increase depth of pipeline (greater overlap of

646 views • 26 slides

Vectorization & Cache Organization ASD Shared Memory HPC Workshop Computer Systems Group

Vectorization & Cache Organization ASD Shared Memory HPC Workshop Computer Systems Group Research School of Computer Science Australian National University Canberra, Australia February 11, 2020 Schedule - Day 2 Computer Systems (ANU)

1.32k views • 85 slides

COMP 633 - Parallel Computing Lecture 15 October 1, 2020 Programming Accelerators using

COMP 633 - Parallel Computing Lecture 15 October 1, 2020 Programming Accelerators using Directives Credits: Introduction to OpenACC and toolkit Jeff Larkin, Nvidia Oct 2015 COMP 633 - Prins Heterogeneous Programming 1 Heterogeneous

696 views • 43 slides

CS3350B Computer Organization Chapter 5: Parallel Architectures Alex Brandt Department of

CS3350B Computer Organization Chapter 5: Parallel Architectures Alex Brandt Department of Computer Science University of Western Ontario, Canada Thursday March 21, 2019 Alex Brandt Chapter 5: Parallel Architectures Thursday March 21, 2019 1

512 views • 48 slides

Lecture 5.1 Flynns Taxonomy EN 600.320/420/620 Instructor: Randal Burns 12 February 2018

Lecture 5.1 Flynns Taxonomy EN 600.320/420/620 Instructor: Randal Burns 12 February 2018 Department of Computer Science, Johns Hopkins University Why do I care about architecture? What s my machine? What do I need to know about

214 views • 6 slides

CSC2/458 Parallel and Distributed Systems Machines and Models Sreepathi Pai January 23, 2018

CSC2/458 Parallel and Distributed Systems Machines and Models Sreepathi Pai January 23, 2018 URCS Outline Recap Scalability Taxonomy of Parallel Machines Performance Metrics Outline Recap Scalability Taxonomy of Parallel Machines

707 views • 31 slides

Multithreaded Algorithms Architecture Evolution Weve come a long way since we blamed Von

Multithreaded Algorithms Architecture Evolution Weve come a long way since we blamed Von Neumann for putting that bottleneck in our computers Memory contains data and programs Computer fetches the instructions sequentially from memory and

500 views • 17 slides

For Friday BE ON TIME Bring two hard copies of your complete rough draft Be sure to

For Friday BE ON TIME Bring two hard copies of your complete rough draft Be sure to submit an electronic copy to Blackboard before class For Monday after Thanksgiving Read Weiss, chapter 12, section 2 (last reading assignment)

250 views • 20 slides

Searching for Solutions Artificial Intelligence CSPP 56553 January 14, 2004 Agenda Search

Searching for Solutions Artificial Intelligence CSPP 56553 January 14, 2004 Agenda Search Motivation Problem-solving agents Rigorous problem definitions Exhaustive search: Breadth-first, Depth-first, Iterative

991 views • 75 slides

CS 188: Artificial Intelligence Spring 2007 Lecture 7: CSP-II and Adversarial Search 2/6/2007

CS 188: Artificial Intelligence Spring 2007 Lecture 7: CSP-II and Adversarial Search 2/6/2007 Srini Narayanan ICSI and UC Berkeley Many slides over the course adapted from Dan Klein, Stuart Russell or Andrew Moore PDF created with

616 views • 40 slides

Minimax-Angle Learning for Optimal Treatment Decision with Heterogeneous Data Chengchun Shi

Minimax-Angle Learning for Optimal Treatment Decision with Heterogeneous Data Chengchun Shi Department of Statistics North Carolina State University Joint work with Wenbin Lu and Rui Song August 3, 2016 Chengchun Shi (NCSU) Minimax-Angle

907 views • 66 slides

Minimax testing of a composite null hypothesis defined via a quadratic functional Joint work with

Minimax testing of a composite null hypothesis defined via a quadratic functional Joint work with L. Comminges Asymptotic Statistics and Related Topics Tokyo, Japan Arnak S. Dalalyan ENSAE / CREST / GENES Motivation 1 Testing the relevance

635 views • 15 slides

High Dimensional Predictive Inference Workshop on Current Trends and Challenges in Model

High Dimensional Predictive Inference Workshop on Current Trends and Challenges in Model Selection and Related Areas Vienna, Austria July 2008 Ed George The Wharton School (joint work with L. Brown, F. Liang, and X. Xu) 1. Estimating a

849 views • 27 slides

Wigner function estimation in QHT with noisy data Joint work with Lounici, K. and Peyr e, G.

Introduction to quantum mechanics Statistical part Main result Wigner function estimation in QHT with noisy data Joint work with Lounici, K. and Peyr e, G. Lounici, Meziani and Peyr e Wigner function estimation in QHT Introduction to

931 views • 40 slides

Thresholding and Learning theory Dominique Picard Laboratoire Probabilit es et Mod` eles Al

Thresholding and Learning theory Dominique Picard Laboratoire Probabilit es et Mod` eles Al eatoires Universit es Paris VII Joint work with G. Kerkyacharian (LPMA) Columbia- SC May 2008. http

604 views • 41 slides

DFA Minimization, Pumping Lemma CSCI 3130 Formal Languages and Automata Theory Siu On CHAN

1/27 DFA Minimization, Pumping Lemma CSCI 3130 Formal Languages and Automata Theory Siu On CHAN Chinese University of Hong Kong Fall 2015 2/27 There is a simpler one L = strings ending in 111 0 3/27 1 0 1 0 1 0 1 Can we do it in

592 views • 39 slides