CSCI341 Lecture 38, Introduction to Multicore Architectures GOAL: - - PowerPoint PPT Presentation

csci341
SMART_READER_LITE
LIVE PREVIEW

CSCI341 Lecture 38, Introduction to Multicore Architectures GOAL: - - PowerPoint PPT Presentation

CSCI341 Lecture 38, Introduction to Multicore Architectures GOAL: PERFORMANCE Recall: Power as the overriding issue. Performance, heat, power efficiency. PIPELINING Exploits potential parallelism among instructions.


slide-1
SLIDE 1

CSCI341

Lecture 38, Introduction to Multicore Architectures

slide-2
SLIDE 2

GOAL: PERFORMANCE

Recall: Power as the overriding issue. Performance, heat, power efficiency.

slide-3
SLIDE 3

PIPELINING

“Exploits potential parallelism among instructions.” “Instruction-level parallelism”

slide-4
SLIDE 4

PROCESS-LEVEL PARALLELISM

Utilizing multiple processors by running independent programs simultaneously.

slide-5
SLIDE 5

PARALLEL PROCESSING PROGRAM

Executing one program upon multiple processors simultaneously.

slide-6
SLIDE 6

MULTI-PROCESSOR ARCHITECTURES

A system with at least two processors.

slide-7
SLIDE 7

MULTI-CORE ARCHITECTURES

A system with multiple processors (“cores”) within a single integrated circuit.

slide-8
SLIDE 8

SEQUENTIAL VS. CONCURRENT

slide-9
SLIDE 9

THE PROBLEM

(not about the hardware) It is difficult to write software that uses multiple processors that complete tasks faster. Why?

slide-10
SLIDE 10

MUST YIELD THE BENEFIT

The parallel implementation must be faster, especially as the number of processors increase. Otherwise, what’s the point? Single-processor instruction-level parallelism has evolved. (see superscalar & out-of-order execution)

slide-11
SLIDE 11

COMPLICATIONS

  • scheduling
  • load balancing
  • time for synchronization
  • communication overhead
  • Amdahl’s law

Example: multiple journalists writing a story.

slide-12
SLIDE 12

SMP

Shared Memory Multiprocessor Multiple processors, single memory address space. All cores have access to all data. (Multi-core architectures generally use this approach)

slide-13
SLIDE 13

SMP

slide-14
SLIDE 14

SYNCHRONIZATION

Coordinating operations on shared data between multiple processors. Common solution: locks.

slide-15
SLIDE 15

MESSAGE PASSING

What if each processor has its own address space?

slide-16
SLIDE 16

MESSAGE PASSING

Pragmatically, manifests as clusters of individual machines. But, there’s a cost to administering these individual physical machines.

slide-17
SLIDE 17

VIRTUAL MACHINES

An additional layer of abstraction on top of hardware. Multiple cluster nodes on top of hardware, each capable

  • f sending/receiving messages.
slide-18
SLIDE 18

SO MUCH MORE...

  • Multithreading
  • MIMD (Multiple Instruction / Multiple Data Streams)
  • Vector architectures (see Cray)
  • GPUs
slide-19
SLIDE 19

AND MORE...

Storage & I/O (Chapter 6) One simple approach: memory-mapped I/O

slide-20
SLIDE 20

AND MORE...

Many instructions are loads/stores... how can we exploit the memory hierarchy?

slide-21
SLIDE 21

PRINCIPAL OF LOCALITY

  • Temporal
  • Spatial
slide-22
SLIDE 22

PRINCIPAL OF LOCALITY

Memory closest to the processor fastest (most expensive).

slide-23
SLIDE 23

HIERARCHY

< 3 ns < 70 ns < 20m ns $2000/GB $20/GB $0.25/GB

slide-24
SLIDE 24

HIERARCHY

slide-25
SLIDE 25

HOMEWORK

  • Reading 32
  • Final exam program

No more homework!