Multicore curiculum
Multicore Multicore curiculum 1 Motivation Moores Law: the number - - PowerPoint PPT Presentation
Multicore Multicore curiculum 1 Motivation Moores Law: the number - - PowerPoint PPT Presentation
Multicore Multicore curiculum 1 Motivation Moores Law: the number of transistors double every 18 months Fonte: Intel Multicore curiculum 2 Memory capacity also increases Multicore curiculum 3 The Memory Wall 100,000 10,000 1,000
SLIDE 1
SLIDE 2
Multicore curiculum
2
Motivation
Fonte: Intel
Moore’s Law: the number of transistors double every 18 months
SLIDE 3
Multicore curiculum
3
Memory capacity also increases
SLIDE 4
Multicore curiculum
4
The Memory Wall
Year Performance 1 10 100 1,000 10,000 100,000 CPU Memory
SLIDE 5
Multicore curiculum
5
How to go parallel?
- VLIW Processors
- Superescalar Processors
– Hyperthread
- Multi-core
SLIDE 6
Multicore curiculum
6
Very Long Instruction Word
Time Processor (8 Functional Units) Instruction (8 operations)
SLIDE 7
Multicore curiculum
7
VLIW
- Advantages
– Easy to implement in hardware
- Several similar tiles
- Do not require a huge control logic
- Disadvantages
– Difficult to generate good code
SLIDE 8
Multicore curiculum
8
Superscalar Processor
Time Processor (8 Functional Units) Instructions
?
SLIDE 9
Multicore curiculum
9
Superscalar Processor
- Advantage
– Transparent to the software – The processor is able to use dynamic information to find the parallelism – Speculative code execution
- Disadvantage
– Can not always find instruction for each functional unit – Detecting parallelism in hardware requires a lot of area
SLIDE 10
Multicore curiculum
10
Hyperthreading Technology
Time P1 (4 FU) P2 (4 FU) P1 + P2
+ =
SLIDE 11
Multicore curiculum
11
Hyperthreading Technology
- Requirements
– 2 Different
- Program counter
- Register banks
- Status registers
– The same
- Functional units
- Caches
SLIDE 12
Multicore curiculum
12
Hyperthreading Technology
- Advantage
– Uses the available functional units to execute a second thread – Capable of executing code during a stall of the other thread (cache miss, etc)
- Disadvantage
– Threads usually need the same functional unit – 2 threads at the same time, but only 30%
- f typical speedup
SLIDE 13
Multicore curiculum
13
Chip Multiprocessing (CMP)
Core 1
L1 Cache
Core 2
L1 Cache L2 Cache
2 Cores
Core 1
L1 Cache
Core 2
L1 Cache L2 Cache
Core 3
L1 Cache
Core 4
L1 Cache
4 Cores A cache L2 também pode ser dividida!
SLIDE 14
Multicore curiculum
14
Pentium D Processor Diagram
SLIDE 15
Multicore curiculum
15
Intel Dual Core Pentium
SLIDE 16
Multicore curiculum
16
Intel Roadmap
SLIDE 17
Multicore curiculum