Carnegie Mellon
1 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
The Memory Hierarchy CS140: Assembly Language and Computer - - PowerPoint PPT Presentation
Carnegie Mellon The Memory Hierarchy CS140: Assembly Language and Computer Organization Slides provided by: Randal E. Bryant and David R. OHallaron 1 Bryant and OHallaron, Computer Systems: A Programmers Perspective, Third Edition
Carnegie Mellon
1 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
2 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Storage technologies and trends Locality of reference Caching in the memory hierarchy
Carnegie Mellon
3 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Key features
RAM comes in two varieties:
Carnegie Mellon
4 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
5 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
DRAM and SRAM are volatile memories
Nonvolatile memories retain value even if powered off
Uses for Nonvolatile Memories
Carnegie Mellon
6 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
A bus is a collection of parallel wires that carry address,
Buses are typically shared by multiple devices.
Main memory I/O bridge Bus interface ALU Register file CPU chip System bus Memory bus
Carnegie Mellon
7 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
CPU places address A on the memory bus.
ALU Register file Bus interface A A
x
Main memory I/O bridge %rax Load operation: movq A, %rax
Carnegie Mellon
8 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Main memory reads A from the memory bus, retrieves
ALU Register file Bus interface x A
x
Main memory %rax I/O bridge Load operation: movq A, %rax
Carnegie Mellon
9 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
CPU read word x from the bus and copies it into register
x
ALU Register file Bus interface
x
Main memory A %rax I/O bridge Load operation: movq A, %rax
Carnegie Mellon
10 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
y
ALU Register file Bus interface A Main memory A %rax I/O bridge Store operation: movq %rax, A
Carnegie Mellon
11 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
y
ALU Register file Bus interface
y
Main memory A %rax I/O bridge Store operation: movq %rax, A
Carnegie Mellon
12 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
y
ALU Register file Bus interface
y
main memory A %rax I/O bridge Store operation: movq %rax, A
Carnegie Mellon
13 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Image courtesy of Seagate Technology
Carnegie Mellon
14 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Disks consist of platters, each with two surfaces. Each surface consists of concentric rings called tracks. Each track consists of sectors separated by gaps.
Spindle Surface Tracks Track k Sectors Gaps
Carnegie Mellon
15 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Surface 0 Surface 1 Surface 2 Surface 3 Surface 4 Surface 5 Cylinder k Spindle Platter 0 Platter 1 Platter 2
Carnegie Mellon
16 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Capacity: maximum number of bits that can be stored.
Capacity is determined by these technology factors:
Carnegie Mellon
17 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Modern disks partition tracks
Spindle
Carnegie Mellon
18 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
19 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
The disk surface spins at a fixed rotational rate By moving radially, the arm can position the read/write head over any track. The read/write head is attached to the end
the disk surface on a thin cushion of air. spindle spindle spindle spindle spindle
Carnegie Mellon
20 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Arm Read/write heads move in unison from cylinder to cylinder Spindle
Carnegie Mellon
21 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
22 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
23 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
24 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
25 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
26 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
27 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
28 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
29 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
30 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
31 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Average time to access some target sector approximated by :
Seek time (Tavg seek)
Rotational latency (Tavg rotation)
Transfer time (Tavg transfer)
Carnegie Mellon
32 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Given:
Derived:
Important points:
Carnegie Mellon
33 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Modern disks present a simpler abstract view of the
Mapping between logical blocks and actual (physical)
Allows controller to set aside spare cylinders for each
Carnegie Mellon
34 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Main memory I/O bridge Bus interface ALU Register file CPU chip System bus Memory bus Disk controller Graphics adapter USB controller Mouse Keyboard Monitor Disk I/O bus Expansion slots for
as network adapters.
Carnegie Mellon
35 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Main memory ALU Register file CPU chip Disk controller Graphics adapter USB controller mouse keyboard Monitor Disk I/O bus Bus interface
Carnegie Mellon
36 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Main memory ALU Register file CPU chip Disk controller Graphics adapter USB controller Mouse Keyboard Monitor Disk I/O bus Bus interface
Carnegie Mellon
37 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Main memory ALU Register file CPU chip Disk controller Graphics adapter USB controller Mouse Keyboard Monitor Disk I/O bus Bus interface
Carnegie Mellon
38 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Pages: 512KB to 4KB, Blocks: 32 to 128 pages Data read/written in units of pages. Page can be written only after its block has been erased A block wears out after about 100,000 repeated writes.
Page 0 Page 1
Page P-1
Page 0 Page 1
Page P-1
Requests to read and write logical disk blocks
Carnegie Mellon
39 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Sequential access faster than random access
Random writes are somewhat slower
Carnegie Mellon
40 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Advantages
Disadvantages
Applications
Carnegie Mellon
41 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
0.0 0.1 1.0 10.0 100.0 1,000.0 10,000.0 100,000.0 1,000,000.0 10,000,000.0 100,000,000.0 1985 1990 1995 2000 2003 2005 2010 2015 Time (ns) Year Disk seek time SSD access time DRAM access time SRAM access time CPU cycle time Effective CPU cycle time
Carnegie Mellon
42 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
43 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Storage technologies and trends Locality of reference Caching in the memory hierarchy
Carnegie Mellon
44 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Principle of Locality: Programs tend to use data and
Temporal locality:
Spatial locality:
Carnegie Mellon
45 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Data references
Instruction references
sum = 0; for (i = 0; i < n; i++) sum += a[i]; return sum;
Carnegie Mellon
46 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Claim: Being able to look at code and get a qualitative
Question: Does this function have good locality with
Carnegie Mellon
47 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Question: Does this function have good locality with
Carnegie Mellon
48 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Question: Can you permute the loops so that the function
Carnegie Mellon
49 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Some fundamental and enduring properties of hardware
These fundamental properties complement each other
They suggest an approach for organizing memory and
Carnegie Mellon
50 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Storage technologies and trends Locality of reference Caching in the memory hierarchy
Carnegie Mellon
51 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Larger, slower, and cheaper (per byte) storage devices
Local disks hold files retrieved from disks
L1 cache holds cache lines retrieved from the L2 cache. CPU registers hold words retrieved from the L1 cache. L2 cache holds cache lines retrieved from L3 cache
Smaller, faster, and costlier (per byte) storage devices
L3 cache holds cache lines retrieved from main memory.
Main memory holds disk blocks retrieved from local disks.
Carnegie Mellon
52 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Cache: A smaller, faster storage device that acts as a staging
Fundamental idea of a memory hierarchy:
Why do memory hierarchies work?
Big Idea: The memory hierarchy creates a large pool of
Carnegie Mellon
53 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Larger, slower, cheaper memory viewed as partitioned into “blocks” Data is copied in block-sized transfer units Smaller, faster, more expensive memory caches a subset of the blocks
Carnegie Mellon
54 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Request: 14
Carnegie Mellon
55 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Request: 12
Request: 12
Carnegie Mellon
56 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Cold (compulsory) miss
Conflict miss
Capacity miss
Carnegie Mellon
57 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Hardware MMU On-Chip TLB Address translations TLB Web browser 10,000,000 Local disk Web pages Browser cache Web cache Network buffer cache Buffer cache Virtual Memory L2 cache L1 cache Registers
Web pages Parts of files Parts of files 4-KB pages 64-byte blocks 64-byte blocks 4-8 bytes words
Web proxy server 1,000,000,000 Remote server disks OS 100 Main memory Hardware 4 On-Chip L1 Hardware 10 On-Chip L2 NFS client 10,000,000 Local disk Hardware + OS 100 Main memory Compiler CPU core
Disk cache Disk sectors Disk controller 100,000 Disk firmware
Carnegie Mellon
58 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
The speed gap between CPU, memory and mass storage
Well-written programs exhibit a property called locality. Memory hierarchies based on caching close the gap by
Carnegie Mellon
59 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
60 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
d x w DRAM:
cols rows 1 2 3 1 2 3 Internal row buffer 16 x 8 DRAM chip addr data supercell (2,1)
2 bits / 8 bits /
Memory controller (to/from CPU)
Carnegie Mellon
61 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Cols Rows RAS = 2 1 2 3 1 2 Internal row buffer 16 x 8 DRAM chip 3 addr data
2 / 8 /
Memory controller
Carnegie Mellon
62 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Cols Rows 1 2 3 1 2 3 Internal row buffer 16 x 8 DRAM chip CAS = 1 addr data
2 / 8 /
Memory controller supercell (2,1) supercell (2,1)
Carnegie Mellon
63 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
: supercell (i,j) 64 MB memory module consisting of eight 8Mx8 DRAMs addr (row = i, col = j) Memory controller
DRAM 7 DRAM 0
31 7 8 15 16 23 24 32 63 39 40 47 48 55 56
64-bit word main memory address A
bits 0-7 bits 8-15 bits 16-23 bits 24-31 bits 32-39 bits 40-47 bits 48-55 bits 56-63
64-bit word
31 7 8 15 16 23 24 32 63 39 40 47 48 55 56
Carnegie Mellon
64 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Basic DRAM cell has not changed since its invention in 1966.
DRAM cores with better interface logic and faster I/O :
Carnegie Mellon
65 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
Carnegie Mellon
66 Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition