Chapter 4: Memory Management Part 1: Mechanisms for Managing Memory - - PowerPoint PPT Presentation
Chapter 4: Memory Management Part 1: Mechanisms for Managing Memory - - PowerPoint PPT Presentation
Chapter 4: Memory Management Part 1: Mechanisms for Managing Memory Memory management n Basic memory management n Swapping n Virtual memory n Page replacement algorithms n Modeling page replacement algorithms n Design issues for paging systems n
Chapter 4
2 CS 1550, cs.pitt.edu (originaly modified by Ethan
Memory management
n Basic memory management n Swapping n Virtual memory n Page replacement algorithms n Modeling page replacement algorithms n Design issues for paging systems n Implementation issues n Segmentation
Chapter 4
3 CS 1550, cs.pitt.edu (originaly modified by Ethan
In an ideal world…
n The ideal world has memory that is
n Very large n Very fast n Non-volatile (doesn’t go away when power is turned off)
n The real world has memory that is:
n Very large n Very fast n Affordable!
ÞPick any two…
n Memory management goal: make the real world look
as much like the ideal world as possible
Chapter 4
4 CS 1550, cs.pitt.edu (originaly modified by Ethan
Memory hierarchy
n What is the memory hierarchy?
n Different levels of memory n Some are small & fast n Others are large & slow
n What levels are usually included?
n Cache: small amount of fast, expensive memory
n L1 (level 1) cache: usually on the CPU chip n L2 & L3 cache: off-chip, made of SRAM
n Main memory: medium-speed, medium price memory (DRAM) n Disk: many gigabytes of slow, cheap, non-volatile storage
n Memory manager handles the memory hierarchy
Chapter 4
5 CS 1550, cs.pitt.edu (originaly modified by Ethan
Basic memory management
n Components include
n Operating system (perhaps with device drivers) n Single process
n Goal: lay these out in memory
n Memory protection may not be an issue (only one program) n Flexibility may still be useful (allow OS changes, etc.)
n No swapping or paging
Operating system (RAM) User program (RAM)
0xFFFF 0xFFFF
User program (RAM) Operating system (ROM) Operating system (RAM) User program (RAM) Device drivers (ROM)
Chapter 4
6 CS 1550, cs.pitt.edu (originaly modified by Ethan
Fixed partitions: multiple programs
n Fixed memory partitions
n Divide memory into fixed spaces n Assign a process to a space when it’s free
n Mechanisms
n Separate input queues for each partition n Single input queue: better ability to optimize CPU usage
OS Partition 1 Partition 2 Partition 3 Partition 4 100K 500K 600K 700K 900K OS Partition 1 Partition 2 Partition 3 Partition 4 100K 500K 600K 700K 900K
Chapter 4
7 CS 1550, cs.pitt.edu (originaly modified by Ethan
How many programs is enough?
n Several memory partitions (fixed or variable size) n Lots of processes wanting to use the CPU n Tradeoff
n More processes utilize the CPU better n Fewer processes use less memory (cheaper!)
n How many processes do we need to keep the CPU
fully utilized?
n This will help determine how much memory we need n Is this still relevant with memory costing less than $1/GB?
Chapter 4
8 CS 1550, cs.pitt.edu (originaly modified by Ethan
Modeling multiprogramming
n More I/O wait means less
processor utilization
n At 20% I/O wait, 3–4
processes fully utilize CPU
n At 80% I/O wait, even 10
processes aren’t enough
n This means that the OS
should have more processes if they’re I/O bound
n More processes =>
memory management & protection more important!
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 9 10 Degree of Multiprogramming CPU Utilization 80% I/O Wait 50% I/O Wait 20% I/O Wait
Chapter 4
9 CS 1550, cs.pitt.edu (originaly modified by Ethan
Multiprogrammed system performance
n Arrival and work requirements of 4 jobs n CPU utilization for 1–4 jobs with 80% I/O wait n Sequence of events as jobs arrive and finish
n Numbers show amount of CPU time jobs get in each interval n More processes => better utilization, less time per process
Job Arrival time CPU needed 1 10:00 4 2 10:10 3 3 10:15 2 4 10:20 2 1 2 3 4 CPU idle 0.80 0.64 0.51 0.41 CPU busy 0.20 0.36 0.49 0.59 CPU/process 0.20 0.18 0.16 0.15 10 15 20 22 27.6 28.2 31.7 1 2 3 4 Time
Chapter 4
10 CS 1550, cs.pitt.edu (originaly modified by Ethan
Memory and multiprogramming
n Memory needs two things for multiprogramming
n Relocation n Protection
n The OS cannot be certain where a program will be
loaded in memory
n Variables and procedures can’t use absolute locations in
memory
n Several ways to guarantee this
n The OS must keep processes’ memory separate
n Protect a process from other processes reading or
modifying its own memory
n Protect a process from modifying its own memory in
undesirable ways (such as writing to program code)
Chapter 4
11 CS 1550, cs.pitt.edu (originaly modified by Ethan
Base and limit registers
n
Special CPU registers: base & limit
n Access to the registers limited to
system mode
n Registers contain
n Base: start of the process’s
memory partition
n Limit: length of the process’s
memory partition
n
Address generation
n Physical address: location in
actual memory
n Logical address: location from
the process’s point of view
n Physical address = base + logical
address
n Logical address larger than limit
=> error
Process partition OS
0xFFFF
Limit Base
0x2000 0x9000
Logical address: 0x1204 Physical address: 0x1204+0x9000 = 0xa204
Chapter 4
12 CS 1550, cs.pitt.edu (originaly modified by Ethan
Swapping
n Memory allocation changes as
n Processes come into memory n Processes leave memory
n Swapped to disk n Complete execution
n Gray regions are unused memory
OS OS OS OS OS OS OS A A B A B C B C B C D C D C D A
Chapter 4
13 CS 1550, cs.pitt.edu (originaly modified by Ethan
Swapping: leaving room to grow
n Need to allow for programs
to grow
n Allocate more memory for
data
n Larger stack
n Handled by allocating more
space than is necessary at the start
n Inefficient: wastes memory
that’s not currently in use
n What if the process requests
too much memory? OS Code Data Stack Code Data Stack Process B Process A Room for B to grow Room for A to grow
Chapter 4
14 CS 1550, cs.pitt.edu (originaly modified by Ethan
Tracking memory usage: bitmaps
n
Keep track of free / allocated memory regions with a bitmap
n One bit in map corresponds to a fixed-size region of memory n Bitmap is a constant size for a given amount of memory regardless of how
much is allocated at a particular time
n
Chunk size determines efficiency
n At 1 bit per 4KB chunk, we need just 256 bits (32 bytes) per MB of memory n For smaller chunks, we need more memory for the bitmap n Can be difficult to find large contiguous free areas in bitmap
A B C D
11111100 00111000 01111111 11111000
8 16 24 32
Memory regions Bitmap
Chapter 4
15 CS 1550, cs.pitt.edu (originaly modified by Ethan
Tracking memory usage: linked lists
n
Keep track of free / allocated memory regions with a linked list
n Each entry in the list corresponds to a contiguous region of memory n Entry can indicate either allocated or free (and, optionally, owning process) n May have separate lists for free and allocated areas
n
Efficient if chunks are large
n Fixed-size representation for each region n More regions => more space needed for free lists
A B C D
16 24 32
Memory regions
A 6
- 6
4 B 10 3
- 13 4
C 17 9
- 29 3
D 26 3
8
Chapter 4
16 CS 1550, cs.pitt.edu (originaly modified by Ethan
Allocating memory
n
Search through region list to find a large enough space
n
Suppose there are several choices: which one to use?
n First fit: the first suitable hole on the list n Next fit: the first suitable after the previously allocated hole n Best fit: the smallest hole that is larger than the desired region (wastes least
space?)
n Worst fit: the largest available hole (leaves largest fragment)
n
Option: maintain separate queues for different-size holes
- 6
5
- 19 14
- 52 25
- 102 30
- 135 16
- 202 10
- 302 20
- 350 30
- 411 19
- 510 3
Allocate 20 blocks first fit 5 Allocate 12 blocks next fit 18 Allocate 13 blocks best fit 1 Allocate 15 blocks worst fit 15
Chapter 4
17 CS 1550, cs.pitt.edu (originaly modified by Ethan
Freeing memory
n Allocation structures must be updated when memory is freed n Easy with bitmaps: just set the appropriate bits in the bitmap n Linked lists: modify adjacent elements as needed
n Merge adjacent free regions into a single region n May involve merging two regions with the just-freed area
A X B A X X B X A B A B
Chapter 4
18 CS 1550, cs.pitt.edu (originaly modified by Ethan
Limitations of swapping
n Problems with swapping
n Process must fit into physical memory (impossible to run
larger processes)
n Memory becomes fragmented
n External fragmentation: lots of small free areas n Compaction needed to reassemble larger free areas
n Processes are either in memory or on disk: half and half
doesn’t do any good
n Overlays solved the first problem
n Bring in pieces of the process over time (typically data) n Still doesn’t solve the problem of fragmentation or
partially resident processes
Chapter 4
19 CS 1550, cs.pitt.edu (originaly modified by Ethan
Virtual memory
n Basic idea: allow the OS to hand out more memory
than exists on the system
n Keep recently used stuff in physical memory n Move less recently used stuff to disk n Keep all of this hidden from processes
n Processes still see an address space from 0 – max address n Movement of information to and from disk handled by the
OS without process help
n Virtual memory (VM) especially helpful in
multiprogrammed system
n CPU schedules process B while process A waits for its
memory to be retrieved from disk
Chapter 4
20 CS 1550, cs.pitt.edu (originaly modified by Ethan
Virtual and physical addresses
n Program uses virtual
addresses
n Addresses local to the process n Hardware translates virtual
address to physical address
n Translation done by the
Memory Management Unit
n Usually on the same chip as
the CPU
n Only physical addresses leave
the CPU/MMU chip
n Physical memory indexed
by physical addresses CPU chip
CPU
Memory Disk controller
MMU
Virtual addresses from CPU to MMU Physical addresses
- n bus, in memory
Chapter 4
21 CS 1550, cs.pitt.edu (originaly modified by Ethan
0–4K 4–8K 8–12K 12–16K 16–20K 20–24K 24–28K 28–32K
Paging and page tables
n
Virtual addresses mapped to physical addresses
n Unit of mapping is called a page n All addresses in the same virtual
page are in the same physical page
n Page table entry (PTE) contains
translation for a single page
n
Table translates virtual page number to physical page number
n Not all virtual memory has a
physical page
n Not every physical page need be
used
n
Example:
n 64 KB virtual memory n 32 KB physical memory
7 0–4K 4 4–8K 8–12K 12–16K 16–20K 20–24K 24–28K 3 28–32K 32–36K 36–40K 1 40–44K 5 44–48K 6 48–52K
- 52–56K
56–60K
- 60–64K
Virtual address space Physical memory
Chapter 4
22 CS 1550, cs.pitt.edu (originaly modified by Ethan
What’s in a page table entry?
n
Each entry in the page table contains
n Valid bit: set if this logical page number has a corresponding physical frame
in memory
n If not valid, remainder of PTE is irrelevant
n Page frame number: page in physical memory n Referenced bit: set if data on the page has been accessed n Dirty (modified) bit :set if data on the page has been modified n Protection information
Page frame number V R D Protection Valid bit Referenced bit Dirty bit
Chapter 4
23 CS 1550, cs.pitt.edu (originaly modified by Ethan
Example:
- 4 KB (=4096 byte) pages
- 32 bit logical addresses
p d
2d = 4096 d = 12 12 bits 32 bit logical address 32-12 = 20 bits
Mapping logical => physical address
n Split address from CPU into
two pieces
n Page number (p) n Page offset (d)
n Page number
n Index into page table n Page table contains base
address of page in physical memory
n Page offset
n Added to base address to get
actual physical memory address
n Page size = 2d bytes
Chapter 4
24 CS 1550, cs.pitt.edu (originaly modified by Ethan
page number p d page offset 1 p-1 p p+1 f f d Page frame number
. . .
page table physical memory 1
. . .
f-1 f f+1 f+2
. . .
Page frame number
CPU
Address translation architecture
Chapter 4
25 CS 1550, cs.pitt.edu (originaly modified by Ethan
Page frame number Logical memory (P0) 1 2 3 4 5 6 7 8 9
Physical memory
Page table (P0) Logical memory (P1) Page table (P1)
Page 4 Page 3 Page 2 Page 1 Page 0 Page 1 Page 0 8 2 9 4 3 6 Page 3 (P0) Page 0 (P1) Page 0 (P0) Page 2 (P0) Page 1 (P0) Page 4 (P0) Page 1 (P1) Free pages
Memory & paging structures
Chapter 4
26 CS 1550, cs.pitt.edu (originaly modified by Ethan
884 960 955
. . .
220 657 401
. . .
1st level page table 2nd level page tables . . . . . . . . . . . . . . . . . . . . . . . . . . . main memory
. . .
125 613 961
. . .
Two-level page tables
n
Problem: page tables can be too large
n 232 bytes in 4KB pages need 1
million PTEs
n
Solution: use multi-level page tables
n “Page size” in first page table is
large (megabytes)
n PTE marked invalid in first page
table needs no 2nd level page table
n
1st level page table has pointers to 2nd level page tables
n
2nd level page table has actual physical page numbers in it
Chapter 4
27 CS 1550, cs.pitt.edu (originaly modified by Ethan
More on two-level page tables
n Tradeoffs between 1st and 2nd level page table sizes
n Total number of bits indexing 1st and 2nd level table is
constant for a given page size and logical address length
n Tradeoff between number of bits indexing 1st and number
indexing 2nd level tables
n More bits in 1st level: fine granularity at 2nd level n Fewer bits in 1st level: maybe less wasted space?
n All addresses in table are physical addresses n Protection bits kept in 2nd level table
Chapter 4
28 CS 1550, cs.pitt.edu (originaly modified by Ethan
p1 = 10 bits p2 = 9 bits
- ffset = 13 bits
page offset page number
Two-level paging: example
n
System characteristics
n 8 KB pages n 32-bit logical address divided into 13 bit page offset, 19 bit page number
n
Page number divided into:
n 10 bit page number n 9 bit page offset
n
Logical address looks like this:
n p1 is an index into the 1st level page table n p2 is an index into the 2nd level page table pointed to by p1
Chapter 4
29 CS 1550, cs.pitt.edu (originaly modified by Ethan
. . . . . .
2-level address translation example
p1 = 10 bits p2 = 9 bits
- ffset = 13 bits
page offset page number
. . .
1 p1
. . .
1 p2
19
physical address 1st level page table 2nd level page table main memory 1 frame number
13
Page table base
. . . . . .
Chapter 4
30 CS 1550, cs.pitt.edu (originaly modified by Ethan
Implementing page tables in hardware
n Page table resides in main (physical) memory n CPU uses special registers for paging
n Page table base register (PTBR) points to the page table n Page table length register (PTLR) contains length of page table:
restricts maximum legal logical address
n Translating an address requires two memory accesses
n First access reads page table entry (PTE) n Second access reads the data / instruction from memory
n Reduce number of memory accesses
n Can’t avoid second access (we need the value from memory) n Eliminate first access by keeping a hardware cache (called a
translation lookaside buffer or TLB) of recently used page table entries
Chapter 4
31 CS 1550, cs.pitt.edu (originaly modified by Ethan
Logical page # Physical frame #
Example TLB
8 unused 2 3 12 29 22 7 3 1 12 6 11 4
Translation Lookaside Buffer (TLB)
n
Search the TLB for the desired logical page number
n Search entries in parallel n Use standard cache techniques
n
If desired logical page number is found, get frame number from TLB
n
If desired logical page number isn’t found
n Get frame number from page
table in memory
n Replace an entry in the TLB
with the logical & physical page numbers from this reference
Chapter 4
32 CS 1550, cs.pitt.edu (originaly modified by Ethan
Handling TLB misses
n If PTE isn’t found in TLB, OS needs to do the lookup in the
page table
n Lookup can be done in hardware or software n Hardware TLB replacement
n CPU hardware does page table lookup n Can be faster than software n Less flexible than software, and more complex hardware
n Software TLB replacement
n OS gets TLB exception n Exception handler does page table lookup & places the result into the
TLB
n Program continues after return from exception n Larger TLB (lower miss rate) can make this feasible
Chapter 4
33 CS 1550, cs.pitt.edu (originaly modified by Ethan
How long do memory accesses take?
n Assume the following times:
n TLB lookup time = a (often zero - overlapped in CPU) n Memory access time = m
n Hit ratio (h) is percentage of time that a logical page number
is found in the TLB
n Larger TLB usually means higher h n TLB structure can affect h as well
n Effective access time (an average) is calculated as:
n EAT = (m + a)h + (m + m + a)(1-h) n EAT =a + (2-h)m
n Interpretation
n Reference always requires TLB lookup, 1 memory access n TLB misses also require an additional memory reference
Chapter 4
34 CS 1550, cs.pitt.edu (originaly modified by Ethan
Inverted page table
n Reduce page table size further: keep one entry for
each frame in memory
n PTE contains
n Virtual address pointing to this frame n Information about the process that owns this page
n Search page table by
n Hashing the virtual page number and process ID n Starting at the entry corresponding to the hash result n Search until either the entry is found or a limit is reached
n Page frame number is index of PTE n Improve performance by using more advanced
hashing algorithms
Chapter 4
35 CS 1550, cs.pitt.edu (originaly modified by Ethan
pid1 pidk pid0
Inverted page table architecture
process ID p = 19 bits
- ffset = 13 bits