Main Memory
- Prof. Bracy and Van Renesse
CS 4410 Cornell University
Main Memory Prof. Bracy and Van Renesse CS 4410 Cornell University - - PowerPoint PPT Presentation
Main Memory Prof. Bracy and Van Renesse CS 4410 Cornell University based on slides designed by Prof. Sirer Agenda Review Address Translation Caching and Virtual Memory Virtualizing Resources Physical Reality: different
CS 4410 Cornell University
data1: dw 32 … start: lw r1,0(data1) jal checkit loop: addi r1, r1, -1 bnz r1,r0, loop … checkit: … 0x300 00000020 … … 0x900 8C2000C0 0x904 0C000340 0x908 2021FFFF 0x90C 1420FFFF … 0xD00 …
– Compile time (gcc) – Link/Load time (unix “ld” does link) – Execution time (dynamic libs)
– depends on hardware & OS
– Linking postponed until execution – Small piece of code (stub) used to locate the appropriate memory- resident library routine – OS checks if routine is in processes’ memory address – Stub replaces itself with the address
0x00000000 0xFFFFFFFF Application Operating System Valid 32-bit Addresses
0x00000000 0xFFFFFFFF Application1 Operating System Application2 0x00020000
0x00000000 0xFFFFFFFF Application1 Operating System Application2 0x00020000 Base=0x20000 Limit=0x10000
places in users address space
segments (called “pages”)
Registers
1 cycle 4-36 cycles 50-70 ns 5-20 ms
OS process 5 process 8 process 2 OS process 5 process 2 OS process 5 process 2 process 9 OS process 5 process 9 process 2 process 10
Code Non-zero Init’d Data Zero Init’d Data + Heap Stack Code Non-zero Init’d Data Zero Init’d Data + Heap Stack Device Registers Kernel User Virtual Address Space
Physical Memory Processor’s View
Code Data Heap 1 Code 1 Heap Data 1 Heap 2 Stack 1 Stack 0 Code Data Heap Stack VPage 0 VPage 1 VPage N Frame 0 Frame M
page number page offset p d m - n n
struct { int frame; bit is_valid, is_dirty, …; } PTE; struct PTE page_table[NUM_VIRTUAL_PAGES]; int translate(int vpn) { if (page_table[vpn].is_valid) return page_table[vpn].frame;
else… }
Frame Access
Physical Memory
Page Table Processor Frame 0 Frame 1 Frame M Page # Offset Virtual Address Page # Offset Virtual Address Frame Offset Physical Address Frame Offset Physical Address
32-byte memory 4-byte frames
Before allocation After allocation
page number page offset pi p2 d 12 10 10
– And replace an existing entry – But which? (stay tuned)
– Throws exceptions on illegal accesses – Often also tracks R/W/X accesses
– Shared code (typically) must appear in same location in the logical address space of all processes – Particularly useful for libraries
Physical memory P1 virtual memory
R/W
P2 virtual memory
R à R/W
54
– Loads entire process in memory, runs it, exit – Is slow (for big, long-lived processes) – Wasteful (might not require everything)
– Runs all processes concurrently, taking only pieces of memory (specifically, pages) away from each process – Finer granularity, higher performance – Paging completes separation between logical memory and physical memory – large virtual memory can be provided
55
– Allocate space and initialize page table for program and data – Allocate and initialize swap area – Info about PT and “swap space” is recorded in process table
– Reset MMU for new process – Flush the TLB
– Bring processes’ pages in memory
– Release pages
– Check to see if a frame with the code or data already exists – If not, allocate a frame and read content from executable file
– Map page for R/X only – Return from interrupt
– Check to see if a frame with the code or data already exists – If not, allocate a frame and read data from executable file – Map page for R/W access – Return from interrupt
– Allocate a frame and fill page with zero bytes – Map page for R/W access – Return from interrupt
57
59
– Select a frame and deallocate it
– Unmap any pages that map to this frame
– If the frame is “dirty” (modified), save it on disk so it can be restored later if needed
60
61
back from the program image stored on disk
62
– Used mainly for comparison
– Ignores usage
– Select page not used for longest time
– Past could be a good predictor of the future
63
1 1 2 2 1 3 3 2 1 4 3 2 4 1 3 1 4 2 2 1 4 5 2 1 5 1 2 1 5 2 2 1 5 3 2 3 5 4 4 3 5 5 4 3 5
ß contents of frames at time of reference
page fault hit marks arrival time
4
reference
64
1 1 2 2 1 3 3 2 1 4 4 3 2 1 1 4 3 2 1 2 4 3 2 1 5 4 3 2 5 1 4 3 1 5 2 4 2 1 5 3 3 2 1 5 4 3 2 3 4 5 3 2 5 4
ß contents of frames at time of reference
page fault hit marks arrival time
4
reference
65
66
1 1 2 2 1 3 3 2 1 4 4 3 2 1 1 4 3 2 1 2 4 3 2 1 5 5 3 2 1 1 5 3 2 1 2 5 3 2 1 3 5 3 2 1 4 5 3 2 4 5 5 3 2 4
67
68
1 1 2 2 1 3 3 2 1 4 4 3 2 1 1 4 3 2 1 2 4 3 2 1 5 4 5 2 1 1 4 5 2 1 2 4 5 2 1 3 3 5 2 1 4 3 4 2 1 5 3 4 2 5
page fault hit marks most recent use
4
– Large page lists – Timestamps are costly
Q: “I thought LRU was already an approximation…” A: “It is... Oh well…”
70
– Set on use, reset periodically by the OS – If no H/W, can be emulated in S/W
– FIFO + reference bit (keep pages in circular list)
– Implements “Not-Recently-Used”
– Low accuracy for large memory
– When to run
71
– Trailing edge clears ref bits – Trailing edge evicts pages with ref bit 0
to be evicted to be cleared
– Works well for data accessed only once, e.g. a movie file – Not a good fit for most other data, e.g. frequently accessed items
– No record of when the page was referenced – Use multiple bits. Shift right by 1 at regular intervals.
72
Valid Protection R/W/X Ref Dirty Index
75
76
77
78
79
– num pages touched in the interval (t-Δ .. t].
– during periods of poor locality, you reference more pages – During that period, you have a larger working set size
– If Σ |WSi| for all i runnable processes > |physical memory| à suspend a process
80
– Cannot tell (within interval of 5000) where reference
1 2 1 3 1 4 5 1 6 1 7 8