BS1 WS19/20 – topic-based slides
Paging
- Basic method
- Hardware support
- TLB
- Memory Protection
- Hierarchical Page Tables
Paging Basic method Hardware support TLB Memory Protection - - PowerPoint PPT Presentation
BS1 WS19/20 topic-based slides Paging Basic method Hardware support TLB Memory Protection Hierarchical Page Tables Hashed Page Tables Inverted Page Tables Shared Pages Virtual Address space under different
BS1 WS19/20 – topic-based slides
2
Operating Systems 15
lead to fragmentation
a) Compaction – dynamic relocation of processes b) Noncontiguous allocation of process memory in equally sized pages (this avoids the memory fjtting problem)
16
frames from backing store (disk)
CPU f d p d Physical memory Logical address Physical address p Page table Page number
Page frames are typically 2-4 kb
Operating Systems 17
Operating Systems 18
physical memory
frame number
physical memory frame number
Operating Systems 19
Operating Systems 20
Page # Frame #
Operating Systems 21
CPU f d p d Physical memory Logical address Physical address p Page table Page number
Page # Frame #
TLB miss TLB hit
Operating Systems 22
associative registers;
EAT = (1μs + ε) α + (2μs + ε) (1 – α)
Operating Systems 23
Operating Systems 24
physical memory
frame number
Operating Systems 25
Operating Systems 26
where pi is an index into the outer page table, and p2 is the displacement within the page of the outer page table
page number page offset pi p2 d 10 10 12
Operating Systems 27
Operating Systems 28
page number page offset p1 p2 d 10 10 12 page directory
p1 p2
Operating Systems 29
Operating Systems 30
CPU f d p d Physical memory Logical address Physical address Page table Page number
p f q r hash function
Operating Systems 31
Operating Systems 32
CPU f d p d Physical memory Logical address Physical address Page table Page number
pid p search pid Process ID
Operating Systems 33
Operating Systems 34
Process 1 virtual memory Process 2 virtual memory
Process 1 page table Process 2 page table
35
The innovative MULTICS operating system introduced:
segment length page-table base s d >= Trap + yes no segment table base register d p d‘ + f f d‘ physical memory physical address page table for segment s logical address segment table
Operating Systems 36
10 10 12
22 31 21 11 Intel Linear Address 12
4Mb PDE 4Kb PDE
Page directory 1024x4byte entries (one per process)
cr 3
Physical address
PTE
Page table 1024 entries Physical Address
4 Kb page
4 Mb page 22 bit
4KiB frame 4MiB frame Physical Memory limit base
s +
descriptor table
selector
Intel logical Address
Operating Systems 2
Virtual pages Physical memory Page table entries
10 10 12
22 31 21 11 12 Page directory index Page table index Byte index
x86:
user system user system
Operating Systems 3
directly reachable from other processes
and appears in every process’s address space
(though there are processes that do things for the OS, more or less in “background”)
serious limitation (think compilers, browsers)
Code: EXE/DLLs Data: EXE/DLL static storage, per- thread user mode stacks, process heaps, etc. Code: EXE/DLLs Data: EXE/DLL static storage, per- thread user mode stacks, process heaps, etc. 00000000 7FFFFFFF Code: NTOSKRNL, HAL, drivers Data: kernel stacks, File system cache Non-paged pool, Paged pool Code: NTOSKRNL, HAL, drivers Data: kernel stacks, File system cache Non-paged pool, Paged pool FFFFFFFF 80000000 Process page tables, hyperspace C0000000 Unique per process, accessible in user or kernel mode System wide, accessible
mode Per process, accessible only in kernel mode
Operating Systems 4
Operating Systems 5
(for fjle mapping)
aware” fmag in image header, or they’re limited to 2 GB (specify at link time or with imagecfg.exe from ResKit)
system cache
GB of physical RAM
Unique per process (= per appl.), user mode
.EXE code Globals Per-thread user mode stacks .DLL code Process heaps Exec, kernel, HAL, drivers, etc. 00000000 BFFFFFFF FFFFFFFF C0000000 Unique per process, accessible in user or kernel mode System wide, accessible
mode Per process, accessible
kernel mode Process page tables, hyperspace
Operating Systems 6
1. Although each process can only address 2 GB, many may be in memory at the same time (e.g. 5 * 2 GB processes = 10 GB) 2. Files in system cache remain in physical memory
–
memory manager keeps unmapped data in physical memory
Working Set Assigned to Virtual Cache
Standby List 960 MB Other ~60 GB 64 GB Physical Memory
Operating Systems 7
to allocate large amounts of physical memory and then map “windows” into that memory
large databases
embedded controllers
AWE memory Physical memory Process virtual memory AWE memory AWE memory
Operating Systems 8
(16 exabytes) total
User mode space per process 00000000 00000000 E0000000 00000000 FFFFFF00 00000000 FFFFFFFF FFFFFFFF System space page tables 6FC 00000000 Kernel mode per process 1FFFFF00 00000000 Process page tables 20000000 00000000 Session space Session space page tables System space 3FFFFF00 00000000 E0000600 00000000
Operating Systems 9
process contains same entries (with a few exceptions), which point to system- wide page tables
tables that map the process page tables
Page Directories (one per process) Sets of per-process page tables System-wide page tables
Operating Systems 10
– individual entries take a multiple of machine word size
Operating Systems 11
(phys. addr. in KPROCESS block, at 0xC0300000, in cr3 (x86))
describe state/location of page tables for this process
– Page tables are created on demand
Operating Systems 12
Res (writable on MP Systems) Res Res Global Res (large page if PDE) Dirty Accessed Cache disabled Write through Owner Write (writable on MP Systems) valid
Reserved bits are used only when PTE is not valid
31 12 Page frame number V U P Cw Gi L D A Cd Wt O W
Operating Systems 13
Name of Bit Meaning on x86 Accessed Page has been read Cache disabled Disables caching for that page Dirty Page has been written to Global Translation applies to all processes (a translation buffer flush won‘t affect this PTE) Large page Indicates that PDE maps a 4MB page (used to map kernel) Owner Indicates whether user-mode code can access the page of whether the page is limited to kernel mode access Valid Indicates whether translation maps to page in phys. Mem. Write through Disables caching of writes; immediate flush to disk Write Uniproc: Indicates whether page is read/write or read-only; Multiproc: ind. whether page is writeable/write bit in res. bit
Operating Systems 14
if list is empty, pager takes list from standby list and zeros it;
and ofgset are zero
Page file offset Protection Page File No Transition Prototype Valid 31 12 11 10 9 5 4 1
Operating Systems 15
write list
–
examine virtual address space descriptors (VADs) to see whether this virtual address has been reserved
Page Frame Number Protection 1 Transition Prototype Protection Cache disable Write through Owner Write Valid 31 12 11 10 9 5 4 1 1 2 3
Operating Systems 16
Operating Systems 17
Standby Page List Zero Page List Free Page List Process Working Sets
page read from disk or kernel allocations demand zero page faults working set replacement
Modified Page List
modified page writer zero page thread “soft” page faults
Bad Page List
Private pages at process exit
Operating Systems 18
page list
the bottom of:
–
Decision made based on “D” (dirty = modifjed) bit in page table entry
maintained while the page is on either of these lists
Operating Systems 19
Operating Systems 20
thread is awoken to write pages out
Operating Systems 21
– References to private pages that have not been created yet
awoken to zero them
Operating Systems 22
Standby Page List Zero Page List Free Page List Process Working Sets
page read from disk or kernel allocations demand zero page faults working set replacement
Modified Page List
modified page writer zero page thread “soft” page faults
Bad Page List
Private pages at process exit
Operating Systems 23
Operating Systems 24
Status Description Active/valid Page is part of working set (sys/proc), valid PTE points to it Transition Page not owned by a working set, not on any paging list I/O is in progress on this page Standby Page belonged to a working set but was removed; not modified Modified Removed from working set, modified, not yet written to disk Modified no write Modified page, will not be touched by modified page write, used by NTFS for pages containing log entries (explicit flushing) Free Page is free but has dirty data in it – cannot be given to user process – C2 security requirement Zeroed Page is free and has been initialized by zero page thread Bad Page has generated parity or other hardware errors
Operating Systems 25
valid Invalid: disk address Invalid: transition valid Invalid: disk address Valid valid Invalid: transition Invalid: disk address Prototype PTE Process 1 page table Process 2 page table Process 3 page table Active Standby Active Active Modified Zeroed Free Standby Modified Bad Modified no write
Operating Systems 26
Operating Systems 27
Other threads in process may issue VM functions, but:
Operating Systems 28
Operating Systems 29
perform automatic stack expansion)
working set
(if paged pool expanded after process directory was created)
dismiss exception
Operating Systems 30
A thread‘s user-mode stack is constructed using this 2-phase approach: initial reserved size is 1MB,
Operating Systems 31
–
Cached fjles are faulted into system working set
–
Not Memory->Page Faults/sec, as that includes soft page faults
function)
Operating Systems 32
(“working set replacement”)
–
approximately RAM minus 512 pages (2 MB on x86) minus min size of system working set (1.5 MB on x86)
–
Interesting to view (gives you an idea how much memory you’ve “lost” to the OS)
Operating Systems 33
PerfMon Process “WorkingSet” newer pages
Operating Systems 34
(and this time, it succeeds)
Operating Systems 35
– Points to itself – Map the page table of the hyperspace – Map system paged and nonpaged areas – Map system cache page table pages
Operating Systems 18
3 2 1 3 2 4 3 2 1 4 3 3 3 4 4 4 4 4 4
2 2 3 3 3 3 3 1 1 1
1 1 2 2 2 2 2
Operating Systems 19
3 2 1 3 2 4 3 2 1 4 3 3 3 3 3 3 4 4 4 4
2 2 2 2 2 3 3 3 3 4
1 1 1 1 1 2 2 2 2
1 1 3 2 1 3 2 4 3 2 1 4
2 1 3 2 4 3 2 1 4 3 3 3 4 4 4 4 4 4
2 2 3 3 3 3 3 1 1 1
1 1 2 2 2 2 2
Operating Systems 20
3 2 1 3 2 4 3 2 1 4 3 3 3 4 4 4 1 1 1
2 2 3 3 3 3 3 3
1 1 2 2 2 2 2 2 4
Operating Systems 21
3 2 1 3 2 4 3 2 1 4 3 3 3 4 4 4 1 1 1
2 2 3 3 3 3 3 3
1 1 2 2 2 2 2 2 4
Operating Systems 6
pages
Operating Systems 7
PFN Valid PFN n Invalid - points to prototype PTE Valid PFN n Invalid – in page file Segment structure PFN n PFN n
PTE address
Share count=1 PFN entry Physical memory Prototype page tables Page table Page directory
Operating Systems 11
– pager makes a copy of the page and allocates it privately
– e.g. writeable data initialized with C initializers
Operating Systems 12
Physical memory Page 3 Page 1 Process Address Space
Process Address Space
Page 2
Operating Systems 13
Process Address Space Physical memory Process Address Space
Page 3 Page 1 Page 2 Mod’d. Data Copy of page 2
Operating Systems 16
Process “WorkingSet” to standby
page list
Operating Systems 17
–
trims working sets of processes
–
if thread in a long user-mode wait, marks kernel stack pages as pageable
–
if process has no nonpageable kernel stacks, “outswaps” process
–
triggers a separate thread to do the “outswap” by gradually reducing target process’s working set limit to zero
be allocated so it can be paged back in
Operating Systems 22
“system working set”
the cache; in NT4, same as “File Cache” on Task Manager / Performance tab)
Operating Systems 24
removed from the free, modifjed, or standby lists and made part of the process working set
processes’ working sets at one time (this case not illustrated here)
Process 3 Process 3 Process 2 Process 2 Process 1 Process 1
00000000 7FFFFFFF 80000000
Pages in Physical Memory F F F F M M M M M S S S S F F F F F F F F 3 3 3 1 2 1 2 2 1
Operating Systems 25
in the page fjle
memory is deleted (e.g. at process exit), even if the page is read back from disk
Operating Systems 27
be?
(“Commit Charge Peak”)
–
Hard disk space is cheap, so why not double this
Operating Systems 28
Operating Systems 29
–
First time: Before pagefjle expansion
–
Second time: When committed bytes reaching commit limit
–
Page fjles are full
–
–
Run poolmon to see what object(s) are fjlling pool
–
Could be a result of processes not closing handles - check process “handle count” in Task Manager