Operating Systems WT 2019/20 Memory Management Shared Memory - - PowerPoint PPT Presentation
Operating Systems WT 2019/20 Memory Management Shared Memory - - PowerPoint PPT Presentation
Operating Systems WT 2019/20 Memory Management Shared Memory Process 1 virtual memory most modern OSs provide a way for processes to share Physical memory memory compiler e.g. code pages for image executables Process 2 virtual
Operating Systems 2
Shared Memory
- most modern OS’s provide a
way for processes to share memory
- e.g. code pages for
executables
- Processes can create shared
memory sections manually
compiler image Physical memory Process 1 virtual memory Process 2 virtual memory
Operating Systems 3
Shared Memory
- Note: the shared
region may be mapped at difgerent addresses in the difgerent processes
- Pointers!
00000000 7FFFFFFF
User accessible v.a.s. User accessible v.a.s.
Process A Process B
Physical Memory
Operating Systems 4
- Open an existing mapping
- bject
- Name comes from
previous CreateFileMapping() call
- First process creates mapping, subsequent processes open mapping
- dwDesiredAccess: same as fdwProtect
- lpName: name created with CreateFileMapping()
- CloseHandle() destroys mapping handles
Shared Memory
HANDLE OpenFileMapping (HANDLE hFile, DWORD dwDesiredAccess, BOOL bInheritHandle, LPCTSTR lpName );
Operating Systems 5
UNIX – Shared Memory (recap)
- Processes can share memory explicitly
int shm_open(const char *name, int ofmag, mode_t mode)
- Shared memory segments are a named resource
- Represented within process resources through fjle descriptor
- Can be mapped to process address space using mmap
- Access has to be synchronized (critical sections)
Operating Systems 6
Prototype Page Table Entries (PTEs)
- Software structure to manage page table redundancy for potentially shared
pages
- Array of prototype PTEs is created as part of shared memory object
- Provide reference count for shared pages
- Shared page valid:
- process & prototype PTE point to physical page
- Page invalidated:
- process PTE points to prototype PTE
- Prototype PTE describes 5 states for shared page:
- Active/valid, Transition, Demand zero, Page fjle, Mapped fjle
- Layer between page table and page frame number database
Operating Systems 7
Prototype PTEs – the bigger picture
- Two virtual pages in a mapped view
- First page is valid; 2nd page is invalid and in page fjle
- Prototype PTE contains exact location
- Process PTE points to prototype PTE
PFN Valid PFN n Invalid - points to prototype PTE Valid PFN n Invalid – in page file Segment structure PFN n PFN n
PTE address
Share count=1 PFN entry Physical memory Prototype page tables Page table Page directory
Operating Systems 8
Memory-Mapped Files
- Process fjle data without manual, bufgered fjle I/O (read/write)
- Appears in memory as if read into a bufger in its entirety
- OS does the heavy lifting effjciently and reliably
- Data structures will be saved verbatim – be careful with pointers
- Convenient & effjcient in-memory algorithms:
- Can process data much larger than physical memory
- Improved performance for fjle processing (prefetching)
- No need to manage bufgers and fjle data
- No need to consume space in paging fjle
- Multiple processes can share memory
Operating Systems 9
File Mapping Object
Parameters:
- hFile:
- hFile: handle to open fjle with
compatible access rights (fdwProtect)
- hFile == 0xFFFFFFFF: paging fjle,
no need to create separate fjle
- fdwProtect:
- PAGE_READONLY, PAGE_READWRITE, PAGE_WRITECOPY
- dwMaximumSizeHigh, dwMaximumSizeLow:
- Zero: current fjle size is used
- lpszMapName:
- Name of mapping object for sharing between processes or NULL
HANDLE CreateFileMapping (HANDLE hFile, LPSECURITY_ATTRIBUTES lpsa, DWORD fdwProtect, DWORD dwMaximumSizeHigh, DWORD dwMaximumSizeLow, LPCTSTR lpszMapName );
Operating Systems 10
Shared Memory File Mapping
00000000 7FFFFFFF
User accessible v.a.s. User accessible v.a.s.
Process A Process B
Physical Memory
Operating Systems 11
Copy-On-Write Pages
- Pages are originally set up as shared, read-only
- Access violation on write attempt
– pager makes a copy of the page and allocates it privately
to the process doing the write
- So, only need unique copies for the pages in the shared
region that are actually written (example of “lazy evaluation”)
- Original values of data are still shared
– e.g. writeable data initialized with C initializers
- Data copied to new physical page, but virtual addresses
unaltered
Operating Systems 12
Copy-On-Write Pages
Physical memory Page 3 Page 1 Process Address Space
- Orig. Data
Process Address Space
- Orig. Data
Page 2
Operating Systems 13
Copy-On-Write Pages
Process Address Space Physical memory Process Address Space
- Orig. Data
Page 3 Page 1 Page 2 Mod’d. Data Copy of page 2
Operating Systems 14
Conclusions
10 10 12
22 31 21 11 Intel Linear Address 12
4Mb PDE 4Kb PDE
Page directory 1024x4byte entries (one per process)
cr 3
Physical address
PTE
Page table 1024 entries Physical Address
- perand
4 Kb page
- perand
4 Mb page 22 bit
- ffset
4KiB frame 4MiB frame Physical Memory limit base
s +
descriptor table
selector
- ffset
Intel logical Address
Operating Systems 15
Conclusions
Standby Page List Zero Page List Free Page List Process Working Sets
page read from disk or kernel allocations demand zero page faults working set replacement
Modified Page List
modified page writer zero page thread “soft” page faults
Bad Page List
Private pages at process exit
Operating Systems 16
Working Set Replacement
- When working set max reached (or working set trim occurs),
process must give up pages to make room for new pages
- a single process cannot take over all of physical memory unless
- ther processes aren’t using it
- Question: Which pages to remove?
Process “WorkingSet” to standby
- r modified
page list
Operating Systems 17
Balance Set Manager
- Balance set = sum of all working sets
- Balance Set Manager is a system thread
- Wakes up every second. If paging activity high or memory needed:
–
trims working sets of processes
–
if thread in a long user-mode wait, marks kernel stack pages as pageable
–
if process has no nonpageable kernel stacks, “outswaps” process
–
triggers a separate thread to do the “outswap” by gradually reducing target process’s working set limit to zero
- Evidence: Look for threads in “Transition” state in PerfMon
- Means that kernel stack has been paged out, and thread is waiting for memory to
be allocated so it can be paged back in
Operating Systems 18
Page Replacement: First In First Out (FIFO)
- Access to Memory Pages, in order:
3, 2, 1, 0, 3, 2, 4, 3, 2, 1, 0, 4
- If a page needs to be replaced, evict the oldest page
3 2 1 3 2 4 3 2 1 4 3 3 3 4 4 4 4 4 4
- 2
2 2 3 3 3 3 3 1 1 1
- 1
1 1 2 2 2 2 2
Operating Systems 19
- increasing the number of page frames results in an increase in
the number of page faults for a given memory access pattern
3 frames 4 frames
Bélády's anomaly – FIFO anomaly
3 2 1 3 2 4 3 2 1 4 3 3 3 3 3 3 4 4 4 4
- 2
2 2 2 2 2 3 3 3 3 4
- 1
1 1 1 1 1 2 2 2 2
- 1
1 1 3 2 1 3 2 4 3 2 1 4
- 3
2 1 3 2 4 3 2 1 4 3 3 3 4 4 4 4 4 4
- 2
2 2 3 3 3 3 3 1 1 1
- 1
1 1 2 2 2 2 2
Operating Systems 20
Page Replacement: Least Recently Used (LRU)
- If a page needs to be replaced, evict the longest unused page
- expensive: every read access requires write to timestamp
- expensive: fjnding page to replace needs lots of comparisons
3 2 1 3 2 4 3 2 1 4 3 3 3 4 4 4 1 1 1
- 2
2 2 3 3 3 3 3 3
- 1
1 1 2 2 2 2 2 2 4
Operating Systems 21
Page Replacement: Second Chance
- Extension to FIFO, favoring frequently accessed pages
- set “accessed” bit to 1 on page access (not page load)
- If a page needs to be replaced, evict the oldest page with unset
access bit, then unset all access bits
3 2 1 3 2 4 3 2 1 4 3 3 3 4 4 4 1 1 1
- 2
2 2 3 3 3 3 3 3
- 1
1 1 2 2 2 2 2 2 4
Operating Systems 22
System Working Set
- Just as processes have working sets, Windows’ pageable system-space code and data lives in the
“system working set”
- Made up of 4 components:
- Paged pool
- Pageable code and data in the executive
- Pageable code and data in kernel-mode drivers, Win32K.Sys, graphics drivers, etc.
- Global fjle system data cache
- To get physical (resident) size of these with PerfMon, look at:
- Memory | Pool Paged Resident Bytes
- Memory | System Code Resident Bytes
- Memory | System Driver Resident Bytes
- Memory | System Cache Resident Bytes
- Memory | Cache bytes counter is total of these four “resident” (physical) counters (not just
the cache; in NT4, same as “File Cache” on Task Manager / Performance tab)
Operating Systems 24
Working Sets in Memory
- As processes incur page faults, pages are
removed from the free, modifjed, or standby lists and made part of the process working set
- A shared page may be resident in several
processes’ working sets at one time (this case not illustrated here)
Process 3 Process 3 Process 2 Process 2 Process 1 Process 1
00000000 7FFFFFFF 80000000
Pages in Physical Memory F F F F M M M M M S S S S F F F F F F F F 3 3 3 1 2 1 2 2 1
Operating Systems 25
Page Files / Swap Space
- Page File is a hidden fjle on a Windows Volume (pagefjle.sys)
- What gets sent to the paging fjle?
- no code – only modifjed data
- code can be re-read from image fjle anytime, no benefjt to create a copy
in the page fjle
- When do pages get paged out?
- Only when necessary
- Page fjle space is only reserved at the time pages are written out
- Once a page is written to the paging fjle, the space is occupied until the
memory is deleted (e.g. at process exit), even if the page is read back from disk
Operating Systems 27
Sizing the Page File / Swap Partition
- Given understanding of page fjle usage, how big should the total paging fjle space
be?
- Size should depend on total private virtual memory used by applications and drivers
- Therefore, not related to RAM size, except for taking a full memory dump
- Hibernation on Windows uses a separate fjle to persist memory (hiberfjl.sys)
- Worst case: system has to page all private data out to make room for code pages
- To handle, minimum size should be the maximum of VM usage
(“Commit Charge Peak”)
–
Hard disk space is cheap, so why not double this
- Page File can grow, Swap partition is statically sized
Operating Systems 28
Contiguous Page Files
- Page File / Swap Space will fragment with use
- Pages not written contiguously
- Number of fragments will eventually impact paging performance
- Can defrag with Pagedefrag tool on Windows
(freeware - www.sysinternals.com)
- Will defrag on reboot
Operating Systems 29
When Page Files are Full
- When page fjle space runs low
- 1. “System running low on virtual memory”
–
First time: Before pagefjle expansion
–
Second time: When committed bytes reaching commit limit
- 2. “System out of virtual memory”
–
Page fjles are full
- Look for who is consuming pagefjle space:
- Process memory leak: Check Task Manager, Processes tab, VM Size column
–
- r Perfmon “private bytes”, same counter
- Paged pool leak: Check paged pool size
–
Run poolmon to see what object(s) are fjlling pool
–
Could be a result of processes not closing handles - check process “handle count” in Task Manager
Operating Systems 30
Conclusions
10 10 12
22 31 21 11 Intel Linear Address 12
4Mb PDE 4Kb PDE
Page directory 1024x4byte entries (one per process)
cr 3
Physical address
PTE
Page table 1024 entries Physical Address
- perand
4 Kb page
- perand
4 Mb page 22 bit
- ffset
4KiB frame 4MiB frame Physical Memory limit base
s +
descriptor table
selector
- ffset
Intel logical Address
Operating Systems 31
Conclusions
Standby Page List Zero Page List Free Page List Process Working Sets
page read from disk or kernel allocations demand zero page faults working set replacement
Modified Page List
modified page writer zero page thread “soft” page faults
Bad Page List
Private pages at process exit