CSE 513 I ntroduction to Operating Systems Class 7 - Virtual - - PowerPoint PPT Presentation

cse 513 i ntroduction to operating systems class 7
SMART_READER_LITE
LIVE PREVIEW

CSE 513 I ntroduction to Operating Systems Class 7 - Virtual - - PowerPoint PPT Presentation

CSE 513 I ntroduction to Operating Systems Class 7 - Virtual Memory (2) Jonathan Walpole Dept. of Comp. Sci. and Eng. Oregon Health and Science University Key memory management issues Utilization Programmability Perf ormance


slide-1
SLIDE 1

CSE 513 I ntroduction to Operating Systems Class 7 - Virtual Memory (2)

Jonathan Walpole

  • Dept. of Comp. Sci. and Eng.

Oregon Health and Science University

slide-2
SLIDE 2

Key memory management issues

  • Utilization
  • Programmability
  • Perf ormance
  • Protection
slide-3
SLIDE 3

Memory utilization

  • What decreases memory utilization?
  • What is f ragmentation and why does it occur?
  • How can f ragmentation be reduced?

memory compact ion? paging?

  • What problems are introduced by compaction and paging?

dynamic relocat ion

slide-4
SLIDE 4

Supporting dynamic relocation

  • Can application programmers and compilers easily deal

with dynamic relocation?

  • What new abstraction and associated hardware support is

needed?

virt ual address spaces page t ables f or mapping virt ual pages t o physical page

f rames

MMU f or aut omat ic t ranslat ion of virt ual addresses t o

physical addresses

slide-5
SLIDE 5

Paged virtual memory

slide-6
SLIDE 6

Memory management using paging

Logical address: <page number, page of f set>

1 4 1 2 2 5 3 Page Table Memory 1K 2K 3K 4K 5K 6K

  • ff

4

  • ff

1 1K 2K 3K Virtual Address Space . . . page number page offset m-n n

slide-7
SLIDE 7

Memory management unit (MMU)

slide-8
SLIDE 8

I nternal operation of a MMU

slide-9
SLIDE 9

Page tables

A typical page table entry

slide-10
SLIDE 10

Perf ormance of memory translation

  • Why can’t memory address translation be done in

sof tware?

  • How of ten is translation done?
  • What work is involved in translating a virtual address to

a physical address?

indexing int o page t ables int erpret ing page descript ors more memory ref erences!

  • Do memory ref erences to page tables hit in the cache?

if not what is t he impact on perf ormance?

slide-11
SLIDE 11

Memory hierarchy perf ormance

The “memory” hierarchy consists of several

types of memory

L1 cache (t ypically on dye) L2 cache (t ypically available) Memor y (DRAM, SRAM, RDRAM,…

)

Disk (lot s of space available) Tape (even mor e space available…

)

0.5 ns! 0.5 ns - 20 ns 40 - 80 ns longer than you want! 8 - 13 ms 1 - 40 cycles 1 cycle 80 - 160 16M - 26M 360 Billion

slide-12
SLIDE 12

Perf ormance of memory translation (2)

How can additional memory ref erences be avoided?

TLB - t ranslat ion look-aside buf f er an associat ive memory cache f or page t able ent ries if t here is localit y of ref erence, perf ormance is good

slide-13
SLIDE 13

Translation lookaside buf f er

CPU

p d f d

page # frame # TLB TLB Hit Physical memory Page Table

slide-14
SLIDE 14

TLB entries

slide-15
SLIDE 15

Page table organization

  • How big should a virtual address space be?

what f act ors inf luence it s size?

  • How big are page tables?

what f act ors det ermine t heir size?

  • Can page tables be held entirely in cache?

can t hey be held ent irely in memory even?

  • How should page tables be structured?

f ast TLB miss handling (and writ e-back)? what about unused regions of memory?

slide-16
SLIDE 16

Two- level page table organization

Second-level page tables Top-level page table

slide-17
SLIDE 17

I nverted page table

slide-18
SLIDE 18

Address space organization

  • How big should a virtual address space be?
  • Which regions of the address space should be allocated

f or dif f erent purposes - stack, data, instructions?

  • What if memory needs f or a region increase dynamically?
  • What are segments?
  • What is the relationship between segments and pages?
  • Can segmentation and paging be used together?
  • I f segments are used, how are segment selectors

incorporated into addresses?

slide-19
SLIDE 19

Memory protection

  • At what granularity should protection be implemented?

page-level? segment level?

  • How is protection checking implemented?

compare page prot ect ion bit s wit h process capabilit ies

and operat ion t ypes on every access

sounds expensive!

  • How can protection checking be done ef f iciently?

segment regist ers prot ect ion look-aside buf f ers

slide-20
SLIDE 20

Segmentation & paging in the Pentium

A Pentium segment selector

slide-21
SLIDE 21

Segmentation & paging in the Pentium

Pentium segment descriptor

slide-22
SLIDE 22

Segmentation & paging in the Pentium

Conversion of a (selector, of f set) pair to a linear address

slide-23
SLIDE 23

Segmentation & paging in the Pentium

Mapping of a linear address onto a physical address

slide-24
SLIDE 24

Protection levels in the Pentium

Protection on the Pentium

Level

slide-25
SLIDE 25

Other VM- related costs

  • What work must be done on a context switch?
  • What work must be done on process creation?
  • What work must be done on process termination?
slide-26
SLIDE 26

Handling accesses to invalid pages

  • The page table is used to translate logical addresses to physical

addresses

  • Pages that are not in memory are marked invalid
  • A page f ault occurs when there is an access to an invalid page
  • f a process
  • Page f aults require the operating system to
  • suspend t he process
  • f ind a f ree f rame in memory
  • swap-in t he page t hat had t he f ault
  • updat e t he page t able ent ry (PTE)
  • rest art t he process
slide-27
SLIDE 27

Anatomy of a page f ault

A C E 1 2 3 4 B D

Logical memory

9 V i 2 V i 5 V 1 2 3 4

Page table

1

  • ff

C E A 10 1 2 3 6 7 8 9 5 4 1 2 3

O.S.

5 4 7 8

Restart Proc. Update PTE Find Frame Get from backing store

A C E B D 6

Bring in page Page fault Physical memory

slide-28
SLIDE 28

Page f ault handling in more detail

  • Hardware traps to kernel
  • General registers saved
  • OS determines which virtual page needed
  • OS checks validity of address, seeks page f rame
  • I f selected f rame is dirty, write it to disk
slide-29
SLIDE 29

Page f ault handling in more detail

  • OS brings new page in f rom disk
  • Page tables updated
  • Faulting instruction backed up to when it began
  • Faulting process scheduled
  • Registers restored
  • Program continues
slide-30
SLIDE 30

Complexity of instruction backup

An instruction causing a page f ault

slide-31
SLIDE 31

Locking pages in memory

  • Virtual memory and I / O occasionally interact
  • Process issues call f or read f rom device into buf f er

while wait ing f or I / O, anot her processes st art s up has a page f ault buf f er f or t he f irst process may be chosen t o be paged out

  • Need to specif y some pages locked (pinned)

exempt ed f rom being t arget pages

slide-32
SLIDE 32

Spare Slides

slide-33
SLIDE 33

Memory management

  • Memory – a linear array of bytes

Hold O.S. and programs (processes) Each memory cell is accessed by a unique memory

address

  • Recall, processes are def ined by an address space,

consisting of text, data, and stack regions

  • Process execution

CP

U f et ches inst ruct ions f rom memory according t o t he value of t he program count er (P C)

Each inst ruct ion may request addit ional operands f rom

t he dat a or st ack region

slide-34
SLIDE 34

Memory hierarchy

The “memory” hierarchy consists of several

types of memory

L1 cache (t ypically on dye) L2 cache (t ypically available) Memor y (DRAM, SRAM, RDRAM,…

)

Disk (lot s of space available) Tape (even mor e space available…

)

slide-35
SLIDE 35

Memory hierarchy

The “memory” hierarchy consists of several

types of memory

L1 cache (t ypically on dye) L2 cache (t ypically available) Memor y (DRAM, SRAM, RDRAM,…

)

Disk (lot s of space available) Tape (even mor e space available…

)

0.5 ns! 0.5 ns - 20 ns 40 - 80 ns longer than you want! 8 - 13 ms

slide-36
SLIDE 36

Memory hierarchy

The “memory” hierarchy consists of several

types of memory

L1 cache (t ypically on dye) L2 cache (t ypically available) Memor y (DRAM, SRAM, RDRAM,…

)

Disk (lot s of space available) Tape (even mor e space available…

)

0.5 ns! 0.5 ns - 20 ns 40 - 80 ns longer than you want! 8 - 13 ms 1 - 40 cycles 1 cycle 80 - 160 16M - 26M 360 Billion

slide-37
SLIDE 37

Understanding the memory hierarchy

The memory hierarchy is extremely important in

maximizing system perf ormance

Ex: 2 GHz pr ocessor

  • I f missing the L1 cache at all times you reduce it

to a 50 MHz processor

The biggest “hits” in the memory system are

currently:

Memor y t o cache int er f ace

(Har dwar e)

Disk t o memor y int er f ace

(OS)

slide-38
SLIDE 38

Memory management overview

  • Memory management – dynamically manages memory

between multiple processes

Keep t rack of which part s of memory are current ly being

used and by whom

P

rovide prot ect ion bet ween processes

Decide which processes are t o be loaded int o memory

when memory space becomes available

Allocat e and deallocat e memory space as needed Hide t he ef f ect s of slow disks

  • Consistency vs. perf ormance
  • Maximize # of processes and throughput as well as

minimize response times f or requests

slide-39
SLIDE 39

Simple memory management

Max mem

Operating system Process j Process i

Max addr Max addr

  • Load process into memory
  • Run Process
slide-40
SLIDE 40

Memory management issues

How should memory be partitioned? How many processes? Swapping Relocation Protection Sharing Logical vs. Physical

addresses

Max mem

Operating system Process j Process i

Max addr Max addr

slide-41
SLIDE 41

Degree of multiprogramming

slide-42
SLIDE 42

Address generation

  • Processes generate logical addresses to physical memory when

they are running

  • How/ when do these addresses get generated?
  • Address binding - f ixing a physical address t o t he logical

address of a process’ address space

  • Compile t ime - if program locat ion is f ixed and known ahead of

t ime

  • Load t ime - if program locat ion in memory is unknown unt il run-

t ime AND locat ion is f ixed

  • Execut ion t ime - if processes can be moved in memory during

execut ion

  • Requires hardware support
slide-43
SLIDE 43

Relocatable address generation

Prog P : : foo() : : End P P: : push ... jmp _foo : foo: ... P: : push ... jmp 75 : foo: ... 75 P: : push ... jmp 175 : foo: ... 100 175 Library Routines P: : push ... jmp 1175 : foo: ... 1000 1100 1175 Library Routines

Compilation Assembly Linking Loading

slide-44
SLIDE 44

P: : push ... jmp 175 : foo: ... 100 175 Library Routines P: : push ... jmp 1175 : foo: ... 1000 1100 1175 Library Routines P: : push ... jmp 1175 : foo: ... 1000 1100 1175 Library Routines P: : push ... jmp 175 : foo: ... 100 175 Library Routines

1000 Base register Execution Time Address Binding Load Time Address Binding Compile Time Address Binding

slide-45
SLIDE 45

Making systems more usable

  • Dynamic Loading - load only those routines that are

accessed while running +) Does not load unused rout ines

  • Dynamic Linking - link shared code such as system

libraries and window code until run- time +) More ef f icient use of disk space

  • Overlays - allows procedures to “overlay” each other to

decrease the memory size required to run the program +) Allows more programs t o be run +) P rograms can be larger t han memory

slide-46
SLIDE 46

Basics - logical and physical addressing

  • Memory Management Unit (MMU) - dynamically converts

logical addresses into physical address

  • MMU stores base address register when loading process

process i Operating system Max addr Max Mem

Physical memory address Relocation register for process i

1000

+

MMU

Program generated address

slide-47
SLIDE 47

Basics - swapping

  • Swapping - allows processes to be temporarily “swapped”
  • ut of main memory to a backing store (typically disk)
  • Swapping has several uses:

Allows mult iple pr ogr ams t o be r un concur r ent ly Allows O.S. f iner gr ain cont r ol of which pr ocesses

can be r un

Max mem

Operating system

Process j Process i Process m Process k Swap in Swap out

slide-48
SLIDE 48

Basics - simple memory protection

  • “keep addresses in play”

Relocat ion regist er gives st art ing address f or process Limit regist er limit s t he of f set accessible f rom t he

relocat ion regist er

relocation

+

Physical address memory register < limit register yes no addressing error logical address

slide-49
SLIDE 49

Basics - overlays

  • Overlays - allow dif f erent parts of the same program to

“overlay” each other in memory to reduce the memory requirements

  • Example - scanner program

140 k Image editing code 120 k Capture code 100k window init, data structures 20 k overlay driver

slide-50
SLIDE 50

Memory management architectures

Fixed size allocation

Memor y is divided int o f ixed par t it ions Fixed Par t it ioning (par t it ion >

pr oc. size)

  • Dif f erent constant size partitions

Paging (par t it ion <

pr oc. size)

  • Constant size partitions

Dynamically sized allocation

Memor y allocat ed t o f it pr ocesses exact ly

  • Dynamic Partitioning (partition > proc. size)
  • Segmentation
slide-51
SLIDE 51

O.S. Job Queues 500k 1200k 2800k 5000k MEMORY P1 P2 P3

Multiprogramming with f ixed partitions

  • Memory is divided into f ixed size partitions
  • Processes loaded into partitions of equal or greater size

Internal Fragmentation

slide-52
SLIDE 52

Multiprogramming with f ixed partitions

slide-53
SLIDE 53

Dynamic partitioning

  • Allocate contiguous memory equal to the process size
  • Corresponds to one job queue f or memory

O.S. Job Queue 500k 5000k MEMORY P1 P2 P3 P4

External Fragmentation

slide-54
SLIDE 54

128K O.S. 128K O.S. 896K P1 576K 320K P2 P6 P3 P4 P5 128K O.S. P1 352K 320K 224K P2 128K O.S. P1 288K 320K 224K 64K P3 128K O.S. P1 288K 320K 224K 64K P3 128K O.S. P1 288K 320K 128K 64K 96K P4 P3 128K O.S. 288K 320K 128K 64K 96K P4 P3 128K O.S. 288K 224K 128K 64K 96K 96K P5 P4 P3 128K O.S. 288K 224K 128K 64K 96K 96K ??? 128K

slide-55
SLIDE 55

Relocatable address generation

Prog P : : foo() : : End P P: : push ... jmp _foo : foo: ... P: : push ... jmp 75 : foo: ... 75 P: : push ... jmp 175 : foo: ... 100 175 Library Routines P: : push ... jmp 1175 : foo: ... 1000 1100 1175 Library Routines

Compilation Assembly Linking Loading

slide-56
SLIDE 56

P: : push ... jmp 175 : foo: ... 100 175 Library Routines P: : push ... jmp 1175 : foo: ... 1000 1100 1175 Library Routines P: : push ... jmp 1175 : foo: ... 1000 1100 1175 Library Routines P: : push ... jmp 175 : foo: ... 100 175 Library Routines

1000 Base register Execution Time Address Binding Load Time Address Binding Compile Time Address Binding

slide-57
SLIDE 57

Dealing with external f ragmentation

  • Compaction – f rom time to time shif t processes around to

collect all f ree space into one contiguous block

  • Placement algorithms: First- f it, best- f it, worst- f it

P6 P5 P4 P3 128K O.S. 288K 224K 128K 64K 96K 96K ??? 128K P6 P5 P4 P3 128K O.S. 288K 224K 128K 256K

slide-58
SLIDE 58

Compaction examples

P1 P1 P2 P2 P2 P3 P3 P3 P4 P4 P5 P5 P2 P2 P2 P2 P2 P2 P2 P3 P3 P3 P4 P4 P4 P4 P4 P4 P4 P4 P5 P5 P5 P5 P6 P6 P6 P5 P2 P4 BEST-FIT FIRST-FIT O.S. O.S. O.S. O.S. O.S. O.S. O.S. O.S. O.S. O.S. O.S.

  • 1. Scan
  • 2. Compact
slide-59
SLIDE 59

Compaction algorithms

  • First- f it: place process in f irst hole that f its

At t empt s t o minimize scanning t ime by f inding f irst

available hole.

Lower memory will get smaller and smaller segment s

(unt il compact ion algorit hm is run)

  • Best- f it: smallest hole that f its the process

At t empt s t o minimize t he number of compact ions t hat

need t o be run

  • Worst- f it: largest hole in memory

At t empt s t o maximize t he ext ernal f ragment sizes

slide-60
SLIDE 60

Memory management using paging

  • Fixed partitioning of memory suf f ers f rom internal

f ragmentation, due to coarse granularity of the f ixed memory partitions

  • Memory management via paging:

P

ermit physical address space of a process t o be noncont iguous

Break physical memory int o f ixed-size blocks called

f rames

Break a process’s address space int o t he same sized

blocks called pages

P

ages are relat ively small compared t o processes (reduces t he int ernal f ragment at ion)

slide-61
SLIDE 61

Memory management using paging

Logical address: <page number, page of f set>

1 4 1 2 2 5 3 Page Table Memory 1K 2K 3K 4K 5K 6K

  • ff

4

  • ff

1 1K 2K 3K Logical Address Space . . . page number page offset m-n n

slide-62
SLIDE 62

Hardware Support f or Paging

  • The page table needs to be stored somewhere

Regist ers Main Memory

  • Page Table Base Register (PTBR) - points to the in

memory location of the page table.

  • Translation Look- aside Buf f ers make translation f aster
  • Paging I mplementation I ssues

Two memory accesses per address? What if page t able >

page size?

How do we implement memory prot ect ion? Can code sharing occur?

slide-63
SLIDE 63

Paging system perf ormance

  • The page table is stored in memory, thus, every logical

address access results in TWO physical memory accesses:

  • 1) Look up the page table
  • 2) Look up the true physical address f or ref erence
  • To make logical to physical address translation quicker:

Translat ion Look-Aside Buf f er - very small associat ive

cache t hat maps logical page ref erences t o physical page ref erences

Localit y of Ref erence - a ref erence t o an area of memory

is likely t o cause anot her access t o t he same area

slide-64
SLIDE 64

Translation lookaside buf f er

CPU

p d f d

page # frame # TLB TLB Hit Physical memory Page Table

slide-65
SLIDE 65

TLB implementation

  • I n order to be f ast, TLBs must implement an associative

search where the cache is searched in parallel.

EXP

ENSI VE

The number of ent ries varies (8 ->

2048)

  • Because the TLB translates logical pages to physical

pages, the TLB must be f lushed on every context switch in order to work

Can improve perf ormance by associat ing process bit s

wit h each TLB ent ry

  • A TLB must implement an eviction policy which f lushes old

entries out of the TLB

Occurs when t he TLB is f ull

slide-66
SLIDE 66

Memory protection with paging

  • Associate protection bits with each page table entry

Read/ Writ e access - can provide read-only access f or re-

ent rant code

Valid/ I nvalid bit s - t ells MMU whet her or not t he page

exist s in t he process address space

P

age Table Lengt h Regist er (P TLR) - st ores how long t he page t able is t o avoid an excessive number of unused page t able ent ries

5 R V 2 W V 3 R V 9 W V I I 1 2 3 4 5

Page Table Frame # R/W V/I

slide-67
SLIDE 67

Multilevel paging

  • For modern computer systems,

# f rames <

< # pages

  • Example:

8 kbyt e page/ f rame size 32-bit addresses 4 byt es per P

TE

How many page t able ent ries? How large is page t able?

  • Multilevel paging - page the page table itself
slide-68
SLIDE 68

Multilevel paging

  • Page the page table
  • Logical address - - > [Section # , Page # , Of f set]
  • How do we calculate size of section and page?

Physical address Page Table Outer Page Table

p1 p2 d p1 p2 . . . . . . f d

Logical address

slide-69
SLIDE 69

Virtual memory management overview

  • What have we learned about memory management?

P

rocesses require memory t o run

We have assumed t hat t he ent ire process is resident

during execut ion

  • Some f unctions in processes never get invoked

Error det ect ion and recovery rout ines I n a graphics package, f unct ions like smoot h, sharpen,

bright en, et c... may not get invoked

  • Virtual Memory - allows f or the execution of processes

that may not be completely in memory (extension of paging technique f rom the last chapter)

Benefits?

slide-70
SLIDE 70

Virtual memory overview

  • Hides physical memory f rom user
  • Allows higher degree of multiprogramming

(only bring in pages that are accessed)

  • Allows large processes to be run on small amounts of

physical memory

  • Reduces I / O required to swap in/ out processes

(makes the system f aster)

  • Requires:

P

ager - page in / out pages as required

“Swap” space in order t o hold processes t hat are part ially

complet e

Hardware support t o do address t ranslat ion

slide-71
SLIDE 71

Demand paging

  • Each process address space is broken into pages (as in

the paged memory management technique)

  • Upon execution, swap in a page if it is not in memory

(lazy swapping or demand paging)

  • Pager - is a process that takes care of swapping in/ out

pages to/ f rom memory

. . . memory disk

slide-72
SLIDE 72

Demand paging implementation

  • One page- table entry (per page)
  • valid/ invalid bit - tells whether the page is resident in memory
  • For each page brought in, mark the valid bit

9 V i 2 V i 5 V 1 2 3 4 A C E 1 2 3 4 B D E 5 6 7 8 9 A C 1 2 3 4

Physical memory Page table Logical memory

A C E B D

slide-73
SLIDE 73

Another example

1 2 3 4 A C E 1 2 3 4 B D 5 6 7 8 9 1 2 3 4

Physical memory Page table Logical memory

A C E B D A B E

slide-74
SLIDE 74

Memory Management

Chapter 4

4.1 Basic memory management 4.2 Swapping 4.3 Virtual memory 4.4 Page replacement algorithms 4.5 Modeling page replacement algorithms 4.6 Design issues for paging systems 4.7 Implementation issues 4.8 Segmentation

slide-75
SLIDE 75

Memory Management

I deally programmers want memory that is

large f ast non volat ile

Memory hierarchy

small amount of f ast , expensive memory – cache some medium-speed, medium price main memory gigabyt es of slow, cheap disk st orage

Memory manager handles the memory hierarchy

slide-76
SLIDE 76

Basic Memory Management

Monoprogramming without Swapping or Paging

Three simple ways of organizing memory

  • an operating system with one user process
slide-77
SLIDE 77

Analysis of Multiprogramming System Perf ormance

  • Arrival and work requirements of 4 jobs
  • CPU utilization f or 1 – 4 jobs with 80% I / O wait
  • Sequence of events as jobs arrive and f inish
  • not e numbers show amout of CPU t ime j obs get in each int erval
slide-78
SLIDE 78

Relocation and Protection

  • Cannot be sure where program will be loaded in memory

address locat ions of variables, code rout ines cannot be absolut e must keep a program out of ot her processes’ part it ions

  • Use base and limit values

address locat ions added t o base value t o map t o physical addr address locat ions larger t han limit value is an error

slide-79
SLIDE 79

Swapping (1)

Memory allocation changes as

processes come int o memory leave memory

Shaded regions are unused memory

slide-80
SLIDE 80

Swapping (2)

  • Allocating space f or growing data segment
  • Allocating space f or growing stack & data segment
slide-81
SLIDE 81

Memory Management with Bit Maps

Part of memory with 5 processes, 3 holes

t ick marks show allocat ion unit s shaded regions are f ree

Corresponding bit map Same inf ormation as a list

slide-82
SLIDE 82

Memory Management with Linked Lists

Four neighbor combinations f or the terminating process X

slide-83
SLIDE 83

Page Size (1)

Small page size

Advantages

less int er nal f r agment at ion bet t er f it f or var ious dat a st r uct ur es, code sect ions less unused pr ogr am in memor y

Disadvantages

pr ogr ams need many pages, lar ger page t ables

slide-84
SLIDE 84

Page Size (2)

Overhead due to page table and internal

f ragmentation

Where

  • s = average process size in byt es
  • p = page size in byt es
  • e = page ent ry

2 s e p

  • verhead

p ⋅ = +

page table space internal fragmentation

Optimized when

2 p se =

slide-85
SLIDE 85

Separate I nstruction and Data Spaces

One address space Separate I and D spaces

slide-86
SLIDE 86

Shared Pages

Two processes sharing same program sharing its page table

slide-87
SLIDE 87

Cleaning Policy

Need f or a background process, paging daemon

per iodically inspect s st at e of memor y

When too f ew f rames are f ree

select s pages t o evict using a r eplacement algor it hm

I t can use same circular list (clock)

as r egular page r eplacement algor it hm but wit h dif f pt r

slide-88
SLIDE 88

I mplementation I ssues

Operating System I nvolvement with Paging

Four times when OS involved with paging

  • Process creation

det ermine program size

creat e page t able

  • Process execution

MMU reset f or new process

TLB f lushed

  • Page f ault time

det ermine virt ual address causing f ault

swap t arget page out , needed page in

  • Process termination time

release page t able, pages

slide-89
SLIDE 89

Backing Store

(a) Paging to static swap area (b) Backing up pages dynamically

slide-90
SLIDE 90

Separation of Policy and Mechanism

Page f ault handling with an external pager

slide-91
SLIDE 91

Segmentation (1)

  • One- dimensional address space with growing tables
  • One table may bump into another
slide-92
SLIDE 92

Segmentation (2)

Allows each table to grow or shrink, independently

slide-93
SLIDE 93

Segmentation (3)

Comparison of paging and segmentation

slide-94
SLIDE 94

I mplementation of Pure Segmentation

(a)- (d) Development of checkerboarding (e) Removal of the checkerboarding by compaction

slide-95
SLIDE 95

Segmentation with Paging: MULTI CS (1)

Descriptor segment points to page tables Segment descriptor – numbers are f ield lengths

slide-96
SLIDE 96

Segmentation with Paging: MULTI CS (2)

A 34- bit MULTI CS virtual address

slide-97
SLIDE 97

Segmentation with Paging: MULTI CS (3)

Conversion of a 2- part MULTI CS address into a main memory address

slide-98
SLIDE 98

Segmentation with Paging: MULTI CS (4)

  • Simplif ied version of the MULTI CS TLB
  • Existence of 2 page sizes makes actual TLB more complicated
slide-99
SLIDE 99

Page Replacement Algorithms and Perf ormance Modelling

slide-100
SLIDE 100

Virtual memory perf ormance

  • What is the limiting f actor in the perf ormance of virtual

memory systems?

I n t he above example, st eps 5 and 6 require on t he order

  • f 10 milliseconds, while t he rest of t he st eps require on

t he order of microseconds/ nanoseconds.

Thus, disk accesses t ypically limit t he perf ormance of

virt ual memory syst ems.

  • Ef f ective Access Time - mean memory access time f rom

logical address to physical address retrieval

ef f ect ive access t ime= (1-p)*ma + p*page_f ault _t ime p = probabilit y t hat a page f ault will occur

slide-101
SLIDE 101

The great virtual memory struggle

  • Having the option to run programs that are partially in memory

leads to a very interesting problem:

  • How many programs should we allow t o run in memory at any

given t ime?

  • We can make sure t hat all t he pages of all processes can f it

int o memory

  • ) may be under- allocat ing memory
  • We can over-allocat e t he memory by assuming t hat processes

will not access (or need) all t heir pages at t he same t ime

  • ) may run out of pages in memory

+) can increase the throughput of the system

slide-102
SLIDE 102

Page replacement

  • Page Replacement - is a technique which allows us to

increase the degree of multiprogramming (i. e. over- allocate memory) by using the disk as “extended” memory

  • I f a page f ault occurs and no f rames are f ree:

1) Find a f rame in memory not currently being used 2) Select a victim page and swap it out to the swap- space on disk (changing its page table entry) 3) Use the f reed f rame to hold the page that caused the page f ault 4) Restart the process

  • Requires two page transf ers to memory (one in, one out)
slide-103
SLIDE 103

Page replacement perf ormance

  • Because page replacement could potentially result in two disk

transf ers, a small number of page f aults can greatly impact system perf ormance

  • Ex. Mem. Access. = 60ns, disk access = 10 ms, page f ault rat e = 0.01
  • No page replacement (just demand paging)

– EAT = 0. 99* 60ns + 0. 01* 10ms = 100. 060 us

  • Page replacement

– EAT = 0. 99* 60ns + 0. 01* 2* 10ms = 200. 060 us

  • Page replacement is key in virtual memory perf ormance
  • Using a modif ied bit in the PTE helps by only swapping out

pages that have been modif ied

  • Ef f icient page replacement and f rame allocation algorithms are

required f or ef f icient virtual memory

slide-104
SLIDE 104

Page replacement algorithms

  • Page replacement algorithms determine what f rame in memory

should be used to handle a page f ault

  • May require t he f rame t o be swapped out t o t he swap-space if

it has been modif ied

  • Page replacement algorithms are typically measured by their

page f ault rate or the rate at which page f aults occur.

  • Commonly ref erred t o algorit hms:
  • First- in First- out (FI FO)
  • Optimal
  • Least- recently used (LRU)
  • Least- f requently used (LFU)
slide-105
SLIDE 105

Page replacement algorithms

  • Which page should be replaced?
  • Local replacement - replace page of f ault ing process
  • Global replacement - possibly replace page of anot her process

in memory

  • Evaluation of algorithms?
  • Record t races of pages accessed by a process
  • Example: Virt ual addresses (page, of f set )
  • (3, 0), (1, 9), (4, 1), (2, 1), (5, 3), (2, 0), (1, 9), (2, 4)
  • . . . generate page trace
  • 3, 1, 4, 2, 5, 2, 1, 2
  • Simulate behavior of process and measure the number of page

f aults that occur

slide-106
SLIDE 106

FI FO page replacement

  • Replace the page that was f irst brought into memory

+) Simple t o implement (FI FO queue)

  • ) Per f or mance not always good
  • Example: Memory system with 4 f rames:

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults a a a a b b b b c c c c d d d d X

slide-107
SLIDE 107

FI FO page replacement

  • Replace the page that was f irst brought into memory

+) Simple t o implement (FI FO queue)

  • ) Per f or mance not always good
  • Example: Memory system with 4 f rames:

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults a a a a e e b b b b b b c c c c c c d d d d d d X X

slide-108
SLIDE 108

FI FO page replacement

  • Replace the page that was f irst brought into memory

+) Simple t o implement (FI FO queue)

  • ) Per f or mance not always good
  • Example: Memory system with 4 f rames:

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults a a a a e e e e e d b b b b b b a a a a c c c c c c c b b b d d d d d d d d c c X X X X X

slide-109
SLIDE 109

Page replacement and # of pages

  • One would expect that with more memory the number of

page f aults would decrease

  • Belady’s Anamoly - More memory does not always mean

better perf ormance

Time 0 1 2 3 4 5 6 7 8 9 10 11 12 Requests a b c d a b e a b c d e Page 0 a Frames 1 b 2 c Page faults

slide-110
SLIDE 110

Page replacement and # of pages

  • One would expect that with more memory the number of

page f aults would decrease

  • Belady’s Anamoly - More memory does not always mean

better perf ormance

Time 0 1 2 3 4 5 6 7 8 9 10 11 12 Requests a b c d a b e a b c d e Page 0 a Frames 1 b 2 c Page faults a a a d b b b b c c c c X

slide-111
SLIDE 111

Page replacement and # of pages

  • One would expect that with more memory the number of

page f aults would decrease

  • Belady’s Anamoly - More memory does not always mean

better perf ormance

Time 0 1 2 3 4 5 6 7 8 9 10 11 12 Requests a b c d a b e a b c d e Page 0 a Frames 1 b 2 c Page faults a a a d d d e e e e e e b b b b a a a a a c c c c c c c c b b b b b d d X X X X X X

slide-112
SLIDE 112

Belady’s anamoly

4 pages

Time 0 1 2 3 4 5 6 7 8 9 10 11 12 Requests a b c d a b e a b c d e Page 0 a Frames 1 b 2 c 3 Page faults

slide-113
SLIDE 113

Belady’s anamoly

4 pages

Time 0 1 2 3 4 5 6 7 8 9 10 11 12 Requests a b c d a b e a b c d e Page 0 a Frames 1 b 2 c 3 Page faults a a a a b b b b c c c c d X

slide-114
SLIDE 114

Belady’s anamoly

4 pages

Time 0 1 2 3 4 5 6 7 8 9 10 11 12 Requests a b c d a b e a b c d e Page 0 a Frames 1 b 2 c 3 Page faults a a a a a a e e e e d d b b b b b b b a a a a e c c c c c c c c b b b b d d d d d d c c c X X X X X X X

slide-115
SLIDE 115

Optimal page replacement

  • Replace the page that will not be needed f or the longest

period of time

+) Minimum number of page f aults

  • ) How do you f oresee the f uture?
  • Example:

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults a a a a b b b b c c c c d d d d X

slide-116
SLIDE 116

Optimal page replacement

  • Replace the page that will not be needed f or the longest

period of time

+) Minimum number of page f aults

  • ) How do you f oresee the f uture?
  • Example:

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults a a a a a a a a a b b b b b b b b b c c c c c c c c c d d d d e e e e e X X

slide-117
SLIDE 117

LRU page replacement

  • Replace the page that hasn’t been ref erenced in the

longest time

Uses recent past as predict or f or t he f ut ure Quit e widely used

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults

slide-118
SLIDE 118

LRU page replacement

  • Replace the page that hasn’t been ref erenced in the

longest time

Uses recent past as predict or f or t he f ut ure Quit e widely used

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults a a a a a a a a a a b b b b b b b b b b c c c c e e e e e d d d d d d d d d c c X X X

slide-119
SLIDE 119

LRU implementation

  • LRU requires the O.S. to keep track of the accesses to

the pages in memory

Exact LRU implement at ion

1) Counters - save “clock” f or each ref erence smallest “clock” is the victim page 2) Stacks - every time page ref erenced put at the top of the stack bottom of stack is the victim page Both require keeping f airly detailed inf ormation

Approximat e LRU implement at ion

1) The clock algorithm (second- chance algo.) 2) Two- handed clock algorithm

slide-120
SLIDE 120

LRU implementation

Take ref erenced and put on top of stack

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults

slide-121
SLIDE 121

LRU implementation

Take ref erenced and put on top of stack

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults

C A B D A C B D

a a b b c c d d

slide-122
SLIDE 122

LRU implementation

Take ref erenced and put on top of stack

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults

C A B D A C B D D A C B B D A C

a a a a b b b b c c c c d d d d X

slide-123
SLIDE 123

LRU implementation

Take ref erenced and put on top of stack

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults

C A B D A C B D D A C B B D A C E B D A B E D A A B E D B A E D C B A E D C B A

a a a a a a a a a a b b b b b b b b b b c c c c e e e e e d d d d d d d d d c c X X X

slide-124
SLIDE 124

Clock algorithm

  • Maintain a circular list of pages in memory
  • Set a clock bit f or the page when a page is ref erenced
  • Clock sweeps over memory looking f or a page that does not have

the clock- bit set

  • Replace pages that haven’t been ref erenced f or one complete

clock revolution

1 1 2 3 5 4 clock bit frame #

slide-125
SLIDE 125

Clock algorithm

Example: clear one- page per ref erence

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults

slide-126
SLIDE 126

Clock algorithm

Example: clear one- page per ref erence

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults

1 1 1 1 1

a b c d

slide-127
SLIDE 127

Clock algorithm

Example: clear one- page per ref erence

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults

1 1 1 1 1 1 1

a a b b c c d d

slide-128
SLIDE 128

Clock algorithm

Example: clear one- page per ref erence

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults

1 1 1 1 1 1 1 1

a a a b b b c c c d d d

slide-129
SLIDE 129

Clock algorithm

Example: clear one- page per ref erence

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults

1 1 1 1 1 1 1 1 1

a a a a b b b b c c c c d d d d

slide-130
SLIDE 130

Clock algorithm

Example: clear one- page per ref erence

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults

1 1 1 1 1 1 1 1 1 1

a a a a a b b b b b c c c c e d d d d d X

slide-131
SLIDE 131

Clock algorithm

Example: clear one- page per ref erence

Time 0 1 2 3 4 5 6 7 8 9 10 Requests c a d b e b a b c d Page 0 a Frames 1 b 2 c 3 d Page faults

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

a a a a a a a a a a b b b b b b b b b b c c c c e e e e c c d d d d d d d d d d X X

slide-132
SLIDE 132

Theory and practice

  • I dent if ying vict im f rame on each page f ault t ypically

requiring t wo disk accesses per page f ault

  • Alt ernat ive t he O.S. can keeps several pages f ree in

ant icipat ion of upcoming page f ault s. I n Unix: low and high water marks

high water mark low water mark

low < # free pages < high

slide-133
SLIDE 133

Free pages and the clock algorithm

  • The rate at which the clock sweeps through memory

determines the number of pages that are kept f ree:

Too high a rat e -->

Too many f ree pages marked

Too low a rat e -->

Not enough (or no) f ree pages marked

  • Large memory system considerations

As memory syst ems grow, it t akes longer and longer f or

t he hand t o sweep t hrough memory

This washes out t he ef f ect of t he clock somewhat Can use a t wo-handed clock t o reduce t he t ime bet ween

t he passing of t he hands

slide-134
SLIDE 134

The UNI X memory model

UNI X page replacement

Two handed clock algor it hm f or page r eplacement

  • I f page has not been accessed move it to the f ree

list f or use as allocatable page – I f modif ied/ dirty write to disk (still keep stuf f in memory though) – I f unmodif ied just move to f ree list

High and low wat er mar ks f or f r ee pages Pages on t he f r ee-list can be r e-allocat ed if t hey

ar e accessed again bef or e being over wr it t en

slide-135
SLIDE 135

VM and multiprogramming

  • Goal: Maximize # processes, minimize response t ime
  • Measurement s of real operat ing syst ems has led t o t he

f ollowing CP U ut ilizat ion measurement s:

  • Thrashing - when t he CP

U is spending all of it s t ime swapping in/ out pages f rom/ t o disk

CPU Utilization Degree of multiprogramming Thrashing

slide-136
SLIDE 136

Prevention of thrashing

  • I n order t o prevent t hrashing, we really need t o know how many

pages of process needs at any given t ime

  • Given t hese numbers, we t hen allocat e memory such t hat t he sum
  • f all t he needs of t he processes is less t han t he t ot al memory

available. Problem - each process’ set of pages required dynamically changes during it s execut ion!

  • Localit y model
  • As processes execute, they move f rom locality to locality,

each with a set of pages that are actively used together.

  • Programs consist of several localities
slide-137
SLIDE 137

Working set model

  • Based on the assumption of locality
  • Use parameter ∆ to def ine working- set window
  • The set of pages in the most recent ∆ ref erences is the

working set

  • Working sets and virtual memory
  • Working-set s change over t ime
  • Want t o make sure t hat t he sum of all t he process’ working-

set s is less t han t he memory size

  • Prevent s t hrashing, while keeping t he degree of

mult iprogramming high

  • Has a large amount of overhead
slide-138
SLIDE 138

Working set modeling

Given a f ixed D, processes exhibit working- sets

similar to the graph below:

Working Set Size

slide-139
SLIDE 139

Prevention of thrashing

  • The working- set model gives a reasonably accurate

measurement of the number of pages needed by a process at any given time

  • Requires keeping t rack of t he working-set
  • Another method f or preventing thrashing is dynamically

measuring page f ault f requency

  • I f t he page f ault f requency is high, we know t hat t he process

requires more f rames

  • I f t he page f ault is low, t hen t he process may have t oo many

f r ames

  • Like the low and high water marks f or memory, we can do the

same f or page f ault f requencies

slide-140
SLIDE 140

Page Replacement Algorithms

Page f ault f orces choice

which page must be r emoved make r oom f or incoming page

Modif ied page must f irst be saved

unmodif ied j ust over wr it t en

Better not to choose an of ten used page

will pr obably need t o be br ought back in soon

slide-141
SLIDE 141

Optimal Page Replacement Algorithm

Replace page needed at the f arthest point in f uture

Opt imal but unr ealizable

Estimate by …

logging page use on pr evious r uns of pr ocess alt hough t his is impr act ical

slide-142
SLIDE 142

Not Recently Used Page Replacement Algorithm

  • Each page has Ref erence bit, Modif ied bit
  • bit s ar e set when page is r ef er enced, modif ied
  • Pages are classif ied

not r ef er enced, not modif ied

not r ef er enced, modif ied

r ef er enced, not modif ied

r ef er enced, modif ied

  • NRU removes page at random
  • f r om lowest number ed non empt y class
slide-143
SLIDE 143

FI FO Page Replacement Algorithm

Maintain a linked list of all pages

in or der t hey came int o memor y

Page at beginning of list replaced Disadvantage

page in memor y t he longest may be of t en used

slide-144
SLIDE 144

Second Chance Page Replacement Algorithm

Operation of a second chance

pages sort ed in FI FO order P

age list if f ault occurs at t ime 20, A has R bit set (numbers above pages are loading t imes)

slide-145
SLIDE 145

The Clock Page Replacement Algorithm

slide-146
SLIDE 146

Least Recently Used (LRU)

Assume pages used recently will used again soon

t hr ow out page t hat has been unused f or longest t ime

Must keep a linked list of pages

most r ecent ly used at f r ont , least at r ear updat e t his list ever y memor y r ef er ence !!

Alternatively keep counter in each page table entry

choose page wit h lowest value count er per iodically zer o t he count er

slide-147
SLIDE 147

Simulating LRU in Sof tware (1)

LRU using a matrix – pages ref erenced in order

0, 1, 2, 3, 2, 1, 0, 3, 2, 3

slide-148
SLIDE 148

Simulating LRU in Sof tware (2)

  • The aging algorithm simulates LRU in sof tware
  • Note 6 pages f or 5 clock ticks, (a) – (e)
slide-149
SLIDE 149

The Working Set Page Replacement Algorithm (1)

  • The working set is the set of pages used by the k most

recent memory ref erences

  • w(k,t) is the size of the working set at time, t
slide-150
SLIDE 150

The Working Set Page Replacement Algorithm (2)

The working set algorithm

slide-151
SLIDE 151

The WSClock Page Replacement Algorithm

Operation of the WSClock algorithm

slide-152
SLIDE 152

Review of Page Replacement Algorithms

slide-153
SLIDE 153

Modeling Page Replacement Algorithms Belady' s Anomaly

  • FI FO with 3 page f rames
  • FI FO with 4 page f rames
  • P' s show which page ref erences show page f aults
slide-154
SLIDE 154

Stack Algorithms

State of memory array, M, af ter each item in ref erence string is processed

7 4 6 5

slide-155
SLIDE 155

The Distance String

Probability density f unctions f or two hypothetical distance strings

slide-156
SLIDE 156

The Distance String

  • Computation of page f ault rate f rom distance string

t he C vect or t he F vect or

slide-157
SLIDE 157

Design I ssues f or Paging Systems

Local versus Global Allocation Policies (1)

Original conf iguration Local page replacement Global page replacement

slide-158
SLIDE 158

Local versus Global Allocation Policies (2)

Page f ault rate as a f unction of the number of page f rames assigned

slide-159
SLIDE 159

Load Control

Despite good designs, system may still thrash When PFF algorithm indicates

some pr ocesses need mor e memor y but no pr ocesses need less

Solution :

Reduce number of processes competing f or memory

swap one or mor e t o disk, divide up pages t hey held r econsider degr ee of mult ipr ogr amming