Chapter 4: Memory Management Part 1: Mechanisms for Managing Memory - - PowerPoint PPT Presentation

chapter 4 memory management
SMART_READER_LITE
LIVE PREVIEW

Chapter 4: Memory Management Part 1: Mechanisms for Managing Memory - - PowerPoint PPT Presentation

Chapter 4: Memory Management Part 1: Mechanisms for Managing Memory Memory management n Basic memory management n Swapping n Virtual memory n Page replacement algorithms n Modeling page replacement algorithms n Design issues for paging systems n


slide-1
SLIDE 1

Chapter 4: Memory Management

Part 1: Mechanisms for Managing Memory

slide-2
SLIDE 2

Chapter 4

2 CS 1550, cs.pitt.edu (originaly modified by Ethan

Memory management

n Basic memory management n Swapping n Virtual memory n Page replacement algorithms n Modeling page replacement algorithms n Design issues for paging systems n Implementation issues n Segmentation

slide-3
SLIDE 3

Chapter 4

3 CS 1550, cs.pitt.edu (originaly modified by Ethan

In an ideal world…

n The ideal world has memory that is

n Very large n Very fast n Non-volatile (doesn’t go away when power is turned off)

n The real world has memory that is:

n Very large n Very fast n Affordable!

ÞPick any two…

n Memory management goal: make the real world look

as much like the ideal world as possible

slide-4
SLIDE 4

Chapter 4

4 CS 1550, cs.pitt.edu (originaly modified by Ethan

Memory hierarchy

n What is the memory hierarchy?

n Different levels of memory n Some are small & fast n Others are large & slow

n What levels are usually included?

n Cache: small amount of fast, expensive memory

n L1 (level 1) cache: usually on the CPU chip n L2 & L3 cache: off-chip, made of SRAM

n Main memory: medium-speed, medium price memory (DRAM) n Disk: many gigabytes of slow, cheap, non-volatile storage

n Memory manager handles the memory hierarchy

slide-5
SLIDE 5

Chapter 4

5 CS 1550, cs.pitt.edu (originaly modified by Ethan

Basic memory management

n Components include

n Operating system (perhaps with device drivers) n Single process

n Goal: lay these out in memory

n Memory protection may not be an issue (only one program) n Flexibility may still be useful (allow OS changes, etc.)

n No swapping or paging

Operating system (RAM) User program (RAM)

0xFFFF 0xFFFF

User program (RAM) Operating system (ROM) Operating system (RAM) User program (RAM) Device drivers (ROM)

slide-6
SLIDE 6

Chapter 4

6 CS 1550, cs.pitt.edu (originaly modified by Ethan

Fixed partitions: multiple programs

n Fixed memory partitions

n Divide memory into fixed spaces n Assign a process to a space when it’s free

n Mechanisms

n Separate input queues for each partition n Single input queue: better ability to optimize CPU usage

OS Partition 1 Partition 2 Partition 3 Partition 4 100K 500K 600K 700K 900K OS Partition 1 Partition 2 Partition 3 Partition 4 100K 500K 600K 700K 900K

slide-7
SLIDE 7

Chapter 4

7 CS 1550, cs.pitt.edu (originaly modified by Ethan

How many programs is enough?

n Several memory partitions (fixed or variable size) n Lots of processes wanting to use the CPU n Tradeoff

n More processes utilize the CPU better n Fewer processes use less memory (cheaper!)

n How many processes do we need to keep the CPU

fully utilized?

n This will help determine how much memory we need n Is this still relevant with memory costing less than $1/GB?

slide-8
SLIDE 8

Chapter 4

8 CS 1550, cs.pitt.edu (originaly modified by Ethan

Modeling multiprogramming

n More I/O wait means less

processor utilization

n At 20% I/O wait, 3–4

processes fully utilize CPU

n At 80% I/O wait, even 10

processes aren’t enough

n This means that the OS

should have more processes if they’re I/O bound

n More processes =>

memory management & protection more important!

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 9 10 Degree of Multiprogramming CPU Utilization 80% I/O Wait 50% I/O Wait 20% I/O Wait

slide-9
SLIDE 9

Chapter 4

9 CS 1550, cs.pitt.edu (originaly modified by Ethan

Multiprogrammed system performance

n Arrival and work requirements of 4 jobs n CPU utilization for 1–4 jobs with 80% I/O wait n Sequence of events as jobs arrive and finish

n Numbers show amount of CPU time jobs get in each interval n More processes => better utilization, less time per process

Job Arrival time CPU needed 1 10:00 4 2 10:10 3 3 10:15 2 4 10:20 2 1 2 3 4 CPU idle 0.80 0.64 0.51 0.41 CPU busy 0.20 0.36 0.49 0.59 CPU/process 0.20 0.18 0.16 0.15 10 15 20 22 27.6 28.2 31.7 1 2 3 4 Time

slide-10
SLIDE 10

Chapter 4

10 CS 1550, cs.pitt.edu (originaly modified by Ethan

Memory and multiprogramming

n Memory needs two things for multiprogramming

n Relocation n Protection

n The OS cannot be certain where a program will be

loaded in memory

n Variables and procedures can’t use absolute locations in

memory

n Several ways to guarantee this

n The OS must keep processes’ memory separate

n Protect a process from other processes reading or

modifying its own memory

n Protect a process from modifying its own memory in

undesirable ways (such as writing to program code)

slide-11
SLIDE 11

Chapter 4

11 CS 1550, cs.pitt.edu (originaly modified by Ethan

Base and limit registers

n

Special CPU registers: base & limit

n Access to the registers limited to

system mode

n Registers contain

n Base: start of the process’s

memory partition

n Limit: length of the process’s

memory partition

n

Address generation

n Physical address: location in

actual memory

n Logical address: location from

the process’s point of view

n Physical address = base + logical

address

n Logical address larger than limit

=> error

Process partition OS

0xFFFF

Limit Base

0x2000 0x9000

Logical address: 0x1204 Physical address: 0x1204+0x9000 = 0xa204

slide-12
SLIDE 12

Chapter 4

12 CS 1550, cs.pitt.edu (originaly modified by Ethan

Swapping

n Memory allocation changes as

n Processes come into memory n Processes leave memory

n Swapped to disk n Complete execution

n Gray regions are unused memory

OS OS OS OS OS OS OS A A B A B C B C B C D C D C D A

slide-13
SLIDE 13

Chapter 4

13 CS 1550, cs.pitt.edu (originaly modified by Ethan

Swapping: leaving room to grow

n Need to allow for programs

to grow

n Allocate more memory for

data

n Larger stack

n Handled by allocating more

space than is necessary at the start

n Inefficient: wastes memory

that’s not currently in use

n What if the process requests

too much memory? OS Code Data Stack Code Data Stack Process B Process A Room for B to grow Room for A to grow

slide-14
SLIDE 14

Chapter 4

14 CS 1550, cs.pitt.edu (originaly modified by Ethan

Tracking memory usage: bitmaps

n

Keep track of free / allocated memory regions with a bitmap

n One bit in map corresponds to a fixed-size region of memory n Bitmap is a constant size for a given amount of memory regardless of how

much is allocated at a particular time

n

Chunk size determines efficiency

n At 1 bit per 4KB chunk, we need just 256 bits (32 bytes) per MB of memory n For smaller chunks, we need more memory for the bitmap n Can be difficult to find large contiguous free areas in bitmap

A B C D

11111100 00111000 01111111 11111000

8 16 24 32

Memory regions Bitmap

slide-15
SLIDE 15

Chapter 4

15 CS 1550, cs.pitt.edu (originaly modified by Ethan

Tracking memory usage: linked lists

n

Keep track of free / allocated memory regions with a linked list

n Each entry in the list corresponds to a contiguous region of memory n Entry can indicate either allocated or free (and, optionally, owning process) n May have separate lists for free and allocated areas

n

Efficient if chunks are large

n Fixed-size representation for each region n More regions => more space needed for free lists

A B C D

16 24 32

Memory regions

A 6

  • 6

4 B 10 3

  • 13 4

C 17 9

  • 29 3

D 26 3

8

slide-16
SLIDE 16

Chapter 4

16 CS 1550, cs.pitt.edu (originaly modified by Ethan

Allocating memory

n

Search through region list to find a large enough space

n

Suppose there are several choices: which one to use?

n First fit: the first suitable hole on the list n Next fit: the first suitable after the previously allocated hole n Best fit: the smallest hole that is larger than the desired region (wastes least

space?)

n Worst fit: the largest available hole (leaves largest fragment)

n

Option: maintain separate queues for different-size holes

  • 6

5

  • 19 14
  • 52 25
  • 102 30
  • 135 16
  • 202 10
  • 302 20
  • 350 30
  • 411 19
  • 510 3

Allocate 20 blocks first fit 5 Allocate 12 blocks next fit 18 Allocate 13 blocks best fit 1 Allocate 15 blocks worst fit 15

slide-17
SLIDE 17

Chapter 4

17 CS 1550, cs.pitt.edu (originaly modified by Ethan

Freeing memory

n Allocation structures must be updated when memory is freed n Easy with bitmaps: just set the appropriate bits in the bitmap n Linked lists: modify adjacent elements as needed

n Merge adjacent free regions into a single region n May involve merging two regions with the just-freed area

A X B A X X B X A B A B

slide-18
SLIDE 18

Chapter 4

18 CS 1550, cs.pitt.edu (originaly modified by Ethan

Limitations of swapping

n Problems with swapping

n Process must fit into physical memory (impossible to run

larger processes)

n Memory becomes fragmented

n External fragmentation: lots of small free areas n Compaction needed to reassemble larger free areas

n Processes are either in memory or on disk: half and half

doesn’t do any good

n Overlays solved the first problem

n Bring in pieces of the process over time (typically data) n Still doesn’t solve the problem of fragmentation or

partially resident processes

slide-19
SLIDE 19

Chapter 4

19 CS 1550, cs.pitt.edu (originaly modified by Ethan

Virtual memory

n Basic idea: allow the OS to hand out more memory

than exists on the system

n Keep recently used stuff in physical memory n Move less recently used stuff to disk n Keep all of this hidden from processes

n Processes still see an address space from 0 – max address n Movement of information to and from disk handled by the

OS without process help

n Virtual memory (VM) especially helpful in

multiprogrammed system

n CPU schedules process B while process A waits for its

memory to be retrieved from disk

slide-20
SLIDE 20

Chapter 4

20 CS 1550, cs.pitt.edu (originaly modified by Ethan

Virtual and physical addresses

n Program uses virtual

addresses

n Addresses local to the process n Hardware translates virtual

address to physical address

n Translation done by the

Memory Management Unit

n Usually on the same chip as

the CPU

n Only physical addresses leave

the CPU/MMU chip

n Physical memory indexed

by physical addresses CPU chip

CPU

Memory Disk controller

MMU

Virtual addresses from CPU to MMU Physical addresses

  • n bus, in memory
slide-21
SLIDE 21

Chapter 4

21 CS 1550, cs.pitt.edu (originaly modified by Ethan

0–4K 4–8K 8–12K 12–16K 16–20K 20–24K 24–28K 28–32K

Paging and page tables

n

Virtual addresses mapped to physical addresses

n Unit of mapping is called a page n All addresses in the same virtual

page are in the same physical page

n Page table entry (PTE) contains

translation for a single page

n

Table translates virtual page number to physical page number

n Not all virtual memory has a

physical page

n Not every physical page need be

used

n

Example:

n 64 KB virtual memory n 32 KB physical memory

7 0–4K 4 4–8K 8–12K 12–16K 16–20K 20–24K 24–28K 3 28–32K 32–36K 36–40K 1 40–44K 5 44–48K 6 48–52K

  • 52–56K

56–60K

  • 60–64K

Virtual address space Physical memory

slide-22
SLIDE 22

Chapter 4

22 CS 1550, cs.pitt.edu (originaly modified by Ethan

What’s in a page table entry?

n

Each entry in the page table contains

n Valid bit: set if this logical page number has a corresponding physical frame

in memory

n If not valid, remainder of PTE is irrelevant

n Page frame number: page in physical memory n Referenced bit: set if data on the page has been accessed n Dirty (modified) bit :set if data on the page has been modified n Protection information

Page frame number V R D Protection Valid bit Referenced bit Dirty bit

slide-23
SLIDE 23

Chapter 4

23 CS 1550, cs.pitt.edu (originaly modified by Ethan

Example:

  • 4 KB (=4096 byte) pages
  • 32 bit logical addresses

p d

2d = 4096 d = 12 12 bits 32 bit logical address 32-12 = 20 bits

Mapping logical => physical address

n Split address from CPU into

two pieces

n Page number (p) n Page offset (d)

n Page number

n Index into page table n Page table contains base

address of page in physical memory

n Page offset

n Added to base address to get

actual physical memory address

n Page size = 2d bytes

slide-24
SLIDE 24

Chapter 4

24 CS 1550, cs.pitt.edu (originaly modified by Ethan

page number p d page offset 1 p-1 p p+1 f f d Page frame number

. . .

page table physical memory 1

. . .

f-1 f f+1 f+2

. . .

Page frame number

CPU

Address translation architecture

slide-25
SLIDE 25

Chapter 4

25 CS 1550, cs.pitt.edu (originaly modified by Ethan

Page frame number Logical memory (P0) 1 2 3 4 5 6 7 8 9

Physical memory

Page table (P0) Logical memory (P1) Page table (P1)

Page 4 Page 3 Page 2 Page 1 Page 0 Page 1 Page 0 8 2 9 4 3 6 Page 3 (P0) Page 0 (P1) Page 0 (P0) Page 2 (P0) Page 1 (P0) Page 4 (P0) Page 1 (P1) Free pages

Memory & paging structures

slide-26
SLIDE 26

Chapter 4

26 CS 1550, cs.pitt.edu (originaly modified by Ethan

884 960 955

. . .

220 657 401

. . .

1st level page table 2nd level page tables . . . . . . . . . . . . . . . . . . . . . . . . . . . main memory

. . .

125 613 961

. . .

Two-level page tables

n

Problem: page tables can be too large

n 232 bytes in 4KB pages need 1

million PTEs

n

Solution: use multi-level page tables

n “Page size” in first page table is

large (megabytes)

n PTE marked invalid in first page

table needs no 2nd level page table

n

1st level page table has pointers to 2nd level page tables

n

2nd level page table has actual physical page numbers in it

slide-27
SLIDE 27

Chapter 4

27 CS 1550, cs.pitt.edu (originaly modified by Ethan

More on two-level page tables

n Tradeoffs between 1st and 2nd level page table sizes

n Total number of bits indexing 1st and 2nd level table is

constant for a given page size and logical address length

n Tradeoff between number of bits indexing 1st and number

indexing 2nd level tables

n More bits in 1st level: fine granularity at 2nd level n Fewer bits in 1st level: maybe less wasted space?

n All addresses in table are physical addresses n Protection bits kept in 2nd level table

slide-28
SLIDE 28

Chapter 4

28 CS 1550, cs.pitt.edu (originaly modified by Ethan

p1 = 10 bits p2 = 9 bits

  • ffset = 13 bits

page offset page number

Two-level paging: example

n

System characteristics

n 8 KB pages n 32-bit logical address divided into 13 bit page offset, 19 bit page number

n

Page number divided into:

n 10 bit page number n 9 bit page offset

n

Logical address looks like this:

n p1 is an index into the 1st level page table n p2 is an index into the 2nd level page table pointed to by p1

slide-29
SLIDE 29

Chapter 4

29 CS 1550, cs.pitt.edu (originaly modified by Ethan

. . . . . .

2-level address translation example

p1 = 10 bits p2 = 9 bits

  • ffset = 13 bits

page offset page number

. . .

1 p1

. . .

1 p2

19

physical address 1st level page table 2nd level page table main memory 1 frame number

13

Page table base

. . . . . .

slide-30
SLIDE 30

Chapter 4

30 CS 1550, cs.pitt.edu (originaly modified by Ethan

Implementing page tables in hardware

n Page table resides in main (physical) memory n CPU uses special registers for paging

n Page table base register (PTBR) points to the page table n Page table length register (PTLR) contains length of page table:

restricts maximum legal logical address

n Translating an address requires two memory accesses

n First access reads page table entry (PTE) n Second access reads the data / instruction from memory

n Reduce number of memory accesses

n Can’t avoid second access (we need the value from memory) n Eliminate first access by keeping a hardware cache (called a

translation lookaside buffer or TLB) of recently used page table entries

slide-31
SLIDE 31

Chapter 4

31 CS 1550, cs.pitt.edu (originaly modified by Ethan

Logical page # Physical frame #

Example TLB

8 unused 2 3 12 29 22 7 3 1 12 6 11 4

Translation Lookaside Buffer (TLB)

n

Search the TLB for the desired logical page number

n Search entries in parallel n Use standard cache techniques

n

If desired logical page number is found, get frame number from TLB

n

If desired logical page number isn’t found

n Get frame number from page

table in memory

n Replace an entry in the TLB

with the logical & physical page numbers from this reference

slide-32
SLIDE 32

Chapter 4

32 CS 1550, cs.pitt.edu (originaly modified by Ethan

Handling TLB misses

n If PTE isn’t found in TLB, OS needs to do the lookup in the

page table

n Lookup can be done in hardware or software n Hardware TLB replacement

n CPU hardware does page table lookup n Can be faster than software n Less flexible than software, and more complex hardware

n Software TLB replacement

n OS gets TLB exception n Exception handler does page table lookup & places the result into the

TLB

n Program continues after return from exception n Larger TLB (lower miss rate) can make this feasible

slide-33
SLIDE 33

Chapter 4

33 CS 1550, cs.pitt.edu (originaly modified by Ethan

How long do memory accesses take?

n Assume the following times:

n TLB lookup time = a (often zero - overlapped in CPU) n Memory access time = m

n Hit ratio (h) is percentage of time that a logical page number

is found in the TLB

n Larger TLB usually means higher h n TLB structure can affect h as well

n Effective access time (an average) is calculated as:

n EAT = (m + a)h + (m + m + a)(1-h) n EAT =a + (2-h)m

n Interpretation

n Reference always requires TLB lookup, 1 memory access n TLB misses also require an additional memory reference

slide-34
SLIDE 34

Chapter 4

34 CS 1550, cs.pitt.edu (originaly modified by Ethan

Inverted page table

n Reduce page table size further: keep one entry for

each frame in memory

n PTE contains

n Virtual address pointing to this frame n Information about the process that owns this page

n Search page table by

n Hashing the virtual page number and process ID n Starting at the entry corresponding to the hash result n Search until either the entry is found or a limit is reached

n Page frame number is index of PTE n Improve performance by using more advanced

hashing algorithms

slide-35
SLIDE 35

Chapter 4

35 CS 1550, cs.pitt.edu (originaly modified by Ethan

pid1 pidk pid0

Inverted page table architecture

process ID p = 19 bits

  • ffset = 13 bits

page number 13 19 physical address inverted page table main memory

. . .

1

. . .

Page frame number page offset pid p p0 p1 pk

. . . . . .

1 k search k