Memory System Design: Virtual Memory Virendra Singh Associate - - PowerPoint PPT Presentation

memory system design
SMART_READER_LITE
LIVE PREVIEW

Memory System Design: Virtual Memory Virendra Singh Associate - - PowerPoint PPT Presentation

Memory System Design: Virtual Memory Virendra Singh Associate Professor C omputer A rchitecture and D ependable S ystems L ab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/ E-mail:


slide-1
SLIDE 1

CADSL

Memory System Design:

Virtual Memory

Virendra Singh

Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay

http://www.ee.iitb.ac.in/~viren/ E-mail: viren@ee.iitb.ac.in

EE-739: Processor Design

Lecture 20 (04 Mar 2013)

slide-2
SLIDE 2

CADSL

EE-739@IITB

Performance: Miss

  • Miss rate
  • Fraction of cache access that result in a miss
  • Causes of misses
  • Compulsory
  • First reference to a block
  • Capacity
  • Blocks discarded and later retrieved
  • Conflict
  • Program makes repeated references to multiple

addresses from different blocks that map to the same location in the cache

04 Mar 2013 2

slide-3
SLIDE 3

CADSL

  • Reducing miss rate
  • Larger block size, larger cache size, higher

associativity

  • Reducing miss penalty
  • Multi-level caches, read priority over write
  • Reducing time to hit in the cache
  • Avoid address translation when indexing caches

EE-739@IITB

Memory Optimization

04 Mar 2013 3

slide-4
SLIDE 4

CADSL

EE-739@IITB

Memory Hierarchy Basics

  • Six basic cache optimizations:
  • Larger block size
  • Reduces compulsory misses
  • Increases capacity and conflict misses, increases

miss penalty

  • Larger total cache capacity to reduce miss

rate

  • Increases hit time, increases power consumption
  • Higher associativity
  • Reduces conflict misses
  • Increases hit time, increases power consumption

04 Mar 2013 4

slide-5
SLIDE 5

CADSL

EE-739@IITB

Memory Hierarchy Basics

  • Six basic cache optimizations:
  • Higher number of cache levels
  • Reduces overall memory access time
  • Giving priority to read misses over writes
  • Reduces miss penalty
  • Avoiding address translation in cache

indexing

  • Reduces hit time

04 Mar 2013 5

slide-6
SLIDE 6

CADSL

Summary Summary

  • Memory technology
  • Memory hierarchy

– Temporal and spatial locality

  • Caches

– Placement – Identification – Replacement – Write Policy

  • Pipeline integration of caches

EE-739@IITB 6 04 Mar 2013

slide-7
SLIDE 7

CADSL

EE-739@IITB 7

Memory Hierarchy Memory Hierarchy

CPU I & D L1 Cache Shared L2 Cache Main Memory Disk Temporal Locality

  • Keep recently

referenced items at higher levels

  • Future references

satisfied quickly Spatial Locality

  • Bring neighbors of

recently referenced to higher levels

  • Future references

satisfied quickly

04 Mar 2013

slide-8
SLIDE 8

CADSL

EE-739@IITB 8

Four Burning Issues Four Burning Issues

  • These are:

– Placement

  • Where can a block of memory go?

– Identification

  • How do I find a block of memory?

– Replacement

  • How do I make space for new blocks?

– Write Policy

  • How do I propagate changes?
  • Consider these for registers and main memory

– Main memory usually DRAM

04 Mar 2013

slide-9
SLIDE 9

CADSL

EE-739@IITB 9

Placement Placement

Memory Type Placement Comments Registers Anywhere; Int, FP, SPR Compiler/programme r manages Cache (SRAM) Fixed in H/W Direct-mapped, set-associative, fully-associative DRAM Anywhere O/S manages Disk Anywhere O/S manages

04 Mar 2013

slide-10
SLIDE 10

CADSL

EE-739@IITB 10

Register File Register File

  • Registers managed by programmer/compiler

– Assign variables, temporaries to registers – Limited name space matches available storage

Placement Flexible (subject to data type) Identification Implicit (name == location) Replacement Spill code (store to stack frame) Write policy Write-back (store on replacement)

04 Mar 2013

slide-11
SLIDE 11

CADSL

EE-739@IITB 11

Main Memory and Virtual Memory Main Memory and Virtual Memory

  • Use of virtual memory

– Main memory becomes another level in the memory hierarchy – Enables programs with address space or working set that exceed physically available memory

  • No need for programmer to manage overlays, etc.
  • Sparse use of large address space is OK

– Allows multiple users or programs to timeshare limited amount of physical memory space and address space

  • Bottom line: efficient use of expensive resource,

and ease of programming

04 Mar 2013

slide-12
SLIDE 12

CADSL

EE-739@IITB 12

Virtual Memory Virtual Memory

  • Enables

– Use more memory than system has – Program can think it is the only one running

  • Don’t have to manage address space usage across

programs

  • E.g. think it always starts at address 0x0

– Memory protection

  • Each program has private VA space: no-one else

can clobber

– Better performance

  • Start running a large program before all of it has

been loaded from disk

04 Mar 2013

slide-13
SLIDE 13

CADSL

EE-739@IITB 13

Virtual Memory – Placement Virtual Memory – Placement

  • Main memory managed in larger blocks

– Page size typically 4K – 16K

  • Fully flexible placement; fully associative
  • Operating system manages placement
  • Indirection through page table
  • Maintain mapping between:
  • Virtual address (seen by programmer)
  • Physical address (seen by main memory)

04 Mar 2013

slide-14
SLIDE 14

CADSL

EE-739@IITB 14

Virtual Memory – Placement Virtual Memory – Placement

  • Fully associative implies expensive

lookup?

– In caches, yes: check multiple tags in parallel

  • In virtual memory, expensive lookup is

avoided by using a level of indirection

– Lookup table or hash table – Called a page table

04 Mar 2013

slide-15
SLIDE 15

CADSL

EE-739@IITB 15

Virtual Memory – Identification Virtual Memory – Identification

  • Similar to cache tag array

– Page table entry contains VA, PA, dirty bit

  • Virtual address:

– Matches programmer view; based on register values – Can be the same for multiple programs sharing same system, without conflicts

  • Physical address:

– Invisible to programmer, managed by O/S – Created/deleted on demand basis, can change

Virtual Address Physical Address Dirty bit 0x20004000 0x2000 Y/N

04 Mar 2013

slide-16
SLIDE 16

CADSL

EE-739@IITB 16

Virtual Memory – Replacement Virtual Memory – Replacement

  • Similar to caches:

– FIFO – LRU; overhead too high

  • Approximated with reference bit checks
  • Clock algorithm

– Random

  • O/S decides, manages

04 Mar 2013

slide-17
SLIDE 17

CADSL

EE-739@IITB 17

Virtual Memory – Write Policy Virtual Memory – Write Policy

  • Write back

– Disks are too slow to write through

  • Page table maintains dirty bit

– Hardware must set dirty bit on first write – O/S checks dirty bit on eviction – Dirty pages written to backing store

  • Disk write, 10+ ms

04 Mar 2013

slide-18
SLIDE 18

CADSL

EE-739@IITB 18

Virtual Memory Implementation Virtual Memory Implementation

  • Caches have fixed policies, hardware

FSM for control, pipeline stall

  • VM has very different miss penalties

– Remember disks are 10+ ms!

  • Hence engineered differently

04 Mar 2013

slide-19
SLIDE 19

CADSL

EE-739@IITB 19

Page Faults Page Faults

  • A virtual memory miss is a page fault

– Physical memory location does not exist – Exception is raised, save PC – Invoke OS page fault handler

  • Find a physical page (possibly evict)
  • Initiate fetch from disk

– Switch to other task that is ready to run – Interrupt when disk access complete – Restart original instruction

04 Mar 2013

slide-20
SLIDE 20

CADSL

EE-739@IITB 20

Address Translation Address Translation

  • O/S and hardware communicate via

PTE

  • How do we find a PTE?

– &PTE = PTBR + page number * sizeof(PTE) – PTBR is private for each program

  • Context switch replaces PTBR contents

VA PA Dirty Ref Protection 0x2000400 0x2000 Y/N Y/N Read/Write /Execute

04 Mar 2013

slide-21
SLIDE 21

CADSL

EE-739@IITB 21

Address Translation Address Translation

PA VA D PTBR Virtual Page Number Offset +

04 Mar 2013

slide-22
SLIDE 22

CADSL

EE-739@IITB 22

Page Table Size Page Table Size

  • How big is page table?

– 232 / 4K * 4B = 4M per program (!) – Much worse for 64-bit machines

  • To make it smaller

– Use limit register(s)

  • If VA exceeds limit, invoke O/S to grow region

– Use a multi-level page table – Make the page table pageable (use VM)

04 Mar 2013

slide-23
SLIDE 23

CADSL

EE-739@IITB 23

Multilevel Page Table Multilevel Page Table

PTBR + Offset + +

04 Mar 2013

slide-24
SLIDE 24

CADSL

EE-739@IITB 24

Hashed Page Table Hashed Page Table

  • Use a hash table or inverted page table

– PT contains an entry for each real address

  • Instead of entry for every virtual address

– Entry is found by hashing VA – Oversize PT to reduce collisions: #PTE = 4 x (#phys. pages)

04 Mar 2013

slide-25
SLIDE 25

CADSL

EE-739@IITB 25

Hashed Page Table Hashed Page Table

PTBR Virtual Page Number Offset Hash

PTE2 PTE1 PTE0 PTE3 04 Mar 2013

slide-26
SLIDE 26

CADSL

EE-739@IITB 26

High-Performance VM High-Performance VM

  • VA translation

– Additional memory reference to PTE – Each instruction fetch/load/store now 2 memory references

  • Or more, with multilevel table or hash collisions

– Even if PTE are cached, still slow

  • Hence, use special-purpose cache for PTEs

– Called TLB (translation lookaside buffer) – Caches PTE entries – Exploits temporal and spatial locality (just a cache)

04 Mar 2013

slide-27
SLIDE 27

CADSL

EE-739@IITB 27

TLB TLB

04 Mar 2013