Memory Virtualization: Paging Speed Prof. Patrick G. Bridges 1 - - PowerPoint PPT Presentation

memory virtualization paging speed
SMART_READER_LITE
LIVE PREVIEW

Memory Virtualization: Paging Speed Prof. Patrick G. Bridges 1 - - PowerPoint PPT Presentation

University of New Mexico Memory Virtualization: Paging Speed Prof. Patrick G. Bridges 1 University of New Mexico Speeding up Translation with a TLB Page table entries (PTEs) are cached in L1 like any other memory word PTEs may be


slide-1
SLIDE 1

University of New Mexico

1

Memory Virtualization: Paging Speed

  • Prof. Patrick G. Bridges
slide-2
SLIDE 2

University of New Mexico

2

Speeding up Translation with a TLB

 Page table entries (PTEs) are cached in L1 like any other

memory word

▪ PTEs may be evicted by other data references ▪ PTE hit still requires a small L1 delay

 Solution: Translation Lookaside Buffer (TLB)

▪ Small set-associative hardware cache in MMU ▪ Maps virtual page numbers to physical page numbers ▪ Contains complete page table entries for small number of pages

slide-3
SLIDE 3

University of New Mexico

3

 Part of the chip’s memory-management unit(MMU).  A hardware cache of popular virtual-to-physical address

translation.

MMU

TLB

CPU

Page 0

TLB

popular v to p Page 1 Page 2 TLB Hit Address Translation with MMU

Physical Memory

Page n … Logical Address TLB Lookup

Page Table

all v to p entries

TLB Miss Physical Address

slide-4
SLIDE 4

University of New Mexico

4

Accessing the TLB

 MMU uses the VPN portion of the virtual address to

access the TLB:

TLB tag (TLBT) TLB index (TLBI)

p-1 p n-1

VPO VPN

p+t-1 p+t PTE tag v

PTE tag v

Set 0

PTE tag v PTE tag v

Set 1

PTE tag v PTE tag v

Set T-1 T = 2t sets TLBI selects the set TLBT matches tag of line within set

slide-5
SLIDE 5

University of New Mexico

5

TLB Hit

MMU Cache/ Memory CPU

CPU Chip

VA 1 PA 4 Data 5

A TLB hit eliminates a memory access

TLB

2 VPN PTE 3

slide-6
SLIDE 6

University of New Mexico

6

TLB Miss

MMU Cache/ Memory

PA Data

CPU

VA

CPU Chip

PTE 1 2 5 6

TLB

VPN 4 PTEA 3

A TLB miss incurs an additional memory access (the PTE)

Fortunately, TLB misses are rare. Why?

slide-7
SLIDE 7

University of New Mexico

7

 How a TLB can improve its performance.

Example: Accessing An Array

OFFSET

00 04 08 12 16 VPN = 00 VPN = 01 VPN = 03 VPN = 04 VPN = 05 VPN = 06 a[0] a[1] a[2] VPN = 07 a[3] a[4] a[5] a[6] VPN = 08 a[7] a[8] a[9] VPN = 09 VPN = 10 VPN = 11 VPN = 12 VPN = 13 VPN = 14 VPN = 15

0: int sum = 0 ; 1: for( i=0; i<10; i++){ 2: sum+=a[i]; 3: }

3 misses and 7 hits. Thus TLB hit rate is 70%. The TLB improves performance due to spatial locality

slide-8
SLIDE 8

University of New Mexico

8

Who Handles The TLB Miss?

 Option 1: Hardware handles the TLB miss (x86, ARM).

▪ The hardware has to know exactly where the page tables are

located in memory.

▪ The hardware would “walk” the page table, find the correct page-

table entry and extract the desired translation, update and retry instruction.

▪ Hardware specifies the exact format of the page table! ▪ Hardware-managed TLB.

slide-9
SLIDE 9

University of New Mexico

9

Who Handles The TLB Miss? (Cont.)

 Option 2: Software-managed TLB (MIPS, some others)

▪ On a TLB miss, the hardware raises exception( trap handler ).

▪ Trap handler is code within the OS that is written with the

express purpose of handling TLB miss.

▪ Allows for much wider range of page table organizations

slide-10
SLIDE 10

University of New Mexico

10

TLB entry

 TLB is generally a small fully associative cache .

▪ A typical TLB might have 32, 64, or 128 entries. ▪ Hardware search the entire TLB in parallel to find the desired

translation.

▪ other bits: valid bits , protection bits, address-space identifier, dirty

bit

VPN PFN

  • ther bits

Typical TLB entry look like this

slide-11
SLIDE 11

University of New Mexico

11

Issue: Context Switching and Shared TLB

 TLB is a hardware structure shared by all processes

Process A Process B

TLB Table Page 0 Page 1 Page 2

Virtual Memory

Page n …

access VPN10

Page 0 Page 1 Page 2 Page n …

Virtual Memory

VPN PFN valid prot 10 100 1 rwx

  • Insert TLB Entry
slide-12
SLIDE 12

University of New Mexico

12

TLB Issue: Context Switching

Process A Process B

TLB Table Page 0 Page 1 Page 2

Virtual Memory

Page n … Page 0 Page 1 Page 2 Page n …

Virtual Memory

VPN PFN valid prot 10 100 1 rwx

  • 10

170 1 rwx

  • Context

Switching access VPN10 Insert TLB Entry

Can’t Distinguish which entry is meant for which process

slide-13
SLIDE 13

University of New Mexico

13

Options

1.

Flush TLB on context switch

2.

Provide an address space identifier (ASID) field in the TLB.

Process A Process B

TLB Table Page 0 Page 1 Page 2

Virtual Memory

Page n … Page 0 Page 1 Page 2 Page n …

Virtual Memory

VPN PFN valid prot ASID 10 100 1 rwx 1

  • 10

170 1 rwx 2

slide-14
SLIDE 14

University of New Mexico

14

Simple Memory System Example

 Addressing

▪ 14-bit virtual addresses ▪ 12-bit physical address ▪ Page size = 64 bytes

13 12 11 10 9 8 7 6 5 4 3 2 1 11 10 9 8 7 6 5 4 3 2 1

VPO PPO PPN VPN Virtual Page Number Virtual Page Offset Physical Page Number Physical Page Offset

slide-15
SLIDE 15

University of New Mexico

15

  • 1. Simple Memory System TLB

 16 entries  4-way associative 13 12 11 10 9 8 7 6 5 4 3 2 1

VPO VPN

TLBI TLBT

– 02 1 34 0A 1 0D 03 – 07 3 – 03 – 06 – 08 – 02 2 – 0A – 04 – 02 1 2D 03 1 1 02 07 – 00 1 0D 09 – 03 Valid PPN Tag Valid PPN Tag Valid PPN Tag Valid PPN Tag Set

slide-16
SLIDE 16

University of New Mexico

16

  • 2. Simple Memory System Page Table

Only show first 16 entries (out of 256)

1 0D 0F 1 11 0E 1 2D 0D – 0C – 0B 1 09 0A 1 17 09 1 13 08 Valid PPN VPN – 07 – 06 1 16 05 – 04 1 02 03 1 33 02 – 01 1 28 00 Valid PPN VPN

slide-17
SLIDE 17

University of New Mexico

17

  • 3. Simple Memory System Cache

 16 lines, 4-byte block size  Physically addressed  Direct mapped 11 10 9 8 7 6 5 4 3 2 1

PPO PPN

CO CI CT

03 DF C2 11 1 16 7 – – – – 31 6 1D F0 72 36 1 0D 5 09 8F 6D 43 1 32 4 – – – – 36 3 08 04 02 00 1 1B 2 – – – – 15 1 11 23 11 99 1 19 B3 B2 B1 B0 Valid Tag Idx – – – – 14 F D3 1B 77 83 1 13 E 15 34 96 04 1 16 D – – – – 12 C – – – – 0B B 3B DA 15 93 1 2D A – – – – 2D 9 89 51 00 3A 1 24 8 B3 B2 B1 B0 Valid Tag Idx

slide-18
SLIDE 18

University of New Mexico

18

Address Translation Example #1

Virtual Address: 0x03D4

VPN ___ TLBI ___ TLBT ___ TLB Hit? __ Page Fault? __ PPN: ____

Physical Address

CO ___ CI__ CT ___ Hit? __ Byte: ____

13 12 11 10 9 8 7 6 5 4 3 2 1

VPO VPN

TLBI TLBT

11 10 9 8 7 6 5 4 3 2 1

PPO PPN

CO CI CT

1 1 1 1 1 1

0x0F 0x3 0x03 Y N 0x0D

1 1 1 1 1

0x5 0x0D Y 0x36

slide-19
SLIDE 19

University of New Mexico

19

Address Translation Example #2

Virtual Address: 0x0020

VPN ___ TLBI ___ TLBT __ TLB Hit? __ Page Fault? __ PPN: ____

Physical Address

CO___ CI___ CT ____ Hit? __ Byte: ____

13 12 11 10 9 8 7 6 5 4 3 2 1

VPO VPN

TLBI TLBT

11 10 9 8 7 6 5 4 3 2 1

PPO PPN

CO CI CT

1

0x00 0x00 N N 0x28

1 1 1

0x8 0x28 N Mem

slide-20
SLIDE 20

University of New Mexico

20

Address Translation Example #3

Virtual Address: 0x0020

VPN ___ TLBI ___ TLBT ____ TLB Hit? __ Page Fault? __ PPN: ____

Physical Address

CO___ CI___ CT ____ Hit? __ Byte: ____

13 12 11 10 9 8 7 6 5 4 3 2 1

VPO VPN

TLBI TLBT

11 10 9 8 7 6 5 4 3 2 1

PPO PPN

CO CI CT

1

0x00 0x00 N N 0x28

1 1 1

0x8 0x28 N Mem