TLBs 1 memory HW random memory image page tables with 1-byte page - - PowerPoint PPT Presentation

tlbs
SMART_READER_LITE
LIVE PREVIEW

TLBs 1 memory HW random memory image page tables with 1-byte page - - PowerPoint PPT Presentation

TLBs 1 memory HW random memory image page tables with 1-byte page entries answer: 2-byte values read (or replaced) or fault 3 attempts per set of problems submitting only right and blank answers doesnt count as attempt keep


slide-1
SLIDE 1

TLBs

1

slide-2
SLIDE 2

memory HW

random memory image page tables with 1-byte page entries answer: 2-byte values read (or replaced) or “fault” 3 attempts per set of problems

submitting only right and blank answers — doesn’t count as attempt

keep getting new sets of problems until you get it right

2

slide-3
SLIDE 3

cache accesses and multi-level PTs

four-level page tables — four cache accesses per memory access L1 cache hits — typically a couple cycles each? so add 8 cycles to each memory access? not acceptable

3

slide-4
SLIDE 4

program memory active sets

0xFFFF FFFF FFFF FFFF 0xFFFF 8000 0000 0000 0x7F… 0x0000 0000 0040 0000 Used by OS Stack Heap / other dynamic Writable data Code + Constants small areas of memory active at a time

  • ne or two pages in each area?

4

slide-5
SLIDE 5

page table entries and locality

page table entries have excellent temporal locality typically one or two pages of the stack active typically one or two pages of code active typically one or two pages of heap/globals active each page contains whole functions, arrays, stack frames, etc. needed page table entries are very small

5

slide-6
SLIDE 6

page table entries and locality

page table entries have excellent temporal locality typically one or two pages of the stack active typically one or two pages of code active typically one or two pages of heap/globals active each page contains whole functions, arrays, stack frames, etc. needed page table entries are very small

5

slide-7
SLIDE 7

page table entry cache

caled a TLB (translation lookaside bufger) very small cache of page table entries L1 cache TLB physical addresses virtual page numbers bytes from memory page table entries tens of bytes per block

  • ne page table entry per block

usually thousands of blocks usually tens of entries

  • nly caches the page table lookup itself

(generally) just entries from the last-level page table not much spatial locality between page table entries (they’re used for kilobytes of data already) few active page table entries at a time enables highly associative cache designs

6

slide-8
SLIDE 8

page table entry cache

caled a TLB (translation lookaside bufger) very small cache of page table entries L1 cache TLB physical addresses virtual page numbers bytes from memory page table entries tens of bytes per block

  • ne page table entry per block

usually thousands of blocks usually tens of entries

  • nly caches the page table lookup itself

(generally) just entries from the last-level page table not much spatial locality between page table entries (they’re used for kilobytes of data already) few active page table entries at a time enables highly associative cache designs

6

slide-9
SLIDE 9

page table entry cache

caled a TLB (translation lookaside bufger) very small cache of page table entries L1 cache TLB physical addresses virtual page numbers bytes from memory page table entries tens of bytes per block

  • ne page table entry per block

usually thousands of blocks usually tens of entries

  • nly caches the page table lookup itself

(generally) just entries from the last-level page table not much spatial locality between page table entries (they’re used for kilobytes of data already) few active page table entries at a time enables highly associative cache designs

6

slide-10
SLIDE 10

page table entry cache

caled a TLB (translation lookaside bufger) very small cache of page table entries L1 cache TLB physical addresses virtual page numbers bytes from memory page table entries tens of bytes per block

  • ne page table entry per block

usually thousands of blocks usually tens of entries

  • nly caches the page table lookup itself

(generally) just entries from the last-level page table not much spatial locality between page table entries (they’re used for kilobytes of data already) few active page table entries at a time enables highly associative cache designs

6

slide-11
SLIDE 11

TLB and the MMU (1)

MMU

(‘page table walk’ logic)

L1 Cache/Memory TLB address from program

7

slide-12
SLIDE 12

TLB and the MMU (2)

11 0101 01 00 1101 1111

× PTE size

0x10000 page table base register TLB

+

data or instruction cache

1101 0011 11

check valid and kernel bit split PTE parts

cause fault?

00 1101 1111

physical address virtual address

TLB hit: TLB accesses replaces page table access TLB miss: page table access happens TLB miss: TLB gets a copy of the page table entry need to check permissions (read/kernel/etc.) but no need to store invalid PTEs in TLB

8

slide-13
SLIDE 13

TLB and the MMU (2)

11 0101 01 00 1101 1111

× PTE size

0x10000 page table base register TLB

+

data or instruction cache

1101 0011 11

check valid and kernel bit split PTE parts

cause fault?

00 1101 1111

physical address virtual address

TLB hit: TLB accesses replaces page table access TLB miss: page table access happens TLB miss: TLB gets a copy of the page table entry need to check permissions (read/kernel/etc.) but no need to store invalid PTEs in TLB

8

slide-14
SLIDE 14

TLB and the MMU (2)

11 0101 01 00 1101 1111

× PTE size

0x10000 page table base register TLB

+

data or instruction cache

1101 0011 11

check valid and kernel bit split PTE parts

cause fault?

00 1101 1111

physical address virtual address

TLB hit: TLB accesses replaces page table access TLB miss: page table access happens TLB miss: TLB gets a copy of the page table entry need to check permissions (read/kernel/etc.) but no need to store invalid PTEs in TLB

8

slide-15
SLIDE 15

TLB and the MMU (2)

11 0101 01 00 1101 1111

× PTE size

0x10000 page table base register TLB

+

data or instruction cache

1101 0011 11

check valid and kernel bit split PTE parts

cause fault?

00 1101 1111

physical address virtual address

TLB hit: TLB accesses replaces page table access TLB miss: page table access happens TLB miss: TLB gets a copy of the page table entry need to check permissions (read/kernel/etc.) but no need to store invalid PTEs in TLB

8

slide-16
SLIDE 16

TLB and the MMU (2)

11 0101 01 00 1101 1111

× PTE size

0x10000 page table base register TLB

+

data or instruction cache

1101 0011 11

check valid and kernel bit split PTE parts

cause fault?

00 1101 1111

physical address virtual address

TLB hit: TLB accesses replaces page table access TLB miss: page table access happens TLB miss: TLB gets a copy of the page table entry need to check permissions (read/kernel/etc.) but no need to store invalid PTEs in TLB

8

slide-17
SLIDE 17

TLB and the MMU (2)

11 0101 01 00 1101 1111

× PTE size

0x10000 page table base register TLB

+

data or instruction cache

1101 0011 11

check valid and kernel bit split PTE parts

cause fault?

00 1101 1111

physical address virtual address

TLB hit: TLB accesses replaces page table access TLB miss: page table access happens TLB miss: TLB gets a copy of the page table entry need to check permissions (read/kernel/etc.) but no need to store invalid PTEs in TLB

8

slide-18
SLIDE 18

TLB and multi-level page tables

TLB caches valid last-level page table entries doesn’t matter which last-level page table means TLB output can be used directly to form address

9

slide-19
SLIDE 19

TLB organization (2-way set associative)

valid tag physical page # write … valid tag physical page # write … … … … … … … … … … … 1 10 0x123 1 1 11 0x12F 1

100 11 010110

index

(program address) VPN page ofgset

= = tag

AND AND

page table entry

OR

is hit? page table entry

10

slide-20
SLIDE 20

TLB organization (2-way set associative)

valid tag physical page # write … valid tag physical page # write … … … … … … … … … … … 1 10 0x123 1 1 11 0x12F 1

100 11 010110

index

(program address) VPN page ofgset

= = tag

AND AND

page table entry

OR

is hit? page table entry

10

slide-21
SLIDE 21

TLB organization (2-way set associative)

valid tag physical page # write … valid tag physical page # write … … … … … … … … … … … 1 10 0x123 1 1 11 0x12F 1

100 11 010110

index

(program address) VPN page ofgset

= = tag

AND AND

page table entry

OR

is hit? page table entry

10

slide-22
SLIDE 22

TLB organization (2-way set associative)

valid tag physical page # write … valid tag physical page # write … … … … … … … … … … … 1 10 0x123 1 1 11 0x12F 1

100 11 010110

index

(program address) VPN page ofgset

= = tag

AND AND

page table entry

OR

is hit? page table entry

10

slide-23
SLIDE 23

TLB organization (2-way set associative)

valid tag physical page # write … valid tag physical page # write … … … … … … … … … … … 1 10 0x123 1 1 11 0x12F 1

100 11 010110

index

(program address) VPN page ofgset

= = tag

AND AND

page table entry

OR

is hit? page table entry

10

slide-24
SLIDE 24

address splitting for TLBs (1)

my desktop: 4KB (212 byte) pages; 48-bit virtual address 64-entry, 4-way L1 data TLB TLB index bits?

sets — 4 bits

TLB tag bits?

bit virtual address — bit TLB tag

11

slide-25
SLIDE 25

address splitting for TLBs (1)

my desktop: 4KB (212 byte) pages; 48-bit virtual address 64-entry, 4-way L1 data TLB TLB index bits?

64/4 = 16 sets — 4 bits

TLB tag bits?

48 − 12 = 36 bit virtual address — 36 − 4 = 32 bit TLB tag

11

slide-26
SLIDE 26

address splitting for TLBs (2)

my desktop: 4KB (212 byte) pages; 48-bit virtual address 1536-entry (3 · 29), 12-way L2 TLB TLB index bits?

sets — 7 bits

TLB tag bits?

bit virtual address — bit TLB tag

12

slide-27
SLIDE 27

address splitting for TLBs (2)

my desktop: 4KB (212 byte) pages; 48-bit virtual address 1536-entry (3 · 29), 12-way L2 TLB TLB index bits?

1536/12 = 128 sets — 7 bits

TLB tag bits?

48 − 12 = 36 bit virtual address — 36 − 7 = 29 bit TLB tag

12

slide-28
SLIDE 28

address splitting exercise (3)

384-entry, 3-way set-associative TLB 32-bit virtual address; 8KB pages 2-level page table; 4 byte PTEs

256 entries in fjrst level; 2048 in second

split the address 0x12345678

0001 0010 0011 0100 0101 0110 0111 1000 13-bit page ofgset 1 0110 0111 1000

  • bit VPN 0001 0010 0011 0100 010

8-bit fjrst part of VPN 0001 0010 11-bit second part of VPN 0011 0100 010 7-bit TLB index 0100 010

  • bit TLB tag 0001 0010 0011

13

slide-29
SLIDE 29

address splitting exercise (3)

384-entry, 3-way set-associative TLB 32-bit virtual address; 8KB pages 2-level page table; 4 byte PTEs

256 entries in fjrst level; 2048 in second

split the address 0x12345678

0001 0010 0011 0100 0101 0110 0111 1000 13-bit page ofgset 1 0110 0111 1000

  • bit VPN 0001 0010 0011 0100 010

8-bit fjrst part of VPN 0001 0010 11-bit second part of VPN 0011 0100 010 7-bit TLB index 0100 010

  • bit TLB tag 0001 0010 0011

13

slide-30
SLIDE 30

address splitting exercise (3)

384-entry, 3-way set-associative TLB 32-bit virtual address; 8KB pages 2-level page table; 4 byte PTEs

256 entries in fjrst level; 2048 in second

split the address 0x12345678

0001 0010 0011 0100 0101 0110 0111 1000 13-bit page ofgset 1 0110 0111 1000

  • bit VPN 0001 0010 0011 0100 010

8-bit fjrst part of VPN 0001 0010 11-bit second part of VPN 0011 0100 010 7-bit TLB index 0100 010

  • bit TLB tag 0001 0010 0011

13

slide-31
SLIDE 31

address splitting exercise (3)

384-entry, 3-way set-associative TLB 32-bit virtual address; 8KB pages 2-level page table; 4 byte PTEs

256 entries in fjrst level; 2048 in second

split the address 0x12345678

0001 0010 0011 0100 0101 0110 0111 1000 13-bit page ofgset 1 0110 0111 1000 32 − 13 = 19-bit VPN 0001 0010 0011 0100 010 8-bit fjrst part of VPN 0001 0010 11-bit second part of VPN 0011 0100 010 7-bit TLB index 0100 010

  • bit TLB tag 0001 0010 0011

13

slide-32
SLIDE 32

address splitting exercise (3)

384-entry, 3-way set-associative TLB 32-bit virtual address; 8KB pages 2-level page table; 4 byte PTEs

256 entries in fjrst level; 2048 in second

split the address 0x12345678

0001 0010 0011 0100 0101 0110 0111 1000 13-bit page ofgset 1 0110 0111 1000 32 − 13 = 19-bit VPN 0001 0010 0011 0100 010 8-bit fjrst part of VPN 0001 0010 11-bit second part of VPN 0011 0100 010 7-bit TLB index 0100 010

  • bit TLB tag 0001 0010 0011

13

slide-33
SLIDE 33

address splitting exercise (3)

384-entry, 3-way set-associative TLB 32-bit virtual address; 8KB pages 2-level page table; 4 byte PTEs

256 entries in fjrst level; 2048 in second

split the address 0x12345678

0001 0010 0011 0100 0101 0110 0111 1000 13-bit page ofgset 1 0110 0111 1000 32 − 13 = 19-bit VPN 0001 0010 0011 0100 010 8-bit fjrst part of VPN 0001 0010 11-bit second part of VPN 0011 0100 010 7-bit TLB index 0100 010

  • bit TLB tag 0001 0010 0011

13

slide-34
SLIDE 34

address splitting exercise (3)

384-entry, 3-way set-associative TLB 32-bit virtual address; 8KB pages 2-level page table; 4 byte PTEs

256 entries in fjrst level; 2048 in second

split the address 0x12345678

0001 0010 0011 0100 0101 0110 0111 1000 13-bit page ofgset 1 0110 0111 1000 32 − 13 = 19-bit VPN 0001 0010 0011 0100 010 8-bit fjrst part of VPN 0001 0010 11-bit second part of VPN 0011 0100 010 7-bit TLB index 0100 010

  • bit TLB tag 0001 0010 0011

13

slide-35
SLIDE 35

address splitting exercise (3)

384-entry, 3-way set-associative TLB 32-bit virtual address; 8KB pages 2-level page table; 4 byte PTEs

256 entries in fjrst level; 2048 in second

split the address 0x12345678

0001 0010 0011 0100 0101 0110 0111 1000 13-bit page ofgset 1 0110 0111 1000 32 − 13 = 19-bit VPN 0001 0010 0011 0100 010 8-bit fjrst part of VPN 0001 0010 11-bit second part of VPN 0011 0100 010 7-bit TLB index 0100 010 19 − 7 = 12-bit TLB tag 0001 0010 0011

13

slide-36
SLIDE 36

TLB example: address splitting

16-bit virtual addresses 64-byte pages 8-entry, 2-way TLB

14

slide-37
SLIDE 37

TLB access pattern

64-byte pages

address (hex) hit? 000100000000 (100) miss 110100000001 (D01) miss 000100001010 (10A) hit 110100100001 (D21) hit 000011111100 (0FC) miss 110011111000 (CF8) miss 111100101000 (F23) miss index V tag PTE V tag PTE LRU 00 0 ????

0 ????

?

01 0 ????

0 ????

?

10 0 ????

0 ????

?

11 0 ????

0 ????

?

8 entries, 4 sets

VPN tag index

15

slide-38
SLIDE 38

TLB access pattern

64-byte pages

address (hex) hit? 000100000000 (100) miss 110100000001 (D01) miss 000100001010 (10A) hit 110100100001 (D21) hit 000011111100 (0FC) miss 110011111000 (CF8) miss 111100101000 (F23) miss index V tag PTE V tag PTE LRU 00 0 ????

0 ????

?

01 0 ????

0 ????

?

10 0 ????

0 ????

?

11 0 ????

0 ????

?

8 entries, 4 sets

VPN tag index

15

slide-39
SLIDE 39

TLB access pattern

64-byte pages

address (hex) hit? 000100000000 (100) miss 110100000001 (D01) miss 000100001010 (10A) hit 110100100001 (D21) hit 000011111100 (0FC) miss 110011111000 (CF8) miss 111100101000 (F23) miss index V tag PTE V tag PTE LRU 00 0 ????

0 ????

?

01 0 ????

0 ????

?

10 0 ????

0 ????

?

11 0 ????

0 ????

?

8 entries, 4 sets

VPN tag index

15

slide-40
SLIDE 40

TLB access pattern

64-byte pages

address (hex) hit? 000100000000 (100) miss 110100000001 (D01) miss 000100001010 (10A) hit 110100100001 (D21) hit 000011111100 (0FC) miss 110011111000 (CF8) miss 111100101000 (F23) miss index V tag PTE V tag PTE LRU 00 1 0001

for VPN 000100

0 ????

way 1

01 0 ????

0 ????

?

10 0 ????

0 ????

?

11 0 ????

0 ????

?

8 entries, 4 sets

VPN tag index

15

slide-41
SLIDE 41

TLB access pattern

64-byte pages

address (hex) hit? 000100000000 (100) miss 110100000001 (D01) miss 000100001010 (10A) hit 110100100001 (D21) hit 000011111100 (0FC) miss 110011111000 (CF8) miss 111100101000 (F23) miss index V tag PTE V tag PTE LRU 00 1 0001

for VPN 000100

1 1101

for VPN 110100

way 0

01 0 ????

0 ????

?

10 0 ????

0 ????

?

11 0 ????

0 ????

?

8 entries, 4 sets

VPN tag index

15

slide-42
SLIDE 42

TLB access pattern

64-byte pages

address (hex) hit? 000100000000 (100) miss 110100000001 (D01) miss 000100001010 (10A) hit 110100100001 (D21) hit 000011111100 (0FC) miss 110011111000 (CF8) miss 111100101000 (F23) miss index V tag PTE V tag PTE LRU 00 1 0001

for VPN 000100

1 1101

for VPN 110100

way 1

01 0 ????

0 ????

?

10 0 ????

0 ????

?

11 0 ????

0 ????

?

8 entries, 4 sets

VPN tag index

15

slide-43
SLIDE 43

TLB access pattern

64-byte pages

address (hex) hit? 000100000000 (100) miss 110100000001 (D01) miss 000100001010 (10A) hit 110100100001 (D21) hit 000011111100 (0FC) miss 110011111000 (CF8) miss 111100101000 (F23) miss index V tag PTE V tag PTE LRU 00 1 0001

for VPN 000100

1 1101

for VPN 110100

way 0

01 0 ????

0 ????

?

10 0 ????

0 ????

?

11 0 ????

0 ????

?

8 entries, 4 sets

VPN tag index

15

slide-44
SLIDE 44

TLB access pattern

64-byte pages

address (hex) hit? 000100000000 (100) miss 110100000001 (D01) miss 000100001010 (10A) hit 110100100001 (D21) hit 000011111100 (0FC) miss 110011111000 (CF8) miss 111100101000 (F23) miss index V tag PTE V tag PTE LRU 00 1 0001

for VPN 000100

1 1101

for VPN 110100

way 0

01 0 ????

0 ????

?

10 0 ????

0 ????

?

11 1 0000

for VPN 000011

????

way 1

8 entries, 4 sets

VPN tag index

15

slide-45
SLIDE 45

TLB access pattern

64-byte pages

address (hex) hit? 000100000000 (100) miss 110100000001 (D01) miss 000100001010 (10A) hit 110100100001 (D21) hit 000011111100 (0FC) miss 110011111000 (CF8) miss 111100101000 (F23) miss index V tag PTE V tag PTE LRU 00 1 0001

for VPN 000100

1 1101

for VPN 110100

way 0

01 0 ????

0 ????

?

10 0 ????

0 ????

?

11 1 0000

for VPN 000011

1 1100

for VPN 110011

way 0

8 entries, 4 sets

VPN tag index

15

slide-46
SLIDE 46

TLB access pattern

64-byte pages

address (hex) hit? 000100000000 (100) miss 110100000001 (D01) miss 000100001010 (10A) hit 110100100001 (D21) hit 000011111100 (0FC) miss 110011111000 (CF8) miss 111100101000 (F23) miss index V tag PTE V tag PTE LRU 00 1 1111

for VPN 111100

1 1101

for VPN 110100

way 1

01 0 ????

0 ????

?

10 0 ????

0 ????

?

11 1 0000

for VPN 000011

1 1100

for VPN 110011

way 0

8 entries, 4 sets

VPN tag index

15

slide-47
SLIDE 47

changing page tables

what happens to TLB when page table base pointer is changed?

e.g. context switch

most entries in TLB refer to things from wrong process

  • ops — read from the wrong process’s stack?
  • ption 1: invalidate all TLB entries

side efgect on “change page table base register” instruction

  • ption 2: TLB entries contain process ID

set by OS (special register) checked by TLB in addition to TLB tag, valid bit

16

slide-48
SLIDE 48

changing page tables

what happens to TLB when page table base pointer is changed?

e.g. context switch

most entries in TLB refer to things from wrong process

  • ops — read from the wrong process’s stack?
  • ption 1: invalidate all TLB entries

side efgect on “change page table base register” instruction

  • ption 2: TLB entries contain process ID

set by OS (special register) checked by TLB in addition to TLB tag, valid bit

16

slide-49
SLIDE 49

changing page tables

what happens to TLB when page table base pointer is changed?

e.g. context switch

most entries in TLB refer to things from wrong process

  • ops — read from the wrong process’s stack?
  • ption 1: invalidate all TLB entries

side efgect on “change page table base register” instruction

  • ption 2: TLB entries contain process ID

set by OS (special register) checked by TLB in addition to TLB tag, valid bit

16

slide-50
SLIDE 50

editing page tables

what happens to TLB when OS changes a page table entry? invalid to valid — nothing needed

TLB doesn’t contain invalid entries MMU will check memory again

valid to invalid — OS needs to tell processor to invalidate it

special instruction (x86: invlpg)

valid to other valid — OS needs to tell processor to invalidate it

17

slide-51
SLIDE 51

TLB shootdown

… program A pages … program B pages

program B (on other core)

program A page fault OS

interrupt — triggered to invalidate TLB

start read evicted loaded interrupt mark evicted page invalid in page table mark evicted page invalid in page table

  • ther processes can run while reading page

OS will get interrupt when disk is done process A’s page table updated and restarted from point of fault

18

slide-52
SLIDE 52

TLB shootdown

… program A pages … program B pages

program B (on other core)

program A page fault OS

interrupt — triggered to invalidate TLB

start read evicted loaded interrupt mark evicted page invalid in page table mark evicted page invalid in page table

  • ther processes can run while reading page

OS will get interrupt when disk is done process A’s page table updated and restarted from point of fault

18

slide-53
SLIDE 53

TLB shootdown

… program A pages … program B pages

program B (on other core)

program A page fault OS

interrupt — triggered to invalidate TLB

start read evicted loaded interrupt mark evicted page invalid in page table mark evicted page invalid in page table

  • ther processes can run while reading page

OS will get interrupt when disk is done process A’s page table updated and restarted from point of fault

18

slide-54
SLIDE 54

TLB shootdown

… program A pages … program B pages

program B (on other core)

program A page fault OS

interrupt — triggered to invalidate TLB

start read evicted loaded interrupt mark evicted page invalid in page table mark evicted page invalid in page table

  • ther processes can run while reading page

OS will get interrupt when disk is done process A’s page table updated and restarted from point of fault

18

slide-55
SLIDE 55

TLB shootdown

… program A pages … program B pages

program B (on other core)

program A page fault OS

interrupt — triggered to invalidate TLB

start read evicted loaded interrupt mark evicted page invalid in page table mark evicted page invalid in page table

  • ther processes can run while reading page

OS will get interrupt when disk is done process A’s page table updated and restarted from point of fault

18

slide-56
SLIDE 56

TLBs and performance

TLB L1 cache extra time?

19

slide-57
SLIDE 57

L1 caches and page numbers (Intel Skylake)

physical address (48 bits) PPN (36 bit) page ofgset (12 bit) L1 cache tag (36 bit) L1 index (6 bit) L1 ofgset (6 bit)

not a coincidence why did Intel make this decision?

20

slide-58
SLIDE 58

L1 caches and page numbers (Intel Skylake)

physical address (48 bits) PPN (36 bit) page ofgset (12 bit) L1 cache tag (36 bit) L1 index (6 bit) L1 ofgset (6 bit)

not a coincidence why did Intel make this decision?

20

slide-59
SLIDE 59
  • verlapping TLB and cache access

valid tag data valid tag data 1 10 00 11 1 00 AA BB 1 11 B4 B5 1 01 33 44

100 000111 1 TLB 11

VPN index = = tag=PPN

AND AND OR

is hit? (1)

  • fgset

data (B5) perform TLB access while cache access is happening

21

slide-60
SLIDE 60
  • verlapping TLB and cache access

valid tag data valid tag data 1 10 00 11 1 00 AA BB 1 11 B4 B5 1 01 33 44

100 000111 1 TLB 11

VPN index = = tag=PPN

AND AND OR

is hit? (1)

  • fgset

data (B5) perform TLB access while cache access is happening

21

slide-61
SLIDE 61
  • verlapping TLB and cache access

valid tag data valid tag data 1 10 00 11 1 00 AA BB 1 11 B4 B5 1 01 33 44

100 000111 1 TLB 11

VPN index = = tag=PPN

AND AND OR

is hit? (1)

  • fgset

data (B5) perform TLB access while cache access is happening

21

slide-62
SLIDE 62

virtually-indexed, physically-tagged

called virtually-indexed, physically-tagged cache requirement: index contained entirely in page ofgset

do not need to do translation to start cache access

tag overlaps with PPN

example: tag=PPN (but tag could include part of page ofgset, too)

do TLB access while retrieving cache set most common design in current processors reason for highly associative (e.g. 8-way) L1 caches

22

slide-63
SLIDE 63

physical caches

so far: caches use physical addresses: means cache lookup can’t complete without TLB

(and can’t start without index from physical address)

memory address TLB L1 cache page ofgset

23

slide-64
SLIDE 64

virtual indexing (rarely used)

alternate option: have caches hold virtual addresses alternate, rarely chosen design

memory address L1 cache TLB

  • n miss

faster lookup, but some things more complicated:

need to invalidate caches on page table changes need to deal with multiple PTEs for some physical page (“aliasing”)

24

slide-65
SLIDE 65

address splitting

16-bit virtual addresses 64-byte pages 256B, 8-way L1 cache with 16B blocks can TLB and cache access overlap?

25

slide-66
SLIDE 66

x86-64 page table entries (1)

present = valid R/W = writes allowed? U/S = kernel-only? (“user/supervisor”) XD = execute-disable? A = accessed? (MMU sets to 1 on page read/write) D = dirty? (MMU sets to 1 on page write)

helps support replacement policies for swapping helps support writeback policy for swapping

26

slide-67
SLIDE 67

x86-64 page table entries (1)

present = valid R/W = writes allowed? U/S = kernel-only? (“user/supervisor”) XD = execute-disable? A = accessed? (MMU sets to 1 on page read/write) D = dirty? (MMU sets to 1 on page write)

helps support replacement policies for swapping helps support writeback policy for swapping

26

slide-68
SLIDE 68

x86-64 page table entries (1)

present = valid R/W = writes allowed? U/S = kernel-only? (“user/supervisor”) XD = execute-disable? A = accessed? (MMU sets to 1 on page read/write) D = dirty? (MMU sets to 1 on page write)

helps support replacement policies for swapping helps support writeback policy for swapping

26

slide-69
SLIDE 69

x86-64 page table entries (2)

G = global? (shared between all page tables) PWT, PCD, PAT = control how caches work when accessing physical page:

can disable using the cache entirely can disable write-back (use write-through instead) multicore-related cache settings (and some other settings)

CPU won’t evict TLB entries on most page table base registers changes

27

slide-70
SLIDE 70

x86-64 page table entries (2)

G = global? (shared between all page tables) PWT, PCD, PAT = control how caches work when accessing physical page:

can disable using the cache entirely can disable write-back (use write-through instead) multicore-related cache settings (and some other settings)

CPU won’t evict TLB entries on most page table base registers changes

27

slide-71
SLIDE 71

book’s diagram

28

slide-72
SLIDE 72

“huge pages”

VPN part 1 VPN part 2 (normal) page ofgset PTBR

page table entry 0

page table entry

physical page (normal) VPN part 1 “huge page” page ofgset PTBR physical page (huge)

page table entry 1

29

slide-73
SLIDE 73

big pages on x86-64

  • ption for 2MB or 1GB pages instead of 4KB pages

fjrst, second, third-level page table entries can point to either

next page table (normal case), or a “huge page”

changes to TLB needed processes can have mix of huge and normal apges

30

slide-74
SLIDE 74

why big pages?

TLB misses can create same sort of problems as cache misses can do cache blocking to help with TLB misses but… big pages are relatively easy to implement might dramatically reduce TLB misses

31

slide-75
SLIDE 75

32

slide-76
SLIDE 76

two-level page tables

for VPN 0x0-0xFF for VPN 0x100-0x1FF for VPN 0x200-0x2FF for VPN 0x300-0x300 … for VPN 0xFF00-0xFFFF fjrst-level page table two-level page table for 65536 pages (16-bit VPN) PTE for VPN 0x00 PTE for VPN 0x01 PTE for VPN 0x02 PTE for VPN 0x03 … PTE for VPN 0xFF second-level page tables actual data for page (if PTE valid) PTE for VPN 0x300 PTE for VPN 0x301 PTE for VPN 0x302 PTE for VPN 0x303 … PTE for VPN 0x3FF invalid entries represent big holes VPN range valid kernel write physical page #

(of next page table)

0x0000-0x00FF 1 1 0x22343 0x0100-0x01FF 1 0x00000 0x0200-0x02FF 0x00000 0x0300-0x03FF 1 1 0x33454 0x0400-0x04FF 1 1 0xFF043 … … … … … 0xFF00-0xFFFF 1 1 0xFF045 fjrst-level page table VPN valid kernel write physical page #

(of data)

0x300 1 1 0x42443 0x301 1 1 0x4A9DE 0x302 1 1 0x5C001 0x303 0x00000 0x304 1 1 0x6C223 … … … … … 0x3FF 0x00000 a second-level page table

33

slide-77
SLIDE 77

two-level page tables

for VPN 0x0-0xFF for VPN 0x100-0x1FF for VPN 0x200-0x2FF for VPN 0x300-0x300 … for VPN 0xFF00-0xFFFF fjrst-level page table two-level page table for 65536 pages (16-bit VPN) PTE for VPN 0x00 PTE for VPN 0x01 PTE for VPN 0x02 PTE for VPN 0x03 … PTE for VPN 0xFF second-level page tables actual data for page (if PTE valid) PTE for VPN 0x300 PTE for VPN 0x301 PTE for VPN 0x302 PTE for VPN 0x303 … PTE for VPN 0x3FF invalid entries represent big holes VPN range valid kernel write physical page #

(of next page table)

0x0000-0x00FF 1 1 0x22343 0x0100-0x01FF 1 0x00000 0x0200-0x02FF 0x00000 0x0300-0x03FF 1 1 0x33454 0x0400-0x04FF 1 1 0xFF043 … … … … … 0xFF00-0xFFFF 1 1 0xFF045 fjrst-level page table VPN valid kernel write physical page #

(of data)

0x300 1 1 0x42443 0x301 1 1 0x4A9DE 0x302 1 1 0x5C001 0x303 0x00000 0x304 1 1 0x6C223 … … … … … 0x3FF 0x00000 a second-level page table

33

slide-78
SLIDE 78

two-level page tables

for VPN 0x0-0xFF for VPN 0x100-0x1FF for VPN 0x200-0x2FF for VPN 0x300-0x300 … for VPN 0xFF00-0xFFFF fjrst-level page table two-level page table for 65536 pages (16-bit VPN) PTE for VPN 0x00 PTE for VPN 0x01 PTE for VPN 0x02 PTE for VPN 0x03 … PTE for VPN 0xFF second-level page tables actual data for page (if PTE valid) PTE for VPN 0x300 PTE for VPN 0x301 PTE for VPN 0x302 PTE for VPN 0x303 … PTE for VPN 0x3FF invalid entries represent big holes VPN range valid kernel write physical page #

(of next page table)

0x0000-0x00FF 1 1 0x22343 0x0100-0x01FF 1 0x00000 0x0200-0x02FF 0x00000 0x0300-0x03FF 1 1 0x33454 0x0400-0x04FF 1 1 0xFF043 … … … … … 0xFF00-0xFFFF 1 1 0xFF045 fjrst-level page table VPN valid kernel write physical page #

(of data)

0x300 1 1 0x42443 0x301 1 1 0x4A9DE 0x302 1 1 0x5C001 0x303 0x00000 0x304 1 1 0x6C223 … … … … … 0x3FF 0x00000 a second-level page table

33

slide-79
SLIDE 79

two-level page tables

for VPN 0x0-0xFF for VPN 0x100-0x1FF for VPN 0x200-0x2FF for VPN 0x300-0x300 … for VPN 0xFF00-0xFFFF fjrst-level page table two-level page table for 65536 pages (16-bit VPN) PTE for VPN 0x00 PTE for VPN 0x01 PTE for VPN 0x02 PTE for VPN 0x03 … PTE for VPN 0xFF second-level page tables actual data for page (if PTE valid) PTE for VPN 0x300 PTE for VPN 0x301 PTE for VPN 0x302 PTE for VPN 0x303 … PTE for VPN 0x3FF invalid entries represent big holes VPN range valid kernel write physical page #

(of next page table)

0x0000-0x00FF 1 1 0x22343 0x0100-0x01FF 1 0x00000 0x0200-0x02FF 0x00000 0x0300-0x03FF 1 1 0x33454 0x0400-0x04FF 1 1 0xFF043 … … … … … 0xFF00-0xFFFF 1 1 0xFF045 fjrst-level page table VPN valid kernel write physical page #

(of data)

0x300 1 1 0x42443 0x301 1 1 0x4A9DE 0x302 1 1 0x5C001 0x303 0x00000 0x304 1 1 0x6C223 … … … … … 0x3FF 0x00000 a second-level page table

33

slide-80
SLIDE 80

two-level page tables

for VPN 0x0-0xFF for VPN 0x100-0x1FF for VPN 0x200-0x2FF for VPN 0x300-0x300 … for VPN 0xFF00-0xFFFF fjrst-level page table two-level page table for 65536 pages (16-bit VPN) PTE for VPN 0x00 PTE for VPN 0x01 PTE for VPN 0x02 PTE for VPN 0x03 … PTE for VPN 0xFF second-level page tables actual data for page (if PTE valid) PTE for VPN 0x300 PTE for VPN 0x301 PTE for VPN 0x302 PTE for VPN 0x303 … PTE for VPN 0x3FF invalid entries represent big holes VPN range valid kernel write physical page #

(of next page table)

0x0000-0x00FF 1 1 0x22343 0x0100-0x01FF 1 0x00000 0x0200-0x02FF 0x00000 0x0300-0x03FF 1 1 0x33454 0x0400-0x04FF 1 1 0xFF043 … … … … … 0xFF00-0xFFFF 1 1 0xFF045 fjrst-level page table VPN valid kernel write physical page #

(of data)

0x300 1 1 0x42443 0x301 1 1 0x4A9DE 0x302 1 1 0x5C001 0x303 0x00000 0x304 1 1 0x6C223 … … … … … 0x3FF 0x00000 a second-level page table

33

slide-81
SLIDE 81

two-level page tables

for VPN 0x0-0xFF for VPN 0x100-0x1FF for VPN 0x200-0x2FF for VPN 0x300-0x300 … for VPN 0xFF00-0xFFFF fjrst-level page table two-level page table for 65536 pages (16-bit VPN) PTE for VPN 0x00 PTE for VPN 0x01 PTE for VPN 0x02 PTE for VPN 0x03 … PTE for VPN 0xFF second-level page tables actual data for page (if PTE valid) PTE for VPN 0x300 PTE for VPN 0x301 PTE for VPN 0x302 PTE for VPN 0x303 … PTE for VPN 0x3FF invalid entries represent big holes VPN range valid kernel write physical page #

(of next page table)

0x0000-0x00FF 1 1 0x22343 0x0100-0x01FF 1 0x00000 0x0200-0x02FF 0x00000 0x0300-0x03FF 1 1 0x33454 0x0400-0x04FF 1 1 0xFF043 … … … … … 0xFF00-0xFFFF 1 1 0xFF045 fjrst-level page table VPN valid kernel write physical page #

(of data)

0x300 1 1 0x42443 0x301 1 1 0x4A9DE 0x302 1 1 0x5C001 0x303 0x00000 0x304 1 1 0x6C223 … … … … … 0x3FF 0x00000 a second-level page table

33

slide-82
SLIDE 82

two-level page tables

for VPN 0x0-0xFF for VPN 0x100-0x1FF for VPN 0x200-0x2FF for VPN 0x300-0x300 … for VPN 0xFF00-0xFFFF fjrst-level page table two-level page table for 65536 pages (16-bit VPN) PTE for VPN 0x00 PTE for VPN 0x01 PTE for VPN 0x02 PTE for VPN 0x03 … PTE for VPN 0xFF second-level page tables actual data for page (if PTE valid) PTE for VPN 0x300 PTE for VPN 0x301 PTE for VPN 0x302 PTE for VPN 0x303 … PTE for VPN 0x3FF invalid entries represent big holes VPN range valid kernel write physical page #

(of next page table)

0x0000-0x00FF 1 1 0x22343 0x0100-0x01FF 1 0x00000 0x0200-0x02FF 0x00000 0x0300-0x03FF 1 1 0x33454 0x0400-0x04FF 1 1 0xFF043 … … … … … 0xFF00-0xFFFF 1 1 0xFF045 fjrst-level page table VPN valid kernel write physical page #

(of data)

0x300 1 1 0x42443 0x301 1 1 0x4A9DE 0x302 1 1 0x5C001 0x303 0x00000 0x304 1 1 0x6C223 … … … … … 0x3FF 0x00000 a second-level page table

33

slide-83
SLIDE 83

two-level page tables

for VPN 0x0-0xFF for VPN 0x100-0x1FF for VPN 0x200-0x2FF for VPN 0x300-0x300 … for VPN 0xFF00-0xFFFF fjrst-level page table two-level page table for 65536 pages (16-bit VPN) PTE for VPN 0x00 PTE for VPN 0x01 PTE for VPN 0x02 PTE for VPN 0x03 … PTE for VPN 0xFF second-level page tables actual data for page (if PTE valid) PTE for VPN 0x300 PTE for VPN 0x301 PTE for VPN 0x302 PTE for VPN 0x303 … PTE for VPN 0x3FF invalid entries represent big holes VPN range valid kernel write physical page #

(of next page table)

0x0000-0x00FF 1 1 0x22343 0x0100-0x01FF 1 0x00000 0x0200-0x02FF 0x00000 0x0300-0x03FF 1 1 0x33454 0x0400-0x04FF 1 1 0xFF043 … … … … … 0xFF00-0xFFFF 1 1 0xFF045 fjrst-level page table VPN valid kernel write physical page #

(of data)

0x300 1 1 0x42443 0x301 1 1 0x4A9DE 0x302 1 1 0x5C001 0x303 0x00000 0x304 1 1 0x6C223 … … … … … 0x3FF 0x00000 a second-level page table

33

slide-84
SLIDE 84

two-level page table lookup

MMU

11 0101 01 00 1011 00 00 1101 1111

VPN — split into two parts (one per level)

PTE size

0x10000

page table base register

+

data or instruction cache

1101 0011 11

1st PTE addr.

valid, etc?

split PTE parts

cause fault?

page size +

2nd PTE addr.

PTE size split PTE parts

valid, etc? cause fault?

00 1101 1111

physical address virtual address

fjrst-level page table lookup second-level page table lookup fjrst-level second-level

34

slide-85
SLIDE 85

two-level page table lookup

MMU

11 0101 01 00 1011 00 00 1101 1111

VPN — split into two parts (one per level)

× PTE size

0x10000

page table base register

+

data or instruction cache

1101 0011 11

1st PTE addr.

valid, etc?

split PTE parts

cause fault?

page size +

2nd PTE addr.

PTE size split PTE parts

valid, etc? cause fault?

00 1101 1111

physical address virtual address

fjrst-level page table lookup second-level page table lookup fjrst-level second-level

34

slide-86
SLIDE 86

two-level page table lookup

MMU

11 0101 01 00 1011 00 00 1101 1111

VPN — split into two parts (one per level)

× PTE size

0x10000

page table base register

+

data or instruction cache

1101 0011 11

1st PTE addr.

valid, etc?

split PTE parts

cause fault?

page size +

2nd PTE addr.

PTE size split PTE parts

valid, etc? cause fault?

00 1101 1111

physical address virtual address

fjrst-level page table lookup second-level page table lookup fjrst-level second-level

34

slide-87
SLIDE 87

two-level page table lookup

MMU

11 0101 01 00 1011 00 00 1101 1111

VPN — split into two parts (one per level)

× PTE size

0x10000

page table base register

+

data or instruction cache

1101 0011 11

1st PTE addr.

valid, etc?

split PTE parts

cause fault?

× page size +

2nd PTE addr.

× PTE size split PTE parts

valid, etc? cause fault?

00 1101 1111

physical address virtual address

fjrst-level page table lookup second-level page table lookup fjrst-level second-level

34

slide-88
SLIDE 88

two-level page table lookup

MMU

11 0101 01 00 1011 00 00 1101 1111

VPN — split into two parts (one per level)

× PTE size

0x10000

page table base register

+

data or instruction cache

1101 0011 11

1st PTE addr.

valid, etc?

split PTE parts

cause fault?

× page size +

2nd PTE addr.

× PTE size split PTE parts

valid, etc? cause fault?

00 1101 1111

physical address virtual address

fjrst-level page table lookup second-level page table lookup fjrst-level second-level

34

slide-89
SLIDE 89

two-level page table lookup

MMU

11 0101 01 00 1011 00 00 1101 1111

VPN — split into two parts (one per level)

× PTE size

0x10000

page table base register

+

data or instruction cache

1101 0011 11

1st PTE addr.

valid, etc?

split PTE parts

cause fault?

× page size +

2nd PTE addr.

× PTE size split PTE parts

valid, etc? cause fault?

00 1101 1111

physical address virtual address

fjrst-level page table lookup second-level page table lookup fjrst-level second-level

34

slide-90
SLIDE 90

two-level page table lookup

MMU

11 0101 01 00 1011 00 00 1101 1111

VPN — split into two parts (one per level)

× PTE size

0x10000

page table base register

+

data or instruction cache

1101 0011 11

1st PTE addr.

valid, etc?

split PTE parts

cause fault?

× page size +

2nd PTE addr.

× PTE size split PTE parts

valid, etc? cause fault?

00 1101 1111

physical address virtual address

fjrst-level page table lookup second-level page table lookup fjrst-level second-level

34

slide-91
SLIDE 91

two-level page table lookup

MMU

11 0101 01 00 1011 00 00 1101 1111

VPN — split into two parts (one per level)

× PTE size

0x10000

page table base register

+

data or instruction cache

1101 0011 11

1st PTE addr.

valid, etc?

split PTE parts

cause fault?

× page size +

2nd PTE addr.

× PTE size split PTE parts

valid, etc? cause fault?

00 1101 1111

physical address virtual address

fjrst-level page table lookup second-level page table lookup fjrst-level second-level

34

slide-92
SLIDE 92

two-level page table lookup

MMU

11 0101 01 00 1011 00 00 1101 1111

VPN — split into two parts (one per level)

× PTE size

0x10000

page table base register

+

data or instruction cache

1101 0011 11

1st PTE addr.

valid, etc?

split PTE parts

cause fault?

× page size +

2nd PTE addr.

× PTE size split PTE parts

valid, etc? cause fault?

00 1101 1111

physical address virtual address

fjrst-level page table lookup second-level page table lookup fjrst-level second-level

34

slide-93
SLIDE 93

two-level page table lookup

MMU

11 0101 01 00 1011 00 00 1101 1111

VPN — split into two parts (one per level)

× PTE size

0x10000

page table base register

+

data or instruction cache

1101 0011 11

1st PTE addr.

valid, etc?

split PTE parts

cause fault?

× page size +

2nd PTE addr.

× PTE size split PTE parts

valid, etc? cause fault?

00 1101 1111

physical address virtual address

fjrst-level page table lookup second-level page table lookup fjrst-level second-level

34

slide-94
SLIDE 94

two-level page table lookup

MMU

11 0101 01 00 1011 00 00 1101 1111

VPN — split into two parts (one per level)

× PTE size

0x10000

page table base register

+

data or instruction cache

1101 0011 11

1st PTE addr.

valid, etc?

split PTE parts

cause fault?

× page size +

2nd PTE addr.

× PTE size split PTE parts

valid, etc? cause fault?

00 1101 1111

physical address virtual address

fjrst-level page table lookup second-level page table lookup fjrst-level second-level

34

slide-95
SLIDE 95

two-level page table lookup

MMU

11 0101 01 00 1011 00 00 1101 1111

VPN — split into two parts (one per level)

× PTE size

0x10000

page table base register

+

data or instruction cache

1101 0011 11

1st PTE addr.

valid, etc?

split PTE parts

cause fault?

× page size +

2nd PTE addr.

× PTE size split PTE parts

valid, etc? cause fault?

00 1101 1111

physical address virtual address

fjrst-level page table lookup second-level page table lookup fjrst-level second-level

34