elastic cuckoo page tables rethinking virtual memory
play

Elastic Cuckoo Page Tables: Rethinking Virtual Memory Translation - PowerPoint PPT Presentation

Elastic Cuckoo Page Tables: Rethinking Virtual Memory Translation for Parallelism Dimitrios Skarlatos , Apostolos Kokolis, Tianyin Xu, Josep Torrellas University of Illinois at Urbana-Champaign skarlat2.web.engr.illinois.edu ASPLOS 2020 Virtual


  1. Elastic Cuckoo Page Tables: Rethinking Virtual Memory Translation for Parallelism Dimitrios Skarlatos , Apostolos Kokolis, Tianyin Xu, Josep Torrellas University of Illinois at Urbana-Champaign skarlat2.web.engr.illinois.edu ASPLOS 2020

  2. Virtual Memory Translation is Expensive Main Memory PA 1 PA 4 Application Page Tables VA 1 PA 4 L1 L2 L3 Core VA 8 PA 1 Cache Cache Cache Issue LD VA 1 TLB TLB Miss! 2

  3. Virtual Memory Translation is Expensive Main Memory PA 1 PA 4 Application Page Tables VA 1 PA 4 VA1 PA4 L1 L2 L3 Core VA 8 PA 1 Cache Cache Cache Issue LD VA 1 TLB TLB Miss à “Page Walk” = Fetch entry from page table 3

  4. x86-64 Radix Page Tables Main Memory PA 1 PA 4 4

  5. x86-64 Radix Page Tables Main Memory PA 1 Virtual Address 47 … 39 38 … 30 29 … 21 20 … 12 11 … 0 PA 4 Address A 9-bits 9-bits 9-bits 9-bits Page Offset + pgd CR3 + pud + pmd + pte TLB Entry PGD PUD PMD VA1 PA4 PTE 5

  6. Virtual Memory Translation is Expensive Main Memory PA 1 PA 4 Application Radix Page Tables pgd pgd pud pmd pte L1 L2 L3 Core PGD Cache Cache Cache PUD PMD PTE Issue LD VA 1 TLB TLB Miss à “Page Walk” = Fetch entry from radix page table 6

  7. Virtual Memory Translation is Expensive Main Memory PA 1 PA 4 Application Radix Page Tables pgd pud pud pmd pte L1 L2 L3 Core PGD Cache Cache Cache PUD PMD PTE Issue LD VA 1 TLB pgd TLB Miss à “Page Walk” = Fetch entry from radix page table 7

  8. Virtual Memory Translation is Expensive Main Memory PA 1 PA 4 Application Radix Page Tables pgd pud pmd pmd pte L1 L2 L3 Core PGD Cache Cache Cache PUD PMD PTE Issue LD VA 1 TLB pud TLB Miss à “Page Walk” = Fetch entry from radix page table 8

  9. Virtual Memory Translation is Expensive Main Memory PA 1 PA 4 Application Radix Page Tables pgd pud pmd pte pte L1 L2 L3 Core PGD Cache Cache Cache PUD PMD PTE Issue LD VA 1 TLB pmd TLB Miss à “Page Walk” = Fetch entry from radix page table 9

  10. Virtual Memory Translation is Expensive Main Memory PA 1 PA 4 Application Page Tables VA 1 PA 4 L1 L2 L3 Core VA 8 PA 1 Cache Cache Cache Issue LD VA 1 TLB VA1 PA4 10

  11. Multilevel TLBs Main Memory PA 1 PA 4 Application Radix Page Tables pgd pud pmd pte L1 L2 L3 Core PGD Cache Cache Cache PUD PMD PTE L1 TLB L2 TLB 11

  12. Memory Management Unit (MMU) Cache Main Memory PA 1 PA 4 Application Radix Page Tables pgd pud pmd pte L1 L2 L3 Core PGD Cache Cache Cache PUD PMD PTE MMU Cache L1 TLB L2 TLB 12

  13. Translations in Data Caches Main Memory PA 1 PA 4 Application Radix Page Tables pgd pgd pgd pud pud pmd pte L1 L2 L3 Core PGD Cache Cache Cache PUD PMD PTE pte pud pmd MMU Cache L1 TLB L2 TLB 13

  14. NVM will Make the Problem Worse Main Memory Sunny Cove introduces 5-Level Radix Page Tables!! PA 1 PA 4 Application Radix Page Tables pgd pgd pgd pud pud pmd pte L1 L2 L3 Core PGD Cache Cache Cache PUD PMD PTE pte pud pmd MMU Cache Non-Volatile Memory Technology L1 TLB L2 TLB 14

  15. Contribution: Elastic Cuckoo Page Tables • Rethinking virtual memory translation for parallelism • Idea: Dynamically resizable page tables based on cuckoo hashing • No sequential page table lookups à parallel single-step lookups • Application speedup over state-of-the-art: • 3-28% with 4KB pages • 3-18% with Huge pages 15

  16. Alternative: A Global Hashed Page Table 16

  17. Alternative: A Global Hashed Page Table The old approach from Intel and IBM COLLISIONS Global Hash Table VA 1 Application H Tag Tag VA 9 H Collisions OS is invoked to resolve them! 17

  18. Alternative: A Global Hashed Page Table The old approach from Intel and IBM How to share pages? Multiple page sizes? Global Hash Table VA 1 Application A H Tag Tag VA 9 H VA 6 Application B H 18

  19. Alternative: A Global Hashed Page Table The old approach from Intel and IBM How to share pages? New level of indirection!! Multiple page sizes? Global Hash Table VA 1 Application A H Tag Tag VA 9 H VA 6 Application B H 19

  20. Alternative: A Global Hashed Page Table The old approach from Intel and IBM How to share pages? New level of indirection!! Multiple page sizes? Global Hash Table VA 1 Application A H Tag Tag VA 9 H VA 6 Application B H Tag 20

  21. Alternative: A Global Hashed Page Table The old approach from Intel and IBM How to share pages? New level of indirection!! COLLISIONS Multiple page sizes? PAGE SHARING Global Hash Table PAGE SIZES VA 1 Application A H Tag Tag VA 9 H VA 6 Application B H Tag 21

  22. Alternative: A Global Hashed Page Table The old approach from Intel and IBM Switched to radix page tables! Global Hash Table VA 1 Application A H Tag Tag DEAD END VA 9 H VA 6 Application B H Tag 22

  23. Elastic Cuckoo Page Tables Rethinking virtual memory translation for parallelism 23

  24. Cuckoo Hashing [Pagh 2001, Fotakis 2005] d H 1 H 2 H 3 d b c f a g T1 T2 T3 d -ary Cuckoo Hash Table 24

  25. Insertions with Cuckoo Hashing e H 1 d b c f a g T1 T2 T3 d -ary Cuckoo Hash Table 25

  26. Insertions with Cuckoo Hashing H 1 d b c e f a g T1 T2 T3 d -ary Cuckoo Hash Table 26

  27. Insertions with Cuckoo Hashing f H 3 d b c e a g T1 T2 T3 d -ary Cuckoo Hash Table 27

  28. Insertions with Cuckoo Hashing H 3 d b f c e a g T1 T2 T3 d -ary Cuckoo Hash Table 28

  29. Insertions with Cuckoo Hashing b H 2 d f c e a g T1 T2 T3 d -ary Cuckoo Hash Table 29

  30. Insertions with Cuckoo Hashing d b f c e a g T1 T2 T3 d -ary Cuckoo Hash Table 30

  31. Insertions with Cuckoo Hashing COLLISIONS PAGE SHARING PAGE SIZES d b f c e a g T1 T2 T3 d -ary Cuckoo Hash Table 31

  32. Private Hashed Page Tables COLLISIONS PRIVATE PAGE SHARING PAGE TABLES PAGE SIZES d b f c e a g T1 T2 T3 d -ary Cuckoo Hash Table 32

  33. Cannot Be Too Big à Waste Memory Main Memory COLLISIONS PRIVATE Page Tables A PAGE SHARING App A PAGE TABLES PAGE SIZES Private page tables cannot be too big Page Tables B App B 33

  34. Need to Dynamically Resize Main Memory COLLISIONS PRIVATE Page Tables A PAGE SHARING App A PAGE TABLES VA 1 PA 4 PAGE SIZES VA 8 PA 1 Private page tables cannot be too big Need to dynamically resize 34

  35. Need to Dynamically Resize Main Memory COLLISIONS PRIVATE Page Tables A PAGE SHARING App A PAGE TABLES VA 1 PA 4 PAGE SIZES VA 8 PA 1 New Page Tables A Private page tables cannot be too big Need to dynamically resize 35

  36. Cannot Rehash All Entries at Once Main Memory COLLISIONS PRIVATE Page Tables A PAGE SHARING App A PAGE TABLES VA 1 PA 4 PAGE SIZES VA 8 PA 1 New Page Tables A Private page tables cannot be too big Need to dynamically resize 36

  37. Cannot Rehash All Entries at Once Main Memory COLLISIONS PRIVATE Page Tables A PAGE SHARING App A PAGE TABLES VA 1 PA 4 PAGE SIZES VA 8 PA 1 New Page Tables A Private page tables cannot be too big Need to dynamically resize 37

  38. Cannot Rehash All Entries at Once Main Memory COLLISIONS PRIVATE Page Tables A PAGE SHARING App A PAGE TABLES VA 1 PA 4 PAGE SIZES VA 8 PA 1 New Page Tables A Private page tables cannot be too big Need to dynamically resize While the program is running Gradual Resizing! 38

  39. Gradual Resizing Cuckoo Hash Tables At every insert à Rehash one element m b k f c e l a g T2 T3 T1 Old d -ary Cuckoo Hash Table T1’ T2’ T3’ New d -ary Cuckoo Hash Table 39

  40. Gradual Resizing Cuckoo Hash Tables At every insert à Rehash one element m b k f H' 1 c e l a g T2 T3 T1 Old d -ary Cuckoo Hash Table T1’ T2’ T3’ New d -ary Cuckoo Hash Table 40

  41. Lookup During Gradual Resizing m H' 1 H' 2 H' 3 H 1 H 2 H 3 b k f m c e l a g T2 T3 T1 Old d -ary Cuckoo Hash Table T1’ T2’ T3’ New d -ary Cuckoo Hash Table 41

  42. Problem of Resizing: Double #Lookups m H' 1 H' 2 H' 3 H 1 H 2 H 3 2 x d Lookups! b k f m c e l a g T2 T3 T1 Old d -ary Cuckoo Hash Table T1’ T2’ T3’ New d -ary Cuckoo Hash Table 42

  43. Contribution: Elastic Cuckoo Hashing Rehashing Pointers P 1 P 2 P 3 m b k f c e l a g T2 T3 T1 Old d -ary Cuckoo Hash Table T1’ T2’ T3’ New d -ary Cuckoo Hash Table 43

  44. Elastic Cuckoo Migration m P 2 P 3 b P 1 k f c e l a g T2 T3 T1 Old d -ary Cuckoo Hash Table T1’ T2’ T3’ Migrated Region New d -ary Cuckoo Hash Table 44

  45. Elastic Cuckoo Migration m P 2 P 3 b P 1 k f H' 1 c e l a g T2 T3 T1 Old d -ary Cuckoo Hash Table T1’ T2’ T3’ Migrated Region New d -ary Cuckoo Hash Table 45

  46. Elastic Cuckoo Migration P 2 P 3 b P 1 f k m c e l a g T2 T3 T1 Old d -ary Cuckoo Hash Table T1’ T2’ T3’ Migrated Region New d -ary Cuckoo Hash Table 46

  47. Elastic Cuckoo Migration P 3 P 2 f P 1 m c e l a g k T2 T3 T1 b Old d -ary Cuckoo Hash Table T1’ T2’ T3’ Migrated Region New d -ary Cuckoo Hash Table 47

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend