hashcache cache storage for the next billion
play

HashCache: Cache Storage for the Next Billion Anirudh Badam - PowerPoint PPT Presentation

HashCache: Cache Storage for the Next Billion Anirudh Badam KyoungSoo Park Vivek S. Pai Larry L. Peterson Princeton University 1 Next Billion Internet Users 2 Next Billion Internet Users Schools, urban middle class in developing


  1. HashCache: Basic Policy URL hash_value H Bits % N t head t th block data N contiguous blocks Circular (Disk Table) Filesystem Log 10

  2. HashCache: Basic Policy URL hash_value H Bits % N t head t th block N contiguous blocks Circular (Disk Table) Filesystem Log 10

  3. HashCache: Basic Policy URL hash_value H Bits % N t head t th block N contiguous blocks Circular (Disk Table) Filesystem Log 10

  4. HashCache: Basic Policy • Advantages • No index memory needed • Tuned for one seek for most objects 10

  5. HashCache: Basic Policy • Advantages • No index memory needed • Tuned for one seek for most objects • Disadvantages • One seek per miss • No collison control • No cache replacement policy 10

  6. Collision Control 11

  7. Collision Control 11

  8. Collision Control Chaining 11

  9. Collision Control • Does not transition well to disk-based • Multiple seeks per operation • Walking hash bin list • Global replacement policy crosses bins Chaining 11

  10. Collision Control Chaining 11

  11. Collision Control • Fixed locations where each object can be found • Allocated contiguously, read together Set Associativity T -Ways 11

  12. Collision Control • Fixed locations where each object can be found • Allocated contiguously, read together • Seek time dominates short read Set Associativity T -Ways 11

  13. Collision Control • Fixed locations where each object can be found • Allocated contiguously, read together • Seek time dominates short read • Eliminate global cache Set Associativity replacement policies T -Ways 11

  14. Reducing Seeks 12

  15. Reducing Seeks • In-memory hash table • Too much memory for Bin Pointers 32 pointers Chaining Pointers 64 Hash 32 Total (bits) 128 In-memory Hash Table 12

  16. Reducing Seeks • In-memory hash table • Too much memory for pointers • Disk is already a hash table • Pointers not needed • Large bitmap with the same layout as the disk Disk Table 12

  17. Reducing Seeks Disk Block H Bits • In-memory hash table • Too much memory for pointers • Disk is already a hash table • Pointers not needed • Large bitmap with the same layout as the disk • Just store hash per URL In-memory Disk Table Bitmap 12

  18. Reducing Seeks Disk Block H Bits • In-memory hash table • Too much memory for pointers • Disk is already a hash table • Pointers not needed • Large bitmap with the same layout as the disk • Just store hash per URL In-memory Disk Table Bitmap 12

  19. Reducing Seeks • Original hash of the URL: Original 64 64 bits Hash 12

  20. Reducing Seeks • Original hash of the URL: Original 64 64 bits Hash • Eliminate bits for (same) 39 64 - log(S) bin # (2 28 objs, 8-way, #bins=2 25 (S)) 12

  21. Reducing Seeks • Original hash of the URL: Original 64 64 bits Hash • Eliminate bits for (same) 39 64 - log(S) bin # (2 28 objs, 8-way, #bins=2 25 (S)) 8 low FP hash • Shrink hash size: Just to eliminate most false positives (8 bits) 12

  22. Cache Replacement Original 64 Hash 39 64 - log(S) 8 low FP hash 13

  23. Cache Replacement • Large disks: 10-100+ million objects • Global caching relevant when Original 64 Hash disk size working set ≈ • When disk >> working set, 39 64 - log(S) local policies global policies ≈ 8 low FP hash 13

  24. Cache Replacement • Large disks: 10-100+ million objects • Global caching relevant when Original 64 Hash disk size working set ≈ • When disk >> working set, 39 64 - log(S) local policies global policies ≈ • Local replacement benefits 8 low FP hash • 3 bits per URL 11 hash + rank • Performed on contiguous objects • False positives limited by set size 13

  25. HashCache: SetMem Policy 14

  26. HashCache: SetMem Policy URL 14

  27. HashCache: SetMem Policy URL data 14

  28. HashCache: SetMem Policy URL hash_value % S t data 14

  29. HashCache: SetMem Policy URL hash_value % S t head data Memory Filesystem 14

  30. HashCache: SetMem Policy URL hash_value % S 11 Bits t head t th set t th set data Memory Filesystem 14

  31. HashCache: SetMem Policy URL hash_value % S 11 Bits t head t th set t th set data Memory Filesystem 14

  32. HashCache: SetMem Policy URL hash_value % S 11 Bits t head t th set t th set Memory Filesystem 14

  33. HashCache: SetMem Policy URL LRU hash_value LRU % S 11 Bits t head t th set t th set Memory Filesystem 14

  34. HashCache: SetMem Policy • Advantages • No seeks for most misses • 1 seek per read, 1 seek per write • Good hash + replacement in 11 bits 14

  35. HashCache: SetMem Policy • Advantages • No seeks for most misses • 1 seek per read, 1 seek per write • Good hash + replacement in 11 bits • Disadvantages • Writes still need seeks 14

  36. Further Reducing Seeks Disk Table 15

  37. Further Reducing Seeks • Storing objects by hash can produce random reads & writes Disk Table 15

  38. Further Reducing Seeks • Storing objects by hash can produce random reads & writes • Restructure on-disk table Disk Table 15

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend