HashCache: Basic Policy URL hash_value H Bits % N t head t th block data N contiguous blocks Circular (Disk Table) Filesystem Log 10
HashCache: Basic Policy URL hash_value H Bits % N t head t th block N contiguous blocks Circular (Disk Table) Filesystem Log 10
HashCache: Basic Policy URL hash_value H Bits % N t head t th block N contiguous blocks Circular (Disk Table) Filesystem Log 10
HashCache: Basic Policy • Advantages • No index memory needed • Tuned for one seek for most objects 10
HashCache: Basic Policy • Advantages • No index memory needed • Tuned for one seek for most objects • Disadvantages • One seek per miss • No collison control • No cache replacement policy 10
Collision Control 11
Collision Control 11
Collision Control Chaining 11
Collision Control • Does not transition well to disk-based • Multiple seeks per operation • Walking hash bin list • Global replacement policy crosses bins Chaining 11
Collision Control Chaining 11
Collision Control • Fixed locations where each object can be found • Allocated contiguously, read together Set Associativity T -Ways 11
Collision Control • Fixed locations where each object can be found • Allocated contiguously, read together • Seek time dominates short read Set Associativity T -Ways 11
Collision Control • Fixed locations where each object can be found • Allocated contiguously, read together • Seek time dominates short read • Eliminate global cache Set Associativity replacement policies T -Ways 11
Reducing Seeks 12
Reducing Seeks • In-memory hash table • Too much memory for Bin Pointers 32 pointers Chaining Pointers 64 Hash 32 Total (bits) 128 In-memory Hash Table 12
Reducing Seeks • In-memory hash table • Too much memory for pointers • Disk is already a hash table • Pointers not needed • Large bitmap with the same layout as the disk Disk Table 12
Reducing Seeks Disk Block H Bits • In-memory hash table • Too much memory for pointers • Disk is already a hash table • Pointers not needed • Large bitmap with the same layout as the disk • Just store hash per URL In-memory Disk Table Bitmap 12
Reducing Seeks Disk Block H Bits • In-memory hash table • Too much memory for pointers • Disk is already a hash table • Pointers not needed • Large bitmap with the same layout as the disk • Just store hash per URL In-memory Disk Table Bitmap 12
Reducing Seeks • Original hash of the URL: Original 64 64 bits Hash 12
Reducing Seeks • Original hash of the URL: Original 64 64 bits Hash • Eliminate bits for (same) 39 64 - log(S) bin # (2 28 objs, 8-way, #bins=2 25 (S)) 12
Reducing Seeks • Original hash of the URL: Original 64 64 bits Hash • Eliminate bits for (same) 39 64 - log(S) bin # (2 28 objs, 8-way, #bins=2 25 (S)) 8 low FP hash • Shrink hash size: Just to eliminate most false positives (8 bits) 12
Cache Replacement Original 64 Hash 39 64 - log(S) 8 low FP hash 13
Cache Replacement • Large disks: 10-100+ million objects • Global caching relevant when Original 64 Hash disk size working set ≈ • When disk >> working set, 39 64 - log(S) local policies global policies ≈ 8 low FP hash 13
Cache Replacement • Large disks: 10-100+ million objects • Global caching relevant when Original 64 Hash disk size working set ≈ • When disk >> working set, 39 64 - log(S) local policies global policies ≈ • Local replacement benefits 8 low FP hash • 3 bits per URL 11 hash + rank • Performed on contiguous objects • False positives limited by set size 13
HashCache: SetMem Policy 14
HashCache: SetMem Policy URL 14
HashCache: SetMem Policy URL data 14
HashCache: SetMem Policy URL hash_value % S t data 14
HashCache: SetMem Policy URL hash_value % S t head data Memory Filesystem 14
HashCache: SetMem Policy URL hash_value % S 11 Bits t head t th set t th set data Memory Filesystem 14
HashCache: SetMem Policy URL hash_value % S 11 Bits t head t th set t th set data Memory Filesystem 14
HashCache: SetMem Policy URL hash_value % S 11 Bits t head t th set t th set Memory Filesystem 14
HashCache: SetMem Policy URL LRU hash_value LRU % S 11 Bits t head t th set t th set Memory Filesystem 14
HashCache: SetMem Policy • Advantages • No seeks for most misses • 1 seek per read, 1 seek per write • Good hash + replacement in 11 bits 14
HashCache: SetMem Policy • Advantages • No seeks for most misses • 1 seek per read, 1 seek per write • Good hash + replacement in 11 bits • Disadvantages • Writes still need seeks 14
Further Reducing Seeks Disk Table 15
Further Reducing Seeks • Storing objects by hash can produce random reads & writes Disk Table 15
Further Reducing Seeks • Storing objects by hash can produce random reads & writes • Restructure on-disk table Disk Table 15
Recommend
More recommend