HashCache: Cache Storage for the Next Billion
1
Anirudh Badam KyoungSoo Park Vivek S. Pai Larry L. Peterson
HashCache: Cache Storage for the Next Billion Anirudh Badam - - PowerPoint PPT Presentation
HashCache: Cache Storage for the Next Billion Anirudh Badam KyoungSoo Park Vivek S. Pai Larry L. Peterson Princeton University 1 Next Billion Internet Users 2 Next Billion Internet Users Schools, urban middle class in developing
1
Anirudh Badam KyoungSoo Park Vivek S. Pai Larry L. Peterson
2
developing regions
2
developing regions
2
$200
developing regions
near future
2
$200 $1500 per month
3
resource
3
resource
bandwidth requirement
3
Internet
Cache
resource
bandwidth requirement
cheap
3
Internet
Cache
4
4
4
4
5
5
70 seeks per sec
5
70 seeks per sec
5
70 seeks per sec
6
6
6
6
Performance: req/sec/disk
1 2 3 4 70 140 210 280 350 HashCache Current Squid Tiger
Gigabytes/Dollar
6
Better
7
7
7
7
7
In-memory Hashtable
7
In-memory Hashtable
7
7
Circular Log
7
Circular Log
7
Circular Log
7
Circular Log
7
8
Functionality Existence Identification Replacement Policy Location Information Other
8
Functionality Implementation Choice Existence Identification Hashtable Chaining Pointers Hash Replacement Policy LRU List Pointers Location Information Disk Offset, Version Number, etc Other Expiration Date, Size, HTTP header info etc Total
8
Functionality Implementation Choice Existence Identification Hashtable Chaining Pointers Hash Replacement Policy LRU List Pointers Location Information Disk Offset, Version Number, etc Other Expiration Date, Size, HTTP header info etc Total
8
Functionality Implementation Choice Squid (Bits) Existence Identification Hashtable Chaining Pointers 96 Hash 160 Replacement Policy LRU List Pointers 64 Location Information Disk Offset, Version Number, etc Other Expiration Date, Size, HTTP header info etc 240 Total 560
8
Functionality Implementation Choice Squid (Bits) Tiger (Bits) Existence Identification Hashtable Chaining Pointers 96 96 Hash 160 32 Replacement Policy LRU List Pointers 64 64 Location Information Disk Offset, Version Number, etc 40 Other Expiration Date, Size, HTTP header info etc 240 Total 560 232
8
9
In-memory Index
9
In-memory Index
9
In-memory Index
9
In-memory Index
reduce the dependency
9
In-memory Index
reduce the dependency
for seeks
9
In-memory Index
9
information
In-memory Index
9
information
In-memory Index
9
information
key lookup
the keys
In-memory Index
9
In-memory Index
9
In-memory Index
In-memory Index
9
In-memory Index
In-memory Index
9
In-memory Index
In-memory Index
9
In-memory Index
9
10
10
data
10
hash_value
data
H Bits
10
hash_value
data
H Bits
N contiguous blocks
(Disk Table)
10
hash_value
% N
data
H Bits
N contiguous blocks
(Disk Table)
10
hash_value
% N
data tth block
H Bits
N contiguous blocks
(Disk Table)
10
Circular Log
head
hash_value
% N
data tth block
H Bits
N contiguous blocks
(Disk Table)
10
Circular Log
head
hash_value
% N
tth block
H Bits
N contiguous blocks
(Disk Table)
10
Circular Log
head
hash_value
% N
tth block
H Bits
N contiguous blocks
(Disk Table)
10
10
10
11
11
11
to disk-based
policy crosses bins
11
11
read together
11
read together
read
11
read together
read
replacement policies
11
12
pointers
Bin Pointers 32 Chaining Pointers 64 Hash 32 Total (bits) 128
12
pointers
same layout as the disk
Disk Table
12
pointers
same layout as the disk
Disk Table
In-memory Bitmap
H Bits Disk Block
12
pointers
same layout as the disk
Disk Table
In-memory Bitmap
H Bits Disk Block
12
64 bits
12
64
Original Hash
64 bits
bin # (228 objs, 8-way, #bins=225 (S))
12
64
Original Hash
39
64 - log(S)
64 bits
bin # (228 objs, 8-way, #bins=225 (S))
eliminate most false positives (8 bits)
12
64
Original Hash
39
64 - log(S)
8
low FP hash
13
64
Original Hash
39
64 - log(S)
8
low FP hash
13
disk size working set
local policies global policies
≈ ≈ 64
Original Hash
39
64 - log(S)
8
low FP hash
13
disk size working set
local policies global policies
≈ ≈ 64
Original Hash
39
64 - log(S)
8
low FP hash
11
hash + rank
14
14
data
14
hash_value
% S
data
14
hash_value
% S
data
Filesystem
head
Memory
14
hash_value
% S
data
Filesystem
head
Memory
11 Bits tth set tth set
14
hash_value
% S
data
Filesystem
head
Memory
11 Bits tth set tth set
14
hash_value
% S
Filesystem
head
Memory
11 Bits tth set tth set
14
hash_value
% S
LRU LRU
Filesystem
head
Memory
11 Bits tth set tth set
14
14
14
15
produce random reads & writes
15
produce random reads & writes
15
produce random reads & writes
15
11 hash
rank
produce random reads & writes
15
11 hash
rank
43 hash
rank
produce random reads & writes
enables read prefetch
15
11 hash
rank
43 hash
rank
produce random reads & writes
enables read prefetch
15
HC-Basic
43
HC-Log
11 HC-SetMem
produce random reads & writes
enables read prefetch
15
560
Squid
232
Tiger HC-Basic
43
HC-Log
11 HC-SetMem
16
16
data
16
hash_value
% S
data
16
hash_value
% S
data Circular Log
head LRU
16
hash_value
% S
data Circular Log
head LRU
tth set 43 Bits
16
hash_value
% S
data Circular Log
head LRU
tth set 43 Bits
16
17
Engine with plug-in policies
17
Engine with plug-in policies
using storage engine
17
Engine with plug-in policies
using storage engine
box, sharing memory
17
Engine with plug-in policies
using storage engine
box, sharing memory
the proxy and 1000 lines for the indexing policies
17
Engine with plug-in policies
using storage engine
box, sharing memory
the proxy and 1000 lines for the indexing policies
implementation with non-blocking I/O
17
Engine with plug-in policies
using storage engine
box, sharing memory
the proxy and 1000 lines for the indexing policies
implementation with non-blocking I/O
Flash Web Server. Helpers for I/O and DNS lookups
17
Engine with plug-in policies
using storage engine
box, sharing memory
the proxy and 1000 lines for the indexing policies
implementation with non-blocking I/O
Flash Web Server. Helpers for I/O and DNS lookups
multiple disks easily and makes scaling obvious
17
18
18
Experiment Name Setting Configuration Comparision
18
Experiment Name Setting Configuration Comparision Low End
Small School using Laptop
1.4 GHz 256 MB 60 GB SATA HashCache vs Squid vs Tiger
18
Experiment Name Setting Configuration Comparision Low End
Small School using Laptop
1.4 GHz 256 MB 60 GB SATA HashCache vs Squid vs Tiger
High End
ISP with High-End Server
2 GHz 3.5 GB 5x18 GB SCSI HashCache-Log vs Squid vsTiger
18
Experiment Name Setting Configuration Comparision Low End
Small School using Laptop
1.4 GHz 256 MB 60 GB SATA HashCache vs Squid vs Tiger
High End
ISP with High-End Server
2 GHz 3.5 GB 5x18 GB SCSI HashCache-Log vs Squid vsTiger
Large Disk
Large School with Mini-Tower
1.4 GHz 2 GB 2x1TB USB HashCache-Log vs HashCache-SetMem
15 30 45 60 HC-Basic Squid HC-SetMem HC-Log Tiger
Hit Rate
Max
19
75 150 225 300
Performance req/sec
HC-Basic Squid HC-SetMem Tiger HC-Log
75 150 225 300
0% RAM for Index 75% 5% 50% 20%
Open Source and Commercial could index only 18 GB HashCache could index 60 GB
Performance req/sec
HC-Basic Squid HC-SetMem Tiger HC-Log
750 1,500 2,250 3,000
Performance req/sec
Squid Tiger HashCache-Log
21
750 1,500 2,250 3,000
Performance req/sec
Squid Tiger HashCache-Log
40% RAM for Index 4% 18%
21
Squid Tiger HC-Log HC-SetMem 1,500 3,000 4,500 6,000
Squid Tiger HC-Log HC-SetMem 1,500 3,000 4,500 6,000
40 reqs/sec 300 300 65
23
75 150 225 300 HC-Basic HC-SetMem HC-Log
Performance req/sec
23
75 150 225 300 HC-Basic HC-SetMem HC-Log
Performance req/sec 150 MB 600 MB 0 MB Index
24
25