Kiran Srinivasan, Tim Bisson Garth Goodson, Kaladhar Voruganti Advanced Technology Group NetApp
iDedup
Latency-aware inline deduplication for primary workloads
1
iDedup Latency-aware inline deduplication for primary workloads - - PowerPoint PPT Presentation
iDedup Latency-aware inline deduplication for primary workloads Kiran Srinivasan, Tim Bisson Garth Goodson, Kaladhar Voruganti Advanced Technology Group NetApp 1 iDedup overview/context Dedupe exploited effectively here Storage
1
2
NFS/CIFS/iSCSI
Primary Storage Storage Clients Secondary Storage
NDMP/Other
Dedupe exploited effectively here => 90+% savings
3
4
5
6
Workload/ Method Offline Inline Primary NetApp ASIS EMC Celerra IBM StorageTank
Secondary (No motivation for systems in this category) EMC DDFS, EMC Cluster DeepStore, NEC HydraStor, Venti, SiLo, Sparse Indexing, ChunkStash, Foundation, Symantec, EMC Centera
7
8
9
10
11
12
Fragmentation with random seeks Sequences, with amortized seeks
13
14
15
Original Spatial Locality Spatial + Temporal Locality
16
17
Dedupe metadata (FPDB)
18
19
20
21
8 10 12 14 16 18 20 22 24 1 2 4 8 Deduplication ratio (%) Threshold .25 GB .5 GB 1 GB
22
Dedupe ratio vs Thresholds, Cache sizes (Corp)
23
10 20 30 40 50 60 70 80 90 100 1 8 16 24 32 40 Percentage of Total Requests Request Sequence Size (Blocks) Baseline (Mean=15.8) Threshold-1 (Mean=12.5) Threshold-2 (Mean=14.8) Threshold-4 (Mean=14.9) Threshold-8 (Mean=15.4)
Least fragmentation Max fragmentation
CDF of block request sizes (Engg, 1GB)
24
10 20 30 40 50 60 70 80 90 100 5 10 15 20 25 30 35 40 Percentage of CPU Samples CPU Utilization (%) Baseline (Mean=13.2%) Threshold-1 (Mean=15.0%) Threshold-4 (Mean=16.6%) Threshold-8 (Mean=17.1%)
CDF of CPU utilization samples (Corp, 1GB)
25
75 80 85 90 95 100 5 10 15 20 25 30 Percentage of Requests Response Time (ms) Baseline Threshold-1 Threshold-8
CDF of client response time (Corp, 1GB)
Affects >2ms
26
27
28