D3N A multi-layer cache for the rest of us
- E. Ugur Kaynar, Mania Abdi, Mohammad Hossein Hajkazemi,
Ata Turk, Raja Sambasivan, David Cohen, Larry Rudolph, Peter Desnoyers, Orran Krieger
1
D3N A multi-layer cache for the rest of us E. Ugur Kaynar, Mania - - PowerPoint PPT Presentation
D3N A multi-layer cache for the rest of us E. Ugur Kaynar, Mania Abdi, Mohammad Hossein Hajkazemi, Ata Turk, Raja Sambasivan, David Cohen, Larry Rudolph, Peter Desnoyers, Orran Krieger 1 Motivation Data Center Network Cluster Network Cluster
Ata Turk, Raja Sambasivan, David Cohen, Larry Rudolph, Peter Desnoyers, Orran Krieger
1
Cluster Network Cluster Network
ToR ToR
Compute Cluster
ToR ToR
Compute Cluster Analytic Frameworks Storage Compute Data Center Network
2
Data Center Network Cluster Network Cluster Network
ToR ToR
Compute Cluster
ToR ToR
Compute Cluster Analytic Frameworks Cluster Network
ToR ToR
Data Lake Object Store
3
Data Center Network
Cluster Network Cluster Network ToR ToR Compute Cluster ToR ToR Compute Cluster Cluster Network ToR ToR Data Lake
More Oversubscription
4
Two Sigma [2018], Facebook [VLDB 2012], and Yahoo [2010] analytic cluster traces show that;
clusters and between different analytic clusters
5
Alluxio (formerly known as Tachyon[SOCC’14]), HDFS-Cache, Pacman[NSDI’12], Adaptive Caching [SOCC’16], Scarlett[Eurosys’11] , Netco[SOCC’19]
6
7
L1 L1 L1 L1 L1 L1 L2 L2 L2 L2 L2 L2 L3 L3 L3 L3 L3 L3
Cache services Multiple cache layers across the network hierarchy
More Oversubscription
Rack-local cache servers
Data Lake (Object Store)
8
9
The algorithm partitions the cache space based on:
10
11
The algorithm partitions the cache space based on:
12
13
14
Client
L1 L1 L1
L2
RGWs on Cache Servers
File Request Block Request Local L1 caches Distributed L2 cache
Lookup service
S3 & Swift
○ Write-through ○ Write-back (today no redundancy)
individual files on an SSD-backed file system.
15
Value of Multi-level Micro Benchmarks
Value of Multi-level Micro Benchmarks
Multi-layer provides higher throughput than single layer cache.
and 40 GbE NICs
by 5x.
a small overhead.
the throughput by 9x.
Adaptability to different access patterns
18
Adaptability to network load changes
Adaptability to different access patterns
19
Adaptability to network load changes
in workload access pattern and congestion on network links.
Workloads: Facebook Traces
Benchmark:
D3N: 2 cache servers each have
Data lake:
Ceph D3N Hadoop Benchmark Facebook Trace
20
The trace completion time 2.4x 3x 25%
D3N improves performance significantly. More than 4x reduction to backend traffic.
Cumulative data transferred from back-end storage 23 Tb 5 Tb
21
Vanilla D3N
Proposed a transparent multi layer caching
Results:
Red Hat is currently productizing D3N.
Project Websites
22