D3N A multi-layer cache for the rest of us E. Ugur Kaynar, Mania - PowerPoint PPT Presentation

D3N A multi-layer cache for the rest of us E. Ugur Kaynar, Mania Abdi, Mohammad Hossein Hajkazemi, Ata Turk, Raja Sambasivan, David Cohen, Larry Rudolph, Peter Desnoyers, Orran Krieger 1

Motivation Data Center Network Cluster Network Cluster Network Analytic Frameworks ToR ToR ToR ToR Compute Storage Compute Cluster Compute Cluster 2

Motivation Data Center Network Cluster Network Cluster Network Cluster Network Analytic Frameworks ToR ToR ToR ToR ToR ToR Object Store Compute Cluster Data Lake Compute Cluster 3

Network Limitations in Data Center Data Center Network More Oversubscription Cluster Network Cluster Network Cluster Network Poor Performance ! ToR ToR ToR ToR ToR ToR Compute Cluster Compute Cluster Data Lake 4

Caching for Big Data Analytics Two Sigma [2018], Facebook [VLDB 2012], and Yahoo [2010] analytic cluster traces show that; ● High data input reuse ● Uneven Data Popularity CACHING ● File popularity changes over time ● Datasets accessed repeatedly by the same analytic clusters and between different analytic clusters Alluxio (formerly known as Tachyon[SOCC’14]) , HDFS-Cache , Pacman [NSDI’12], Adaptive Caching [SOCC’16] , Scarlett [Eurosys’11] , Netco [SOCC’19] 5

Fundamental Goals of D3N • Extension of the data lake • Reduce demand on network • Automatically adjust to: • access pattern • network contention 6

Design Principles • Transparent to user • Naturally scalable with the clusters that access it • Cache policies based purely on local information • Hierarchical multi-level cache 7

D3N’s Architecture More Oversubscription Data Lake (Object Rack-local Store) cache servers Cache services L3 L3 L3 L3 L3 L3 Multiple L2 L2 L2 L2 L2 L2 cache layers L1 L1 L1 L1 L1 L1 across the network hierarchy 8

Dynamic Cache Size Management The algorithm partitions the cache space based on: • Access Pattern L2 • Network Congestion L1 Cache Server 9

Dynamic Cache Size Management • High rack locality with small working set size • Congestion to storage network L2 L1 Cache Server 10

Dynamic Cache Size Management • High rack locality L2 • Congestion within the cluster network L1 Cache Server 11

Dynamic Cache Size Management The algorithm partitions the cache space based on: • Access Pattern L2 • Network Congestion L1 Cache Server 12

Dynamic Cache Size Management The algorithm measures • the reuse distance histogram • mean miss latency Find the optimal cache size split. 13

Edge Conditions and Failures • VM Migration • Anycast to DNS lookup server. • TCP session keep active until a request is completed. • Failure of cache server • Heartbeat service is used to keep track of active caches. • During a failure • lookup service will direct new requests to second nearest L1. • Consistent hashing algorithm remove the failed node from its map. 14

Implementation File Request Block Request Client Lookup ● Modification to Ceph’s RADOS S3 & Swift service gateway . We add 2500 lines of code. Local L1 caches ● Implements two level cache, L1 and L2. ● L1 L1 L1 Read Cache ● RGWs on Write Cache Cache ○ Write-through Servers L2 ○ Write-back (today no redundancy) ● Stores cached data in 4 MB blocks as Distributed L2 cache individual files on an SSD-backed file system. 15

Evaluation of D3N Value of Multi-level Micro Benchmarks

Evaluation of D3N Value of Multi-level Micro Benchmarks ● D3N saturates NVMe SSDs Multi-layer provides and 40 GbE NICs higher throughput ● Read throughput is increased than single layer cache. by 5x. ● Write through policy imposes a small overhead. ● Write back policy increased the throughput by 9x.

Evaluation of Cache Management Adaptability to different access patterns Adaptability to network load changes 18

Evaluation of Cache Management Adaptability to different access patterns Adaptability to network load changes ● Rapidly and automatically adjust changes in workload access pattern and congestion on network links. 19

Impact of D3N on Realistic Workload Facebook Workloads: Facebook Traces Trace ● 75% reuse ● 40TB data ● Requests were randomly assigned Hadoop Benchmark: Benchmark ● Mimic the hadoop mappers oncurrent ● Concurrent 144 read requests using “curl” D3N: 2 cache servers each have ● 1.5 TB NVMe SSDs (RAID 0) D3N ● Fast NIC: 2 x40 Gbit & Slow NIC: 2 x6 Gbit Data lake: ● Ceph (90 HDDs) Ceph 20

Impact of D3N on Realistic Workload Cumulative data transferred The trace completion time from back-end storage Vanilla 2.4x 23 Tb 3x 25% 5 Tb D3N D3N improves performance More than 4x reduction to backend significantly. traffic. 21

Concluding Remarks Proposed a transparent multi layer caching • Extension of the data lake • Implemented two layer prototype Results: • Cache partitioning algorithm dynamically adapt changes • Reduces demand datacenter wide Thank you • Improve the analytic workloads performance Red Hat is currently productizing D3N. • https://github.com/ekaynar/ceph Project Websites • https://www.bu.edu/rhcollab/projects/d3n/ • https://massopen.cloud/d3n/ 22

D3N A multi-layer cache for the rest of us E. Ugur Kaynar, Mania - PowerPoint PPT Presentation

D3N A multi-layer cache for the rest of us E. Ugur Kaynar, Mania Abdi, Mohammad Hossein Hajkazemi, Ata Turk, Raja Sambasivan, David Cohen, Larry Rudolph, Peter Desnoyers, Orran Krieger 1 Motivation Data Center Network Cluster Network Cluster

Outline What is ReST ? Constraints in ReST REST Architecture Components Features of

1 Classifying cache misses Cache Organization Classifying misses by causes (3Cs) Cache size,

Multi Multi Multi- Multi - - -Layer Access Control Layer Access Control Layer Access

Rest for the Restless (or Finding Rest in a Family Tree) Matthew 1:1-17 3 Promises of Rest for

presentation kit leave the rest to us SM leave the rest to us SM centrally located leave the

Preparing a REST API Rules of REST APIs, API patterns, Typical CRUD operations Rules for a REST

Introduction to REST Web REST Web Services Characteristics Services Principle of REST Web

REST REST REST REST Is it a sin to work on the Sabbath? Is it a sin to work on the Sabbath?

What Is Memory Hierarchy A typical memory hierarchy today: Lecture 13: Cache Basics and Cache

Memory Hierarchy: Cache Memory hierarchy Cache basics Locality Cache organization Cache-aware

Web Cache Consistency Web Cache Consistency Web Cache Consistency Web Cache Consistency

L09: Cache Name: ID: Question: Direct Mapping Cache Hit Rate Consider a 4-block empty Cache,

Overview Multi-layer networks: Cognitive Modeling limits of single layer networks; Lecture

Network Layer October 2, 2019 guha.jayachandran@sjsu.edu Layer 2: Protocol atop Layer 1

A multi- -layer layer A multi A multi-layer research and training platform research and

Cache Impact on Program Performance T. Yang. UCSB CS240A. 2017 Multi-level cache in computer

Evaluating selected cluster file systems with Parabench Internship report Authors: Marcel Krause;

A network for improvement of cephalopod welfare husbandry in research, aquaculture and fisheries (

Linux Open Source Distributed Filesystem Ceph at SURFsara Remco van Vugt July 2, 2013 1/ 34

Ceph: All-in-One Network Data Storage What is Ceph and how we use it to backend the Arbutus cloud

Storage Cluster mit Ceph CeBIT 2015 20. Mrz 2015 Michel Rode Linux/Unix Consultant &

Certified in Public Health: Credentialing Public Health Leaders Why the CPH? Created to

Ceph & RocksDB (Cloud Storage ) Ceph Basics Placement Group PG#1 PG#2 PG#3

Status quo and current challenges related to antimicrobial resistance: Veterinary consultants

D3N A multi-layer cache for the rest of us E. Ugur Kaynar, Mania - PowerPoint PPT Presentation

D3N A multi-layer cache for the rest of us E. Ugur Kaynar, Mania Abdi, Mohammad Hossein Hajkazemi, Ata Turk, Raja Sambasivan, David Cohen, Larry Rudolph, Peter Desnoyers, Orran Krieger 1 Motivation Data Center Network Cluster Network Cluster

Outline What is ReST ? Constraints in ReST REST Architecture Components Features of

1 Classifying cache misses Cache Organization Classifying misses by causes (3Cs) Cache size,

Multi Multi Multi- Multi - - -Layer Access Control Layer Access Control Layer Access

Rest for the Restless (or Finding Rest in a Family Tree) Matthew 1:1-17 3 Promises of Rest for

presentation kit leave the rest to us SM leave the rest to us SM centrally located leave the

Preparing a REST API Rules of REST APIs, API patterns, Typical CRUD operations Rules for a REST

Introduction to REST Web REST Web Services Characteristics Services Principle of REST Web

REST REST REST REST Is it a sin to work on the Sabbath? Is it a sin to work on the Sabbath?

What Is Memory Hierarchy A typical memory hierarchy today: Lecture 13: Cache Basics and Cache

Memory Hierarchy: Cache Memory hierarchy Cache basics Locality Cache organization Cache-aware

Web Cache Consistency Web Cache Consistency Web Cache Consistency Web Cache Consistency

L09: Cache Name: ID: Question: Direct Mapping Cache Hit Rate Consider a 4-block empty Cache,

Overview Multi-layer networks: Cognitive Modeling limits of single layer networks; Lecture

Network Layer October 2, 2019 guha.jayachandran@sjsu.edu Layer 2: Protocol atop Layer 1

A multi- -layer layer A multi A multi-layer research and training platform research and

Cache Impact on Program Performance T. Yang. UCSB CS240A. 2017 Multi-level cache in computer

Evaluating selected cluster file systems with Parabench Internship report Authors: Marcel Krause;

A network for improvement of cephalopod welfare husbandry in research, aquaculture and fisheries (

Linux Open Source Distributed Filesystem Ceph at SURFsara Remco van Vugt July 2, 2013 1/ 34

Ceph: All-in-One Network Data Storage What is Ceph and how we use it to backend the Arbutus cloud

Storage Cluster mit Ceph CeBIT 2015 20. Mrz 2015 Michel Rode Linux/Unix Consultant &amp;

Certified in Public Health: Credentialing Public Health Leaders Why the CPH? Created to

Ceph &amp; RocksDB (Cloud Storage ) Ceph Basics Placement Group PG#1 PG#2 PG#3

Status quo and current challenges related to antimicrobial resistance: Veterinary consultants

Storage Cluster mit Ceph CeBIT 2015 20. Mrz 2015 Michel Rode Linux/Unix Consultant &

Ceph & RocksDB (Cloud Storage ) Ceph Basics Placement Group PG#1 PG#2 PG#3