Pf Pfimbi : Accelerat ating Big Dat ata Jo Jobs Through Flow-‑
- ‑Co
Controlled Da Data Replicati tion
SimbarasheDzinamarira* Florin Dinu▵
- T. S. Eugene Ng*
*Rice University, ▵EPFL
1
Pf Pfimbi : Accelerat ating Big Dat ata Jo Jobs Through Flow- - - PowerPoint PPT Presentation
Pf Pfimbi : Accelerat ating Big Dat ata Jo Jobs Through Flow- -Co Controlled Da Data Replicati tion SimbarasheDzinamarira* Florin Dinu T. S. Eugene Ng* *Rice University, EPFL 1 DFSs have a critical role on the
SimbarasheDzinamarira* Florin Dinu▵
*Rice University, ▵EPFL
1
Management & Monitoring
(Ambari)
Coordination
(ZooKeeper)
Workflow & Scheduling
(Oozie)
Scripting
(Pig)
Machine Learning
(Mahout)
Query
(Hive)
Distributed Processing
(MapReduce)
Distributed Processing
(HDFS)
NoSQL Database
(HBase)
Data Integration
(Sqoop/REST/ODBC)
Image reproducedfrom https //www.mssqltips.com/sqlservertip/3262/big-‑data-‑basics-‑-‑part-‑6-‑-‑related-‑apache-‑projects-‑in-‑hadoop-‑ecosystem/
2
DATANODE
3.DATA
DATANODE
3
within 5 minutes of being written [TidyFS: USENIX ATC 2011]
DATANODE DATANODE DATANODE
4
STORAGE DEVICE STORAGE DEVICE STORAGE DEVICE
HDD HDD SSD SSD RAM HDD
SSD ¡image ¡from ¡: ¡http://www.storagereview.com/intel_ssd_525_msata_review
5
DATANODE
3.DATA
DATANODE
6
DATANODE DATANODE DATANODE
TIME BANDWIDTH SHARE
WITHOUT FLOW CONTROL
TIME BANDWIDTH SHARE
WITH FLOW CONTROL
7
8
DATANODE DATANODE DATANODE
DATANODE
SSD image from http //www.storagereview.com/intel ssd 525 msata review,; Magnifier image from https //commons.wikimedia.org/wiki/File Magnifying glass icon.svg
9
CLIENT Kernel ¡Space BLOCK ¡ BUFFER
PFIMBI
Kernel ¡Space BLOCK ¡ BUFFER
PFIMBI
Block ¡notification Send ¡a ¡block
10
Replication traffic Position 1 Position 2 100 1 Job 1 Job 2 Job 3 1 1 1 Job 1 Job 2 Job 3 1 1 1
11
Tapimage from https //image.freepik.com/free-‑icon/bathroom-‑tap-‑silhouette 318-‑63404.png
Incoming Data Block Buffer Synchronous data Asynchronous data Monitoring Activity Bu Buffer Ca Cache
12
OS threshold for flushing buffered data : T Threshold for asynchronous replication : T + δ
Buffer cache
Typical Values T 10% of RAM (~13GB) δ 500MB Buffer Cache 20% of RAM (~26GB)
13
14
15
DFSIO on HDFS DFSIO on PFIMBI
HDD->HDD->HDD SSD->HDD->HDD
Configurations
100 200 300 400 500 600 700 800 900 1000
Completion time of replicas(s)
HDD->HDD->HDD SSD->HDD->HDD
Configurations
100 200 300 400 500 600 700 800 900 1000
Completion time of eplicas(s)
Primary write 2nd replica 1st replica Syncing dirty data
16
Without Flow Control With Flow Control
Configurations
100 200 300 400 500 600 700 800 900 1000
Completion time (s)
Job 1 Remaining replication Job 2
Two DFSIO jobs
17
18
200 400 600 800 1000
Timeline of DFSIO writes (s)
10 20 30 40 50 60 70
Number of block completions
200 400 600 800 1000
Timeline of DFSIO writes (s)
10 20 30 40 50 60 70
Number of block completions
1st replica 3rd replica 2nd replica
Equal weights Weights in ratio 100:10:1
19
20
21