- Md. Wasi-ur- Rahman, Nusrat Sharmin Islam, Xiaoyi Lu, and
Dhabaleswar K. (DK) Panda
Department of Computer Science and Engineering The Ohio State University Columbus, OH, USA
Can Non-Volatile Memory Benefit MapReduce Applications
- n HPC Clusters?
Can Non-Volatile Memory Benefit MapReduce Applications on HPC - - PowerPoint PPT Presentation
Can Non-Volatile Memory Benefit MapReduce Applications on HPC Clusters? Md. Wasi-ur- Rahman, Nusrat Sharmin Islam, Xiaoyi Lu, and Dhabaleswar K. (DK) Panda Department of Computer Science and Engineering The Ohio State University Columbus, OH,
PDSW-DISCS 2016
2
PDSW-DISCS 2016
3
http://www.coolinfographics.com/blog/tag/data?currentPage=3 http://www.climatecentral.org/news/white-house-brings-together-big-data- and-climate-change-17194
PDSW-DISCS 2016
4
Hadoop Framework
PDSW-DISCS 2016
Tianhe – 2 Titan Stampede Gordon
Accelerators / Coprocessors high compute density, high performance/watt >1 TFlop DP on a chip High Performance Interconnects - InfiniBand <1usec latency, 100Gbps Bandwidth> Multi-core Processors SSD, NVMe-SSD, NVRAM
5
PDSW-DISCS 2016
6
http://www.slideshare.net/Yole_Developpement/yole-emerging-nonvolatile- memory-2016-report-by-yole-developpement?next_slideshow=2 http://www.chipdesignmag.com/bursky/?paged=2
PDSW-DISCS 2016
High Performance Design for HDFS with Byte- Addressability of NVM and RDMA, 24th International Conference on Supercomputing (ICS '16), Jun 2016.
7
Hadoop MapReduce
Co-Design
(Cost-Effectiveness, Use-case)
RDMA Receiver RDMA Sender
DFSClient
RDMA Replicator RDMA Receiver NVFS- BlkIO Writer/Reader NVM NVFS- MemIO SSD SSD SSD
DataNode
RDMA
PDSW-DISCS 2016
8
PDSW-DISCS 2016
9
PDSW-DISCS 2016
10
PDSW-DISCS 2016
11
PDSW-DISCS 2016
12
PDSW-DISCS 2016
– RDMA-based shuffle engine – Pre-fetching and caching of intermediate data
–
Hadoop MapReduce over InfiniBand, HPDIC, in conjunction with IPDPS, 2013
– Overlapping among map, shuffle, and merge phases as well as shuffle, merge, and reduce phases – Advanced shuffle algorithms with dynamic adjustments in shuffle volume
–
13
PDSW-DISCS 2016
– Plugins for Apache, Hortonworks (HDP) and Cloudera (CDH) Hadoop distributions
– HDFS, Memcached, and HBase Micro-benchmarks
14
PDSW-DISCS 2016
15
– High performance RDMA-enhanced design with native InfiniBand and RoCE support at the verbs-level for HDFS, MapReduce, and RPC components – Enhanced HDFS with in-memory and heterogeneous storage – High performance design of MapReduce
– Plugin-based architecture supporting RDMA-based designs for Apache Hadoop, HDP, and CDH
– Based on Apache Hadoop 2.7.3 – Compliant with Apache Hadoop 2.7.3, HDP 2.5.0.3, CDH 5.8.2 APIs and applications – http://hibd.cse.ohio-state.edu
PDSW-DISCS 2016
16
PDSW-DISCS 2016
17
HDD SSD RAMDisk
50 100 150 200 250 300 350 400
Intermediate Data Storage
Execution Time (s)
PDSW-DISCS 2016
18
Read Map Spill Merge
Read Map Spill Merge
Shuffle Reduce In- Mem Merge
Shuffle Reduce In- Mem Merge
RDMA
PDSW-DISCS 2016
19
PDSW-DISCS 2016
20
Read + Map + Collect Spill + Merge
2 4 6 8 10 12 14
Time (s)
Sort TeraSort
PDSW-DISCS 2016
21
PDSW-DISCS 2016
22
Read Map Spill Merge
Read Map Spill Merge
Shuffle Reduce In- Mem Merge
Shuffle Reduce In- Mem Merge
RDMA
NVRAM
PDSW-DISCS 2016
23
PDSW-DISCS 2016
24
PDSW-DISCS 2016
25
PDSW-DISCS 2016
–
microsecondlatency –
VLDB Endow., 2013.
26
PDSW-DISCS 2016
27 Sort TeraSort 0.5 1 1.5 2 2.5 3 3.5 Benchmarks Time (s) MR RMR RMR-NVM Sort TeraSort 2 4 6 8 10 12 14 Benchmarks Time (s) MR RMR RMR-NVM
PDSW-DISCS 2016
28 1 11 21 31 41 51 61 71 81 91 200 400 600 800 1000 1200 1400 1600 1800 2000 2200
Map tasks Spill Cost (ms) MR-IPoIB RMR RMR-NVM
2.39x
PDSW-DISCS 2016
29
2.37x 55% 2.48x 51%
PDSW-DISCS 2016
30
PDSW-DISCS 2016
31
PDSW-DISCS 2016
32
PDSW-DISCS 2016
33
PDSW-DISCS 2016