Can Non-Volatile Memory Benefit MapReduce Applications on HPC - PowerPoint PPT Presentation

Can Non-Volatile Memory Benefit MapReduce Applications on HPC Clusters? Md. Wasi-ur- Rahman, Nusrat Sharmin Islam, Xiaoyi Lu, and Dhabaleswar K. (DK) Panda Department of Computer Science and Engineering The Ohio State University Columbus, OH, USA

Outline • Introduction • Problem Statement • Key Contributions • Opportunities and Design • Performance Evaluation • Conclusion and Future Work PDSW-DISCS 2016 2

Introduction • Big Data has become one of the most important elements in business analytics • The rate of information growth appears to be exceeding Moore’s Law • Every day ~2.5 quintillion (2.5×10 18 ) bytes of data are created http://www.coolinfographics.com/blog/tag/data?currentPage=3 • Big Data and High Performance Computing (HPC) are converging to meet large scale data processing challenges • According to IDC, 67% of HPC centers are running High Performance Data Analysis (HPDA) workloads • The revenues of these workloads are expected to grow exponentially http://www.climatecentral.org/news/white-house-brings-together-big-data- and-climate-change-17194 PDSW-DISCS 2016 3

Big Data Processing with Hadoop • The open-source implementation of User Applications MapReduce programming model for Big Data Analytics MapReduce • Major components q HDFS q MapReduce HDFS • Underlying Hadoop Distributed File System Hadoop Common (RPC) (HDFS) can be used by both MapReduce and Hadoop Framework end applications PDSW-DISCS 2016 4

Drivers of Modern HPC Cluster Architectures Accelerators / Coprocessors High Performance Interconnects - high compute density, high InfiniBand SSD, NVMe-SSD, NVRAM performance/watt Multi-core Processors <1usec latency, 100Gbps Bandwidth> >1 TFlop DP on a chip • Multi-core/many-core technologies • Remote Direct Memory Access (RDMA)-enabled networking (InfiniBand and RoCE) • Solid State Drives (SSDs), Non-Volatile Random-Access Memory (NVRAM), Parallel File Systems • Accelerators (NVIDIA GPGPUs and Intel Xeon Phi) Tianhe – 2 Stampede Titan Gordon PDSW-DISCS 2016 5

Non-Volatile Memory Trends http://www.slideshare.net/Yole_Developpement/yole-emerging-nonvolatile- memory-2016-report-by-yole-developpement?next_slideshow=2 http://www.chipdesignmag.com/bursky/?paged=2 • NVM devices offer DRAM-like performance characteristics with persistence; suitable for data processing middleware • Number of NVM applications are growing rapidly because of the byte-addressability and persistence features PDSW-DISCS 2016 6

NVM-aware HDFS Applications and Benchmarks • Our previous work, NVFS provides NVRAM-based Hadoop Spark HBase designs for HDFS MapReduce • Exploits byte-addressability of Co-Design NVM for communication and (Cost-Effectiveness, Use-case) I/O in HDFS NVM and RDMA-aware HDFS (NVFS) • MapReduce, Spark, HBase can DataNode obtain better performance for Writer/Reader RDMA DFSClient utilizing NVFS as input-output Replicator NVFS- NVFS- BlkIO MemIO storage RDMA NVM • N. S. Islam, M. W. Rahman, X. Lu, D. K. Panda, RDMA RDMA RDMA High Performance Design for HDFS with Byte- Receiver Sender Receiver Addressability of NVM and RDMA , 24th SSD SSD SSD International Conference on Supercomputing (ICS '16), Jun 2016. PDSW-DISCS 2016 7

MapReduce on HPC Systems Our previous works provide designs for MapReduce with these HPC resources PDSW-DISCS 2016 8

Problem Statement • What are the possible choices for using NVRAM in the MapReduce execution pipeline? • How can MapReduce execution frameworks take advantage of NVRAM in such use cases? • Can MapReduce benchmarks and applications be benefitted through the usage of NVRAM in terms of performance and scalability? PDSW-DISCS 2016 10

Key Contributions • Proposed a novel NVRAM-assisted Map Output Spill Approach • Applied our approach on top of RDMA-based Hadoop MapReduce to keep both map and reduce phase enhancements • Proposed approach can significantly out-perform the current approaches proven by different sets of workloads PDSW-DISCS 2016 12

RDMA-enhanced MapReduce • RDMA-based MapReduce – RDMA-based shuffle engine – Pre-fetching and caching of intermediate data – M. W. Rahman , N. S. Islam, X. Lu, J. Jose, H. Subramoni, H. Wang, and D. K. Panda, High-Performance RDMA-based Design of Hadoop MapReduce over InfiniBand , HPDIC, in conjunction with IPDPS, 2013 • Hybrid Overlapping among Phases (HOMR) – Overlapping among map, shuffle, and merge phases as well as shuffle, merge, and reduce phases – Advanced shuffle algorithms with dynamic adjustments in shuffle volume – M. W. Rahman , X. Lu, N. S. Islam, and D. K. Panda, HOMR: A Hybrid Approach to Exploit Maximum Overlapping in MapReduce over High Performance Interconnects , ICS, 2014 These designs are incorporated into the public release of “RDMA for Apache Hadoop” package under HiBD project PDSW-DISCS 2016 13

The High-Performance Big Data (HiBD) Project • RDMA for Apache Spark • RDMA for Apache Hadoop 2.x (RDMA-Hadoop-2.x) – Plugins for Apache, Hortonworks (HDP) and Cloudera (CDH) Hadoop distributions • RDMA for Apache HBase • RDMA for Memcached (RDMA-Memcached) • RDMA for Apache Hadoop 1.x (RDMA-Hadoop) • OSU HiBD-Benchmarks (OHB) – HDFS, Memcached, and HBase Micro-benchmarks • http://hibd.cse.ohio-state.edu • Users Base: 195 organizations from 26 countries • More than 18,600 downloads from the project site • RDMA for Impala (upcoming) Available for InfiniBand and RoCE PDSW-DISCS 2016 14

RDMA for Apache Hadoop 2.x • High-Performance Design of Hadoop over RDMA-enabled Interconnects – High performance RDMA-enhanced design with native InfiniBand and RoCE support at the verbs-level for HDFS, MapReduce, and RPC components – Enhanced HDFS with in-memory and heterogeneous storage – High performance design of MapReduce over Lustre – Plugin-based architecture supporting RDMA-based designs for Apache Hadoop, HDP, and CDH • Current release: 1.1.0 – Based on Apache Hadoop 2.7.3 – Compliant with Apache Hadoop 2.7.3, HDP 2.5.0.3, CDH 5.8.2 APIs and applications – http://hibd.cse.ohio-state.edu PDSW-DISCS 2016 15

Outline • Introduction • Problem Statement • Key Contributions • Opportunities and Design – Optimization Opportunities – NVRAM-Assisted Map Spilling • Performance Evaluation • Conclusion and Future Work PDSW-DISCS 2016 16

Optimization Opportunities • Utilizing NVMs as PCIe SSD devices would be straight-forward – Configuring the Hadoop local dirs with the NVMe SSD locations – No design changes required 400 350 • Performance improvement 300 potential with such Execution Time (s) 250 configuration changes is 200 150 not high 100 – Only improves by 16% for 50 RAMDisk over HDD as 0 HDD SSD RAMDisk intermediate data storage Intermediate Data Storage • Utilizing NVMs as NVRAM can be crucial PDSW-DISCS 2016 17

HOMR Design and Execution Flow Map Task Reduce Task Intermediate Data Spill In- Read Map Shuffle Mem Reduce Output Files Merge Opportunities exist to Merge Input Files improve the All Operations are In- performance with Memory RDMA NVRAM Map Task Reduce Task Spill In- Read Map Shuffle Mem Reduce Merge Merge PDSW-DISCS 2016 18

Profiling Map Phase • Map execution performance can be estimated from five different stages Merge the spill files Reading input data Applying Serialization and Spilling key-value and write the data to from file system map() function Partitioning pairs to files intermediate storage Involves disk operations on intermediate data storage PDSW-DISCS 2016 19

Profiling Map Phase 14 Sort TeraSort 12 10 8 Time (s) 6 4 2 0 Read + Map + Collect Spill + Merge • Profiled 20GB Sort and TeraSort experiments on 8 nodes with default Hadoop • Averaged over 3 executions • Spill + Merge takes 1.71x more time compared to Read + Map + Collect for Sort; for TeraSort, it takes 3.75x more time PDSW-DISCS 2016 20

Outline • Introduction • Problem Statement • Key Contributions • Opportunities and Design – Optimization Opportunities – NVRAM-Assisted Map Spilling • Performance Evaluation • Conclusion and Future Work PDSW-DISCS 2016 21

NVRAM-Assisted Map Spilling Map Task Reduce Task Intermediate Data Spill In- Read Map Shuffle Mem Reduce Output Files Merge Merge Input Files q Minimizes the disk operations in Spill phase NVRAM q Final merged output is still written to intermediate data storage RDMA for maintaining similar fault-tolerance Map Task Reduce Task Spill In- Read Map Shuffle Mem Reduce Merge Merge PDSW-DISCS 2016 22

Can Non-Volatile Memory Benefit MapReduce Applications on HPC - PowerPoint PPT Presentation

Can Non-Volatile Memory Benefit MapReduce Applications on HPC Clusters? Md. Wasi-ur- Rahman, Nusrat Sharmin Islam, Xiaoyi Lu, and Dhabaleswar K. (DK) Panda Department of Computer Science and Engineering The Ohio State University Columbus, OH,

Encrypted Non-volatile Main Memory Systems Yu Hua Huazhong University of Science and Technology

Soft Updates Made Simple and Fast on Non-volatile Memory Mingkai Dong , Haibo Chen Institute of

Object-Oriented Recovery for Non-volatile Memory Nachshon Cohen, David Aksun, James Larus EPFL 10

Cutting MapReduce Cost with Spot Market Huan Liu Accenture Technology Labs Why spot market? 2

MapReduce Andrew Crotty Alex Galakatos What is MapReduce? MapReduce is a framework for:

Mrs: MapReduce for Scientific Computing in Python Andrew McNabb, Jeff Lund , and Kevin Seppi

Architectural Support for Atomic Durability in Non-Volatile Memory Arpit Joshi , Vijay Nagarajan,

COMP9313: Big Data Management MapReduce Data Structure in MapReduce Key-value pairs are the

Rethinking Applications in the NVM Era Amitabha Roy ex- Intel Research NVM = Non Volatile

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Phoenix Rebirth: Scalable MapReduce on a Large-Scale Shared-Memory System Richard Yoo, Anthony

A Persistent Friedman Lock-Free Queue Maurice Herlihy for Non-Volatile Memory Virendra

HOOP: Efficient Hardware-Assisted Out-of-Place Update for Non-Volatile Memory Miao Cai Chance

A Write-friendly Hashing Scheme for Non-volatile Memory Systems Pengfei Zuo and Yu Hua Huazhong

Managing Non-Volatile Memory in Database Systems A review by Apaar Shanker DATA ANALYTICS

Storage Class Memory Towards a disruptively low-cost solid-state non-volatile memory Science

Rigid Body Dynamics CSE169: Computer Animation Instructor: Steve Rotenberg UCSD, Winter 2020

A 3 NCF: An Adaptive Aspect Attention Model for Rating Prediction Zhiyong Cheng 1 , Ying Ding 2 ,

Using BARRA to assess climate related hazards Chun-Hsu Su 1 , Nathan Eizenberg 1 , Peter Steinle 1

1 st Parent-Teacher Meeting (PTM) 2017 10 Feb 2017 PROGRAMME Objectives Introduction of

Moving Toward a Concurrent Computing Grammar Michael Kane and Bryan Lewis "Make the easy

On the Complexity of Rationalizing Behavior Jose Apesteguia and Miguel A. Ballester Universitat

MUSer2: An Efficient MUS Extractor SYSTEM DESCRIPTION Anton Belov and Joao Marques-Silva Complex

Quantum circuits: From Structure to Software Aleks Kissinger Quantum Natural Language Processing,

Can Non-Volatile Memory Benefit MapReduce Applications on HPC - PowerPoint PPT Presentation

Can Non-Volatile Memory Benefit MapReduce Applications on HPC Clusters? Md. Wasi-ur- Rahman, Nusrat Sharmin Islam, Xiaoyi Lu, and Dhabaleswar K. (DK) Panda Department of Computer Science and Engineering The Ohio State University Columbus, OH,

Encrypted Non-volatile Main Memory Systems Yu Hua Huazhong University of Science and Technology

Soft Updates Made Simple and Fast on Non-volatile Memory Mingkai Dong , Haibo Chen Institute of

Object-Oriented Recovery for Non-volatile Memory Nachshon Cohen, David Aksun, James Larus EPFL 10

Cutting MapReduce Cost with Spot Market Huan Liu Accenture Technology Labs Why spot market? 2

MapReduce Andrew Crotty Alex Galakatos What is MapReduce? MapReduce is a framework for:

Mrs: MapReduce for Scientific Computing in Python Andrew McNabb, Jeff Lund , and Kevin Seppi

Architectural Support for Atomic Durability in Non-Volatile Memory Arpit Joshi , Vijay Nagarajan,

COMP9313: Big Data Management MapReduce Data Structure in MapReduce Key-value pairs are the

Rethinking Applications in the NVM Era Amitabha Roy ex- Intel Research NVM = Non Volatile

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Phoenix Rebirth: Scalable MapReduce on a Large-Scale Shared-Memory System Richard Yoo, Anthony

A Persistent Friedman Lock-Free Queue Maurice Herlihy for Non-Volatile Memory Virendra

HOOP: Efficient Hardware-Assisted Out-of-Place Update for Non-Volatile Memory Miao Cai Chance

A Write-friendly Hashing Scheme for Non-volatile Memory Systems Pengfei Zuo and Yu Hua Huazhong

Managing Non-Volatile Memory in Database Systems A review by Apaar Shanker DATA ANALYTICS

Storage Class Memory Towards a disruptively low-cost solid-state non-volatile memory Science

Rigid Body Dynamics CSE169: Computer Animation Instructor: Steve Rotenberg UCSD, Winter 2020

A 3 NCF: An Adaptive Aspect Attention Model for Rating Prediction Zhiyong Cheng 1 , Ying Ding 2 ,

Using BARRA to assess climate related hazards Chun-Hsu Su 1 , Nathan Eizenberg 1 , Peter Steinle 1

1 st Parent-Teacher Meeting (PTM) 2017 10 Feb 2017 PROGRAMME Objectives Introduction of

Moving Toward a Concurrent Computing Grammar Michael Kane and Bryan Lewis &quot;Make the easy

On the Complexity of Rationalizing Behavior Jose Apesteguia and Miguel A. Ballester Universitat

MUSer2: An Efficient MUS Extractor SYSTEM DESCRIPTION Anton Belov and Joao Marques-Silva Complex

Quantum circuits: From Structure to Software Aleks Kissinger Quantum Natural Language Processing,

Moving Toward a Concurrent Computing Grammar Michael Kane and Bryan Lewis "Make the easy