improving hadoop mapreduce performance on supercomputers
play

Improving Hadoop MapReduce Performance on Supercomputers with JVM - PowerPoint PPT Presentation

1 Thanh-Chung Dao Improving Hadoop MapReduce Performance on Supercomputers with JVM Reuse Thanh-Chung Dao and Shigeru Chiba The University of Tokyo 2 Thanh-Chung Dao Supercomputers Expensive clusters Multi-core processors Large


  1. 1 Thanh-Chung Dao Improving Hadoop MapReduce Performance on Supercomputers with JVM Reuse Thanh-Chung Dao and Shigeru Chiba The University of Tokyo

  2. 2 Thanh-Chung Dao Supercomputers • Expensive clusters • Multi-core processors • Large capacity of main memory • High-speed network • Focus mainly on compute-intensive applications • Data-intensive workloads are emerging as supercomputing problems • Graph processing • Pre-processing of simulation data

  3. 3 Thanh-Chung Dao MapReduce • Simple parallel paradigm to process large datasets • Hidden parallelization & communication • PageRank example Splitting Reducing Result Mapping Shuffling Input Function Reducer <PageB, 0.5> Function Mapper <PageA, 0.5> PageA à PageB, PageC <PageA, 0.5> <PageC, 0.5> Input <PageA, x 1 >, … <PageA, x n > Input PageA à PageB, PageC Shuffling Begin Rank contribution Begin Done rank = 0 PageA 0.5 PageA à PageB, PageC N = outbound links automatically <PageB, 0.5> <PageB, 1> PageB 1 For each item x i PageB à PageC <PageC, 1> PageB à PageC <PageB, 0.5> (Users can For each outbound link PageC 1.5 PageC à PageA, PageB rank += x i ignore) output <Page, 1/N> output <PageA, rank> End End <PageA, 0.5> <PageC, 0.5> PageC à PageA, PageB <PageC, 1.5> <PageB, 0.5> <PageC, 1>

  4. 4 Thanh-Chung Dao Hadoop MapReduce • Standard of MapReduce implementation • Provide easy-to-use MapReduce APIs • TCP/IP-based communication • Designed to run on commodity clusters • Lab clusters, or Amazon EC2 • Scalability (32,000 nodes at Yahoo) & Resilience • Written in Java

  5. 5 Thanh-Chung Dao Improving Hadoop MapReduce Performance on Supercomputers • Hadoop MapReduce is good choice on supercomputers • Maturity • Productivity Supercomputer Hadoop Resource allocation at runtime Static Dynamic (# of processes, memory, CPU) Communication MPI TCP/IP Workload Compute-intensive Data-intensive

  6. 6 Thanh-Chung Dao Our Approach • JVM Reuse • Statically create JVM processes and dynamically allocate to Hadoop tasks • Enable efficient MPI communication by Hadoop tasks • Statically created processes can exploit efficient MPI • Dynamic allocation enables to use the original Hadoop implementation • Shorten start-up time of processes • Technique • Process pool is used to implement JVM Reuse • Minimize changes of the original Hadoop engine

  7. 7 Thanh-Chung Dao Why MPI is required for Hadoop • The de facto high-speed communication on supercomputers Throughput (Mbps) On FX10 MPI 30000 Throughput (Mbps) supercomputer TCP • Improve slow MapReduce shuffling 10 times faster 10000 0 2 0 2 4 2 8 2 12 2 16 2 20 2 26 Message size (Bytes) • Enable Hadoop to co-host traditional MPI applications • Combine MPI and MapReduce models • Rich data analysis workflow • Efficient data sharing between MPI and MapReduce models • E.g. MPI can access data located at Hadoop file system (HDFS)

  8. 8 Thanh-Chung Dao Slow MapReduce shuffling on Hadoop • TCP/IP-based communication • JVM-Bypass (Wang et al., IPDPS 2013) ReduceTasks MapTasks Map output 1 HTTP Servlet Sort & Merge Server Map output n Multiple requests at once Reducing Local disk Slave nodes Reducing Phase Mapping Phase Shuffling Phase

  9. 9 Thanh-Chung Dao Dynamic Process Creation on MPI • Discouraged on supercomputers • Reasons of performance • Collective mechanism (MPISpawn) • Gang scheduling (error-prone if not enough resource) • Gerbil (Xu el al., CCGrid 2015) • Co-hosting MPI applications on Hadoop • Creating dynamically processes • Its experiments showed significant overhead • Resources should be specified before running MPI applications • Number of processes is known (static) • Memory and CPU cores

  10. 10 Thanh-Chung Dao Dynamic Process Creation on Hadoop • Required • Resources are allocated on demand to run MapReduce applications • Number of processes is unknown (dynamic)

  11. 11 Thanh-Chung Dao Dynamic Process Creation on Hadoop • Required • Resources are allocated on demand to run MapReduce applications • Number of processes is unknown (dynamic) Slave 1 Master Slave node 2 … Slave n A Node

  12. 12 Thanh-Chung Dao Dynamic Process Creation on Hadoop • Required • Resources are allocated on demand to run MapReduce applications • Number of processes is unknown (dynamic) Slave 1 6 tasks 6 tasks Master Slave User node 2 Job … Submission 8 tasks Slave n A Node Request sending

  13. 13 Thanh-Chung Dao Dynamic Process Creation on Hadoop • Required • Resources are allocated on demand to run MapReduce applications • Number of processes is unknown (dynamic) Slave 6 1 processes Process creation 6 tasks 6 tasks 6 Master Slave User processes node 2 Process creation Job … Submission 8 tasks Slave 8 n processes Process creation Each task is run Processes on a process A Node Request sending

  14. 14 Thanh-Chung Dao Dynamic Process Creation on Hadoop • Required • Resources are allocated on demand to run MapReduce applications • Number of processes is unknown (dynamic) Slave Task 1 running Process creation 6 tasks 6 tasks Task Master Slave User running node 2 Process creation Job … Submission 8 tasks Slave Task n running Process creation Each task is run Processes on a process A Node Request sending

  15. 15 Thanh-Chung Dao Dynamic Process Creation on Hadoop • Required • Resources are allocated on demand to run MapReduce applications • Number of processes is unknown (dynamic) Slave Terminated 1 Process creation 6 tasks 6 tasks Master Slave Terminated User node 2 Process creation Job … Submission 8 tasks Slave Terminated n Process creation Each task is run Processes on a process A Node Request sending

  16. 16 Thanh-Chung Dao Dynamic Process Creation on Hadoop • Required • Resources are allocated on demand to run MapReduce applications • Number of processes is unknown (dynamic) Slave 1 Master Slave User node 2 Job … Completion Slave n A Node Request sending

  17. 17 Thanh-Chung Dao Idea of Reusing • JVM Pool added • Idle JVM processes • Number of processes is statically fixed Slave idle idle idle 1 JVM Pool Master Slave idle idle idle node 2 JVM Pool … Slave idle idle idle n JVM Pool Processes A Node

  18. 18 Thanh-Chung Dao Idea of Reusing • JVM Pool added • Idle JVM processes • Number of processes is statically fixed Slave idle idle idle 1 6 tasks JVM Pool 6 tasks Master Slave User Job idle idle idle node 2 JVM Pool Submission … 8 tasks Slave idle idle idle n JVM Pool Processes A Node Request sending

  19. 19 Thanh-Chung Dao Idea of Reusing • JVM Pool added • Idle JVM processes • Number of processes is statically fixed Slave Busy idle Busy 1 6 tasks JVM Pool Allocation 6 tasks Master Slave User Job Busy Busy idle node 2 JVM Pool Submission … Allocation 8 tasks Slave Busy Busy Busy n JVM Pool Processes A Node Request sending

  20. 20 Thanh-Chung Dao Idea of Reusing • JVM Pool added • Idle JVM processes • Number of processes is statically fixed Slave idle Running Running 1 6 tasks JVM Pool 6 tasks Master Slave User Job idle Running Running node 2 JVM Pool Submission … 8 tasks Slave Running Running Running n JVM Pool Processes A Node Request sending

  21. 21 Thanh-Chung Dao Idea of Reusing • JVM Pool added • Idle JVM processes • Number of processes is statically fixed Slave idle Cleanup Cleanup 1 6 tasks JVM Pool 6 tasks Master Slave User Job idle Cleanup Cleanup node 2 JVM Pool Submission … 8 tasks Slave Cleanup Cleanup Cleanup n JVM Pool Processes A Node Request sending

  22. 22 Thanh-Chung Dao Idea of Reusing • JVM Pool added • Idle JVM processes • Number of processes is statically fixed Slave idle idle idle 1 6 tasks JVM Pool 6 tasks Master Slave User Job idle idle idle node 2 JVM Pool Submission … 8 tasks Slave idle idle idle n JVM Pool Processes A Node Request sending

  23. 23 Thanh-Chung Dao Idea of Reusing • JVM Pool added • Idle JVM processes • Number of processes is statically fixed Slave idle idle idle 1 JVM Pool Master Slave User Job idle idle idle node 2 JVM Pool Completion … Slave idle idle idle n JVM Pool Processes A Node Request sending

  24. 24 Thanh-Chung Dao JVM Reuse enables MPI communication • MPI communication is established at the beginning • JVM Reuse keeps processes running • MPI connection is always available

  25. 25 Thanh-Chung Dao JVM Reuse shortens start-up time JVM start-up flow of Program A Cmd: java A Class loader subsystem Invoke Class loading OS level main() method process creation of A Class linking Execution engine (verification & initializing) JIT compiler After A finishes, Execute A instructions Program B wants to reuse JVM of A Invoke Execute B Process creation & class loader are skipped main() method instructions of B

  26. 26 Thanh-Chung Dao Iterative jobs benefit from JVM Reuse • Iterative jobs • Many short running JVM processes • PageRank is an example Iterative job flow Yes, then quit Stop Cond? No Pre-processing Initial data Job A Job A Maps use results job of the previous Iteration Map Reduce Map Reduce Map Reduce

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend