Hadoop over NDN: Initial Experience and Results Mathias Gibbens, - PowerPoint PPT Presentation

Apr 15, 2023 •164 likes •317 views

Hadoop over NDN: Initial Experience and Results Mathias Gibbens, Lei Ye, Chris Gniady, and Beichuan Zhang The University Of Arizona Overview The research goal: apply NDN to the data center network environment to improve the storage, access,

Hadoop over NDN: Initial Experience and Results Mathias Gibbens, Lei Ye, Chris Gniady, and Beichuan Zhang The University Of Arizona
Overview The research goal: apply NDN to the data center network environment to improve the storage, access, and processing of large amount of data. The current work: modify Hadoop to run on top of NDN to establish performance baseline, and collect research problems, still work in process. The next step: design NDN-native distributed filesystem and network mechanisms to improve system performance and resiliency. 1
What is Hadoop A popular MapReduce framework for distributed storage and processing of large data sets. 2 http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/
Hadoop Distributed File System By default, data is stored in the Hadoop Distributed File System (HDFS), in the unit of Blocks. HDFS replicates each Block to three different DataNodes along with checksums to ensure data integrity Cluster-wide consistent states provided by NameNode • Maintain states of entire HDFS • All requests of data placement and retrieval go through it. • Receiving heartbeats from DataNodes and initiate recovery after failures detected. 3
Why Hadoop over NDN Hadoop is a complex piece of software that requires non- trivial configuration and tuning for good performance. NDN can improve the performance • Caching, multicast, multi-path and multi-source data retrieval. Increase resiliency and failure handling • Get data from any working node that stores the data • Interest-data feedback loop to quickly detect failures and adapt to them by forwarding strategy Simplify implementation • Many network-related functions are handled by NDN. Signature for data integrity and security 4
Making Hadoop running on NDN A challenging task to modify a complex piece of software • As the first step, simply convert all the communication to “NDN Sockets” using address/port in the names. • Future work is to make the application logic NDN-native. 5
Making Hadoop running on NDN Remote Procedure Calls (RPC) • Used between NameNode and DataNodes • RPC requests and responses can be naturally mapped to NDN Interests and Data. • A name contains address, port, timestamp, and nonce to make it unique. TCP data transfer • Between DataNodes for bulk data transfer • Writing a Block in HDFS requires 2 other replicas. • Need to convert the “push” model to “pull”, which becomes multicast to the replicas. 6
Experiments Run a diverse set of benchmarks on two Hadoop clusters. 7
Writing 1GB data 8
Cache hit over 30-second bins 9
A missing piece: congestion control 10
Code changes 11
Conclusions Opportunities for traffic reduction • Caching and multicast. Other potentials • Multipath, multi-source data transfer • Resiliency: failure detection and recovery • Code simplification Challenges • Routing, forwarding strategy, etc. to realize the potentials. 12
Comments and Suggestions? 13

Recommend

NDN Testbed Status Update March 2017 John DeHart Washington University jdd@wustl.edu NDN

NDN Testbed Status Update March 2017 John DeHart Washington University jdd@wustl.edu NDN Testbed Shared resource for running NDN experim ents Each node runs an NDN softw are router: nfd : NDN Forw arding Daem on NLSR : NDN

593 views • 15 slides

SAS Data Loader for Hadoop Agenda Intro What is Hadoop? What do I get from Hadoop?

SAS Data Loader for Hadoop Agenda Intro What is Hadoop? What do I get from Hadoop? Hadoop components Why SAS Data Loader for Hadoop? SAS Data Loader for Hadoop overview Demo Introduction Doug Cutting, creator of Hadoop

285 views • 11 slides

NDN-Trace A A PATH TRA RACING UT UTILITY Y FOR R NDN NDN SIHAM KHOUSSI, DAVIDE PESAVENTO ,

NDN-Trace A A PATH TRA RACING UT UTILITY Y FOR R NDN NDN SIHAM KHOUSSI, DAVIDE PESAVENTO , LOTFI BENMOHAMED, ABDELLA BATTOU NATIONA NA ONAL L INS NSTITUTE OF OF STAND NDARDS DS AND ND TECHNOL NOLOG OGY (NI NIST) NDN vs IP IP

478 views • 11 slides

Consumer Producer API for Named Data Networking Ilya Moiseenko 1 How do we develop NDN apps?

Consumer Producer API for Named Data Networking Ilya Moiseenko 1 How do we develop NDN apps? How do we develop NDN apps? 2 Lib Libraries: NDN cxx, NDN cpp i NDN NDN Figure 1. Interest / Data API 3 Can we have more

554 views • 17 slides

NDN Managed Gateways and The NDN Testbed John DeHart Computer Science & Engineering

NDN Managed Gateways and The NDN Testbed John DeHart Computer Science & Engineering Washington University www.arl.wustl.edu NDN Nodes Types of NDN Nodes Gatew ay Routers End User Application Nodes Gatew ay Router Nodes

552 views • 16 slides

Routing in NDN Lan Wang (University of Memphis) & the NDN Team FIA PI Meeting 11/14/2013

Routing in NDN Lan Wang (University of Memphis) & the NDN Team FIA PI Meeting 11/14/2013 11/14/13 Roadmap What does NDN require from a routing protocol? How does NDN support in-network storage, anycast, and mobility? What does

741 views • 49 slides

Hadoop on HPC: Integrating Hadoop and Pilot-based Dynamic Resource Management Andre Luckow,

Hadoop on HPC: Integrating Hadoop and Pilot-based Dynamic Resource Management Andre Luckow, Ioannis Paraskevakos, George Chantzialexiou and Shantenu Jha Hadoop on HPC: Integrating Hadoop and Pilot- based Dynamic Resource Management Overview

322 views • 17 slides

COMP9313: Big Data Management Hadoop and HDFS Hadoop Apache Hadoop is an open-source

COMP9313: Big Data Management Hadoop and HDFS Hadoop Apache Hadoop is an open-source software framework that Stores big data in a distributed manner Processes big data parallelly Builds on large clusters of commodity hardware.

2.93k views • 60 slides

NDN Codebase and Tools Introduction and getting started info A LEX A FANASYEV Florida

NDN Codebase and Tools Introduction and getting started info A LEX A FANASYEV Florida International University aa@cs.fiu.edu Starting Point: https://named-data.net/ Codebase 2 NDN Tutorial ACM SIGCOMM 2017 NDN Codebase Overview

1.41k views • 30 slides

NDN-BMS Security: Requirements and Solution Wentao Shang (UCLA) Application scenario

NDN-BMS Security: Requirements and Solution Wentao Shang (UCLA) Application scenario NDN-BMS collects sensor data from UCLA campus and publishes the data into an NDN repo Multiple users have access to the data Different users have

393 views • 11 slides

NDN Congestion Control Motivation, Assumptions, and Early Design Klaus Schneider, Beichuan Zhang

NDN Congestion Control Motivation, Assumptions, and Early Design Klaus Schneider, Beichuan Zhang March 24, 2016 1 Why NDN Congestion Control is Hard NDN Architecture makes Congestion Control hard: 1. Unknown Endpoints 2. Universal Caching

86 views • 4 slides

DEMO Cloud-optional Home IoT w/NDN UCLA IRL 8 9 Motivation NDN primitives already provide

DEMO Cloud-optional Home IoT w/NDN UCLA IRL 8 9 Motivation NDN primitives already provide benefits to IoT systems, we will demonstrate that here. IOTDI 16 and 17 papers discuss building higher-level functionality. Basic idea

304 views • 28 slides

Named Data Networking of Things: NDN for Microcontrollers (NDN-RIOT) Wentao Shang, Alex

1 Named Data Networking of Things: NDN for Microcontrollers (NDN-RIOT) Wentao Shang, Alex Afanasyev, Lixia Zhang, and others UCLA The Future is Coming: Internet-of-Things 2 Four Market Drivers Four Barriers Expanding Internet

388 views • 9 slides

NAMED DATA NETWORKING (NDN) Named Data Networking NDN BRIEF HISTORY When the Networking was

NAMED DATA NETWORKING (NDN) Named Data Networking NDN BRIEF HISTORY When the Networking was developed in the 60s and 70s Networking was mainly used for resource sharing. IP was the effective communication protocol in place.

484 views • 29 slides

SIGCOMM 2017 NDN Demonstrations John DeHart Washington University in St. Louis jdd@wustl.edu N

SIGCOMM 2017 NDN Demonstrations John DeHart Washington University in St. Louis jdd@wustl.edu N ight D uring N oontime (Aug. 21, 2017) Topics Platforms NDN Testbed Open Network Lab (ONL) Amazon AWS NDN Topics

674 views • 33 slides

Poster: NDN Distributed File System (NDFS) Junior DONGO (UPEC) Fabrice MOURLIN (UPEC) Charif

Poster: NDN Distributed File System (NDFS) Junior DONGO (UPEC) Fabrice MOURLIN (UPEC) Charif MAHMOUDI (UPEC/ NIST) Email: junior.dongo@u-pec.fr 23/ 03/ 2017 Junior DONGO NDNComm 2017 Overview Objectives DFS based on NDN NDN

415 views • 3 slides

Multi-Source Adjustment of Multi-Layer Annotation: the Bits of Wisdom Approach Kilian Evang 20

Multi-Source Adjustment of Multi-Layer Annotation: the Bits of Wisdom Approach Multi-Source Adjustment of Multi-Layer Annotation: the Bits of Wisdom Approach Kilian Evang 20 January 2012 http://gmb.let.rug.nl 1/1 Multi-Source Adjustment of

611 views • 14 slides

Professor Paul Knight Secondary Care Appraisal Lead Appraisal and Revalidation Update

Professor Paul Knight Secondary Care Appraisal Lead Appraisal and Revalidation Update medical.revalidation@ggc.scot.nhs.uk What did we recommend in 2012? Everyone must have an Appraisal every year These should take place throughout the year

396 views • 26 slides

While we wait to begin, please access PollEv : 1. If you have a browser on a computer or

While we wait to begin, please access PollEv : 1. If you have a browser on a computer or smartphone: Go to: PollEv.com/erictseng514 2. To use text messaging through traditional cell phone or smart phone: Text ERICTSENG514 to 3-7607 to

865 views • 73 slides

Household Incomes in Tax Data: Using Addresses to Move from Tax Units to Households Jeff Larrimore

Household Incomes in Tax Data: Using Addresses to Move from Tax Units to Households Jeff Larrimore , Jacob Mortenson , and David Splinter U.S. Federal Reserve Board Joint Committee on Taxation, U.S. Congress May 2017

497 views • 22 slides

Multi-Task Active Learning Yi Zhang Outline Active Learning Multi-Task Active Learning

Multi-Task Active Learning Yi Zhang Outline Active Learning Multi-Task Active Learning Linguistic Annotations (ACL 08) Image Classification (CVPR 08) Current Work and Discussions Constraint-Driven Active Learning

945 views • 47 slides

Multi-Task Learning: Models, Optimization and Applications Linli Xu University of Science and

Multi-Task Learning: Models, Optimization and Applications Linli Xu University of Science and Technology of China University of Science and Technology of China Outline Introduction to multi-task learning (MTL): problem and models

561 views • 53 slides

Multitask Learning with Low-Level Auxiliary Tasks 1 Traditional automatic speech recognition

for Speech Recognition Shubham Toshniwal , Hao Tang, Liang Lu, Karen Livescu Toyota Technological Institute at Chicago Multitask Learning with Low-Level Auxiliary Tasks 1 Traditional automatic speech recognition (ASR) systems are modular.

751 views • 32 slides

Recitation 1: Multitasking Kai Mast Threads vs. Processes Threads Processes How to start?

Recitation 1: Multitasking Kai Mast Threads vs. Processes Threads Processes How to start? pthread_create() fork() (+ exec() ) Own Address Space? No Yes Can share memory? Yes No Can execute Yes Yes concurrently? (and other

340 views • 14 slides