for Data Intensive Scalable Computing CAP3 Gene Assembly Program - PowerPoint PPT Presentation

Sep 14, 2023 •113 likes •150 views

Architecture and Performance of Runtime Environments for Data Intensive Scalable Computing CAP3 Gene Assembly Program Compute intensive application Embarrassingly parallel operation All runtimes performs equally well Measured

Architecture and Performance of Runtime Environments for Data Intensive Scalable Computing CAP3 – Gene Assembly Program • Compute intensive application • Embarrassingly parallel operation • All runtimes performs equally well Measured using 32 Compute nodes each with 8 cores and 16 GB of memory Data/compute intensive applications implemented as MapReduce “ filters ” Number of Reads processed High Energy Physics Data Analysis • Data intensive application • MapReduce style parallel operation • Both runtimes perform comparably well Architecture of CGL-MapReduce Jaliya Ekanayake {jekanaya@cs.indiana.edu}
Iterative MapReduce- Kmeans Clustering and Matrix Multiplication Overhead of parallel runtimes – Matrix Multiplication • Compute intensive application O(n^3) • Higher data transfer requirements O(n^2) • CGL-MapReduce shows minimal overheads next to MPI Iterative MapReduce algorithm for Matrix Multiplication Overhead of parallel runtimes – Kmeans Clustering • O(n) calculations in each iteration • Small data transfer requirements O(1) • With large data sets, CGL-MapReduce shows negligible overheads • Extremely higher Kmeans Clustering implemented as an overheads in Hadoop iterative MapReduce application and Dryad Jaliya Ekanayake {jekanaya@cs.indiana.edu}
High Performance Parallel Computing on Cloud • Performance of MPI on virtualized resources – Evaluated using a dedicated private cloud infrastructure – Exactly the same hardware and software configurations in bare-metal and virtual nodes – Applications with different communication: computation ratios – Different virtual machine(VM) allocation strategies {1-VM per node to 8-VMs per node} Performance of Matrix multiplication Overhead under different VM configurations for under different VM configurations Concurrent Wave Equation Solver • O(1) communication (Smaller messages) • O(n^2) communication (n = dimension of a matrix) • More susceptible to latency • More susceptible to bandwidth than latency • Higher overheads under virtualized • Minimal overheads under virtualized resources resources Jaliya Ekanayake {jekanaya@cs.indiana.edu}

Recommend

MapReduce Data Intensive Computing Data-intensive computing is a class of parallel

MapReduce Data Intensive Computing Data-intensive computing is a class of parallel computing applications which use a data parallel approach to processing large volumes of data typically terabytes or petabytes in size and typically

1.01k views • 51 slides

Data-Intensive Workfmows A journey to a Holistjc Framework for Data-Intensive Workfmows Ian

Data-Intensive Workfmows A journey to a Holistjc Framework for Data-Intensive Workfmows Ian Corner Design and Implementatjon Lead May 2016 INFORMATION MANAGEMENT AND TECHNOLOGY (IMT) CSIRO Data-Intensive Workfmows Holistjc

919 views • 61 slides

Cache Coherence in Scalable Machines Scalable Cache Coherent Systems Scalable, distributed

Cache Coherence in Scalable Machines Scalable Cache Coherent Systems Scalable, distributed memory plus coherent replication Scalable distributed memory machines P-C-M nodes connected by network communication assist interprets

1.13k views • 87 slides

Data Intensive Computing Frameworks Amir H. Payberah amir@sics.se Amirkabir University of

Data Intensive Computing Frameworks Amir H. Payberah amir@sics.se Amirkabir University of Technology 1394/2/25 Amir H. Payberah (AUT) Data Intensive Computing 1394/2/25 1 / 95 Big Data small data big data Amir H. Payberah (AUT) Data

1.29k views • 116 slides

Automated Debugging In Data Intensive Scalable Computing Systems Muhammad Ali Gulzar 1 Siman Wang

Automated Debugging In Data Intensive Scalable Computing Systems Muhammad Ali Gulzar 1 Siman Wang 1,2 Miryung Kim 1 1 University of California, Los Angeles 2 Hunan University 1 Big Data Debugging in the Dark Develop locally Hope it works Run

392 views • 22 slides

Intensive Family Support Project Katherine Manchester Paula Hill What is the Intensive Family

Intensive Family Support Project Katherine Manchester Paula Hill What is the Intensive Family Support Project The Intensive Family Support Project (IFSP) is part of Hounslows response to supporting families with multiple and complex

726 views • 19 slides

Scalable String Matching on the Scalable String Matching on the Scalable String Matching on the

Scalable String Matching on the Scalable String Matching on the Scalable String Matching on the Cell BE Processor BE Processor Cell Cell BE Processor Daniele Scarpazza, Oreste Villa, Fabrizio Petrini Applied Computer Science Group Pacific

408 views • 21 slides

Automated Debugging In Data Intensive Scalable Computing Systems Muhammad Ali Gulzar 1 , Matteo

Automated Debugging In Data Intensive Scalable Computing Systems Muhammad Ali Gulzar 1 , Matteo Interlandi 3 , Xueyuan Han 2 , Mingda Li 1 , Tyson Condie 1 , and Miryung Kim 1 1 University of California, Los Angeles 2 Harvard University 3

397 views • 25 slides

Enabling Enabling Data- -Intensive Science Intensive Science Data with Tactical Storage

Enabling Enabling Data- -Intensive Science Intensive Science Data with Tactical Storage Systems with Tactical Storage Systems Prof. Douglas Thain Prof. Douglas Thain University of Notre Dame University of Notre Dame

604 views • 47 slides

Zarr - scalable storage of tensor Zarr - scalable storage of tensor data for parallel and

Zarr - scalable storage of tensor Zarr - scalable storage of tensor data for parallel and distributed data for parallel and distributed computing computing Alistair Miles ( @alimanfoo ) - SciPy 2019 These slides:

801 views • 63 slides

OCIO UFOs Template 4 April 26, 2011 4 April 26, 2011 Objectives 1. Provide an interoperable

What are the grand challenges? 1. Data intensive 2. Computing intensive 3. Spatiotemporal intensive 4. Concurrent intensive OCIO UFOs Template 4 April 26, 2011 4 April 26, 2011 Objectives 1. Provide an interoperable platform to share

128 views • 10 slides

and Observational Science The Convergence of Data-Intensive and Compute-Intensive Infrastructure

The Revolution in Experimental and Observational Science The Convergence of Data-Intensive and Compute-Intensive Infrastructure Professor Tony Hey Chief Data Scientist STFC tony.hey@stfc.ac.uk The Background X-Info The evolution of

827 views • 60 slides

Distributed Computing Resources at Duke University Scalable Computing Support Center

Distributed Computing Resources at Duke University Scalable Computing Support Center http://wiki.duke.edu/display/SCSC http://sites.duke.edu/scsc scsc@duke.edu John Pormann, Ph.D. jbp1@duke.edu Scalable Computing Support Center

470 views • 17 slides

Scalable node addressing Scalable node addre and message routing for global

Scalable node addressing Scalable node addre and message routing for global computing lobal computing Peer-to-peer (P2P) Global computing

1.37k views • 103 slides

Limitlessly Scalable Storage for Capacity-Intensive Computing Meet Cloudian S3-compatible

Limitlessly Scalable Storage for Capacity-Intensive Computing Meet Cloudian S3-compatible object storage Appliances and Software-defined Storage Start small, scale to over an exabyte Hybrid cloud and multi-cloud ready 2 See

257 views • 7 slides

Turning Data Into Business Value Qwertee 101: Finding Your Next Data Partner Data-Intensive

Turning Data Into Business Value Qwertee 101: Finding Your Next Data Partner Data-Intensive Software Services We cover the full scope of data-intensive application development. From data engineering, machine learning to production-ready cloud

212 views • 9 slides

Univention and our solutions Linux T ag Berlin, 22.05.2013 Cord Martens

Univention and our solutions Linux T ag Berlin, 22.05.2013 Cord Martens <martens@univention.de> Nico Gulden <gulden@univention.de> https://plus.google.com/s/univention Agenda 1)Univention GmbH and our portfolio (Cord) 2)Demo:

260 views • 14 slides

Informed Search A* Algorithm CE417: Introduction to Artificial Intelligence Sharif University of

Informed Search A* Algorithm CE417: Introduction to Artificial Intelligence Sharif University of Technology Fall 2019 Soleymani Artificial Intelligence: A Modern Approach, Chapter 3 Most slides have been adopted from Klein and Abdeel,

835 views • 71 slides

Extracting Unsatisfiable Cores for LTL via Temporal Resolution Viktor Schuppan TIME 2013,

Extracting Unsatisfiable Cores for LTL via Temporal Resolution Viktor Schuppan TIME 2013, Pensacola, FL, USA, September 26-28, 2013 LTL as a Specification Language 2 LTL + relatives widely used specification languages; methodologies exist:

287 views • 28 slides

Data and Process Modelling 3. Object-Role Modeling - CSDP Step 4 Marco Montali KRDB Research

Data and Process Modelling 3. Object-Role Modeling - CSDP Step 4 Marco Montali KRDB Research Centre for Knowledge and Data Faculty of Computer Science Free University of Bozen-Bolzano A.Y. 2014/2015 Marco Montali (unibz) DPM - 3.CDSP-4 A.Y.

716 views • 45 slides

Order of Presentations Team Topic 206-2 Evil neighbors - Securing the IPv6 Link-Layer 206-1

Order of Presentations Team Topic 206-2 Evil neighbors - Securing the IPv6 Link-Layer 206-1 IPFS - Interplanetary Filesystem 205-1 BlockCars 205-2 SSL Strip Attack - Downgrade HTTPS to HTTP 201-1 CoAP - 8-Bit uCs also deserve to REST

1.41k views • 117 slides

Site Report OpenAFS and Kerberos at the Max Planck Institute for Gravitational Physics October

Site Report OpenAFS and Kerberos at the Max Planck Institute for Gravitational Physics October 18th, 2012 Andreas Donath Systemsadministrator MPI for Gravitational Physics European AFS and Kerberos Conference 2012 Max - Planck - Institut fr

371 views • 16 slides

AI Lab - Lesson 2 Informed Search Davide Corsi Alessandro Farinelli University of Verona

AI Lab - Lesson 2 Informed Search Davide Corsi Alessandro Farinelli University of Verona Department of Computer Science April 1 st 2020 Start Your Working Environment Start the previously installed (Session 1) conda environment ai-lab Listing

68 views • 5 slides

Announcements Project 0: Python Tutorial Due today at 11:59pm (0 points in class, but

Announcements Project 0: Python Tutorial Due today at 11:59pm (0 points in class, but pulse check to see you are in + get to know submission system) Homework 0: Math self-diagnostic Optional, but important to check your

761 views • 74 slides