A Convenient Framework for Efficient Parallel Multipass Algorithms - PowerPoint PPT Presentation

Aug 16, 2023 •489 likes •627 views

A Convenient Framework for Efficient Parallel Multipass Algorithms Markus Weimer Joint Work with Sriram Rao and Martin Zinkevich Intro / Point of view taken ML is data compression: from large training data to a small model We typically

A Convenient Framework for Efficient Parallel Multipass Algorithms Markus Weimer Joint Work with Sriram Rao and Martin Zinkevich
Intro / Point of view taken  ML is data compression: from large training data to a small model  We typically iterate over the training data  The state shared between iterations is relatively small O(model)  Many algorithms can be expressed as data-parallel loops with synchronization 2 12/11/10
In MapReduce Pass Result Overhead per Iteration: Data (each pass) • Job setup • Data Loading • Disk I/O 3 12/11/10
Worker/Aggregator Advantages: Final Result Initial Data • Schedule once per Job • Data stays in memory • P2P communication 4 12/11/10
Worker 1. Load data 2. Iterate: 1. Iterates over data 2. Communicates state 3. Waits for input state of next pass 5 12/11/10
Worker 1. Load data 2. Iterate: 1. Iterates over data  user supplied function 2. Communicates state 3. Waits for input state of next pass 6 12/11/10
Aggregator  Receive state from the workers  Aggregate state  Send state to all workers Yahoo! Presentation, Confidential 7 12/11/10
Aggregator  Receive state from the workers  Aggregate state  user supplied  Send state to all workers Yahoo! Presentation, Confidential 8 12/11/10
Failure Handling in the Framework  Worker › Meh (SGD) › Restart on different machine (else)  Aggregator › Restart on different machine › Re-request data from workers 9 12/11/10
Experiments: Parallel Stochastic Gradient Descent Work() Stochastic Gradient Descent pass Aggregate() Average Models 10 12/11/10
Does it work? – Objective over #Passes 0.8 Parallel eta=0.8 0.7 Sequential, eta=0.1 0.6 Sequential, eta=0.8 0.5 Parallel, eta=6.4 0.4 0.3 0.2 0.1 0 1 11 21 31 41 51 61 71 81 91 11 12/11/10
Is it fast? – Time per pass (8 machines) 1.00 0.45 0.06 0.03 0.03 Sequential MapReduce W/A 10 Passes W/A 100 Passes W/A Limit 12 12/11/10
markus weimer Yahoo! Labs weimer@yahoo-inc.com 13

Recommend

Modelling and Simulation of Microalloyed Austenite During Multipass Deformation E.J. Palmiere

Modelling and Simulation of Microalloyed Austenite During Multipass Deformation E.J. Palmiere Department of Materials Science & Engineering, The University of Sheffield Sir Robert Hadfield Building, Mappin Street, Sheffield, S1 3JD, UK

507 views • 32 slides

Optimal Automatic Multipass Shader Partitioning by Dynamic Programming Alan Heirich Sony

Optimal Automatic Multipass Shader Partitioning by Dynamic Programming Alan Heirich Sony Computer Entertainment America 31 July 2005 Disclaimer This talk describes GPU architecture research carried out at Sony Computer Entertainment. It

815 views • 34 slides

Parallel Numerical Algorithms Chapter 2 Parallel Thinking Section 2.2 Parallel

Parallel Programming Paradigms MPI Message-Passing Interface OpenMP Portable Shared Memory Programming Parallel Numerical Algorithms Chapter 2 Parallel Thinking Section 2.2 Parallel Programming Michael T. Heath and Edgar

790 views • 45 slides

Introduction Introduction What is Parallel Architecture? Why Parallel Architecture? Evolution

Introduction Introduction What is Parallel Architecture? Why Parallel Architecture? Evolution and Convergence of Parallel Architectures Fundamental Design Issues 2 What is Parallel Architecture? A parallel computer is a collection of

1.01k views • 84 slides

Parallel and Distributed Programming Introduction Kenjiro Taura 1 / 21 Contents 1 Why Parallel

Parallel and Distributed Programming Introduction Kenjiro Taura 1 / 21 Contents 1 Why Parallel Programming? 2 What Parallel Machines Look Like, and Where Performance Come From? 3 How to Program Parallel Machines? 4 How to Program Parallel

337 views • 33 slides

Introduction to Parallel Computing George Karypis Principles of Parallel Algorithm Design

Introduction to Parallel Computing George Karypis Principles of Parallel Algorithm Design Outline Overview of some Serial Algorithms Parallel Algorithm vs Parallel Formulation Elements of a Parallel Algorithm/Formulation Common

873 views • 52 slides

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources of

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources of Overhead in Parallel Programs n Performance Metrics for Parallel Systems n Effect of Granularity on Performance n Scalability of Parallel Systems n

646 views • 50 slides

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources

1.39k views • 50 slides

Overview Why Parallel Sorting? Parallel Quicksort Bitonic Sort Parallel Merge Sort

Department of Mathematics and Computer Science Department of Mathematics and Computer Science Overview Why Parallel Sorting? Parallel Quicksort Bitonic Sort Parallel Merge Sort Parallel Sorting Algorithms Summary Course 01727

251 views • 6 slides

Parallel Computing: Opportunities and Challenges Victor Lee Parallel Computing Lab (PCL), Intel

Parallel Computing: Opportunities and Challenges Victor Lee Parallel Computing Lab (PCL), Intel Parallel Computing Lab (PCL), Intel Who We Are: Parallel Computing Lab Parallel Computing Research to Realization Worldwide leadership in

372 views • 34 slides

A Massively Parallel Dense Symmetric A Massively Parallel Dense Symmetric A Massively Parallel

A Massively Parallel Dense Symmetric A Massively Parallel Dense Symmetric A Massively Parallel Dense Symmetric A Massively Parallel Dense Symmetric Eigensolver with Communication Eigensolver with Communication g Splitting Multicasting

539 views • 37 slides

Shared Memory Programming with OpenMP Lecture 3: Parallel Regions Parallel region directive

Shared Memory Programming with OpenMP Lecture 3: Parallel Regions Parallel region directive Code within a parallel region is executed by all threads. Syntax: Fortran: !$OMP PARALLEL block !$OMP END PARALLEL C/C++: #pragma omp

321 views • 15 slides

Hydra: : a Python Framework a Python Framework Hydra for Parallel Computing for Parallel

Hydra: : a Python Framework a Python Framework Hydra for Parallel Computing for Parallel Computing Waide Tristram Karen Bradshaw 3 rd November 2009 Hydra in hour hour Hydra in An Opportunity Why Python and CSP? Aim

491 views • 22 slides

PASCAL A Parallel Algorithmic SCALable Framework A Parallel Algorithmic SCALable Framework for

PASCAL A Parallel Algorithmic SCALable Framework A Parallel Algorithmic SCALable Framework for N-body Problems for N-body Problems Laleh Aghababaie Beni, Aparna Chandramowlishwaran Laleh Aghababaie Beni, Aparna Chandramowlishwaran Euro-Par

1.33k views • 83 slides

Energy-efficient parallel software for mobile hand-held devices Antti P Miettinen , Nokia Research

Energy-efficient parallel software for mobile hand-held devices Antti P Miettinen , Nokia Research Center Vesa Hirvisalo, Helsinki University of Technology Mobile-phone view to parallel SW Parallel == efficient? Not always

283 views • 9 slides

+ Design of Parallel Algorithms Bulk Synchronous Parallel A Bridging Model of Parallel

+ Design of Parallel Algorithms Bulk Synchronous Parallel A Bridging Model of Parallel Computation + Need for a Bridging Model n The RAM model has been reasonable successful for serial programming n The model provides a framework for describing

509 views • 22 slides

Demand Management from an Aggregator's Perspective David Brewster, President May 21, 2009

Demand Management from an Aggregator's Perspective David Brewster, President May 21, 2009 Todays Energy Challenges Unprecedented Challenges Higher Increased Renewable Demand Costs Energy Fall Summer 10% of Costs for 1% of the Time

720 views • 32 slides

Datadog: A Real-Time Metrics Database for Trillions of Points/Day Ian NOWLAND

Datadog: A Real-Time Metrics Database for Trillions of Points/Day Ian NOWLAND (https://twitter.com/inowland) VP , Metrics and Monitors Joel BARCIAUSKAS (https://twitter.com/JoelBarciauskas) QCon NYC 19 Director, Aggregation Metrics Some

1.37k views • 90 slides

Full Virtualization for GPUs Reconsidered Revisit -- Suzuki, Yusuke, et al. GPUvm: Why not

Full Virtualization for GPUs Reconsidered Revisit -- Suzuki, Yusuke, et al. GPUvm: Why not virtualizing GPUs at the hypervisor?. USENIX ATC 14. Hangchen Yu 1 , Christopher J. Rossbach 1,2 1 The University of Texas at Austin 2 VMware

880 views • 40 slides

On Overlapping Communication and File I/O in Collective Write Operation Raafat Feki and Edgar

On Overlapping Communication and File I/O in Collective Write Operation Raafat Feki and Edgar Gabriel Parallel Software Technologies Lab Department of Computer Science, University of Houston, Houston, USA Email: {rfeki, egabriel}@uh.edu

373 views • 10 slides

Address Subcommittee Meeting May 10, 2017 1:00 2:30 PM Eastern U.S. Dept. of Transportation

Address Subcommittee Meeting May 10, 2017 1:00 2:30 PM Eastern U.S. Dept. of Transportation Washington, DC Meeting Agenda 1:00 Welcome and Meeting Goals Subcommittee Charter Update NAD availability 1:10 State of Minnesota ETL

618 views • 14 slides

Some Techniques and Best Practices for Sourcing and Properly Citing Climate Science Research 19

Session 2 Some Techniques and Best Practices for Sourcing and Properly Citing Climate Science Research 19 July 2018 In todays session we will cover these topics. 1. How do you do your research? Do you begin with the Internet? Where do

1.8k views • 141 slides

Data Visualizations of HYIP Dataset Quantifying the World April 23, 2012 Jie Han Financial

Data Visualizations of HYIP Dataset Quantifying the World April 23, 2012 Jie Han Financial Cryptography 2012 This could be you!!! http://fc12.ifca.ai/pre-proceedings/paper_27.pdf Overview 1. What's an HYIP? 2. Dataset 3. Processes 4. R

883 views • 27 slides

MAC-layer Approach for Cluster-Based Aggregation in Sensor Networks Petar Popovski, Frank H.P.

MAC-layer Approach for Cluster-Based Aggregation in Sensor Networks Petar Popovski, Frank H.P. Fitzek, Hiroyuki Yomo, Tatiana K. Madsen and Ramjee Prasad Center for TeleInFrastructure (CTIF) Aalborg University Niels Jernes Vej 12, DK-9220

472 views • 21 slides