A Convenient Framework for Efficient Parallel Multipass Algorithms - - PowerPoint PPT Presentation

a convenient framework for efficient parallel multipass
SMART_READER_LITE
LIVE PREVIEW

A Convenient Framework for Efficient Parallel Multipass Algorithms - - PowerPoint PPT Presentation

A Convenient Framework for Efficient Parallel Multipass Algorithms Markus Weimer Joint Work with Sriram Rao and Martin Zinkevich Intro / Point of view taken ML is data compression: from large training data to a small model We typically


slide-1
SLIDE 1

A Convenient Framework for Efficient Parallel Multipass Algorithms Markus Weimer

Joint Work with Sriram Rao and Martin Zinkevich

slide-2
SLIDE 2

Intro / Point of view taken

  • ML is data compression: from large training

data to a small model

  • We typically iterate over the training data
  • The state shared between iterations is

relatively small O(model)  Many algorithms can be expressed as data-parallel loops with synchronization

12/11/10 2

slide-3
SLIDE 3

In MapReduce

12/11/10 3

Overhead per Iteration:

  • Job setup
  • Data Loading
  • Disk I/O

Pass Result Data (each pass)

slide-4
SLIDE 4

Worker/Aggregator

12/11/10 4

Advantages:

  • Schedule once per Job
  • Data stays in memory
  • P2P communication

Initial Data Final Result

slide-5
SLIDE 5

Worker

  • 1. Load data
  • 2. Iterate:
  • 1. Iterates over data
  • 2. Communicates state
  • 3. Waits for input state of next pass

12/11/10 5

slide-6
SLIDE 6

Worker

  • 1. Load data
  • 2. Iterate:
  • 1. Iterates over data  user supplied function
  • 2. Communicates state
  • 3. Waits for input state of next pass

12/11/10 6

slide-7
SLIDE 7

Aggregator

  • Receive state from the workers
  • Aggregate state
  • Send state to all workers

12/11/10 Yahoo! Presentation, Confidential 7

slide-8
SLIDE 8

Aggregator

  • Receive state from the workers
  • Aggregate state  user supplied
  • Send state to all workers

12/11/10 Yahoo! Presentation, Confidential 8

slide-9
SLIDE 9

Failure Handling in the Framework

  • Worker

› Meh (SGD) › Restart on different machine (else)

  • Aggregator

› Restart on different machine › Re-request data from workers

12/11/10 9

slide-10
SLIDE 10

Experiments: Parallel Stochastic Gradient Descent Work() Stochastic Gradient Descent pass Aggregate() Average Models

12/11/10 10

slide-11
SLIDE 11

Does it work? – Objective over #Passes

12/11/10 11

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 1 11 21 31 41 51 61 71 81 91 Parallel eta=0.8 Sequential, eta=0.1 Sequential, eta=0.8 Parallel, eta=6.4

slide-12
SLIDE 12

Is it fast? – Time per pass (8 machines)

1.00 0.45 0.06 0.03 0.03

Sequential MapReduce W/A 10 Passes W/A 100 Passes W/A Limit

12/11/10 12

slide-13
SLIDE 13

markus

weimer

Yahoo! Labs weimer@yahoo-inc.com

13