Tradeoffs Between Synchronous and Asynchronous Execution in - - PowerPoint PPT Presentation

tradeoffs between synchronous and asynchronous execution
SMART_READER_LITE
LIVE PREVIEW

Tradeoffs Between Synchronous and Asynchronous Execution in - - PowerPoint PPT Presentation

Tradeoffs Between Synchronous and Asynchronous Execution in PowerGraph Joshua Send Trinity Hall 28 November, 2017 [ 1 ] P o w e r G r a p h R e c a l l : G r a p h L a b = > P o w e r G r a p h M


slide-1
SLIDE 1

Tradeoffs Between Synchronous and Asynchronous Execution in PowerGraph

Joshua Send Trinity Hall 28 November, 2017

slide-2
SLIDE 2

P

  • w

e r G r a p h

[ 1 ]

  • R

e c a l l : G r a p h L a b = > P

  • w

e r G r a p h

  • M
  • t

i v a t i

  • n

: l a r g e n a t u r a l g r a p h s

– F

  • l

l

  • w

p

  • w

e r l a w d i s t r i b u t i

  • n

P ( d ) d ∝

− α

  • P
  • w

e r G r a p h c

  • n

t r i b u t i

  • n

s

– G

e n e r a l i z e d v e r t e x p r

  • g

r a m s

– V

e r t e x C u t s

– P

a r a l l e l l

  • c

k i n g

slide-3
SLIDE 3

PowerGraph

  • Recall: Huge array of system parameters

– Edge distribution

  • Random
  • Heuristic – oblivious (estimate from local state only)
  • Heuristic – coordinated (distributed table of vertex replication)

– Execution Strategies

  • Synchronous supersteps
  • Full Asynchronous
  • Asynchronous + serializable
slide-4
SLIDE 4

2015: PowerSwitch [2]

  • Extends PowerGraph with a new switching mode
  • Choose execution mode (sync/async) based on current problem
  • Async

– Favors CPU-heavy workload – High communication costs (no barrier = no batching) – Heavy contention for shared resources

  • Favors problems with few active vertices at a time

– Some problems (graph coloring) only converge in Async

  • Sync

– Many active vertices and scales well with graph size – Favors lightweight computation & heavy IO

slide-5
SLIDE 5

PowerSwitch

  • Instrument system to measure throughput
  • Also estimate/sample convergence rates
  • Use Neural network or online sampling to

measure throughput of mode not currently in

  • Switch according to some heuristics and the

throughput & convergence rates

slide-6
SLIDE 6

Project

  • Check results from the PowerSwitch paper –

source was found online

  • Modify heuristics/add new parameter to manually

bias execution toward one paradigm or the other

  • Their experiments were run with relatively large

clusters – 48 machines. Attempt running with smaller quantities, compare results

– Expect Synchronous to be used most of the time

slide-7
SLIDE 7

Current Status

  • GraphLab/GraphChi => Turi => Apple
  • graphlab.org no longer a valid domain... dependencies

used to be hosted here

  • Have to manually modify CmakeLists to resolve these

issues...

slide-8
SLIDE 8

References

1) Gonzalez, Joseph E., et al. "PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs." OSDI. Vol. 12. No. 1. 2012. 2)Xie, Chenning, et al. "Sync or async: Time to fuse for distributed graph-parallel computation." ACM SIGPLAN Notices 50.8 (2015): 194-204.