Pregel: A System for Large-Scale Graph Processing Grzegorz - - PowerPoint PPT Presentation

pregel a system for large scale graph processing
SMART_READER_LITE
LIVE PREVIEW

Pregel: A System for Large-Scale Graph Processing Grzegorz - - PowerPoint PPT Presentation

Pregel: A System for Large-Scale Graph Processing Pregel: A System for Large-Scale Graph Processing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski Bogdan-Alexandru


slide-1
SLIDE 1

Pregel: A System for Large-Scale Graph Processing

Pregel: A System for Large-Scale Graph Processing

Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski

Bogdan-Alexandru Matican

University of Cambridge

February 26, 2013

slide-2
SLIDE 2

Pregel: A System for Large-Scale Graph Processing

Table of contents

1 Research questions 2 Design

Programming Model Usability Architecture

3 Experiments 4 Conclusion

slide-3
SLIDE 3

Pregel: A System for Large-Scale Graph Processing Research questions

Main considerations

Typical Google system’s paper. Cross-research influences: MapReduce, Chubby, GFS, BigTable. Scalability process graphs of billions of vertexes Usability paradigm, API, features Architecture Master-Slave, network aggregation, data locality Transparency fault tolerance, commodity machines Performance resources, speed, scale

slide-4
SLIDE 4

Pregel: A System for Large-Scale Graph Processing Design Programming Model

Vertex

local action: vertex and outgoing edges message passing communication independent state change: synchronicity

slide-5
SLIDE 5

Pregel: A System for Large-Scale Graph Processing Design Programming Model

System

supersteps (BSP model) message based state alterations aggregation performance optimizations fault tolerance (check-pointing)

slide-6
SLIDE 6

Pregel: A System for Large-Scale Graph Processing Design Usability

API Design

simple interface for users to understand usage pattern driven: Combiner, Aggregator, Http IO format variable for interoperability fault tolerance transparent data partitioning

slide-7
SLIDE 7

Pregel: A System for Large-Scale Graph Processing Design Architecture

Components and Mechanics

data sharding (graph partitioning) Master (ids, sharding, sync, pings) Workers (supersteps, state, buffering) fault tolerance (check-pointing, confined recovery) performance considerations

slide-8
SLIDE 8

Pregel: A System for Large-Scale Graph Processing Experiments

Scalability

Figure : Binary tree topology for 800 workers, 300 machines.

Linear scaling of runtime for binary fan-out, high vertex count.

slide-9
SLIDE 9

Pregel: A System for Large-Scale Graph Processing Experiments

Scalability

Figure : Social graph topology for 800 workers, 300 machines.

Linear scaling of runtime for relatively sparse graphs with instances

  • f high density.
slide-10
SLIDE 10

Pregel: A System for Large-Scale Graph Processing Experiments

Notes

naive implementation of SSSP no input pre-processing or special sharding comparable results with state-of-the-art systems scalable considerably past points shown in paper

slide-11
SLIDE 11

Pregel: A System for Large-Scale Graph Processing Conclusion

Contributions

programming model design simplicity concurency avoidance fault tolerance performance optimizations

slide-12
SLIDE 12

Pregel: A System for Large-Scale Graph Processing Conclusion

Critique and questions

master failover mechanism? evaluation: good enough for us evaluation: how much faster?