graph computation on computer cluster
play

Graph Computation on Computer Cluster? Steep learning curve Cost - PowerPoint PPT Presentation

MMap Fast Billion-Scale Graph Computation on a PC via Memory Mapping Lead by Zhiyuan (Jerry) Lin Georgia Tech CS Undergrad Now: Stanford 1st year PhD student MMap: Fast Billion-Scale Graph Computation on a PC via Memory Mapping .


  1. MMap 
 Fast Billion-Scale Graph Computation on a PC via Memory Mapping Lead by 
 Zhiyuan (Jerry) Lin 
 Georgia Tech CS Undergrad Now: Stanford 1st year PhD student MMap: Fast Billion-Scale Graph Computation on a PC via Memory Mapping . Zhiyuan Lin, Minsuk Kahng, Kaeser Md. Sabrin, Duen Horng Chau, Ho Lee, and U Kang. Proceedings of IEEE BigData 2014 conference. Oct 27-30, Washington DC, USA. Towards Scalable Graph Computation on Mobile Devices. Yiqi Chen, Zhiyuan Lin, Robert Pienta, Minsuk Kahng, Duen Horng (Polo) Chau. IEEE BigData 2014 Workshop on Scalable Machine Learning: Theory and Applications. 1

  2. Graph Computation on 
 Computer Cluster? Steep learning curve Cost Overkill for smaller graphs Image source: http://www.drupaltky.org/en/article/20

  3. Best-of-breed Single-PC Approaches GraphChi – OSDI 2012 • TurboGraph – KDD 2013 • What do they have in common? Sophisticated Data Structures • Explicit Memory Management •

  4. Can We Do Less? 
 To get same or better performance? 
 e.g., auto memory management, faster, etc.

  5. Main Idea: Memory-mapped the Graph 5

  6. Main Idea: Memory-mapped the Graph ! l l a s ’ t a h T 5

  7. How to compute PageRank for r e d n i huge matrix? m e R 2 3 1 Use the power iteration method http://en.wikipedia.org/wiki/Power_iteration 4 p = c B p + (1-c) 1 5 n B p p’ (1-c) = c + n 6 Can initialize this vector to any non-zero vector, e.g., all “1”s

  8. Example: PageRank (implemented using MMap) http://www.cc.gatech.edu/~dchau/papers/14-bigdata-mmap.pdf 7

  9. 8

  10. Why Memory Mapping Works? High-degree nodes’ info automatically cached/kept in memory for future frequent access Read-ahead paging preemptively loads edges from disk. Highly-optimized by the OS No need to explicitly manage memory 
 (less book-keeping)

  11. Also works on tablets! (If you want.) 
 Big Data on Small Devices (270M+ Edges) 10

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend