M-Flash: Fast Billion-Scale Graph Computation Using a Bimodal Block - - PowerPoint PPT Presentation

m flash
SMART_READER_LITE
LIVE PREVIEW

M-Flash: Fast Billion-Scale Graph Computation Using a Bimodal Block - - PowerPoint PPT Presentation

M-Flash: Fast Billion-Scale Graph Computation Using a Bimodal Block Processing Model Hugo Gualdron University of Sao Paulo Robson Cordeiro University of Sao Paulo Jose Rodrigues-Jr University of Sao Paulo Duen Horng (Polo) Chau Georgia Tech


slide-1
SLIDE 1

M-Flash:

Fast Billion-Scale Graph Computation Using a Bimodal Block Processing Model

Hugo Gualdron University of Sao Paulo Robson Cordeiro University of Sao Paulo Jose Rodrigues-Jr University of Sao Paulo Duen Horng (Polo) Chau Georgia Tech Minsuk Kahng Georgia Tech U Kang Seoul National University Dezhi “Andy” Fang

Presenter

Georgia Tech

slide-2
SLIDE 2

2

Internet

4+ Billion Web Pages

www.worldwidewebsize.com www.opte.org

slide-3
SLIDE 3

3

Citation Network

www.scirus.com/press/html/feb_2006.html#2 Modified from well-formed.eigenfactor.org

250+ Million Articles

slide-4
SLIDE 4

Many More

4

§ Twitter

Who-follows-whom (310 million monthly active users) Who-buys-what (300+ million users)

§

cellphone network

Who-calls-whom (130+ million users)

Protein-protein interactions

200 million possible interactions in human genome

Sources: www.selectscience.net www.phonedog.com www.mediabistro.com www.practicalecommerce.com

slide-5
SLIDE 5

5

Large Graphs Are Common

Graph Nodes Edges

YahooWeb 1.4 Billion 6 Billion Symantec Machine-File Graph 1 Billion 37 Billion Twitter 104 Million 3.7 Billion Phone call network 30 Million 260 Million

Takes Most Space

slide-6
SLIDE 6

Scalable Graph Computation on Single Machines

6

131 198 1248 209.5 298 428.5

500 1000 1500

MMap TurboGraph GraphChi GraphX Giraph Spark

PageRank Runtime (s) on Twitter Graph (1.5 billion edges; 10 iterations, lower is better)

128 Cores Single Machine (4 cores)

Today’s single machines are very powerful.

Can we do even better?

McSherry, Frank, Michael Isard, and Derek G. Murray. "Scalability! But at what COST?." 15th Workshop on Hot Topics in Operating Systems (HotOS XV). 2015. Lin, Zhiyuan, et al. "Mmap: Fast billion-scale graph computation on a pc via memory mapping." Big Data (Big Data), 2014 IEEE International Conference on. IEEE, 2014.

slide-7
SLIDE 7

M-Flash:

Fast Billion-Scale Graph Computation Using a Bimodal Block Processing Model

7

slide-8
SLIDE 8

8

Our Observation #1: I/O is Bottleneck

Graph edges need to be stored on disk. Symantec graph: 37 billion edges, 200+ GB Disk access is much slower than RAM.

Goal: Reduce I/O, especially random accesses

slide-9
SLIDE 9

9

Our Observation #2: Real-world graphs are sparse.

Adjacency matrix contains dense and sparse blocks

Dense Blocks Sparse Blocks

https://web.stanford.edu/class/bios221/labs/networks/lab_7_networks. html

slide-10
SLIDE 10

10

M-Flash’s Solutions

  • 1. Determine edge block types

(dense and sparse)

  • 2. Design efficient processing approaches

for each block type

slide-11
SLIDE 11

11

Determine Block Types In Pre-processing

BlockType = Sparse, if I/O cost if treated as Sparse I/O cost if treated as Dense < 1 Dense, otherwise Dense Sparse Sparse Sparse

slide-12
SLIDE 12

12

Dense Block Processing

(Assuming all blocks are dense)

New vertex values

= x

Old vertex values

slide-13
SLIDE 13

13

I/O Cost for Dense Block Processing

Type equation here.

O( 𝛾 + 1 𝑊 + 𝐹 𝐶 + 𝛾9)

Each vertex is read 𝛾 times and then written once # Edge Size of per I/O Operation #Interval (= #Row = #Column) # Vertex

slide-14
SLIDE 14

14

Source Partition 1 Destination

Source Partition: Sequential Read

Sparse Block Processing

(Assuming all blocks are sparse)

Source Partition 2

slide-15
SLIDE 15

15

Source Destination Partition 1

Destination Partition: Sequential Write

Destination Partition 2

Sparse Block Processing

(Assuming all blocks are sparse)

slide-16
SLIDE 16

16

I/O Cost for Sparse Block Processing

Type equation here.

O(2 𝑊 + 𝐹 + 2|𝐹=>?=@A=A| 𝐶 + 𝛾9)

# Edge Size of per I/O Operation #Interval (= #Row = #Column) # Vertex Edge with extended information

slide-17
SLIDE 17

17

Bimodal Block Processing

BlockType = Sparse, if I/O cost if treated as Sparse I/O cost if treated as Dense < 1 Dense, otherwise Dense Sparse Sparse Sparse

slide-18
SLIDE 18

18

Large Graphs Used in Evaluation

Graph Nodes Edges LiveJournal 5 Million 69 Million Twitter 41 Million 1.5 Billion YahooWeb 1.4 Billion 6.6 Billion R-Mat (Synthetic) 4 Billion 12 Billion

slide-19
SLIDE 19

19

Runtime of M-Flash

1000 2000 3000 16GB 8GB 4GB

Memory Size

PageRank Runtime (s) on 6 billion edge YahooWeb Graph

(1 iteration, shorter is better)

M-Flash MMap TurboGraph X-Stream GraphChi

slide-20
SLIDE 20

20

  • Fastest single-node graph computing framework
  • Innovative bimodal design that addresses varying edge

density in real-world graphs

  • M-Flash Code: https://github.com/M-Flash/m-flash-cpp
  • MMap Project: http://poloclub.gatech.edu/mmap/

M-Flash:

Fast Billion-Scale Graph Computation Using a Bimodal Block Processing Model

CNPq (grant 444985/2014-0), Fapesp (grants 2016/02557-0, 2014/21483-2), Capes, NSF (grants IIS- 1563816, TWC-1526254, IIS-1217559) GRFP (grant DGE-1148903), Korean (MSIP) agency IITP (grant R0190-15-2012)

Dezhi “Andy” Fang

Georgia Tech CS Undergrad http://andyfang.me