STREAMER: a Distributed Framework for Incremental Closeness - PowerPoint PPT Presentation

STREAMER: ¡a ¡Distributed ¡Framework ¡for ¡ Incremental ¡Closeness ¡Centrality ¡Computa@on ¡ ¡ A. ¡Erdem ¡Sarıyüce ¡ 1,2 , ¡Erik ¡Saule ¡ 4 , ¡Kamer ¡Kaya ¡ 1 , ¡Ümit ¡V. ¡Çatalyürek ¡ 1,3 ¡ 1 ¡ Department ¡of ¡Biomedical ¡InformaBcs ¡ 2 ¡ Department ¡of ¡Computer ¡Science ¡& ¡Engineering ¡ ¡ 3 ¡ Department ¡of ¡Electrical ¡& ¡Computer ¡Engineering ¡ The ¡Ohio ¡State ¡University ¡ 4 ¡ Department ¡of ¡Computer ¡Science ¡ University ¡of ¡North ¡Carolina ¡CharloMe ¡ ¡ ¡ IEEE ¡Cluster ¡2013, ¡Indianapolis, ¡IN ¡ ¡ ¡

Massive ¡Graphs ¡are ¡everywhere ¡ • Facebook has a billion users and a trillion connections • Twitter has more than 200 million users Topic 2 Topic 5 Topic 4 Topic 1 Topic 6 Topic 3 citation graphs IEEE STREAMER: ¡a ¡Distributed ¡Framework ¡for ¡Incremental ¡Closeness ¡Centrality ¡Computa@on ¡ 2 Cluster’13

Large(r) ¡Networks ¡and ¡Centrality ¡ • Who ¡is ¡more ¡important ¡in ¡a ¡ network? ¡Who ¡controls ¡the ¡ flow ¡between ¡nodes? ¡ • Centrality ¡metrics ¡answer ¡these ¡ quesBons ¡ • Closeness ¡Centrality ¡(CC) ¡is ¡an ¡ intriguing ¡metric ¡ ¡ • How ¡to ¡handle ¡changes? ¡ • Incremental ¡algorithms ¡are ¡good ¡ but ¡not ¡enough ¡in ¡pracBce ¡ • Parallelism ¡is ¡essenBal ¡ IEEE STREAMER: ¡a ¡Distributed ¡Framework ¡for ¡Incremental ¡Closeness ¡Centrality ¡Computa@on ¡ 3 Cluster’13

Closeness ¡Centrality ¡ • Let ¡ G=(V, ¡E) ¡be ¡a ¡graph ¡with ¡vertex ¡set ¡ V ¡and ¡edge ¡set ¡ E ¡ • Farness ¡(far) ¡of ¡a ¡vertex ¡is ¡the ¡sum ¡of ¡shortest ¡distances ¡to ¡each ¡ vertex ¡ X far [ u ] = d G ( u, v ) . v 2 V d G ( u,v ) 6 = 1 • Closeness ¡centrality ¡(cc) ¡of ¡a ¡vertex ¡: ¡ ¡ 1 cc [ u ] = far [ u ] . • Best ¡algorithm: ¡All-‑pairs ¡shortest ¡paths ¡ • ¡O(|V|.|E|) ¡complexity ¡for ¡unweighted ¡networks ¡ • For ¡large ¡and ¡dynamic ¡networks ¡ • From ¡scratch ¡computaBon ¡is ¡infeasible ¡ ¡ • Faster ¡soluBons ¡are ¡essenBal ¡ ¡ IEEE STREAMER: ¡a ¡Distributed ¡Framework ¡for ¡Incremental ¡Closeness ¡Centrality ¡Computa@on ¡ 4 Cluster’13 ¡

CC ¡Algorithm ¡ Algorithm 1: CC: Basic centrality computation Data : G = ( V, E ) Single Source Shortest Path Output : cc [ . ] (SSSP) is computed for each 1 for each s ∈ V do vertex . SSSP( G , s ) with centrality computation Q ← empty queue d [ v ] ← ∞ , ∀ v ∈ V \ { s } Q .push( s ), d [ s ] ← 0 far [ s ] ← 0 while Q is not empty do Breadth- v ← Q .pop() First for all w ∈ Γ G ( v ) do Search with if d [ w ] = ∞ then farness Q .push( w ) computation d [ w ] ← d [ v ] + 1 cc value is far [ s ] ← far [ s ] + d [ w ] assigned 1 cc [ s ] = far [ s ] return cc [ . ] IEEE STREAMER: ¡a ¡Distributed ¡Framework ¡for ¡Incremental ¡Closeness ¡Centrality ¡Computa@on ¡ 5 Cluster’13

Incremental ¡Closeness ¡Centrality ¡ ¡ • CompuBng ¡cc ¡values ¡from ¡scratch ¡a`er ¡each ¡edge ¡change ¡ is ¡very ¡costly ¡ • Incremental ¡algorithms ¡are ¡used ¡to ¡handle ¡changes ¡ • Main ¡idea ¡is ¡to ¡reduce ¡number ¡of ¡SSSPs ¡to ¡be ¡executed ¡ • Three ¡filtering ¡techniques ¡are ¡proposed ¡ • Filtering ¡with ¡level ¡differences ¡ • Filtering ¡with ¡biconnected ¡components ¡ • Filtering ¡with ¡idenBcal ¡verBces ¡ • Details ¡can ¡be ¡found ¡at ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ “A. ¡E. ¡Sarıyuce, ¡K. ¡Kaya, ¡E. ¡Saule, ¡and ¡Umit ¡V. ¡Catalyurek. ¡Incremental ¡algorithms ¡ for ¡Closeness ¡Centrality. ¡IEEE ¡BigData ¡Conference, ¡2013” ¡ ¡ IEEE STREAMER: ¡a ¡Distributed ¡Framework ¡for ¡Incremental ¡Closeness ¡Centrality ¡Computa@on ¡ 6 Cluster’13

Filtering ¡with ¡level ¡differences ¡ • Upon ¡edge ¡inserBon, ¡breadth-‑first ¡search ¡tree ¡of ¡ each ¡vertex ¡will ¡change. ¡Three ¡possibiliBes: ¡ • Case ¡1 ¡and ¡2 ¡will ¡not ¡change ¡cc ¡of ¡s! ¡ • No ¡need ¡to ¡apply ¡SSSP ¡from ¡them ¡ • Just ¡Case ¡3 ¡ • BFSs ¡are ¡executed ¡from ¡u ¡and ¡v ¡and ¡level ¡diff ¡is ¡checked ¡ ¡ IEEE STREAMER: ¡a ¡Distributed ¡Framework ¡for ¡Incremental ¡Closeness ¡Centrality ¡Computa@on ¡ 7 Cluster’13

Filtering ¡with ¡biconnected ¡components ¡ • What ¡if ¡the ¡graph ¡have ¡arBculaBon ¡points? ¡ v u A B • Change ¡in ¡A ¡can ¡change ¡cc ¡of ¡any ¡vertex ¡in ¡A ¡and ¡B ¡ • CompuBng ¡the ¡change ¡for ¡u ¡is ¡ enough ¡for ¡finding ¡ changes ¡for ¡any ¡vertex ¡v ¡in ¡B ¡(constant ¡factor ¡is ¡added) ¡ IEEE STREAMER: ¡a ¡Distributed ¡Framework ¡for ¡Incremental ¡Closeness ¡Centrality ¡Computa@on ¡ 8 Cluster’13

Filtering ¡with ¡iden@cal ¡ver@ces ¡ • Two ¡types ¡of ¡idenBcal ¡verBces: ¡ • Type ¡I: ¡u ¡and ¡v ¡are ¡idenBcal ¡verBces ¡if ¡ ¡N(u) ¡= ¡N(v), ¡i.e., ¡their ¡ neighbor ¡lists ¡are ¡same ¡ u v • Type ¡II: ¡u ¡and ¡v ¡are ¡idenBcal ¡verBces ¡if ¡{u} ¡U ¡N(u) ¡= ¡{v} ¡U ¡N(v), ¡ i.e., ¡they ¡are ¡also ¡connected ¡ u v • If ¡u ¡and ¡v ¡are ¡idenBcal ¡verBces, ¡their ¡cc ¡are ¡the ¡same ¡ • Same ¡breadth-‑first ¡search ¡trees! ¡ IEEE STREAMER: ¡a ¡Distributed ¡Framework ¡for ¡Incremental ¡Closeness ¡Centrality ¡Computa@on ¡ 9 Cluster’13

Is ¡it ¡enough? ¡ | V | | E | name Time (in sec.) web-NotreDame 325K 1,090K 53.0 amazon0601 403K 2,443K 298.1 875K 4,322K 824.4 web-Google ¡ • Too ¡slow ¡for ¡real-‑Bme ¡processing ¡ • The ¡problem ¡is ¡mostly ¡parallel ¡and ¡graphs ¡are ¡relaBvely ¡ small. ¡ • Source-‑level ¡parallelism ¡can ¡be ¡used ¡to ¡fill ¡up ¡a ¡cluster ¡ IEEE STREAMER: ¡a ¡Distributed ¡Framework ¡for ¡Incremental ¡Closeness ¡Centrality ¡Computa@on ¡ 10 Cluster’13

DataCuNer ¡ • Component-‑based ¡middleware ¡tool ¡ • Supports ¡filter-‑stream ¡programming ¡ • Implements ¡the ¡computaBons ¡as ¡a ¡set ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ of ¡components ¡(filters) ¡that ¡exchange ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ data ¡through ¡logical ¡streams ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ (unidirecBonal ¡data ¡flows) ¡ • Layout ¡is ¡a ¡filter ¡ontology ¡ • Describes ¡the ¡set ¡of ¡tasks, ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ streams ¡and ¡the ¡connecBons ¡ • All ¡replicable ¡ IEEE STREAMER: ¡a ¡Distributed ¡Framework ¡for ¡Incremental ¡Closeness ¡Centrality ¡Computa@on ¡ 11 Cluster’13

STREAMER: a Distributed Framework for Incremental Closeness - PowerPoint PPT Presentation

STREAMER: a Distributed Framework for Incremental Closeness Centrality Computa@on A. Erdem Saryce 1,2 , Erik Saule 4 , Kamer Kaya 1 , mit V.

Incremental Garbage Collection Part II Roland Schatz Incremental Garbage Collection p.1/22

Multi-Chamber Arrester Field Test Experience in Asia High Lightning Density Area JENS R. BOTHE

Extent- -based Incremental Identification based Incremental Identification Extent of Reaction

Incremental Construction Cost Incremental Construction Cost Analysis for New Homes Robin Snyder,

ENTSOG: 5 th Stakeholder Joint Working Session for the Incremental Proposal 8 April 2014 5th SJWS

Incremental SAT Library Integration using Abstract Stobjs Sol Swords Centaur Technology, Inc.

Batch-Incremental vs. Instance-Incremental Learning in Dynamic and Evolving Data Jesse Read 1 ,

Incremental Consistency Guarantees For Replicated Objects Rachid Guerraoui, Matej Pavlovic,

Incremental Change of Software Taxonomy of Evolution Changes Incremental change (IC)

Incremental and Non-incremental Learning of Control Knowledge for Planning Daniel Borrajo Mill

RCGC is naturally incremental, how about making it concurrent Review Incremental mark-sweep

Efficient Incremental Dynamic Invariant Detection Jeff Perkins and Michael Ernst MIT CSAIL Page

Incremental Classification: First Step into Lifelong Learning PAN Xinyu MMLab, Department of IE

Incremental Event Calculus for Run-Time Reasoning Efthimis Tsilionis, Alexander Artikis, Georgios

Distributed Systems (ICE 601) Distributed Transactions Dongman Lee ICU Class Overview

Unleashing Talent in A Distributed Workforce C O R E N E T 2 0 2 0 HACKATHON: DISTRIBUTED W O R K

An Ontology-Based Model for Vehicular Ad-hoc Networks Adrian Groza, Anca Marginean and Vlad

Using Stationary Vehicles to Enhance Cooperative Positioning in Vehicular Ad-hoc Networks R.H.

Mode l Pre dic tive Control for E ne rg y- e ffic ie nt Ma ne uve ring of Conne c te d

Bound-state QED calculations for antiprotonic helium V.I. Korobov Joint Institute for Nuclear

Image and Video Coding: Transform Coefficient Coding 18 6 2 0 1 0 0 0 2 0 1 0 0 0 0 0 1 2 0 0 0

Why Meningitis B Vaccination Matters Presented by Patti Wukovits, RN and Alicia Stillman About

XML in Programming 2 (just a small supplement) Patryk Czarnik XML and Applications 2014/2015

XML and Databases XML and Databases (CS 345b) (CS 345b) Daniela Florescu (dflorescu@mac.com)