Raphtory : Streaming Analysis Of Distributed Temporal Graphs - - PowerPoint PPT Presentation

raphtory streaming analysis of distributed temporal graphs
SMART_READER_LITE
LIVE PREVIEW

Raphtory : Streaming Analysis Of Distributed Temporal Graphs - - PowerPoint PPT Presentation

Raphtory : Streaming Analysis Of Distributed Temporal Graphs Benjamin Steer , Felix Cuadrado & Richard G. Clegg 1 Motivation Traditional Graph Processing Systems Chosen Computation Output Graph Snapshot Processing Snapshot 1 Snapshot 2


slide-1
SLIDE 1

Raphtory : Streaming Analysis Of Distributed Temporal Graphs

Benjamin Steer, Felix Cuadrado & Richard G. Clegg

1

slide-2
SLIDE 2

Motivation

Chosen Computation

Traditional Graph Processing Systems

Snapshot1

Graph Snapshot Processing

Snapshot2 Snapshot2

Chosen Computation

Output for each snapshot Output

2

slide-3
SLIDE 3

Stream-Based Graph Processing Platform Event source User Graph maintained in-memory

Motivation

■ Analysis on the most recent Graph ■ Near real-time updates to metrics ■ Compare new updates to previous state ■ Temporal graph analysis

3

slide-4
SLIDE 4
  • Temporal Graph Model
  • Formalisation of model and update semantics
  • Distributed graph management
  • Stream Ingestion and near real-time maintenance
  • Pregel-like temporal Graph Analysis
  • Live, view and temporal range analysis

Raphtory features

4

slide-5
SLIDE 5

Raphtory Design

5

Implemented in Scala using the Akka actor model

[Raphtory: Streaming analysis of distributed temporal graphs, Future Generation Computer Systems 2020, Vol 102, pp 453-464]

slide-6
SLIDE 6

Edge 1 à 2

Partition Manger Ingestion

Vertex 1 Created: t8 Edge 1 à 2 Created: t14 Created: t14 Deleted: t15 Vertex 2 Edge 1 à 2 Deleted: t15 Created: t14 Deleted: t15 Created: t14 Partition 1 Partition 2

6

slide-7
SLIDE 7

{ "Edge Add":{ ”Message Time": 14, ”Source ID":1, ”Destination ID":2 } } Partition Manager 1 Vertex 1 Created: t14 Edge 1 à 2 Created: t14 Partition Manager 2 Vertex 2 Edge 1 à 2 Created: t14 Created: t14 Created: t8 Created: t9

Correct update order

7

slide-8
SLIDE 8

{ "Edge Add":{ ”Message Time": 14, ”Source ID":1, ”Destination ID":2 } } Partition Manager 1 Vertex 1 Created: t14 Edge 1 à 2 Created: t14 Partition Manager 2 Vertex 2 Edge 1 à 2 Created: t14 Created: t14 { ”Vertex Add":{ ”Message Time": 8, ”Source ID":1 } } Created: t8

Edge Added Before Vertex

8

slide-9
SLIDE 9

{ "Edge Add":{ ”Message Time": 14, ”Source ID":1, ”Destination ID":2 } } Partition Manager 1 Vertex 1 Created: t8 Edge 1 à 2 Created: t14 Partition Manager 2 Vertex 2 Edge 1 à 2 Deleted: t15 Created: t14 Deleted: t15 Created: t14 Deleted: t15 Created: t14 { ”Vertex Removal":{ ”Message Time": 15 ”Source ID":2 } }

Vertex Deletion Before Edge Addition

9

slide-10
SLIDE 10

Analysis

Partition Manager Partition Manager Partition Manager Partition Manager Router Router Router Analysis Manager Analysis Request Individual Responses

10

slide-11
SLIDE 11

Live Graph, Views & Snapshots

11

slide-12
SLIDE 12

Views & Windowing

Full History of the Graph

t0 tn t10

View (Right Hand Filter) Window (Left Hand Filter)

t5

Window Size = 5

slide-13
SLIDE 13

Windowing Batches

Full History of the Graph

t0 tn t10

Batch of Windows (Decreasing in size)

t5

Window Sizes = [5,3,1]

t7 t9

slide-14
SLIDE 14

Temporal Range Analysis

Full History of the Graph

t0 tn t10 t4

Range of Interest = t4 -> t10 Interval = 2

t6 t8

slide-15
SLIDE 15

Gab.ai Connected Components Every Hour Across Lifetime

slide-16
SLIDE 16

Gab.ai Connected Components Every Hour Across Lifetime

Largest Connected Component

slide-17
SLIDE 17
  • Available at github: https://github.com/miratepuffin/raphtory
  • Includes starting documentation and tutorials
  • Readme goes through a single machine dockerised version that runs

connected components over Gab graph.

  • Multiple spouts (parsing data from Gab, Twitter, Bitcoin, Ethereum)
  • Multiple analysis functions implemented (on views, ranges, window)
  • Connected Components
  • Information Diffusion
  • Top Degree vertex rankings

17

Using Raphtory

slide-18
SLIDE 18

18

Drop me a line at b.a.steer@qmul.ac.uk Raise PR’s/Queries on Git https://github.com/miratepuffin/raphtory

Future Roadmap and Getting Involved