Kineograph Raymond Cheng (University of Washinton, Microsoft - - PowerPoint PPT Presentation

kineograph
SMART_READER_LITE
LIVE PREVIEW

Kineograph Raymond Cheng (University of Washinton, Microsoft - - PowerPoint PPT Presentation

Kineograph Raymond Cheng (University of Washinton, Microsoft Research) et al. The challenge Social networks (Facebook, Twitter) generate a lot of information Let's analyze it! Simple data-mining won't do: too much data


slide-1
SLIDE 1

Kineograph

Raymond Cheng (University of Washinton, Microsoft Research) et al.

slide-2
SLIDE 2

The challenge

  • Social networks (Facebook, Twitter)

generate a lot of information

  • Let's analyze it!
  • Simple data-mining won't do:

○ too much data ○ constant influx of new data ○ long computation time

slide-3
SLIDE 3

A solution

  • Process live stream of data (i.e. tweets)
  • Aggregate them as a dynamic graph
  • Snapshot regularly
  • Run distributed graph-mining on snapshots

○ support incremental computation

slide-4
SLIDE 4

Kineograph architecture

slide-5
SLIDE 5

Data influx (ingest node)

@Alice: @Bob, check out these #kittens ! Transaction T @Alice @Bob #kittens Ingest node node(Alice)

T @Alice -> #kittens @Alice -> @Bob T #kittens -> @Alice

node(kittens)

T @Bob -> @Alice

node(Bob) Progress table after receiving ACKs, report T

slide-6
SLIDE 6

Data influx (ingest nodes)

  • Parse data and convert them to graph

updates (i.e. sets of edges)

  • Send transaction to affected graph nodes

○ at this point, it's just stored in the queue

  • Report submitted transaction to global vector

clock

slide-7
SLIDE 7

Snapshot creation

slide-8
SLIDE 8

Snapshot creation

  • Snapshooter initiates the process

○ in practice, every 10 seconds

  • Snapshooter copies current progress table

and sends it to graph nodes

  • Graph nodes commit transactions up to

times specified in progress table

○ new updates are coming in parallel

slide-9
SLIDE 9

Computation overview

  • Ran on snapshots
  • Algorithm-specific data stored in vertices
  • Alternating phases of computation and

propagation

slide-10
SLIDE 10

Example: TunkRank

  • similar to PageRank:
  • vertex value - single real number
  • add ranks received from neighbours
  • when rank increases by ε, push update to

neighbours

  • repeat until stable

Bonus: it's incremental between snapshots!

slide-11
SLIDE 11

Example: Shortest Paths

  • Bellman-Ford with landmarks

○ landmarks - top vertices from TunkRank ○ calculate only paths passing through landmarks

  • vertex data - distances to landmarks
  • shorten distances by relaxing edges
  • push new distances to neighbours
  • repeat until stable
slide-12
SLIDE 12

Evaluation

  • 17,000 lines of C# code
  • 50 Windows servers

○ Intel Xeon (quad-core, 2.8 GHz) with 8 GB RAM

  • 100k tweets per second (10 times peak

Twitter rate)

slide-13
SLIDE 13

Degree distribution

slide-14
SLIDE 14

Graph growth

Decaying can help

slide-15
SLIDE 15

Throughput & timeliness

slide-16
SLIDE 16

Throughput

slide-17
SLIDE 17

Timeliness

slide-18
SLIDE 18

Incrementality helps!

Tunk-rank:

slide-19
SLIDE 19

Incrementality helps!

slide-20
SLIDE 20

Scalability (TunkRank)

slide-21
SLIDE 21

Fault tolerance

  • Centralized services (progress table &

snapshooter):

○ simple replication ○ Paxos-based consensus

  • Ingest nodes:

○ input data is cached until it is committed to a snapshot ○ if ingest node fails, all its transactions are discarded ○ another machine processes data from cache

slide-22
SLIDE 22

Replication of graph nodes

  • quorum-based: 3 replicas of each node
  • Update must be acknowledged by 2 replicas
  • If replica misses update, it retrieves it from
  • ther replicas
  • If replica fails and is replaced, it waits for the

next snapshot and starts working normally from there

  • For computation failures: rollback and redo
slide-23
SLIDE 23

Incremental expansion

  • Ingest nodes - trivial, just add a node
  • Storage nodes:

○ maintain more logical partitions than nodes ○ to add nodes, migrate some logical partitions to it ○ splitting logical partitions is possible too ○ new node starts working from the next snapshot - just as in failure recovery

slide-24
SLIDE 24

Failure recovery

slide-25
SLIDE 25

Thank you!