Kineograph Raymond Cheng (University of Washinton, Microsoft - PowerPoint PPT Presentation

Kineograph Raymond Cheng (University of Washinton, Microsoft Research) et al.

The challenge ● Social networks (Facebook, Twitter) generate a lot of information ● Let's analyze it! ● Simple data-mining won't do: ○ too much data ○ constant influx of new data ○ long computation time

A solution ● Process live stream of data (i.e. tweets) ● Aggregate them as a dynamic graph ● Snapshot regularly ● Run distributed graph-mining on snapshots ○ support incremental computation

Kineograph architecture

Data influx (ingest node) @Alice: @Bob , check out these #kittens ! Ingest node node(Alice) T @Alice @Alice -> #kittens @Alice -> @Bob Transaction T @Bob #kittens T T @Bob -> @Alice #kittens -> @Alice node(Bob) node(kittens) after receiving ACKs, report T Progress table

Data influx (ingest nodes) ● Parse data and convert them to graph updates (i.e. sets of edges) ● Send transaction to affected graph nodes ○ at this point, it's just stored in the queue ● Report submitted transaction to global vector clock

Snapshot creation

Snapshot creation ● Snapshooter initiates the process ○ in practice, every 10 seconds ● Snapshooter copies current progress table and sends it to graph nodes ● Graph nodes commit transactions up to times specified in progress table ○ new updates are coming in parallel

Computation overview ● Ran on snapshots ● Algorithm-specific data stored in vertices ● Alternating phases of computation and propagation

Example: TunkRank ● similar to PageRank: ● vertex value - single real number ● add ranks received from neighbours ● when rank increases by ε, push update to neighbours ● repeat until stable Bonus: it's incremental between snapshots!

Example: Shortest Paths ● Bellman-Ford with landmarks ○ landmarks - top vertices from TunkRank ○ calculate only paths passing through landmarks ● vertex data - distances to landmarks ● shorten distances by relaxing edges ● push new distances to neighbours ● repeat until stable

Evaluation ● 17,000 lines of C# code ● 50 Windows servers ○ Intel Xeon (quad-core, 2.8 GHz) with 8 GB RAM ● 100k tweets per second (10 times peak Twitter rate)

Degree distribution

Graph growth Decaying can help

Throughput & timeliness

Throughput

Timeliness

Incrementality helps! Tunk-rank:

Incrementality helps!

Scalability (TunkRank)

Fault tolerance ● Centralized services (progress table & snapshooter): ○ simple replication ○ Paxos-based consensus ● Ingest nodes: ○ input data is cached until it is committed to a snapshot ○ if ingest node fails, all its transactions are discarded ○ another machine processes data from cache

Replication of graph nodes ● quorum-based: 3 replicas of each node ● Update must be acknowledged by 2 replicas ● If replica misses update, it retrieves it from other replicas ● If replica fails and is replaced, it waits for the next snapshot and starts working normally from there ● For computation failures: rollback and redo

Incremental expansion ● Ingest nodes - trivial, just add a node ● Storage nodes: ○ maintain more logical partitions than nodes ○ to add nodes, migrate some logical partitions to it ○ splitting logical partitions is possible too ○ new node starts working from the next snapshot - just as in failure recovery

Failure recovery

Thank you!

Kineograph Raymond Cheng (University of Washinton, Microsoft - PowerPoint PPT Presentation

Kineograph Raymond Cheng (University of Washinton, Microsoft Research) et al. The challenge Social networks (Facebook, Twitter) generate a lot of information Let's analyze it! Simple data-mining won't do: too much data

COMP9020 Lecture 6 Session 1, 2017 Graphs and Trees Textbook (R & W) - Ch. 3, Sec. 3.2; Ch.

Color imaging sensors with perovskite alloys (Conference Presentation) Conference Paper April

Deep Dive Program What is Nursing Informatics and Why is it Important? Connie White Delaney, PhD,

Harvesting Natures Energy Harvesting Natures Energy Geothermal Power Generation at the

FORGING NEW PATHS An English River First Nation Company ALFRED DAWATSARE CHIEF EXECUTIVE OFFICER

LASERTRON Where Adults Go To Play with Friends and Family LASERTRON Where Adults Go To Play with

8th to 9th Grade Brent Riessen - Principal Raine Mollenbeck - Associate Principal Tron

JUNIOR NIGHT Tuesday, December 4, 2018 11 th Grade School Counselors Mrs. Baez (Last Names A-L)

FY2016 BUDGET HEARINGS 0 APP O RIAT 0 UNLVERSLTY OF ;ou TH DAKOTA COMMI E Student Success

The Norwegian EV Success Christina Bu, Secretary General Norwegian EV Association The Norwegian

Next generation data centre networks A platform for innovation Chris Gascoigne

Automated Troubleshooting of Live Site Issues Sriram Srinivasan PayPal SRE May 23, 2017 About

AFN March 10-12, 2020 Kevin Debassige Transitioning to a Modernized Approach of Asset Management

ENDO PHARMACEUTICALS grow. collaborate. innovate. thrive. Instructions and Troubleshooting the

Texas Institute of Science Clients Technology Space and Requirements Director of Engineering

Summer School A Technology Presentation Paul Maniscalco Key School Instructional

Advanced Coatings for Surface Protection and Corrosion Control TWI Technology Centre (North East)

Care, Maintenance, and Troubleshooting of HPLC Columns Part 2 Indranath Chakraborty 09/03/13

TRS-Care Overview PRESENTED TO THE HOUSE APPROPRIATIONS COMMITTEE SUBCOMMITTEE ON ARTICLE III

Support S.4563-A (Golden)/A.7353-B (Buchwald) Authorizes certain school districts and BOCES to

Duluth-Superior Area Truck Route Study Update to the TAC & MIC Board August 14-15, 2018

TIME RELEASE STUDY IN UGANDA Dicksons Kateshumbwa Commissioner Customs 12-Dec-17 1 O U T L

Montana Teachers Retirement System Agency Profile The Montana Teachers Retirement System

National Tax Training Committee SUGGESTED AGENDAS TAX-AIDE TAX-AIDE TRS TRAINING - 2013 1

Sambuz

Useful Links

Newsletter

Mail Us

Kineograph Raymond Cheng (University of Washinton, Microsoft - PowerPoint PPT Presentation

Kineograph Raymond Cheng (University of Washinton, Microsoft Research) et al. The challenge Social networks (Facebook, Twitter) generate a lot of information Let's analyze it! Simple data-mining won't do: too much data

COMP9020 Lecture 6 Session 1, 2017 Graphs and Trees Textbook (R &amp; W) - Ch. 3, Sec. 3.2; Ch.

Color imaging sensors with perovskite alloys (Conference Presentation) Conference Paper April

Deep Dive Program What is Nursing Informatics and Why is it Important? Connie White Delaney, PhD,

Harvesting Natures Energy Harvesting Natures Energy Geothermal Power Generation at the

FORGING NEW PATHS An English River First Nation Company ALFRED DAWATSARE CHIEF EXECUTIVE OFFICER

LASERTRON Where Adults Go To Play with Friends and Family LASERTRON Where Adults Go To Play with

8th to 9th Grade Brent Riessen - Principal Raine Mollenbeck - Associate Principal Tron

JUNIOR NIGHT Tuesday, December 4, 2018 11 th Grade School Counselors Mrs. Baez (Last Names A-L)

FY2016 BUDGET HEARINGS 0 APP O RIAT 0 UNLVERSLTY OF ;ou TH DAKOTA COMMI E Student Success

The Norwegian EV Success Christina Bu, Secretary General Norwegian EV Association The Norwegian

Next generation data centre networks A platform for innovation Chris Gascoigne

Automated Troubleshooting of Live Site Issues Sriram Srinivasan PayPal SRE May 23, 2017 About

AFN March 10-12, 2020 Kevin Debassige Transitioning to a Modernized Approach of Asset Management

ENDO PHARMACEUTICALS grow. collaborate. innovate. thrive. Instructions and Troubleshooting the

Texas Institute of Science Clients Technology Space and Requirements Director of Engineering

Summer School A Technology Presentation Paul Maniscalco Key School Instructional

Advanced Coatings for Surface Protection and Corrosion Control TWI Technology Centre (North East)

Care, Maintenance, and Troubleshooting of HPLC Columns Part 2 Indranath Chakraborty 09/03/13

TRS-Care Overview PRESENTED TO THE HOUSE APPROPRIATIONS COMMITTEE SUBCOMMITTEE ON ARTICLE III

Support S.4563-A (Golden)/A.7353-B (Buchwald) Authorizes certain school districts and BOCES to

Duluth-Superior Area Truck Route Study Update to the TAC &amp; MIC Board August 14-15, 2018

TIME RELEASE STUDY IN UGANDA Dicksons Kateshumbwa Commissioner Customs 12-Dec-17 1 O U T L

Montana Teachers Retirement System Agency Profile The Montana Teachers Retirement System

National Tax Training Committee SUGGESTED AGENDAS TAX-AIDE TAX-AIDE TRS TRAINING - 2013 1

Sambuz

Useful Links

Newsletter

Mail Us

COMP9020 Lecture 6 Session 1, 2017 Graphs and Trees Textbook (R & W) - Ch. 3, Sec. 3.2; Ch.

Duluth-Superior Area Truck Route Study Update to the TAC & MIC Board August 14-15, 2018