CrashSim: An Efficient Algorithm for Computing SimRank over Static - - PowerPoint PPT Presentation

crashsim an efficient algorithm for computing simrank
SMART_READER_LITE
LIVE PREVIEW

CrashSim: An Efficient Algorithm for Computing SimRank over Static - - PowerPoint PPT Presentation

CrashSim: An Efficient Algorithm for Computing SimRank over Static and Temporal Graphs Mo Li 1,2 , Farhana M. Choudhury 2 , Renata Borovica-Gajic 2 , Zhiqiong Wang 1 , Junchang Xin 1,* , Jianxin Li 3 1 Northeastern University, CN 2 University of


slide-1
SLIDE 1

April 22, 2020

CrashSim: An Efficient Algorithm for Computing SimRank over Static and Temporal Graphs

Mo Li1,2, Farhana M. Choudhury2, Renata Borovica-Gajic2, Zhiqiong Wang1, Junchang Xin1,*, Jianxin Li3

1Northeastern University, CN 2University of Melbourne, AU 3Deakin University, AU

slide-2
SLIDE 2

Outline

2

  • Experimental Evaluation
  • Conclusion

Experiments and Conclusion

  • CrashSim Algorithm --- static graphs
  • CrashSim-T Algorithm --- temporal graphs

Our Approach

  • Preliminaries
  • Problem Definition

Problem Definition

  • SimRank Overview
  • Motivation

Background

slide-3
SLIDE 3

Background

  • SimRank Overview
  • Motivation
slide-4
SLIDE 4

Background

4

  • Similarity assessment plays a vital role in our lives.

Recommender System Citation Graph Collaboration Network

slide-5
SLIDE 5

Background

5

  • SimRank
  • Node-to-node measurement based on the topology of graphs (KDD’02)
  • Basic assumption
  • Two nodes will be similar if they are both highly relevant to similar nodes
  • Two Forms
  • Original definition (KDD’02)
  • √c- walk (SIGMOD’16)
slide-6
SLIDE 6

Background

6

  • Temporal Graph
  • Temporal SimRank queries: threshold and trend

A B C E D F G H A B C E D F G H A B C E D F G H

Recommender System

slide-7
SLIDE 7

Problem Definition

  • Preliminaries
  • Problem Definition
slide-8
SLIDE 8

Preliminaries

8

SLING algorithm (SIGMOD’16) ProbeSim algorithm (VLDB’17)

slide-9
SLIDE 9

Problem Definition

9

Problem (Temporal SimRank Queries) Given: !, #, [%

&, % ']

Return: node set Ω, such that the SimRank of # and each node * ∈ Ω continuously meet a certain query requirement during the entire query interval [%

&, % ']

Problem (Temporal SimRank Trend Query) Given: !, #, [%

&, % ']

Return: node set Ω, such that the SimRank of # and each node * ∈ Ω is continuously increasing (or decreasing) during the entire query interval [%

&, % ']

Problem (Temporal SimRank Threshold Query) Given: !, #, [%

&, % '], ,

Return: node set Ω, such that the SimRank of # and each node * ∈ Ω is greater than

  • during the entire query interval [%

&, % ']

slide-10
SLIDE 10

Our Approach

  • CrashSim Algorithm --- static graphs
  • CrashSim-T Algorithm --- temporal graphs
slide-11
SLIDE 11

CrashSim Algorithm

11

  • Motivation
  • ProbeSim (VLDB’17) is the state-of-the-art algorithm for SimRak computation over static graph
  • Drawbacks
  • redundant computations
  • the length of √"-walk determine the computation costs

A B C E D F G H

A B A B C D E C A D F G C D E A D F G H C B B C F G H E A D

slide-12
SLIDE 12

CrashSim Algorithm

12

  • Key idea
  • Constrain the length of √"-walk to lmax
  • A reverse reachable tree of source u with the limited length of √"-walk, lmax
  • Still obtain SimRank estimators with the same guaranteed error bound of the ProbeSim

Problem (Approximation Guarantee) Given: #, %, &, ' Return: ( %, ) such that |( %, ) − (,-(%, ))| ≤ & with at least 1 − ' probability

slide-13
SLIDE 13

CrashSim Algorithm

13

A B C E D F G H

A B C E B D H A E B

Level 0 Level 1 Level 2 Level 3

! 0,$ = 1 ! 1,' = ! 0,$ × ) * ' = 1×0.5 2 = 0.25 ! 1,. = ! 0,$ × ) * . = 1×0.5 3 = 0.167 ! 2,2 = 0.0625, ! 2,' = 0.0417, ! 2,4 = 0.0417 ! 3,5 = 0.0156, ! 3,$ = 0.0104 ! 3,2 = 0.0104, ! 3,' = 0.0104 In the k-th trial, 6 . = ., 4, ', $ 78 A, C = ! 0, C + ! 1, D + ! 2, B + ! 3, A = 0 + 0 + 0.0417 + 0.0104 = 0.0521

slide-14
SLIDE 14

CrashSim Algorithm

14

Time Complexity:

slide-15
SLIDE 15

CrashSim-T Algorithm

15

  • Two opportunities
  • Unnecessary to compute the SimRank between u and the candidate node set ! at each time instant
  • The size of node set ! can only gradually reduce over time
  • CrashSim naturally supports the computation of SimRank of the source u and a partial

set of nodes.

slide-16
SLIDE 16

CrashSim-T Algorithm --- Delta Pruning

16

  • Affected area of a changed edge ! → #
  • The altered nodes in the reverse reachable tree of u
  • $%&' − 1 length reachable nodes of y
  • Delta pruning: ignore the nodes of an unaffected area

A B C E D F G H A B C E D F G H

Delete * → +

A B C E B D

The reverse reachable tree of A remains unchanged.

slide-17
SLIDE 17

CrashSim-T Algorithm --- Difference Pruning

17

  • Related area: the !"#$ length reverse reachable tree of u and v
  • Difference pruning: filter out those nodes whose related area is unchanged

The reverse reachable tree of A and E remains unchanged. Add % → '

A B C E D F G H A B C E D F G H E B A D H

A B C E B D

slide-18
SLIDE 18

CrashSim-T Algorithm

18

  • Main idea
  • Check whether the conditions of delta and difference pruning are satisfied
  • If so, disregard those nodes as part of the candidate node set
  • Invoke CrashSim algorithm to compute u and residual nodes
  • According to different query requirements to filter out unsatisfied nodes
slide-19
SLIDE 19

Experiments and Conclusion

  • Experimental Evaluation
  • Conclusion
slide-20
SLIDE 20

Experimental Evaluation

20

  • Datasets
  • Comparison baselines
  • SLING (SIGMOD’16), ProbeSim (VLDB’17), READS (VLDB’17)
  • Setting and metrics
  • ! varies from 0.0125, 0.025, 0.05 to 0.1
  • "# = max ( ),+ − (-. ),+

+ ∈ 0

  • 1234-(-56 =

7(9:)∩7(9=) >?@(9:,9=)

slide-21
SLIDE 21

Experimental Evaluation

21

(a) AS-733 (b) AS-Caidi (c) Wiki-Vote (d) HepTh (e) HepPh

slide-22
SLIDE 22

Experimental Evaluation

22

The impact of the query interval on the response time of the algorithms

slide-23
SLIDE 23

Conclusion

23

  • Propose CrashSim algorithm, an index-free algorithm for single-source and partial

SimRank computation in static graphs

  • Introduce CrashSim-T --- an extension to CrashSim to solve SimRank queries over

temporal graphs

  • Experiments show that both CrashSim and CrashSim-T outperform the state-of-the-

art algorithms.

slide-24
SLIDE 24

Thanks.

Mo Li limo_neucse@hotmail.com