Di Discovering Graph Temporal Association Rules Qi Song Mohammad - - PowerPoint PPT Presentation

di discovering graph temporal association rules
SMART_READER_LITE
LIVE PREVIEW

Di Discovering Graph Temporal Association Rules Qi Song Mohammad - - PowerPoint PPT Presentation

Di Discovering Graph Temporal Association Rules Qi Song Mohammad Hossein Namaki Yinghui Wu Peng Lin Tingjian Ge* Washington State University, *UMass Lowell Temporal association rules in networks Time-aware POI recommendation check-in z


slide-1
SLIDE 1

Di Discovering Graph Temporal Association Rules

Qi Song

Washington State University, *UMass Lowell

Mohammad Hossein Namaki Yinghui Wu Peng Lin Tingjian Ge*

slide-2
SLIDE 2

Temporal association rules in networks

2

Ø Time-aware POI recommendation Ø in P2 can be recommended as a point of interest for Ø is a potential customer for

user

u ū

user POI

w

user

ū

POI

w

POI

check-in retweet

z

≤ 2 hours P1 P2 R1 Requirement: AR’s with topological, semantic and temporal constraints

Left hand side event(LHS) Right hand side event(RHS)

slide-3
SLIDE 3

Outline

3

Ø Graph temporal association rules (GTARs) definition Ø GTARs discovery problem formalization Ø A feasible GTAR discovery algorithm Ø Experiment study: verify the effectiveness of GTARs, and the efficiency of GTAR discovery algorithm.

slide-4
SLIDE 4

Temporal Graph

4

Ø Temporal graph GT(V,E,L,T). Ø Snapshot Gt: induced by the set of all edges associated with time stamp t.

G3

slide-5
SLIDE 5

Graph temporal association rules (GTAR)

5

Ø GTAR φ = (P1⇒P2, ū, Δt)

Ø ū: common shared focus. Ø Δt: a constant that specifies a time interval.

φ = (P1⇒P2, ū, Δt=2hours)

If there exists an occurrence of event P1 at an entity specified by ū at some time t, then it is likely that an event P2 occurs at the same entity, within a time window [t, t + Δt]

LHS RHS user

u ū

user POI

w

user

ū

POI

w

POI

retweet

z

≤ 2 hours P1 P2 R2

check-in

slide-6
SLIDE 6

Events and Matching

6

Ø Events

Ø Connected subgraph pattern carry a designated focus node.

Ø Event matching

Ø An event P occurred in GT at time t if there is a matching relation (Rt) between P and snapshot Gt Ø focus occurrence o(P, ū, t): the nodes in V that matches ū induced by Rt Ø Example:

ØMatches of ū induced by R3 in G3 contains {(x1,3),(x2,3),(x3,3)} Øo(P1, ū,3) is {x1,x2,x3}

user

u ū

user POI

w

retweet

P1

One subgraph matching of P1

slide-7
SLIDE 7

GTAR occurrence

7

Ø Given a time window [t1,t2], φ occurs if at least a node matches the focus of both P1 and P2 at t1 and t2, respectively. Ø A time window may contain multiple occurrences of a GTAR. Ø Minimal occurrence

Ø O(v)=[t1,t2] is an occurrence of φ in GT supported by node v Ø There exists no O’(v) ⊂O(v), such that O’(v) is also an occurrence

P1 P2

slide-8
SLIDE 8

Support and Confidence

8

Ø Based on minimal occurrences ØConfidence: measures how likely P2 occurs within Δt time at the focus occurrence of P1

O(ϕ,GT )

Conf (ϕ,GT ) = Supp(ϕ,GT ) Supp(P

1,GT )

Supp(ϕ, GT ) = O(ϕ,GT ) C(u) T

# Occurrence of this rule Normalizer # Support of this rule # Support of LHS

slide-9
SLIDE 9

GTAR Discovery

9

Informative GTARs

Ø Interested in GTARs with high support and confidence Ø Maximal GTARs with size bound to be more informative Ø In a b-maximal GTAR, both LHS and RHS have at most b edges.

The Discovery Problem

Ø Input: Temporal graph GT , focus ū, time interval Δt, size bound b, support threshold σ, and confidence threshold θ; Ø Output: The set of b-maximal GTARs Σ pertaining to ū and Δt such that for each GTAR φ ∈ Σ, Supp(φ, GT) ≥ σ, and Conf(φ, GT) ≥ θ.

slide-10
SLIDE 10

GTAR Discovery

10

Ø Integrate event mining and rule discovery as a single process Ø Intuition: Ø LHS generation by best-first strategy.

Ø Generate and verify best new LHS events

Ø RHS generation given fixed LHS

Ø To generate and validate new GTAR candidates by appending best RHS

events to verified LHS events. Ø It prefers RHS events with high support.

Conf (ϕ,GT ) = Supp(ϕ,GT ) Supp(P

1,GT )

Rule with high support LHS with low support

slide-11
SLIDE 11

Ø GTAR discovery:

GTAR Discovery

11

P’1(ū)

  • 1. event

spawning user u user ū

retweet

queue L queue R

  • 4. rule validation

check-in POI

z s show

P2 …

user ū

P2

  • 3. rule spawning

… P7

backtracking

P1(ū)

  • 2. event verification

user u user ū w POI

retweet

slide-12
SLIDE 12

ØComplexity:

Ø Time: O(|T|N(b)(b+|V|)(b+|E|)+N(b)2|T|) Ø Space: O(N(b)|C(ū)||T|) Ø Size bound b is small in practice and Ø Number of events N(b) is significantly reduced by pruning rules

ØOptimization

Ø Pruning rules: extend (conditional) anti-monotonicity to GTARs Ø Anytime performance: returning GTARs as the events are discovered Ø Batch matching: merge snapshots to a graph and perform one matching

Performance analysis and optimization

12

slide-13
SLIDE 13

Experimental Study

13

Ø Datasets ØAlgorithms

Ø DisGTAR: our integrated algorithms including all pruning rules Ø DisGTARn: without the pruning strategies. (Pruning) Ø IsoGTAR: isolating the snapshots and computes event matching over each snapshots one by one. (Batch matching) Ø SeqGTAR: separating event mining and rule discovery to two independent

  • processes. (Integrate mining)

#Nodes #Edges #Labels #Snapshots Citation 4.3M 21.7M 273 80 Panama 839k 3.6M 433 12k Movielens 81.5k 10M 21 1439

slide-14
SLIDE 14

Performance of GTAR discovery

14

DisGTAR DisGTARn SeqGTAR IsoGTAR Time(s) # verif. Time(s) # verif. Time(s) # verif. Time(s) # verif. Panama 9 1,194 276 8,393 560 8,393 N/A Citation 22 157 994 12,507 1,621 12,507 12,721 11,461 MovieLens 558 191 2,432 1,423 2,445 1,423 N/A DisGTAR outperforms DisGTARn, SeqGTAR, and IsoGTAR by 6.28, 7.85 and 64.79 times

  • n average
slide-15
SLIDE 15

Anytime performance

15

Time vs. Accuracy (Citation) Time vs. Accuracy (Panama) Conf (ϕ,GT )

ϕ∈Σt

Conf (ϕ,GT )

ϕ∈Σ*

anytime quality(t) = DisGTAR converges with high quality GTARs much faster than SeqGTAR 18 seconds 8 seconds

slide-16
SLIDE 16

Scalability of DisGTAR

16

Varying |G|,# of edges (Synthetic) Varying |T| (Synthetic) DisGTAR is less sensitive to |G| DisGTAR is much less sensitive than IsoGTAR The “packing” of consecutive timestamps to time intervals Pruning rules

slide-17
SLIDE 17

Case Study

17

Matches: Prof. Christopher Manning(Stanford Univ.) Matches: F.Geneve Project Management

slide-18
SLIDE 18

Conclusion and future work

18

Conclusion Ø We have proposed a class of temporal association rules over graphs Ø We have studied the discovery problem of GTARs Ø Despite the enhanced expressive power of GTARs, it is feasible to find and apply GTARs in practice. Future work Ø Extending GTARs to multi-focus and exploring other quality metrics Ø Fast online discovery of GTARs over graph streams.

18

Sponsored By:

slide-19
SLIDE 19

Th Thank you!

Related Work Ø Event Pattern Discovery by Keywords in Graph Streams (BigData’17) Ø BEAMS: Bounded Event Detection in Graph Streams (ICDE’16) (http://eecs.wsu.edu/~ksasani/BEAMS/Display.php )