Regularities and dynamics in bisimulation reductions of big graphs - - PowerPoint PPT Presentation

regularities and dynamics in bisimulation reductions of
SMART_READER_LITE
LIVE PREVIEW

Regularities and dynamics in bisimulation reductions of big graphs - - PowerPoint PPT Presentation

Regularities and dynamics in bisimulation reductions of big graphs Yongming Luo , George Fletcher, Jan Hidders, Paul De Bra and Yuqing Wu GRADES 2013 SIGMOD/PODS 2013 The research of YL, GF, JH and PD is supported by the Nether- lands


slide-1
SLIDE 1

Where innovation starts

Regularities and dynamics in bisimulation reductions of big graphs

Yongming Luo , George Fletcher, Jan Hidders, Paul De Bra and Yuqing Wu GRADES 2013 • SIGMOD/PODS 2013

The research of YL, GF, JH and PD is supported by the Nether- lands Organisation for Scientific Research. The research of YW is supported by Research Foundation Flan- ders during her sabbatical visit to Hasselt University, Bel- gium.

slide-2
SLIDE 2

2/11 department of mathematics and computer science

Outline

Motivation Experimental setup Results Insights

1 2 3 4 5 6 7 8 P1 P3 P4 P2

An example of bisimulation reduction

slide-3
SLIDE 3

3/11 department of mathematics and computer science

Bisimulation reduction

◮ Bisimulation partitioning is an important concept in many

fields (computer science, modal logic, etc.), in DB research as well (structural index, graph reduction)

◮ It can be seen as a way of clustering nodes

1 2 3 4 5 6 7 8 P1 P3 P4 P2

Figure: Bisimulation partition example, partition block graph (reduction graph) {P2 ↔ P1 → P3 → P4}

slide-4
SLIDE 4

3/11 department of mathematics and computer science

Bisimulation reduction

◮ Bisimulation partitioning is an important concept in many

fields (computer science, modal logic, etc.), in DB research as well (structural index, graph reduction)

◮ It can be seen as a way of clustering nodes

1 2 3 4 5 6 7 8 P1 P3 P4 P2

Figure: Bisimulation partition example, partition block graph (reduction graph) {P2 ↔ P1 → P3 → P4}

◮ Reduce graph size while preserving structural properties

(e.g., reachability)

◮ Result can be seen as a graph ◮ Many algorithms, no work on analyzing the results

slide-5
SLIDE 5

4/11 department of mathematics and computer science

Questions

Regularities, such as power-law distribution exists in real graphs.

slide-6
SLIDE 6

4/11 department of mathematics and computer science

Questions

Regularities, such as power-law distribution exists in real graphs.

◮ Do graphs under bisimulation reduction also have

such properties?

slide-7
SLIDE 7

4/11 department of mathematics and computer science

Questions

Regularities, such as power-law distribution exists in real graphs.

◮ Do graphs under bisimulation reduction also have

such properties?

◮ How would that knowledge help us?

slide-8
SLIDE 8

5/11 department of mathematics and computer science

Experimental setup for investigation

◮ Big graphs, from 1 Million to 1.4 Billion edges

(Twitter, DBPedia, etc.)

◮ One dynamic social graph, from 17 Million to 33

Million edges (Flickr-grow)

◮ State-of-the-art I/O efficient algorithm for

computing bisimulation reductions (k-bisim, k = 10)

◮ We use cumulative distribution function (CDF) to

present distributions

slide-9
SLIDE 9

6/11 department of mathematics and computer science

Regularities - bisimulation result Power-law also exists in many attributes for bisimulation partition results for real graphs. But this is not the case for synthetic graphs.

slide-10
SLIDE 10

7/11 department of mathematics and computer science

Regularities - bisimulation result

Partition block size distribution

100 102 104 106 100 10−1 10−2 10−3 10−4 10−5 10−6 10−7 10−8

x (# of nodes per PB) cumulative % of PB with ≥ x

real graphs

100 102 104 106 108 100 10−1 10−2 10−3 10−4 10−5 10−6 10−7

x (# of nodes per PB)

synthetic graphs

Jamendo LinkedMDB DBLP DBPedia WikiLinks Twitter Flickr-Grow BSBM SP2B Power Random

slide-11
SLIDE 11

8/11 department of mathematics and computer science

Regularities - bisimulation result

Bisimulation graph in/out-degree distribution

100 102 104 106 108 100 10−1 10−2 10−3 10−4 10−5 10−6 10−7 10−8

x (in-degree) cumulative % of Nk with ≥ x

real graphs

100 102 104 106 108 100 10−1 10−2 10−3 10−4 10−5 10−6 10−7

x (in-degree)

synthetic graphs

Jamendo LinkedMDB DBLP DBPedia WikiLinks Twitter Flickr-Grow BSBM SP2B Power Random

slide-12
SLIDE 12

9/11 department of mathematics and computer science

Dynamics - a real growing social graph

◮ Does the bisimulation result grow when the

  • riginal graph grows?
slide-13
SLIDE 13

9/11 department of mathematics and computer science

Dynamics - a real growing social graph

◮ Does the bisimulation result grow when the

  • riginal graph grows?
  • Yes.
slide-14
SLIDE 14

9/11 department of mathematics and computer science

Dynamics - a real growing social graph

◮ Does the bisimulation result grow when the

  • riginal graph grows?
  • Yes.

◮ How fast does it grow?

slide-15
SLIDE 15

9/11 department of mathematics and computer science

Dynamics - a real growing social graph

◮ Does the bisimulation result grow when the

  • riginal graph grows?
  • Yes.

◮ How fast does it grow?

  • Linearly with respect to the original graph.

|N| |Nk| |E| |Ek|

slide-16
SLIDE 16

10/11 department of mathematics and computer science

Insights

◮ Power-law distributions in bisimulation results ⇒

skew expected in applications (indexes, data partitioned among machines, . . .)

slide-17
SLIDE 17

10/11 department of mathematics and computer science

Insights

◮ Power-law distributions in bisimulation results ⇒

skew expected in applications (indexes, data partitioned among machines, . . .)

◮ Behaviors of graph generators ⇒ some more work

needs to be done for graph generators

slide-18
SLIDE 18

10/11 department of mathematics and computer science

Insights

◮ Power-law distributions in bisimulation results ⇒

skew expected in applications (indexes, data partitioned among machines, . . .)

◮ Behaviors of graph generators ⇒ some more work

needs to be done for graph generators

◮ Bisimulation result/graph grows ⇒ lower k or

  • ther adaptations (e.g., choose different k for

different parts of the graph, different node/edge labeling)

slide-19
SLIDE 19

11/11 department of mathematics and computer science

Thank you! Q&A

For more information, just google seeqr project

  • r visit: bit.ly/seeqr
slide-20
SLIDE 20

11/11 department of mathematics and computer science

Definition of k-bisimilar

Definition

Let k be a non-negative integer and G = N, E, λN, λE be a

  • graph. Nodes u, v ∈ N are called k-bisimilar (denoted as u ≈k v),

iff the following holds:

  • 1. λN(u) = λN(v),
  • 2. if k > 0, then for any edge (u, u′) ∈ E, there exists an edge

(v, v′) ∈ E, such that u′ ≈k−1 v′ and λE(u, u′) = λE(v, v′), and

  • 3. if k > 0, then for any edge (v, v′) ∈ E, there exists an edge

(u, u′) ∈ E, such that v′ ≈k−1 u′ and λE(v, v′) = λE(u, u′).

slide-21
SLIDE 21

11/11 department of mathematics and computer science

Definition of k-bisimilar

Definition

Let k be a non-negative integer and G = N, E, λN, λE be a

  • graph. Nodes u, v ∈ N are called k-bisimilar (denoted as u ≈k v),

iff the following holds:

  • 1. λN(u) = λN(v),
  • 2. if k > 0, then for any edge (u, u′) ∈ E, there exists an edge

(v, v′) ∈ E, such that u′ ≈k−1 v′ and λE(u, u′) = λE(v, v′), and

  • 3. if k > 0, then for any edge (v, v′) ∈ E, there exists an edge

(u, u′) ∈ E, such that v′ ≈k−1 u′ and λE(v, v′) = λE(u, u′).

1

M

2

M

4

P

3

P

5

P

6

P

w l w l l l l

In this example graph, nodes 1 and 2 are 0- and 1- bisimilar but not 2-bisimilar.

slide-22
SLIDE 22

11/11 department of mathematics and computer science

Regularities - original graphs

Power-law exists in in/out-degree distribution for most of the examined graphs.

100 102 104 106 100 10−1 10−2 10−3 10−4 10−5 10−6 10−7 10−8

x (in-degree) cumulative % of nodes with ≥ x

real graphs

100 102 104 106 108 100 10−1 10−2 10−3 10−4 10−5 10−6 10−7 10−8

x (in-degree)

synthetic graphs

Jamendo LinkedMDB DBLP DBPedia WikiLinks Twitter Flickr-Grow BSBM SP2B Power Random

slide-23
SLIDE 23

11/11 department of mathematics and computer science

Signature length

100 102 104 106 100 10−1 10−2 10−3 10−4 10−5 10−6 10−7 10−8

x (signature length) cumulative % of nodes with ≥ x

real graphs

100 101 102 103 104 100 10−1 10−2 10−3 10−4 10−5 10−6 10−7

x (signature length)

synthetic graphs

Jamendo LinkedMDB DBLP DBPedia WikiLinks Twitter Flickr-Grow BSBM SP2B Power Random

slide-24
SLIDE 24

11/11 department of mathematics and computer science

Out-degree

100 102 104 106 100 10−1 10−2 10−3 10−4 10−5 10−6 10−7 10−8

x (out-degree) cumulative % of Nk with ≥ x

real graphs

100 102 104 106 108 100 10−1 10−2 10−3 10−4 10−5 10−6 10−7

x (out-degree)

synthetic graphs

Jamendo LinkedMDB DBLP DBPedia WikiLinks Twitter Flickr-Grow BSBM SP2B Power Random