http://cs224w.stanford.edu Degree distribution: P(k) Path length: - - PowerPoint PPT Presentation

http cs224w stanford edu degree distribution p k path
SMART_READER_LITE
LIVE PREVIEW

http://cs224w.stanford.edu Degree distribution: P(k) Path length: - - PowerPoint PPT Presentation

CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu Degree distribution: P(k) Path length: h Clustering coefficient: C Connected components: s Definitions will be presented for undirected graphs,


slide-1
SLIDE 1

CS224W: Analysis of Networks Jure Leskovec, Stanford University

http://cs224w.stanford.edu

slide-2
SLIDE 2
slide-3
SLIDE 3

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 3

Degree distribution: P(k) Path length: h Clustering coefficient: C Connected components: s

Definitions will be presented for undirected graphs, sometimes we will explicitly mention extensions to directed graphs, and sometimes extensions will be obvious

slide-4
SLIDE 4

¡ Degree distribution P(k): Probability that

a randomly chosen node has degree k Nk = # nodes with degree k

¡ Normalized histogram:

P(k) = Nk / N ➔ plot

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 4

k P(k)

1 2 3 4 0.1 0.2 0.3 0.4 0.5 0.6

For directed graphs we have separate in- and out-degree distributions.

slide-5
SLIDE 5

¡ A path is a sequence of nodes in which each

node is linked to the next one

¡ A path can intersect itself

and pass through the same edge multiple times

§ E.g.: ACBDCDEG

P

n = {i0,i1,i2,...,in}

P

n = {(i0,i 1),(i 1,i2),(i2,i3),...,(in-1,in)}

C A B D E H F G

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 5

X

slide-6
SLIDE 6

¡ Distance (shortest path, geodesic)

between a pair of nodes is defined as the number of edges along the shortest path connecting the nodes

§ *If the two nodes are not connected, the distance is usually defined as infinite (or zero)

¡ In directed graphs, paths need to

follow the direction of the arrows

§ Consequence: Distance is not symmetric: hB,C ≠ hC,B

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 6

B A D C B A D C hB,D = 2 hA,X = ∞ hB,C = 1, hC,B = 2 X

slide-7
SLIDE 7

¡ Diameter: The maximum (shortest path)

distance between any pair of nodes in a graph

¡ Average path length for a connected graph or a

strongly connected directed graph

§ Many times we compute the average only over the connected pairs of nodes (that is, we ignore “infinite” length paths) § Note that ths measure also applied to (strongly) connected components of a graph

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 7

å

¹

=

i j i ij

h E h

, max

2 1

  • hij is the distance from node i to node j
  • Emax is the max number of edges (total

number of node pairs) = n(n-1)/2

slide-8
SLIDE 8

¡ Clustering coefficient (for undirected graphs):

§ How connected are i’s neighbors to each other? § Node i with degree ki § Ci Î [0,1]

§

¡ Average clustering coefficient:

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 8

where ei is the number of edges between the neighbors of node i

å

=

N i i

C N C 1

Clustering coefficient is undefined (or defined to be 0) for nodes with degree 0 or 1 Note 𝑙"(𝑙" − 1) is max number of edges between the 𝑙" neighbors

slide-9
SLIDE 9

¡ Clustering coefficient (for undirected graps):

§ How connected are i’s neighbors to each other? § Node i with degree ki

§

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 9

where ei is the number of edges between the neighbors of node i C A B D E H F G

kB=2, eB=1, CB=2/2 = 1 kD=4, eD=2, CD=4/12 = 1/3

  • Avg. clustering: C=0.33
slide-10
SLIDE 10

¡ Size of the largest connected component

§ Largest set where any two vertices can be joined by a path

¡ Largest component = Giant component

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 10

How to find connected components:

  • Start from random node and perform

Breadth First Search (BFS)

  • Label the nodes that BFS visits
  • If all nodes are visited, the network is connected
  • Otherwise find an unvisited node and repeat BFS

D C A B H F G

slide-11
SLIDE 11

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 11

Degree distribution: P(k) Path length: h Clustering coefficient: C Connected components: s

slide-12
SLIDE 12
slide-13
SLIDE 13

MSN Messenger:

¡ 1 month of activity

§ 245 million users logged in § 180 million users engaged in conversations § More than 30 billion conversations § More than 255 billion exchanged messages

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 13

slide-14
SLIDE 14

14 9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu

slide-15
SLIDE 15

Network: 180M people, 1.3B edges

15 9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu

slide-16
SLIDE 16

Contact Conversation

Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 9/29/19 16

Messaging as an undirected graph

  • Edge (u,v) if users u and v

exchanged at least 1 msg

  • N=180 million people
  • E=1.3 billion edges
slide-17
SLIDE 17

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 17

slide-18
SLIDE 18

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 18

Note: We plotted the same data as on the previous slide, just the axes are now logarithmic.

slide-19
SLIDE 19

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 19

å

=

=

k k i i k k

i

C N C

:

1

Ck: average Ci of nodes i of degree k:

  • Avg. clustering
  • f the MSN:

C = 0.1140

slide-20
SLIDE 20

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 20

slide-21
SLIDE 21

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 21

Number of links between pairs of nodes in the largest connected component

  • Avg. path length 6.6

90% of the nodes can be reached in < 8 hops

Steps #Nodes

1 1 10 2 78 3 3,96 4 8,648 5 3,299,252 6 28,395,849 7 79,059,497 8 52,995,778 9 10,321,008 10 1,955,007 11 518,410 12 149,945 13 44,616 14 13,740 15 4,476 16 1,542 17 536 18 167 19 71 20 29 21 16 22 10 23 3 24 2 25 3

# nodes as we do BFS out of a random node

slide-22
SLIDE 22

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 22

Degree distribution: Path length: 6.6 Clustering coefficient: 0.11 Connectivity:

giant component Heavily skewed;

  • avg. degree = 14.4

Are these values “expected”? Are they “surprising”? To answer this we need a model!

slide-23
SLIDE 23
  • a. Undirected network

N=2,018 proteins as nodes E=2,930 binding interactions as links.

  • b. Degree distribution:
  • Skewed. Average degree <k>=2.90
  • c. Diameter:
  • Avg. path length = 5.8
  • d. Clustering:
  • Avg. clustering = 0.12

Connectivity: 185 components the largest component has 1,647 nodes (81% of nodes)

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 23

slide-24
SLIDE 24
slide-25
SLIDE 25

¡ Erdös-Renyi Random Graphs [Erdös-Renyi, ‘60]

¡ Two variants:

§ Gnp: undirected graph on n nodes where each edge (u,v) appears i.i.d. with probability p § Gnm : undirected graph with n nodes, and m edges picked uniformly at random

Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 9/29/19 25

What kind of networks do such models produce?

slide-26
SLIDE 26

¡ n and p do not uniquely determine the graph!

§ The graph is a result of a random process

¡ We can have many different realizations given

the same n and p

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 26

n = 10 p= 1/6

slide-27
SLIDE 27

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 27

Degree distribution: P(k) Path length: h Clustering coefficient: C What are the values of these properties for Gnp?

slide-28
SLIDE 28

¡ Fact: Degree distribution of Gnp is binomial. ¡ Let P(k) denote the fraction of nodes with

degree k:

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 28

k n k

p p k n k P

  • ÷

÷ ø ö ç ç è æ

  • =

1

) 1 ( 1 ) (

Select k nodes

  • ut of n-1

Probability of having k edges Probability of missing the rest of the n-1-k edges

σ 2 = p(1− p)(n −1)

σ k = 1− p p 1 (n −1) " # $ % & '

1/2

≈ 1 (n −1)1/2

) 1 ( - = n p k

By the law of large numbers, as the network size increases, the distribution becomes increasingly narrow—we are increasingly confident that the degree

  • f a node is in the vicinity of k.

P(k)

Mean, variance of a binomial distribution

k

slide-29
SLIDE 29

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 29

n k n k p k k k k p C

i i i i

»

  • =

=

  • ×

= 1 ) 1 ( ) 1 (

Clustering coefficient of a random graph is small. If we generate bigger and bigger graphs with fixed avg. degree 𝑙 (that is we set 𝑞 = 𝑙 ⋅ 1/𝑜), then C decreases with the graph size n.

) 1 ( 2

  • =

i i i i

k k e C

ei = p ki(ki −1) 2

Number of distinct pairs of neighbors of node i of degree ki Each pair is connected with prob. p

Where ei is the number

  • f edges between i’s

neighbors

¡ Remember: ¡ Edges in Gnp appear i.i.d. with prob. p ¡ So, expected E[ei] is: ¡ Then E[Ci]:

slide-30
SLIDE 30

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 30

Degree distribution: Clustering coefficient: C=p=k/n Path length: next! Connectivity:

k n k

p p k n k P

  • ÷

÷ ø ö ç ç è æ - =

1

) 1 ( 1 ) (

slide-31
SLIDE 31

¡ Graph G(V, E) has expansion α: if" S Í V:

# of edges leaving S ³ α× min(|S|,|V\S|)

¡ Or equivalently:

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 31

|) \ | |, min(| #

min

S V S S leaving edges

V SÍ

= a

S V \ S

slide-32
SLIDE 32

¡ Fact: In a graph on n nodes with expansion α, for all

pairs of nodes, there is a path of length O((log n)/α).

¡ Random graph Gnp:

For log n > np > c, diam(Gnp) = O(log n / log (np))

§ Random graphs have good expansion so it takes a logarithmic number of steps for BFS to visit all nodes

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 32

S nodes α·S edges S’ nodes α·S’ edges s

slide-33
SLIDE 33

Erdös-Renyi Random Graph can grow very large but nodes will be just a few hops apart

200000 400000 600000 800000 1000000 5 10 15 20

num nodes average shortest path

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 33

Here 𝑜 ⋅ 𝑞 =constant That is, avg deg 𝑙 is const

slide-34
SLIDE 34

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 34

Degree distribution: Path length: O(log n) Clustering coefficient: C = p = k / n Connected components: next!

k n k

p p k n k P

  • ÷

÷ ø ö ç ç è æ - =

1

) 1 ( 1 ) (

slide-35
SLIDE 35

¡ Graph structure of Gnp as p changes: ¡ Emergence of a giant component:

  • avg. degree k=2E/n or p=k/(n-1)

§ k=1-ε: all components are of size Ω(log n) § k=1+ε: 1 component of size Ω(n), others have size Ω(log n) § Each node has at least one edge in expectation

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 35

1

p=

1/(n-1)

Giant component appears

c/(n-1)

  • Avg. deg const.

Lots of isolated nodes.

log(n)/(n-1)

Fewer isolated nodes.

2*log(n)/(n-1)

No isolated nodes.

Empty graph Complete graph

Avg deg = 1

slide-36
SLIDE 36

¡ Gnp, n=100,000, k=p(n-1) = 0.5 … 3

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 36

Fraction of nodes in the largest component

p*(n-1)=1

slide-37
SLIDE 37

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 37

Degree distribution:

  • Avg. path length:

6.6 O(log n)

  • Avg. clustering coef.: 0.11 k / n

Largest Conn. Comp.: 99%

C ≈ 8·10-8 h ≈ 8.2

MSN Gnp

GCC exists when k>1. k ≈ 14.

n=180M

ý þ ý þ

slide-38
SLIDE 38

¡ Are real networks like random graphs?

§ Giant connected component: J § Average path length: J § Clustering Coefficient: L § Degree Distribution: L

¡ Problems with the random networks model:

§ Degree distribution differs from that of real networks § Giant component in most real networks does NOT emerge through a phase transition § No local structure – clustering coefficient is too low

¡ Most important: Are real networks random?

§ The answer is simply: NO!

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 38

slide-39
SLIDE 39

¡ If Gnp is wrong, why did we spend time on it?

§ It is the reference model for the rest of the class § It will help us calculate many quantities, that can then be compared to the real data § It will help us understand to what degree a particular property is the result of some random process

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 39

So, while Gnp is WRONG, it will turn out to be extremly USEFUL!

slide-40
SLIDE 40

Can we have high clustering while also having short paths?

Vs.

High clustering coefficient, High diameter Low clustering coefficient Low diameter

slide-41
SLIDE 41

¡ MSN network has 7 orders of magnitude

larger clustering than the corresponding Gnp!

¡ Other examples:

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 42

h ... Average shortest path length C ... Average clustering coefficient “actual” … real network “random” … random graph with same avg. degree Actor Collaborations (IMDB): N = 225,226 nodes, avg. degree k = 61 Electrical power grid: N = 4,941 nodes, k = 2.67 Network of neurons: N = 282 nodes, k = 14 Network hactual hrandom Crandom Film actors 3.65 2.99 0.00027 Power Grid 18.70 12.40 0.005

  • C. elegans

2.65 2.25 0.05

slide-42
SLIDE 42

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 43

¡ Consequence of expansion:

§ Short paths: O(log n)

§ This is the smallest diameter we can get if we keep the degree constant.

§ But clustering is low!

¡ But networks have

“local” structure:

§ Triadic closure: Friend of a friend is my friend § High clustering but diameter is also high

¡ How can we have both?

Low diameter Low clustering coefficient High clustering coefficient High diameter

slide-43
SLIDE 43

¡ Could a network with high clustering also

be small world (have log 𝒐 diameter)?

§ How can we at the same time have high clustering and small diameter? § Clustering implies edge “locality” § Randomness enables “shortcuts”

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 44

High clustering High diameter Low clustering Low diameter

slide-44
SLIDE 44

Small-World Model [Watts-Strogatz ‘98] Two components to the model:

¡ (1) Start with a low-dimensional regular lattice

§ (In our case we are using a ring as a lattice) § Has high clustering coefficient

¡ (2) Rewire: Introduce randomness (“shortcuts”)

§ Add/remove edges to create shortcuts to join remote parts

  • f the lattice

§ For each edge, with prob. p, move the other endpoint to a random node

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 45

[Watts-Strogatz, ‘98]

slide-45
SLIDE 45

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 46

High clustering Low diameter Low clustering Low diameter

4 3 2 = = C k N h N k C N h = = log log a

Rewiring allows us to “interpolate” between a regular lattice and a random graph

[Watts-Strogatz, ‘98] 1 2

High clustering High diameter

slide-46
SLIDE 46

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 47

Clustering Coefficient, 𝐷 = 1

2 ∑ 𝐷𝑗

  • Prob. of rewiring, p

Parameter region of high clustering and low path length

Intuition: It takes a lot of randomness to ruin the clustering, but a very small amount to create shortcuts.

(scaled) Average Path Length

slide-47
SLIDE 47

¡ Could a network with high clustering be at the

same time a small world?

§ Yes! You don’t need more than a few random links

¡ The Watts Strogatz Model:

§ Provides insight on the interplay between clustering and the small-world § Captures the structure of many realistic networks § Accounts for the high clustering of real networks § Does not lead to the correct degree distribution

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 48

slide-48
SLIDE 48

Generating large realistic graphs

slide-49
SLIDE 49

¡ How can we think of network structure

recursively? Intuition: Self-similarity

§ Object is similar to a part of itself: the whole has the same shape as one or more of the parts

¡ Mimic recursive graph/community growth: ¡ Kronecker product is a way of generating

self-similar matrices

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 50

Initial graph Recursive expansion

slide-50
SLIDE 50

¡ Kronecker graphs:

§ A recursive model of network structure

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 51

81 x 81 adjacency matrix

K1

[PKDD ‘05]

3 x 3 9 x 9

slide-51
SLIDE 51

¡ Kronecker product of matrices 𝐵 and 𝐶 is

given by

¡ Define a Kronecker product of two graphs as a

Kronecker product of their adjacency matrices

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 52

N x M K x L N*K x M*L

slide-52
SLIDE 52

¡ Kronecker graph is obtained by

growing sequence of graphs by iterating the Kronecker product

  • ver the initiator matrix 𝑳𝟐:

¡ Note: One can easily use multiple initiator

matrices (𝐿1’, 𝐿1’’, 𝐿1’’’ ) (even of different sizes)

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 53

K1

[PKDD ‘05]

m

[m] m-1 m

slide-53
SLIDE 53

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 54

[PKDD ‘05]

slide-54
SLIDE 54

0.25 0.10 0.10 0.04 0.05 0.15 0.02 0.06 0.05 0.02 0.15 0.06 0.01 0.03 0.03 0.09

¡ Create 𝑂1´ 𝑂1 probability matrix 𝚰𝟐 ¡ Compute the kth Kronecker power 𝚰𝒍 ¡ For each entry 𝑞𝑣𝑤 of 𝚰𝒍 include an

edge (𝑣, 𝑤) in 𝐿𝑙 with probability 𝑞𝑣𝑤

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 55

0.5 0.2 0.1 0.3

Θ1

Instance matrix K2

Θ2= Θ1Ä Θ1

Flip biased coins Kronecker multiplication

Probability

  • f edge puv

[PKDD ‘05]

slide-55
SLIDE 55

¡ How do we generate an instance of a

(Directed) stochastic Kronecker graph?

¡ Is there a faster way? YES! ¡ Idea: Exploit the recursive structure of

Kronecker graphs

§ “Drop” edges onto the graph one by one

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 56

0.25 0.10 0.10 0.04 0.05 0.15 0.02 0.06 0.05 0.02 0.15 0.06 0.01 0.03 0.03 0.09 1 1 1 1 1 1 1 1 1

Probability

  • f edge puv

Need to flip 𝒐𝟑 coins!! Way too slow!! Flip biased coins

slide-56
SLIDE 56

¡ A faster way to generate Kronecker graphs ¡ How to “drop” an edge into a graph 𝑯 on

𝒐 = 𝟑𝒏 nodes

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 57

Adjacency matrix G

=

= Q

Q Ä Q

slide-57
SLIDE 57

¡ A faster way to generate Kronecker graphs ¡ How to “drop” an edge into a graph 𝑯 on

𝒐 = 𝟑𝒏 nodes

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 58

Adjacency matrix G

a b c d

=

= Q

Q Ä Q

slide-58
SLIDE 58

¡ A faster way to generate Kronecker graphs ¡ How to “drop” an edge into a graph 𝑯 on

𝒐 = 𝟑𝒏 nodes

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 59

=

= Q

Q Ä Q

Adjacency matrix G

a b c d

a b c d

slide-59
SLIDE 59

¡ A faster way to generate Kronecker graphs ¡ How to “drop” an edge into a graph 𝑯 on

𝒐 = 𝟑𝒏 nodes:

¡ We may get a few

edges colliding. We simply reinsert them.

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 60

Adjacency matrix G

a b c d

a b c d

a b c d

=

= Q

Q Ä Q

slide-60
SLIDE 60

Fast Kronecker generator algorithm:

§ For generating directed graphs

¡ Insert 1 edge on graph 𝑯 on 𝒐 = 𝟑𝒏 nodes:

§ Create normalized matrix 𝑴𝒗𝒘 = 𝚰𝒗𝒘/(∑𝒑𝒒 𝚰𝒑𝒒) § For 𝒋 = 𝟐 … 𝒏

§ Start with 𝒚 = 𝟏, 𝒛 = 𝟏 § Pick a row/column (𝒗, 𝒘) with prob. 𝑴𝒗𝒘 § Descend into quadrant (𝒗, 𝒘) at level 𝒋 of 𝑯

§ This means: 𝒚 += 𝒗 ⋅ 𝟑𝒏O𝒋 , 𝒛 += 𝒘 ⋅ 𝟑𝒏O𝒋

§ Add an edge (𝒚, 𝒛) to 𝑯

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 61

a b c d

𝚰

slide-61
SLIDE 61

¡ Real and Kronecker are very close:

9/29/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, cs224w.stanford.edu 62

= Q1

0.99 0.54 0.49 0.13 [ICML ‘07]