http://cs224w.stanford.edu Evolving Networks are networks that - - PowerPoint PPT Presentation

http cs224w stanford edu evolving networks are networks
SMART_READER_LITE
LIVE PREVIEW

http://cs224w.stanford.edu Evolving Networks are networks that - - PowerPoint PPT Presentation

CS224W: Machine Learning with Graphs Jure Leskovec and Baharan Mirzasoleiman, Stanford http://cs224w.stanford.edu Evolving Networks are networks that change as a function of time Almost all real world networks evolve either by adding or


slide-1
SLIDE 1

CS224W: Machine Learning with Graphs Jure Leskovec and Baharan Mirzasoleiman, Stanford

http://cs224w.stanford.edu

slide-2
SLIDE 2

¡ Evolving Networks are networks that change as a

function of time

¡ Almost all real world networks evolve either by adding

  • r removing nodes or links over time

¡ Examples:

§ Social networks: people make and lose friends and join or leave the network § Internet, web graphs, E-mail, phone calls, P2P networks, etc.

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 2

Collaborations in the journal Physical Review Letters (PRL)

[Perra et al. 2012]

slide-3
SLIDE 3

¡ Visualization of the student collaboration network

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 3

Nodes represent the students. An edge exists between two nodes if any of the two ever reported collaboration with the other in any of the assignments used to construct the network

[Burstein et al. 2018]

slide-4
SLIDE 4

¡ Evolution of the pooled R&D network for the nodes belonging

to the ten largest sectors

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 4

[Tomasello et al. 2017]

slide-5
SLIDE 5

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 5

¡ Evolution of five selected sectoral R&D networks

Blue nodes represent the firms strictly belonging to the examined sector, while orange nodes represent their alliance partners belonging to different sectors

[Tomasello et al. 2017]

slide-6
SLIDE 6

¡ Evolving network structure of academic institutions

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 6

Community structure, indicated by color, for the networks from the three years 2011 to 2013. Different communities are indicated by different colors.

[Wang et al. 2017]

slide-7
SLIDE 7

¡ The largest components in Apple’s inventor network over a 6-

year period

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 7

Each node reflects an inventor, each tie reflects a patent collaboration. Node colors reflect technology classes, while node sizes show the overall connectedness of an inventor by measuring their total number of ties/collaborations (the node’s so-called degree centrality).

[kenedict.com]

slide-8
SLIDE 8

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 8

¡ How do networks evolve?

§ How do networks evolve at the macro level?

§ Evolving network models, densification

§ How do networks evolve at the meso level?

§ Network motifs, communities

§ How do networks evolve at the micro level?

§ Node, link properties (degree, network centrality)

Macroscopic: statistics Microscopic: Degree, centralities Mesoscopic: Motifs, communities

slide-9
SLIDE 9
slide-10
SLIDE 10

¡ How do networks evolve at the macro level?

§ What are global phenomena of network growth?

¡ Questions:

§ What is the relation between the number of nodes n(t) and number of edges e(t) over time t? § How does diameter change as the network grows? § How does degree distribution evolve as the network grows?

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 10

slide-11
SLIDE 11

¡ 𝑶(𝒖) … nodes at time 𝒖 ¡ 𝑭(𝒖) … edges at time 𝒖 ¡ Suppose that

𝑶 𝒖 + 𝟐 = 𝟑 ⋅ 𝑶(𝒖)

¡ Q: what is:

𝑭 𝒖 + 𝟐 = ? Is it 𝟑 ⋅ 𝑭(𝒖)?

¡ A: More than doubled!

§ But obeying the Densification Power Law

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 11

slide-12
SLIDE 12

¡ Networks become denser over time

¡ Densification Power Law:

a … densification exponent (1 ≤ a ≤ 2)

¡ What is the relation between

the number of nodes and the edges over time?

¡ First guess: constant average

degree over time

Internet Citations a=1.2 a=1.6

N(t) E(t) N(t) E(t)

12 11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu

slide-13
SLIDE 13

¡ Densification Power Law

§ the number of edges grows faster than the number of nodes – average degree is increasing

a … densification exponent: 1 ≤ a ≤ 2:

§ a=1: linear growth – constant out-degree (traditionally assumed) § a=2: quadratic growth – fully connected graph

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 13

  • r

equivalently

slide-14
SLIDE 14

¡ Prior models and intuition say

that the network diameter slowly grows (like log N)

time diameter diameter size of the graph Internet Citations

¡ Diameter shrinks over time

§ As the network grows the distances between the nodes slowly decrease

14 11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu

How do we compute diameter in practice?

  • - Long paths: Take 90th-percentile or average path length (not the maximum)
  • - Disconnected components: Take only largest component or average only over connected pairs of nodes
slide-15
SLIDE 15

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu

diameter size of the graph

Erdos-Renyi random graph

Densification exponent a =1.3

Densifying random graph has increasing diameter Þ There is more to shrinking diameter than just densification! Is shrinking diameter just a consequence of densification?

(answer by simulation)

15

slide-16
SLIDE 16

Does the changing degree sequence explain the shrinking diameter?

Compare diameter of a: § Real network (red) § Random network with the same degree distribution (blue)

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 16

year diameter Citations

Densification + degree sequence gives shrinking diameter

slide-17
SLIDE 17

¡ How does degree distribution evolve to allow

for densification?

¡ Option 1) Degree exponent 𝜹𝒖 is constant:

§ Fact 1: If 𝜹𝒖 = 𝜹 ∈ [𝟐, 𝟑], then: 𝒃 = 𝟑/𝜹

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu

Email network

17

■ Power-laws with exponents <2 have infinite expectations. ■ So, by maintaining constant degree exponent 𝛽 the average degree grows.

slide-18
SLIDE 18

¡ How does degree distribution evolve to allow

for densification?

¡ Option 2) 𝜹𝒖 evolves with graph size 𝒐:

§ Fact 2: If 𝜹𝒖 =

𝟓𝒐𝒖

𝒚7𝟐8𝟐

𝟑𝒐𝒖

𝒚7𝟐8𝟐 , then: 𝒃 = 𝒚

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 18

Citation network Remember, the expected degree in a power law is: 𝑭 𝒀 = 𝜹𝒖 − 𝟐 𝜹𝒖 − 𝟑 𝒚𝒏 So 𝜹𝒖 has to decay as a function of graph size 𝒐𝒖 for the avg. degree to go up.

Notice: 𝜹< → 2 as 𝑜< → ∞

slide-19
SLIDE 19

¡ Want to model graphs that densify and have

shrinking diameters

¡ Intuition:

§ How do we meet friends at a party? § How do we identify references when writing papers?

11/14/19 19

v w

Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu

slide-20
SLIDE 20

¡ The Forest Fire model has 2 parameters:

§ p … forward burning probability § r … backward burning probability

¡ The model: Directed Graph

§ Each turn a new node v arrives § Uniformly at random choose an “ambassador” node w § Flip 2 coins sampled from a geometric distribution (based on p and r) to determine the number of in- and

  • ut-links of w to follow, i.e., to ”spread the fire” along

§ “Fire” spreads recursively until it dies § New node v links to all burned nodes

Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 20 11/14/19

Geometric distribution:

slide-21
SLIDE 21

¡ The Forest Fire model

§ (1) 𝑤 chooses an ambassador node 𝑥 uniformly at random, and forms a link to 𝑥 § (2) generate two random numbers 𝑦 and 𝑧 from geometric distributions with means 𝑞/(1 − 𝑞) and 𝑠𝑞/(1 − 𝑠𝑞) § (3) 𝑤 selects 𝑦 out-links and 𝑧 in-links of 𝑥 incident to nodes that were not yet visited and form out-links to them § (4) 𝑤 applies step (2) to the nodes found in step (3)

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 21

Example: (1) Connect to a random node 𝑥 (2) Sample 𝑦=2, 𝑧=1 (3) Connect to 2 out- and 1 in-links

  • f 𝑥, namely 𝑏, 𝑐, 𝑑

(4) Repeat the process for 𝑏, 𝑐, 𝑑

v

c

w

b a

slide-22
SLIDE 22

¡ Forest Fire generates graphs that densify

and have shrinking diameter

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 22

densification diameter 1.32 N(t) E(t) N(t) diameter

slide-23
SLIDE 23

¡ Forest Fire also generates graphs with

power-law degree distribution

Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 23

in-degree

  • ut-degree

log count vs. log in-degree log count vs. log out-degree

11/14/19

slide-24
SLIDE 24

¡ Fix backward

probability r and vary forward burning prob. p

¡ Notice a sharp

transition between sparse and clique-like graphs

¡ The “sweet spot”

is very narrow

Sparse graph Clique-like graph Increasing diameter Decreasing diameter Constant diameter

Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 24 11/14/19

slide-25
SLIDE 25
slide-26
SLIDE 26

¡ Temporal network: A sequence of static directed

graphs over the same (static) set of nodes 𝑊

¡ Each temporal edge is a timestamped ordered

pair of nodes (𝑓M= 𝑣, 𝑤 , 𝑢M), where 𝑣, 𝑤 ∈ 𝑊 and 𝑢M is the timestamp at which the edge exists

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 26

B E C D A F 1,3 1,2,3 2 2 1 1,3 timestamps temporal edge

slide-27
SLIDE 27

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 27

1,3 B E C D A F 1,2,3 2 2 1 1,3 B E C D A F t = 1 B E C D A F t = 2 B E C D A F t = 3

¡ A temporal network is a sequence of static directed graphs

  • ver the same (static) set of nodes 𝑊

¡ Edges of a temporal network is are

active only at certain points in time

temporal edge time

slide-28
SLIDE 28

¡ Communication: Email, phone call, face-to-face ¡ Proximity networks: Same hospital room, meet at

conference, animals hanging out

¡ Transportation: train, flights… ¡ Cell biology: protein-protein, gene regulation

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 28

[Tang et al, 2010]

Email communication. A typical week of activity during Nov 2001 using 24-hour windows (left) and aggregated static graph (right). Nodes represent employees; a link between two employees exists if an email was sent by one of them to the other in that 24-hour window

slide-29
SLIDE 29

¡ Communication: Email, phone call, face-to-face ¡ Proximity networks: Same hospital room, meet at

conference, animals hanging out

¡ Transportation: train, flights… ¡ Cell biology: protein-protein, gene regulation

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 29

[Isella et al, 2011]

Aggregated networks for two different days of the Science Gallery museum deployment. Nodes are colored according to the corresponding visitor's entry time slot. The network diameter is highlighted in each case.

slide-30
SLIDE 30
slide-31
SLIDE 31

¡ How do networks evolve at the micro level?

§ What are local phenomena of network growth?

¡ Questions:

§ How do we define paths and walks in temporal networks? § How can we extend network centrality measures to temporal networks?

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 31

slide-32
SLIDE 32

¡ A temporal path is a sequence of edges 𝑣P, 𝑣Q, 𝑢P ,

𝑣Q, 𝑣R, 𝑢Q , … , 𝑣T, 𝑣TUP, 𝑢T , for which 𝑢P ≤ 𝑢Q ≤ ⋯ ≤ 𝑢T and each node is visited at most once

¡ Example:

§ The sequence of edges [(5,2),(2,1)] together with the sequence of times 𝑢P, 𝑢R is a temporal path

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 32

time

slide-33
SLIDE 33

TPSP-Dijkstra algorithm: An adaptation of Dijkstra using a priority queue

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 33

Notation:

§ 𝑜X: source node § 𝑜<: target node § 𝑢Z: time of the query (we calculate the distance from 𝑜X to 𝑜< between time 𝑢X and time 𝑢Z) § 𝑢X: time that a node/edge joins the network § 𝑢[: time that a node/edge leaves the network § 𝑥: edge weights § 𝑒[𝑤]: distance of 𝑜X to 𝑤 § 𝑄𝑅: priority queue

⊳ Set distance to ∞ for all nodes ⊳ Set distance to 𝟏 for 𝒐𝒕 ⊳ Insert (nodes, distances) to PQ ⊳ Extract the closest node from PQ ⊳ Verify if edge e is valid at 𝒖𝒓 ⊳ If so, update v’s distance from 𝒐𝒕 ⊳ insert (𝒘, 𝒆[𝒘]) to PQ ⊳ or update 𝒆 𝒘 in PQ

slide-34
SLIDE 34

TPSP-Dijkstra algorithm: An adaptation of Dijkstra using a priority queue

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 34

Example of a temporally evolving graph. Shortest path from a to f are marked in thick lines.

g g

Edge weights New edge

slide-35
SLIDE 35

¡ Temporal Closeness: Measure of how close a node is to any

  • ther node in the network at time interval [0, 𝑢]

§ Sum of shortest (fastest) temporal path lengths to all other nodes

𝑑fghi(𝑦, 𝑢) = 1 ∑k 𝑒(𝑧, 𝑦|𝑢)

¡ Example:

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 35

length of the temporal shortest path from 𝒛 to 𝒚 from time 0 to time 𝒖 1,3 B E C D A F 1,2,3 2 2 1 1,3

𝑑fghi 𝐵, 2 = 1 1 + 1 + 2 + 2 + 2 = 0.1

𝑒(𝐵, 𝐷|𝑢 = 2)

slide-36
SLIDE 36

¡ Intuition:

§ Node 𝑏 initially receives many in-links and it should be considered important § After time 𝑢 = 8, it does not receive any more in-links and thus its importance should diminish

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 37

slide-37
SLIDE 37

¡ A temporal or time-respecting walk is a sequence of

edges 𝑣P, 𝑣Q, 𝑢P , 𝑣Q, 𝑣R, 𝑢Q , … , 𝑣T, 𝑣TUP, 𝑢T , for which 𝑢P ≤ 𝑢Q ≤ ⋯ ≤ 𝑢T

¡ Example:

§ The sequence of edges [(5,2),(2,1),(1,5)] together with the sequence of times 𝑢P, 𝑢R, 𝑢r is a temporal walk

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 38

time

slide-38
SLIDE 38

¡ Idea: Make a random walk only on temporal

  • r time-respecting paths

§ Time-stamps increase along the path

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 39

§ c → b → a → c: time respecting § a → c → b → a: not time respecting

slide-39
SLIDE 39

How can we calculate the probability of a temporal path?

§ The probability 𝑄 (𝑣, 𝑦, 𝑢Q)| 𝑤, 𝑣, 𝑢P

  • f taking (𝑣, 𝑦, 𝑢Q) given that we

arrived via 𝑣, 𝑤, 𝑢P decreases as the time difference (𝑢Q − 𝑢P) increases § This can be modeled by an exponential distribution 𝑄 (𝑣, 𝑦, 𝑢Q)| 𝑤, 𝑣, 𝑢P = 𝛾|tu | § Γw is the set of all temporal edges from 𝑣 during the time 𝑢x∈ 𝑢P, 𝑢Q Γw = 𝑣, 𝑧, 𝑢x 𝑢x ∈ 𝑢P, 𝑢Q , 𝑧 ∈ 𝑊}

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 40

x u v

𝑢P 𝑢Q

𝒖x∈ 𝒖𝟐, 𝒖𝟑 𝒖xx∈ 𝒖𝟐, 𝒖𝟑

transition probability (𝜸 ∈ (𝟏, 𝟐])

§ Smaller values of 𝛾 increase the probability

  • f walks stopping at high degree nodes,

but we get a slower convergence

slide-40
SLIDE 40

¡ As 𝑢 → ∞, the temporal PageRank converges to the static

PageRank: why?

¡ Temporal PageRank is running regular PageRank on a time-

augmented graph: § Connect graphs at different time steps via time hops, and run PageRank on this time-extended graph § Node u at 𝑢P becomes a node (u, 𝑢P) in this new graph § Transition probabilities given by 𝑄 ((𝑣, 𝑢P), (𝑦, 𝑢Q))| (𝑤, 𝑢{), (𝑣, 𝑢P) = 𝛾|tu |

¡ As 𝑢 → ∞, 𝛾|tu | becomes the uniform distribution

§ Graph looks as if we superimposed the original graphs from each time step → back to regular PageRank

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 41

slide-41
SLIDE 41

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 42

B E C D A F t = 1 B E C D A F t = 2 B E C D A F t = 3 time Starting at A at t=1… …can go to B at t=1, C at t=2, or B at t=3 Starting at E at t=1… …can go to C at t=1, or C at t=3

Constructing the time-augmented graph: Repeat for all nodes across all time steps!

slide-42
SLIDE 42

¡ Temporal PageRank:

𝑠 𝑣, 𝑢 = |

}∈~

|

  • €{

<

(1 − 𝛽)𝛽• |

  • ∈‚(},w|<)
  • €•

𝑄[𝑨|𝑢]

§ 𝑎(𝑤, 𝑣|𝑢) is a set of all possible temporal walks from 𝑤 to 𝑣 until time 𝒖 § 𝛽 is the probability of starting a new walk

¡ As 𝑢 → ∞, the temporal PageRank converges to the static PageRank

¡ Temporal Personalized PageRank: 𝑠 𝑣, 𝑢 = |

}∈~

|

  • €{

<

(1 − 𝛽)𝛽• ℎ∗(𝑤) ℎx(𝑤) |

  • ∈‚(},w|<)
  • €•

𝑄[𝑨|𝑢]

§ ℎ∗: personalization vector § ℎx: walk starting probability vector, ℎx 𝑣 =

|(w,},<)∈‡: ∀}∈~| |‡|

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 43

slide-43
SLIDE 43

¡ The weighted number of temporal walks 𝒜 is then defined as

𝑑 𝑨 𝑢 = (1 − 𝛾) Š

w‹,w‹Œ•,<‹Œ• | w‹7•,w‹,<‹ ∈•

𝛾|tu‹| § 𝑎(𝑣, 𝑤|𝑢) is a set of all possible temporal walks from 𝑤 to 𝑣 until time 𝒖 § 𝑄[𝑨|𝑢] is the probability of a particular walk 𝑨 ∈ 𝑎(𝑤, 𝑣|𝑢) 𝑄 𝑨 ∈ 𝑎 𝑤, 𝑣 𝑢 = 𝑑(𝑨|𝑢) ∑•x∈‚ 𝑤, 𝑦 𝑢

Ž∈~, •• €|•|

𝑑(𝑨′|𝑢)

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 44

𝒗𝟔 𝒗

𝟓

𝒗𝟒 𝒗𝟑

𝑢R 𝑢r

𝒖x∈ 𝒖𝟒, 𝒖𝟓 𝒖xx∈ 𝒖𝟒, 𝒖𝟓 𝒗𝟐

𝑨 = { 𝑣P, 𝑣Q , 𝑣Q, 𝑣R , 𝑣R, 𝑣r , 𝑣r, 𝑣• } 𝑢Q 𝑢P 𝑢• = 𝒖 𝒘 𝒗

Probability of stopping the walk number of all temporal walks that start at node 𝒘 and have the same length as 𝒜 Temporal walk from 𝒘 = 𝒗𝟐 to 𝒗 = 𝒗𝟔

slide-44
SLIDE 44

¡ 𝑠(𝑣): Temporal PageRank estimate of 𝑣 ¡ 𝑡(𝑣): Count of active walks visiting 𝑣

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 45

Temporal PageRank

⊳ Initiating a new walk with probability 1 − α ⊳ With probability α we continue active walks that wait in 𝒗 ⊳ Increment the active walks (active mass) count in the node 𝒘 with appropriate normalization 1−β ⊳ decrements the active mass count in node 𝒗 ⊳ Increments the active walks (active mass) count in the node 𝒘 with appropriate normalization 1−β ⊳ decrements the active mass count in node 𝒗

probability of initiating a new walk 𝛽

slide-45
SLIDE 45

¡ Datasets:

§ Facebook: A 3-month subset of Facebook activity in a New Orleans regional community. The dataset contains an anonymized list of wall posts (interactions) § Twitter: Users’ activity in Helsinki during 08.2010– 10.2010. As interactions we consider tweets that contain mentions of other users § Students: An activity log of a student online community at the University of California, Irvine. Nodes represent students and edges represent messages

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 46

slide-46
SLIDE 46

¡ Experimental setup:

§ For each network, static subgraph of n = 100 nodes is obtained by BFS from a random node § Edge weights are equal to the frequency of corresponding interactions and are normalized to sum to 1 § Then a sequence of 100K temporal edges are sampled, such that each edge is sampled with probability proportional to its weight § In this setting, temporal PageRank is expected to converge to the static PageRank of a corresponding graph § Probability of starting a new walk is set to 𝛽 = 0.85, and transition probability 𝛾 for temporal PageRank is set to 0 unless specified otherwise.

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 47

slide-47
SLIDE 47

¡

Comparison of temporal PageRank ranking with static PageRank ranking

¡

Rank quality (Pearson corr. coeff. Between static and temporal PageRank) and transition probability 𝜸

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 48

Smaller 𝜸 corresponds to slower convergence rate, but better correlated rankings Rank correlation between static and temporal PageRank is high for top-ranked nodes and decreases towards the tail of ranking

slide-48
SLIDE 48

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 49

¡ Adaptation to concept drift (𝜸=0.5)

§ We start with a temporal network sampled from some static network § After sampling 10K temporal edges 𝐹P, we change the weights of the static graph and sample another 10K temporal edges 𝐹Q § Similarly, a final set of edges 𝐹R is sampled after changing the weights § The algorithm on the concatenated sequence 𝐹 =< 𝐹P, 𝐹Q, 𝐹R >

Temporal PageRank is able to adapt to the changing distribution quite fast. Error is the Euclidean distance on the PageRank vectors.

slide-49
SLIDE 49
slide-50
SLIDE 50

¡ How do networks evolve at the mezo level?

§ What are mesoscopic impact of network growth?

¡ Questions:

§ How does patterns of interaction change over time? § What can we infer about the network from the changes in temporal patterns?

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 51

slide-51
SLIDE 51

¡ 𝒍 −node 𝒎 − edge 𝜺-temporal motif: is a sequence of 𝑚

edges 𝑣P, 𝑤P, 𝑢P , 𝑣Q, 𝑤Q, 𝑢Q , … 𝑣ž, 𝑤ž, 𝑢ž such that

§ 𝑢P < 𝑢Q < … < 𝑢ž and 𝑢ž- 𝑢P ≤ 𝜀, § The induced static graph from the edges is connected and has 𝑙 nodes

§ Temporal motifs offer valuable information about the networks’ evolution § For example to discover trends and anomalies in temporal networks

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 52

slide-52
SLIDE 52

¡ Temporal Motif Instance: A collection of edges in a

temporal graph is an instance of a 𝜀-temporal motif 𝑁 if

§ It matches the same edge pattern, and § All of the edges occur in the right order specified by the motif, within a 𝜀 time window

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 53

𝜺-temporal motif instances Temporal graph

slide-53
SLIDE 53

¡ We study all 2- and 3- node motifs with 3 edges

§ We do not discuss how to count temporal motifs here

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 54

The green background highlights the four 2-node motifs (bottom left) and the grey background highlights the eight triangles.

slide-54
SLIDE 54

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 55

¡ Real-world temporal datasets

[Paranjape et al. 2017]

slide-55
SLIDE 55

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 56

Fraction of all 2 and 3-node, 3-edge δ-temporal motif counts that correspond to two groups of motifs (δ = 1 hour). Motifs on the left capture “blocking” behavior, common in SMS messaging and Facebook wall posting, and motifs on the right exhibit “non-blocking” behavior, common in email.

¡ Blocking communication

§ If an individual typically waits for a reply from one individual before proceeding to communicate with another individual

[Paranjape et al. 2017]

slide-56
SLIDE 56

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 57

Distribution of switching behavior amongst the nonblocking motifs (δ = 1 hour). Switching is least common on Stack Overflow and most common in email.

¡ Cost of Switching

§ On Stack Overflow and Wikipedia talk pages, there is a high cost to switch targets because of peer engagement and depth of discussion § In the COLLEGEMSG dataset there is a lesser cost to switch because it lacks depth of discussion within the time frame of δ = 1 hour § In EMAIL-EU, there is almost no peer engagement and cost of switching is negligible

[Paranjape et al. 2017]

slide-57
SLIDE 57

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 58

Counts over various time scales for the motifs representing a node sending 3 outgoing messages to 1 or 2 neighbors in the COLLEGEMSG dataset

¡ Motif counts at varying time scales

§ At small time scales, the motif consisting of three edges to a single neighbor occurs frequently § After 5 minutes, counts for the three motifs with one switch in the target grow at a faster rate than the counts for the motif with two switches

[Paranjape et al. 2017]

slide-58
SLIDE 58

¡ To spot trends and anomalies, we have to spot statistically

significant temporal motifs

§ To do so, we must compute the expected number of occurrences of each motif § We study all 2- and 3- node motifs with 3 edges

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 59

slide-59
SLIDE 59

11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 60

Anomalies: We can localize the time the financial crisis hits the country around September 2011 from the difference in the actual

  • vs. expected motif frequencies

¡

A European country’s transaction log for all transactions larger than 50K Euros over 10 years from 2008 to 2018, with 118,739 nodes and 2,982,049 temporal edges (𝜀=90 days)

Financial crisis starts