http://cs224w.stanford.edu Evolving Networks are networks that - - PowerPoint PPT Presentation
http://cs224w.stanford.edu Evolving Networks are networks that - - PowerPoint PPT Presentation
CS224W: Machine Learning with Graphs Jure Leskovec and Baharan Mirzasoleiman, Stanford http://cs224w.stanford.edu Evolving Networks are networks that change as a function of time Almost all real world networks evolve either by adding or
¡ Evolving Networks are networks that change as a
function of time
¡ Almost all real world networks evolve either by adding
- r removing nodes or links over time
¡ Examples:
§ Social networks: people make and lose friends and join or leave the network § Internet, web graphs, E-mail, phone calls, P2P networks, etc.
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 2
Collaborations in the journal Physical Review Letters (PRL)
[Perra et al. 2012]
¡ Visualization of the student collaboration network
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 3
Nodes represent the students. An edge exists between two nodes if any of the two ever reported collaboration with the other in any of the assignments used to construct the network
[Burstein et al. 2018]
¡ Evolution of the pooled R&D network for the nodes belonging
to the ten largest sectors
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 4
[Tomasello et al. 2017]
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 5
¡ Evolution of five selected sectoral R&D networks
Blue nodes represent the firms strictly belonging to the examined sector, while orange nodes represent their alliance partners belonging to different sectors
[Tomasello et al. 2017]
¡ Evolving network structure of academic institutions
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 6
Community structure, indicated by color, for the networks from the three years 2011 to 2013. Different communities are indicated by different colors.
[Wang et al. 2017]
¡ The largest components in Apple’s inventor network over a 6-
year period
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 7
Each node reflects an inventor, each tie reflects a patent collaboration. Node colors reflect technology classes, while node sizes show the overall connectedness of an inventor by measuring their total number of ties/collaborations (the node’s so-called degree centrality).
[kenedict.com]
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 8
¡ How do networks evolve?
§ How do networks evolve at the macro level?
§ Evolving network models, densification
§ How do networks evolve at the meso level?
§ Network motifs, communities
§ How do networks evolve at the micro level?
§ Node, link properties (degree, network centrality)
Macroscopic: statistics Microscopic: Degree, centralities Mesoscopic: Motifs, communities
¡ How do networks evolve at the macro level?
§ What are global phenomena of network growth?
¡ Questions:
§ What is the relation between the number of nodes n(t) and number of edges e(t) over time t? § How does diameter change as the network grows? § How does degree distribution evolve as the network grows?
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 10
¡ 𝑶(𝒖) … nodes at time 𝒖 ¡ 𝑭(𝒖) … edges at time 𝒖 ¡ Suppose that
𝑶 𝒖 + 𝟐 = 𝟑 ⋅ 𝑶(𝒖)
¡ Q: what is:
𝑭 𝒖 + 𝟐 = ? Is it 𝟑 ⋅ 𝑭(𝒖)?
¡ A: More than doubled!
§ But obeying the Densification Power Law
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 11
¡ Networks become denser over time
¡ Densification Power Law:
a … densification exponent (1 ≤ a ≤ 2)
¡ What is the relation between
the number of nodes and the edges over time?
¡ First guess: constant average
degree over time
Internet Citations a=1.2 a=1.6
N(t) E(t) N(t) E(t)
12 11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu
¡ Densification Power Law
§ the number of edges grows faster than the number of nodes – average degree is increasing
a … densification exponent: 1 ≤ a ≤ 2:
§ a=1: linear growth – constant out-degree (traditionally assumed) § a=2: quadratic growth – fully connected graph
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 13
- r
equivalently
¡ Prior models and intuition say
that the network diameter slowly grows (like log N)
time diameter diameter size of the graph Internet Citations
¡ Diameter shrinks over time
§ As the network grows the distances between the nodes slowly decrease
14 11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu
How do we compute diameter in practice?
- - Long paths: Take 90th-percentile or average path length (not the maximum)
- - Disconnected components: Take only largest component or average only over connected pairs of nodes
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu
diameter size of the graph
Erdos-Renyi random graph
Densification exponent a =1.3
Densifying random graph has increasing diameter Þ There is more to shrinking diameter than just densification! Is shrinking diameter just a consequence of densification?
(answer by simulation)
15
Does the changing degree sequence explain the shrinking diameter?
Compare diameter of a: § Real network (red) § Random network with the same degree distribution (blue)
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 16
year diameter Citations
Densification + degree sequence gives shrinking diameter
¡ How does degree distribution evolve to allow
for densification?
¡ Option 1) Degree exponent 𝜹𝒖 is constant:
§ Fact 1: If 𝜹𝒖 = 𝜹 ∈ [𝟐, 𝟑], then: 𝒃 = 𝟑/𝜹
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu
Email network
17
■ Power-laws with exponents <2 have infinite expectations. ■ So, by maintaining constant degree exponent 𝛽 the average degree grows.
¡ How does degree distribution evolve to allow
for densification?
¡ Option 2) 𝜹𝒖 evolves with graph size 𝒐:
§ Fact 2: If 𝜹𝒖 =
𝟓𝒐𝒖
𝒚7𝟐8𝟐
𝟑𝒐𝒖
𝒚7𝟐8𝟐 , then: 𝒃 = 𝒚
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 18
Citation network Remember, the expected degree in a power law is: 𝑭 𝒀 = 𝜹𝒖 − 𝟐 𝜹𝒖 − 𝟑 𝒚𝒏 So 𝜹𝒖 has to decay as a function of graph size 𝒐𝒖 for the avg. degree to go up.
Notice: 𝜹< → 2 as 𝑜< → ∞
¡ Want to model graphs that densify and have
shrinking diameters
¡ Intuition:
§ How do we meet friends at a party? § How do we identify references when writing papers?
11/14/19 19
v w
Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu
¡ The Forest Fire model has 2 parameters:
§ p … forward burning probability § r … backward burning probability
¡ The model: Directed Graph
§ Each turn a new node v arrives § Uniformly at random choose an “ambassador” node w § Flip 2 coins sampled from a geometric distribution (based on p and r) to determine the number of in- and
- ut-links of w to follow, i.e., to ”spread the fire” along
§ “Fire” spreads recursively until it dies § New node v links to all burned nodes
Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 20 11/14/19
Geometric distribution:
¡ The Forest Fire model
§ (1) 𝑤 chooses an ambassador node 𝑥 uniformly at random, and forms a link to 𝑥 § (2) generate two random numbers 𝑦 and 𝑧 from geometric distributions with means 𝑞/(1 − 𝑞) and 𝑠𝑞/(1 − 𝑠𝑞) § (3) 𝑤 selects 𝑦 out-links and 𝑧 in-links of 𝑥 incident to nodes that were not yet visited and form out-links to them § (4) 𝑤 applies step (2) to the nodes found in step (3)
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 21
Example: (1) Connect to a random node 𝑥 (2) Sample 𝑦=2, 𝑧=1 (3) Connect to 2 out- and 1 in-links
- f 𝑥, namely 𝑏, 𝑐, 𝑑
(4) Repeat the process for 𝑏, 𝑐, 𝑑
v
c
w
b a
¡ Forest Fire generates graphs that densify
and have shrinking diameter
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 22
densification diameter 1.32 N(t) E(t) N(t) diameter
¡ Forest Fire also generates graphs with
power-law degree distribution
Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 23
in-degree
- ut-degree
log count vs. log in-degree log count vs. log out-degree
11/14/19
¡ Fix backward
probability r and vary forward burning prob. p
¡ Notice a sharp
transition between sparse and clique-like graphs
¡ The “sweet spot”
is very narrow
Sparse graph Clique-like graph Increasing diameter Decreasing diameter Constant diameter
Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 24 11/14/19
¡ Temporal network: A sequence of static directed
graphs over the same (static) set of nodes 𝑊
¡ Each temporal edge is a timestamped ordered
pair of nodes (𝑓M= 𝑣, 𝑤 , 𝑢M), where 𝑣, 𝑤 ∈ 𝑊 and 𝑢M is the timestamp at which the edge exists
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 26
B E C D A F 1,3 1,2,3 2 2 1 1,3 timestamps temporal edge
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 27
1,3 B E C D A F 1,2,3 2 2 1 1,3 B E C D A F t = 1 B E C D A F t = 2 B E C D A F t = 3
¡ A temporal network is a sequence of static directed graphs
- ver the same (static) set of nodes 𝑊
¡ Edges of a temporal network is are
active only at certain points in time
temporal edge time
¡ Communication: Email, phone call, face-to-face ¡ Proximity networks: Same hospital room, meet at
conference, animals hanging out
¡ Transportation: train, flights… ¡ Cell biology: protein-protein, gene regulation
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 28
[Tang et al, 2010]
Email communication. A typical week of activity during Nov 2001 using 24-hour windows (left) and aggregated static graph (right). Nodes represent employees; a link between two employees exists if an email was sent by one of them to the other in that 24-hour window
¡ Communication: Email, phone call, face-to-face ¡ Proximity networks: Same hospital room, meet at
conference, animals hanging out
¡ Transportation: train, flights… ¡ Cell biology: protein-protein, gene regulation
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 29
[Isella et al, 2011]
Aggregated networks for two different days of the Science Gallery museum deployment. Nodes are colored according to the corresponding visitor's entry time slot. The network diameter is highlighted in each case.
¡ How do networks evolve at the micro level?
§ What are local phenomena of network growth?
¡ Questions:
§ How do we define paths and walks in temporal networks? § How can we extend network centrality measures to temporal networks?
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 31
¡ A temporal path is a sequence of edges 𝑣P, 𝑣Q, 𝑢P ,
𝑣Q, 𝑣R, 𝑢Q , … , 𝑣T, 𝑣TUP, 𝑢T , for which 𝑢P ≤ 𝑢Q ≤ ⋯ ≤ 𝑢T and each node is visited at most once
¡ Example:
§ The sequence of edges [(5,2),(2,1)] together with the sequence of times 𝑢P, 𝑢R is a temporal path
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 32
time
TPSP-Dijkstra algorithm: An adaptation of Dijkstra using a priority queue
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 33
Notation:
§ 𝑜X: source node § 𝑜<: target node § 𝑢Z: time of the query (we calculate the distance from 𝑜X to 𝑜< between time 𝑢X and time 𝑢Z) § 𝑢X: time that a node/edge joins the network § 𝑢[: time that a node/edge leaves the network § 𝑥: edge weights § 𝑒[𝑤]: distance of 𝑜X to 𝑤 § 𝑄𝑅: priority queue
⊳ Set distance to ∞ for all nodes ⊳ Set distance to 𝟏 for 𝒐𝒕 ⊳ Insert (nodes, distances) to PQ ⊳ Extract the closest node from PQ ⊳ Verify if edge e is valid at 𝒖𝒓 ⊳ If so, update v’s distance from 𝒐𝒕 ⊳ insert (𝒘, 𝒆[𝒘]) to PQ ⊳ or update 𝒆 𝒘 in PQ
TPSP-Dijkstra algorithm: An adaptation of Dijkstra using a priority queue
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 34
Example of a temporally evolving graph. Shortest path from a to f are marked in thick lines.
g g
Edge weights New edge
¡ Temporal Closeness: Measure of how close a node is to any
- ther node in the network at time interval [0, 𝑢]
§ Sum of shortest (fastest) temporal path lengths to all other nodes
𝑑fghi(𝑦, 𝑢) = 1 ∑k 𝑒(𝑧, 𝑦|𝑢)
¡ Example:
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 35
length of the temporal shortest path from 𝒛 to 𝒚 from time 0 to time 𝒖 1,3 B E C D A F 1,2,3 2 2 1 1,3
𝑑fghi 𝐵, 2 = 1 1 + 1 + 2 + 2 + 2 = 0.1
𝑒(𝐵, 𝐷|𝑢 = 2)
¡ Intuition:
§ Node 𝑏 initially receives many in-links and it should be considered important § After time 𝑢 = 8, it does not receive any more in-links and thus its importance should diminish
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 37
¡ A temporal or time-respecting walk is a sequence of
edges 𝑣P, 𝑣Q, 𝑢P , 𝑣Q, 𝑣R, 𝑢Q , … , 𝑣T, 𝑣TUP, 𝑢T , for which 𝑢P ≤ 𝑢Q ≤ ⋯ ≤ 𝑢T
¡ Example:
§ The sequence of edges [(5,2),(2,1),(1,5)] together with the sequence of times 𝑢P, 𝑢R, 𝑢r is a temporal walk
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 38
time
¡ Idea: Make a random walk only on temporal
- r time-respecting paths
§ Time-stamps increase along the path
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 39
§ c → b → a → c: time respecting § a → c → b → a: not time respecting
How can we calculate the probability of a temporal path?
§ The probability 𝑄 (𝑣, 𝑦, 𝑢Q)| 𝑤, 𝑣, 𝑢P
- f taking (𝑣, 𝑦, 𝑢Q) given that we
arrived via 𝑣, 𝑤, 𝑢P decreases as the time difference (𝑢Q − 𝑢P) increases § This can be modeled by an exponential distribution 𝑄 (𝑣, 𝑦, 𝑢Q)| 𝑤, 𝑣, 𝑢P = 𝛾|tu | § Γw is the set of all temporal edges from 𝑣 during the time 𝑢x∈ 𝑢P, 𝑢Q Γw = 𝑣, 𝑧, 𝑢x 𝑢x ∈ 𝑢P, 𝑢Q , 𝑧 ∈ 𝑊}
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 40
x u v
𝑢P 𝑢Q
𝒖x∈ 𝒖𝟐, 𝒖𝟑 𝒖xx∈ 𝒖𝟐, 𝒖𝟑
transition probability (𝜸 ∈ (𝟏, 𝟐])
§ Smaller values of 𝛾 increase the probability
- f walks stopping at high degree nodes,
but we get a slower convergence
¡ As 𝑢 → ∞, the temporal PageRank converges to the static
PageRank: why?
¡ Temporal PageRank is running regular PageRank on a time-
augmented graph: § Connect graphs at different time steps via time hops, and run PageRank on this time-extended graph § Node u at 𝑢P becomes a node (u, 𝑢P) in this new graph § Transition probabilities given by 𝑄 ((𝑣, 𝑢P), (𝑦, 𝑢Q))| (𝑤, 𝑢{), (𝑣, 𝑢P) = 𝛾|tu |
¡ As 𝑢 → ∞, 𝛾|tu | becomes the uniform distribution
§ Graph looks as if we superimposed the original graphs from each time step → back to regular PageRank
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 41
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 42
B E C D A F t = 1 B E C D A F t = 2 B E C D A F t = 3 time Starting at A at t=1… …can go to B at t=1, C at t=2, or B at t=3 Starting at E at t=1… …can go to C at t=1, or C at t=3
Constructing the time-augmented graph: Repeat for all nodes across all time steps!
¡ Temporal PageRank:
𝑠 𝑣, 𝑢 = |
}∈~
|
- €{
<
(1 − 𝛽)𝛽• |
- ∈‚(},w|<)
- ۥ
𝑄[𝑨|𝑢]
§ 𝑎(𝑤, 𝑣|𝑢) is a set of all possible temporal walks from 𝑤 to 𝑣 until time 𝒖 § 𝛽 is the probability of starting a new walk
¡ As 𝑢 → ∞, the temporal PageRank converges to the static PageRank
¡ Temporal Personalized PageRank: 𝑠 𝑣, 𝑢 = |
}∈~
|
- €{
<
(1 − 𝛽)𝛽• ℎ∗(𝑤) ℎx(𝑤) |
- ∈‚(},w|<)
- ۥ
𝑄[𝑨|𝑢]
§ ℎ∗: personalization vector § ℎx: walk starting probability vector, ℎx 𝑣 =
|(w,},<)∈‡: ∀}∈~| |‡|
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 43
¡ The weighted number of temporal walks 𝒜 is then defined as
𝑑 𝑨 𝑢 = (1 − 𝛾) Š
w‹,w‹Œ•,<‹Œ• | w‹7•,w‹,<‹ ∈•
𝛾|tu‹| § 𝑎(𝑣, 𝑤|𝑢) is a set of all possible temporal walks from 𝑤 to 𝑣 until time 𝒖 § 𝑄[𝑨|𝑢] is the probability of a particular walk 𝑨 ∈ 𝑎(𝑤, 𝑣|𝑢) 𝑄 𝑨 ∈ 𝑎 𝑤, 𝑣 𝑢 = 𝑑(𝑨|𝑢) ∑•x∈‚ 𝑤, 𝑦 𝑢
Ž∈~, •• €|•|
𝑑(𝑨′|𝑢)
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 44
𝒗𝟔 𝒗
𝟓
𝒗𝟒 𝒗𝟑
𝑢R 𝑢r
𝒖x∈ 𝒖𝟒, 𝒖𝟓 𝒖xx∈ 𝒖𝟒, 𝒖𝟓 𝒗𝟐
𝑨 = { 𝑣P, 𝑣Q , 𝑣Q, 𝑣R , 𝑣R, 𝑣r , 𝑣r, 𝑣• } 𝑢Q 𝑢P 𝑢• = 𝒖 𝒘 𝒗
Probability of stopping the walk number of all temporal walks that start at node 𝒘 and have the same length as 𝒜 Temporal walk from 𝒘 = 𝒗𝟐 to 𝒗 = 𝒗𝟔
¡ 𝑠(𝑣): Temporal PageRank estimate of 𝑣 ¡ 𝑡(𝑣): Count of active walks visiting 𝑣
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 45
Temporal PageRank
⊳ Initiating a new walk with probability 1 − α ⊳ With probability α we continue active walks that wait in 𝒗 ⊳ Increment the active walks (active mass) count in the node 𝒘 with appropriate normalization 1−β ⊳ decrements the active mass count in node 𝒗 ⊳ Increments the active walks (active mass) count in the node 𝒘 with appropriate normalization 1−β ⊳ decrements the active mass count in node 𝒗
probability of initiating a new walk 𝛽
¡ Datasets:
§ Facebook: A 3-month subset of Facebook activity in a New Orleans regional community. The dataset contains an anonymized list of wall posts (interactions) § Twitter: Users’ activity in Helsinki during 08.2010– 10.2010. As interactions we consider tweets that contain mentions of other users § Students: An activity log of a student online community at the University of California, Irvine. Nodes represent students and edges represent messages
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 46
¡ Experimental setup:
§ For each network, static subgraph of n = 100 nodes is obtained by BFS from a random node § Edge weights are equal to the frequency of corresponding interactions and are normalized to sum to 1 § Then a sequence of 100K temporal edges are sampled, such that each edge is sampled with probability proportional to its weight § In this setting, temporal PageRank is expected to converge to the static PageRank of a corresponding graph § Probability of starting a new walk is set to 𝛽 = 0.85, and transition probability 𝛾 for temporal PageRank is set to 0 unless specified otherwise.
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 47
¡
Comparison of temporal PageRank ranking with static PageRank ranking
¡
Rank quality (Pearson corr. coeff. Between static and temporal PageRank) and transition probability 𝜸
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 48
Smaller 𝜸 corresponds to slower convergence rate, but better correlated rankings Rank correlation between static and temporal PageRank is high for top-ranked nodes and decreases towards the tail of ranking
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 49
¡ Adaptation to concept drift (𝜸=0.5)
§ We start with a temporal network sampled from some static network § After sampling 10K temporal edges 𝐹P, we change the weights of the static graph and sample another 10K temporal edges 𝐹Q § Similarly, a final set of edges 𝐹R is sampled after changing the weights § The algorithm on the concatenated sequence 𝐹 =< 𝐹P, 𝐹Q, 𝐹R >
Temporal PageRank is able to adapt to the changing distribution quite fast. Error is the Euclidean distance on the PageRank vectors.
¡ How do networks evolve at the mezo level?
§ What are mesoscopic impact of network growth?
¡ Questions:
§ How does patterns of interaction change over time? § What can we infer about the network from the changes in temporal patterns?
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 51
¡ 𝒍 −node 𝒎 − edge 𝜺-temporal motif: is a sequence of 𝑚
edges 𝑣P, 𝑤P, 𝑢P , 𝑣Q, 𝑤Q, 𝑢Q , … 𝑣ž, 𝑤ž, 𝑢ž such that
§ 𝑢P < 𝑢Q < … < 𝑢ž and 𝑢ž- 𝑢P ≤ 𝜀, § The induced static graph from the edges is connected and has 𝑙 nodes
§ Temporal motifs offer valuable information about the networks’ evolution § For example to discover trends and anomalies in temporal networks
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 52
¡ Temporal Motif Instance: A collection of edges in a
temporal graph is an instance of a 𝜀-temporal motif 𝑁 if
§ It matches the same edge pattern, and § All of the edges occur in the right order specified by the motif, within a 𝜀 time window
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 53
𝜺-temporal motif instances Temporal graph
¡ We study all 2- and 3- node motifs with 3 edges
§ We do not discuss how to count temporal motifs here
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 54
The green background highlights the four 2-node motifs (bottom left) and the grey background highlights the eight triangles.
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 55
¡ Real-world temporal datasets
[Paranjape et al. 2017]
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 56
Fraction of all 2 and 3-node, 3-edge δ-temporal motif counts that correspond to two groups of motifs (δ = 1 hour). Motifs on the left capture “blocking” behavior, common in SMS messaging and Facebook wall posting, and motifs on the right exhibit “non-blocking” behavior, common in email.
¡ Blocking communication
§ If an individual typically waits for a reply from one individual before proceeding to communicate with another individual
[Paranjape et al. 2017]
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 57
Distribution of switching behavior amongst the nonblocking motifs (δ = 1 hour). Switching is least common on Stack Overflow and most common in email.
¡ Cost of Switching
§ On Stack Overflow and Wikipedia talk pages, there is a high cost to switch targets because of peer engagement and depth of discussion § In the COLLEGEMSG dataset there is a lesser cost to switch because it lacks depth of discussion within the time frame of δ = 1 hour § In EMAIL-EU, there is almost no peer engagement and cost of switching is negligible
[Paranjape et al. 2017]
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 58
Counts over various time scales for the motifs representing a node sending 3 outgoing messages to 1 or 2 neighbors in the COLLEGEMSG dataset
¡ Motif counts at varying time scales
§ At small time scales, the motif consisting of three edges to a single neighbor occurs frequently § After 5 minutes, counts for the three motifs with one switch in the target grow at a faster rate than the counts for the motif with two switches
[Paranjape et al. 2017]
¡ To spot trends and anomalies, we have to spot statistically
significant temporal motifs
§ To do so, we must compute the expected number of occurrences of each motif § We study all 2- and 3- node motifs with 3 edges
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 59
11/14/19 Jure Leskovec, Stanford CS224W: Machine Learning with Graphs, http://cs224w.stanford.edu 60
Anomalies: We can localize the time the financial crisis hits the country around September 2011 from the difference in the actual
- vs. expected motif frequencies
¡
A European country’s transaction log for all transactions larger than 50K Euros over 10 years from 2008 to 2018, with 118,739 nodes and 2,982,049 temporal edges (𝜀=90 days)
Financial crisis starts