Generative models for social network data
Kevin S. Xu (University of Toledo) James R. Foulds (University of California-San Diego) SBP-BRiMS 2016 Tutorial
Generative models for social network data Kevin S. Xu (University - - PowerPoint PPT Presentation
Generative models for social network data Kevin S. Xu (University of Toledo) James R. Foulds (University of California-San Diego) SBP-BRiMS 2016 Tutorial About Us Kevin S. Xu James R. Foulds Assistant professor at Postdoctoral scholar
Kevin S. Xu (University of Toledo) James R. Foulds (University of California-San Diego) SBP-BRiMS 2016 Tutorial
Kevin S. Xu
University of Toledo
experience in industry
analysis James R. Foulds
UCSD
and generative models
models
studying social networks for decades!
study of social networks: Jacob Moreno in 1930s
sociograms
network analysis (SNA) from physics, EECS, statistics, and many other disciplines
sociomatrix π
π = 1 1 1 1 1 1 1 1 1 1 1 1 1 1
exchangeable by symmetry
do not change graph
all other nodes are identical
similar (not necessarily common neighbors)
models, power-law degree distributions (Barabasi et al.)
new networks
12
Data generating process Observed data Probability Inference
Figure based on one by Larry Wasserman, "All of Statistics"
Mathematics/physics: ErdΕs-RΓ©nyi, preferential attachment,β¦ Statistics/machine learning: ERGMs, latent variable modelsβ¦
13
graphs with a fixed number of edges.
equally likely
with probability p
have about the same degree (# edges)
following a power law (possibly controversial?)
model to address this (Barabasi and Albert, 1999)
proportional to its degree ( + smoothing counts),
18
connected to K neighbors in a ring
each edge with probability
phenomenon, β6-degrees of separationβ)
19 Figure due to Arpad Horvath, https://commons.wikimedia.org/wiki/File:Watts_strogatz.svg
20
21
Arbitrary sufficient statistics Covariates (gender, age, β¦) E.g. βhow many males are friends with femalesβ
science
e.g. ergm package for statnet
22
specification for an observed dataset as to render the observed data virtually impossibleβ
23
specification for an observed dataset as to render the observed data virtually impossibleβ
24
25
If two people have a friend in common, then there is an increased likelihood that they will become friends themselves at some point in the future.
26
27
Depending on parameters, we could get:
mass on desired density and triad closure MLE may not exist!
28
Depending on parameters, we could get:
mass on desired density and triad closure MLE may not exist!
29 Handcock, M. S., Hunter, D. R., Butts, C. T., Goodreau, S. M., & Morris, M. (2008). statnet: Software tools for the representation, visualization, analysis and simulation of network data. Journal of statistical software, 24(1), 1548.
30
Completes two triangles! If an edge completes more triangles, it becomes overwhelming likely to exist. This propagates to create more triangles β¦
returns for completing more triangles
dimensional , e.g. encoding geometrically decreasing weights (curved exponential family)
require care and expertise to perform well
31
a set of unobserved or latent variables
independent given latent variables
permutations
variable model of form for iid latent variables and some function
nodes in social networks
independent of all other node pairs given values of latent variables π π π, π = π π§ππ π΄π, π΄π, π
πβ π
representation
characteristics of the nodes become more similar
characteristics or βsocial spaceβ
high probability of edge between nodes
suggests that π and π are not too far apart in latent space ο¨ more likely to also have an edge
by Hoff et al. (2002)
between latent positions
2
latent space
between all pairs of nodes
pairs of nodes
Figure due to P. D. Hoff, Modeling homophily and stochastic equivalence in symmetric relational data, NIPS 2008
network
transitivity
enough degrees of freedom
to associate with people with different characteristics)
(1983)
RΓ©nyi model
variable π¨π β 1, β¦ , πΏ denoting its class or group
depend on class memberships of nodes (πΏ Γ πΏ matrix W)
functional roles in social networks
generalization of structural equivalence
identical probabilities of forming edges to members
disassortative mixing
Figure due to P. D. Hoff, Modeling homophily and stochastic equivalence in symmetric relational data, NIPS 2008
Original graph Blockmodel
Figure due to Goldenberg et al. (2009) - Survey of Statistical Network Models, Foundations and Trends
Stochastically equivalent, but are not densely connected
UCSD UCI UCLA Alice 1 Bob 1 Claire 1
Alice Bob Claire
Kemp, Charles, et al. "Learning systems of concepts with an infinite relational model." AAAI. Vol. 3. 2006.
Kemp, Charles, et al. "Learning systems of concepts with an infinite relational model." AAAI. Vol. 3. 2006.
Latent groups Z Interaction matrix W (probability of an edge from block k to block kβ)
44
Running Dancing Fishing Alice 1 Bob 1 Claire 1
Alice Bob Claire Nodes assigned to only
Not always an appropriate assumption
Running Dancing Fishing Alice 0.4 0.4 0.2 Bob 0.5 0.5 Claire 0.1 0.9
Alice Bob Claire Nodes represented by distributions
Airoldi et al., (2008)
Airoldi et al., (2008)
Cycling Fishing Running Waltz Running Tango Salsa Alice Bob Claire Mixed membership implies a kind of βconservation of (probability) massβ constraint: If you like cycling more, you must like running less, to sum to one
Miller, Griffiths, Jordan (2009)
Mixed membership implies a kind of βconservation of (probability) massβ constraint: If you like cycling more, you must like running less, to sum to one
Miller, Griffiths, Jordan (2009)
Cycling Fishing Running Waltz Running Tango Salsa Alice Bob Claire
Mixed membership implies a kind of βconservation of (probability) massβ constraint: If you like cycling more, you must like running less, to sum to one
Miller, Griffiths, Jordan (2009)
Cycling Fishing Running Waltz Running Tango Salsa Alice Bob Claire
Miller, Griffiths, Jordan (2009)
Cycling Fishing Running Waltz Running Tango Salsa
Cycling Fishing Running Tango Salsa Waltz Alice Bob Claire
Z =
Alice Bob Claire Nodes represented by binary vector of latent features
(Miller, Griffiths, Jordan, 2009) likelihood model:
in the p2 model
52
1
+ ο΅
and generative models
models
Viswanath et al. (2009)
Facebook wall
not handle directed graphs in a straightforward manner
as a starting point
Kemp, Charles, et al. "Learning systems of concepts with an infinite relational model." AAAI. Vol. 3. 2006.
Latent groups Z Interaction matrix W (probability of an edge from block k to block kβ)
each node is given by some other variable
class π connects to node in class πβ² for all π, πβ²
Number of actual edges in block π, πβ² Number of possible edges in block π, πβ²
accurately estimate latent classes
matrix to estimate classes
π = πΞ£ππ
rows and columns of Ξ£
π = πΞ£1/2 πΞ£1/2
to return estimate π
Scales to networks with thousands of nodes!
interactions between 62 bottlenose dolphins
and Newman (2004)
interactions between dolphins
transitivity may be expected
by Hoff et al. (2002)
between latent positions
2
πβ π
matrix πΈ but not in latent positions π
scaling (MDS) to get initialization for π
in graph then use MDS
visualization using scatter plot Scales to ~1000 nodes
prior beliefs, write down your likelihood, and apply Bayes β rule,
64
65
Posterior Likelihood Marginal likelihood (a.k.a. model evidence) Prior is a normalization constant that does not depend on the value of ΞΈ. It is the probability of the data under the model, marginalizing over all possible ΞΈβs.
66
The mode (MAP estimate) is unrepresentative of the distribution
67
the posterior, with a set of samples
distribution and draw samples
68
Bayesian posterior, can be hard.
Graph structure gives us Markov blanket
69
their conditional distributions
the variables, in any order.
70
71
72
73
74
75
76
77
distribution q(z)
p(z) as possible, e.g. in KL-divergence
79
80
81
82
83
Blows up if p is small and q isnβt. Under-estimates the support Blows up if q is small and p isnβt. Over-estimates the support Reverse KL Forwards KL
Figures due to Kevin Murphy (2012). Machine Learning: A Probabilistic Perspective
84
Fit the data well Be flat
85
Fit the data well Be flat
86
Fit the data well Be flat
87
Fit the data well Be flat
to make these expectations tractable.
88
The entropy term decomposes nicely:
the algorithm must converge
89
90
update one variable given the rest
variable, while Gibbs sampling draws from one.
91
samples
unlike Gibbs sampling
algorithms
92
parameters Wkkβ
mixed membership vectors, cluster assignments
zp->q and zq->p assignments
membership distribution
friendship relationships between novice monks
Airoldi, E. M., Blei, D. M., Fienberg, S. E., & Xing, E. P. (2009). Mixed membership stochastic blockmodels. In Advances in Neural Information Processing Systems (pp. 33-40).
Airoldi, E. M., Blei, D. M., Fienberg, S. E., & Xing, E. P. (2009). Mixed membership stochastic blockmodels. In Advances in Neural Information Processing Systems (pp. 33-40).
Estimated blockmodel
Airoldi, E. M., Blei, D. M., Fienberg, S. E., & Xing, E. P. (2009). Mixed membership stochastic blockmodels. In Advances in Neural Information Processing Systems (pp. 33-40).
Estimated blockmodel Least coherent
Airoldi, E. M., Blei, D. M., Fienberg, S. E., & Xing, E. P. (2009). Mixed membership stochastic blockmodels. In Advances in Neural Information Processing Systems (pp. 33-40).
Estimated Mixed membership vectors (posterior mean)
Airoldi, E. M., Blei, D. M., Fienberg, S. E., & Xing, E. P. (2009). Mixed membership stochastic blockmodels. In Advances in Neural Information Processing Systems (pp. 33-40).
Estimated Mixed membership vectors (posterior mean) Expelled
Airoldi, E. M., Blei, D. M., Fienberg, S. E., & Xing, E. P. (2009). Mixed membership stochastic blockmodels. In Advances in Neural Information Processing Systems (pp. 33-40).
Estimated Mixed membership vectors (posterior mean) Wavering not captured
Airoldi, E. M., Blei, D. M., Fienberg, S. E., & Xing, E. P. (2009). Mixed membership stochastic blockmodels. In Advances in Neural Information Processing Systems (pp. 33-40).
Original network (whom do you like?) Summary of network (use Οβs)
Airoldi, E. M., Blei, D. M., Fienberg, S. E., & Xing, E. P. (2009). Mixed membership stochastic blockmodels. In Advances in Neural Information Processing Systems (pp. 33-40).
Original network (whom do you like?) Denoise network (use zβs)
algorithms
al., 2012)
(Korattika et al., 2014)
103
104
probably evaluate based on it!
evaluation purposes
105
allows us to βlook into the mind of the modelβ β G. Hinton
106
βThis use of the word mind is not intended to be metaphorical. We believe that a mental state is the state of a hypothetical, external world in which a high-level internal representation would constitute veridical perception. That hypothetical world is what the figure shows.β Geoff Hinton et al. (2006), A Fast Learning Algorithm for Deep Belief Nets.
107
and generative models
models
account for dynamics
Dynamic social network (Nordlie, 1958; Newcomb, 1961)
A dynamic relational infinite feature model for longitudinal social networks. AISTATS 2011
changing latent features
Cycling Fishing Running Waltz Running Tango Salsa Alice Bob Claire
changing latent features
Cycling Fishing Running Waltz Running Tango Salsa Alice Bob Claire
A dynamic relational infinite feature model for longitudinal social networks. AISTATS 2011
changing latent features
Cycling Fishing Running Waltz Running Tango Salsa Alice Bob Claire
A dynamic relational infinite feature model for longitudinal social networks. AISTATS 2011
changing latent features
Cycling Fishing Running Waltz Running Tango Salsa Fishing Alice Bob Claire
A dynamic relational infinite feature model for longitudinal social networks. AISTATS 2011
changing latent features
Cycling Fishing Running Waltz Running Tango Salsa Fishing Alice Bob Claire
A dynamic relational infinite feature model for longitudinal social networks. AISTATS 2011
changing latent features
A dynamic relational infinite feature model for longitudinal social networks. AISTATS 2011
each actor's feature chains
construction of the IBP to adaptively truncate the number of features but still perform exact inference
A dynamic relational infinite feature model for longitudinal social networks. AISTATS 2011
A dynamic relational infinite feature model for longitudinal social networks. AISTATS 2011
A dynamic relational infinite feature model for longitudinal social networks. AISTATS 2011
A dynamic relational infinite feature model for longitudinal social networks. AISTATS 2011
A dynamic relational infinite feature model for longitudinal social networks. AISTATS 2011
A dynamic relational infinite feature model for longitudinal social networks. AISTATS 2011
A dynamic relational infinite feature model for longitudinal social networks. AISTATS 2011
A dynamic relational infinite feature model for longitudinal social networks. AISTATS 2011
assumes hidden Markov structure
β Latent variables and/or parameters follow Markov dynamics β Graph snapshot at each time generated using static network model, e.g. stochastic block model or latent feature model as in DRIFT β Has been used to extend SBMs to dynamic models (Yang et al., 2011; Xu and Hero, 2014)
realistic assumption in social interaction networks
β Interaction between two people does not influence future interactions
current parameters and previous graph
β Scales to ~ 1000 nodes
π, πβ² with two probabilities
β Probability of forming new edge πππβ²
π’|0 = Pr π ππ π’ = 1|π ππ π’β1 = 0
β Probability of existing edge re-
πππβ²
π’|1 = Pr π ππ π’ = 1|π ππ π’β1 = 1
β ~ 700 nodes, 9 time steps, 5 classes
replicate edge durations in observed network?
β Simulate networks from both models using estimated parameters β Hidden Markov SBM cannot replicate long-lasting edges in sparse blocks
t=0 t=3.5 t=1 t=2 t=1.5
from text-based cascades. ICML 2015.
130
Mutual exciting nature: A posting event can trigger future events
Content cascades: The content of a document should be similar
to the document that triggers its publication
from text-based cascades. ICML 2015.
ππ€ π’ = ππ€ + π΅π€π,π€π
Ξ(π’ β π’π) π:π’π<π’
π΅π£,π₯: influence strength from π£ to π€ π
Ξ(β ): probability density function of the delay distribution
Base intensity Influence from previous events
Rate =
from text-based cascades. ICML 2015.
from text-based cascades. ICML 2015.
from text-based cascades. ICML 2015.
from text-based cascades. ICML 2015.
from text-based cascades. ICML 2015.
from text-based cascades. ICML 2015.
from text-based cascades. ICML 2015.
modeling social networks
models motivated by sociological principles
social networks
a βgiantβ connected component may emerge
141
a βgiantβ connected component may emerge
142
a βgiantβ connected component may emerge
143