Monte Carlo algorithms for Bayesian social network models Alberto - - PowerPoint PPT Presentation

monte carlo algorithms for bayesian social network models
SMART_READER_LITE
LIVE PREVIEW

Monte Carlo algorithms for Bayesian social network models Alberto - - PowerPoint PPT Presentation

Monte Carlo algorithms for Bayesian social network models Alberto Caimo alberto.caimo@usi.ch Social Network Analysis Research Centre InterDisciplinary Institute of Data Science University of Lugano, Switzerland AMS Sectional Meeting, Loyola


slide-1
SLIDE 1

Monte Carlo algorithms for Bayesian social network models

Alberto Caimo alberto.caimo@usi.ch

Social Network Analysis Research Centre InterDisciplinary Institute of Data Science University of Lugano, Switzerland

AMS Sectional Meeting, Loyola University Chicago October 2015

Alberto Caimo Bayesian computation for ERGMs

slide-2
SLIDE 2

Outline

Exponential random graph models (ERGMs) Bayesian exponential random graph models (BERGMs) Computational approaches for BERGMs Examples

Alberto Caimo Bayesian computation for ERGMs

slide-3
SLIDE 3

Exponential random graph models (ERGMs)

The relational structure of an observed network y can be explained by the relative prevalence of a set of overlapping sub-graph configurations s(y) called network statistics

Alberto Caimo Bayesian computation for ERGMs

slide-4
SLIDE 4

Exponential random graph models (ERGMs)

The relational structure of an observed network y can be explained by the relative prevalence of a set of overlapping sub-graph configurations s(y) called network statistics The likelihood of an ERGM represents the probability distribution of y given a parameter θ: p(y|θ) = exp{θ ts(y)−γ(θ)}

Alberto Caimo Bayesian computation for ERGMs

slide-5
SLIDE 5

Exponential random graph models (ERGMs)

The relational structure of an observed network y can be explained by the relative prevalence of a set of overlapping sub-graph configurations s(y) called network statistics The likelihood of an ERGM represents the probability distribution of y given a parameter θ: p(y|θ) = exp{θ ts(y)−γ(θ)} From a computational point of view we have an intractable likelihood problem

Alberto Caimo Bayesian computation for ERGMs

slide-6
SLIDE 6

Bayesian exponential random graph models (BERGMs)

(1) Observed network y (2) Model specification p(y|θ, m) ∝ exp{θts(y)} (3) Doubly-intractable posterior p(θ, m|y) ∝ p(y|θ, m) p(θ|m) p(m) (4) Parameter inference p(θ|y, m) (5) Model choice across-model p(m|θ, y) within-model p(y|m) (6) Goodness of fit Alberto Caimo Bayesian computation for ERGMs

slide-7
SLIDE 7

Bayesian exponential random graph models (BERGMs)

. . . . . .

. .

.parameter .network statistics .normalising constant likelihood: . . exp{θ t . s(y) − . . γ(θ) }.. prior . posterior: p(θ|y) = . . p(y|θ) . . p(θ) . . p(y) model evidence . .

Alberto Caimo Bayesian computation for ERGMs

slide-8
SLIDE 8

Computational approaches

Let’s define: qθ(y) = exp{θ ts(y)} unnormalised likelihood z(θ) = exp{γ(θ)} normalising constant Metropolis-Hastings 1 Gibbs update of (θ ′)

i Draw θ ′ ∼ h(·|θ)

2 Accept move from θ to θ ′ with probability 1∧ qθ ′(y) qθ(y) p(θ ′) p(θ) h(θ|θ ′) h(θ ′|θ) × z(θ) z(θ ′)

Alberto Caimo Bayesian computation for ERGMs

slide-9
SLIDE 9

Computational approaches

Approximate exchange algorithm (AEA)

(Murray et al., 2006; Caimo and Friel, 2011)

1 Gibbs update of (θ ′,y′)

(i) Draw θ ′ ∼ h(·|θ) (ii) Draw y′ ∼ p(·|θ ′) via MCMC

Alberto Caimo Bayesian computation for ERGMs

slide-10
SLIDE 10

Computational approaches

Approximate exchange algorithm (AEA)

(Murray et al., 2006; Caimo and Friel, 2011)

1 Gibbs update of (θ ′,y′)

(i) Draw θ ′ ∼ h(·|θ) (ii) Draw y′ ∼ p(·|θ ′) via MCMC

2 Exchange move from θ to θ ′ with probability 1∧ qθ ′(y) qθ(y) p(θ ′) p(θ) h(θ|θ ′) h(θ ′|θ) qθ(y′) qθ ′(y′) × z(θ)z(θ ′) z(θ)z(θ ′)

  • 1

Alberto Caimo Bayesian computation for ERGMs

slide-11
SLIDE 11

Computational approaches

Computational challenges Posterior distribution p(θ|y) is difficult to sample efficiently from as ERGM parameters is typically very thin and highly correlated

Alberto Caimo Bayesian computation for ERGMs

slide-12
SLIDE 12

Computational approaches

Improving chain mixing and convergence

(Caimo and Friel, 2011)

1(i) Parallel adaptive direction sampling (ADS) for Gibbs update

  • f θ ′

1(ii) Tie/no tie (TNT) sampler (as in the ergm package for R)

Alberto Caimo Bayesian computation for ERGMs

slide-13
SLIDE 13

Computational approaches

θh2 θh1 θh θh

1

ADS sampler: “snooker move”

Alberto Caimo Bayesian computation for ERGMs

slide-14
SLIDE 14

Bayesian exponential random graph models (BERGMs)

−15 −10 −5 5 10 −10 −5 5 10 Alberto Caimo Bayesian computation for ERGMs

slide-15
SLIDE 15

Computational approaches

Adaptive approximate exchange algorithm with delayed rejection

(Caimo and Mira, 2015)

1(i) Adaptive strategies

Alberto Caimo Bayesian computation for ERGMs

slide-16
SLIDE 16

Computational approaches

Adaptive approximate exchange algorithm with delayed rejection

(Caimo and Mira, 2015)

1(i) Adaptive strategies 2 Approximate exchange algorithm with delayed rejection

Alberto Caimo Bayesian computation for ERGMs

slide-17
SLIDE 17

Computational approaches

Adaptive strategies for Gibbs update of θ ′

(Roberts and Rosenthal, 2007; Haario et al., 2001)

vertical adaptation: all past particles along the same chain (AAEA-1+DR) horizontal adaptation: all particles at the current time for all chains (AAEA-2+DR) rectangular adaptation: particles from all chains and all past simulations (AAEA-3+DR)

Alberto Caimo Bayesian computation for ERGMs

slide-18
SLIDE 18

Computational approaches

Adaptive exchange algorithm with delayed rejection First stage move: α1(θ,θ ′) = 1∧ qθ ′(y) qθ(y) p(θ ′) p(θ) h1(θ|θ ′) h1(θ ′|θ) qθ(y′) qθ ′(y′)

Alberto Caimo Bayesian computation for ERGMs

slide-19
SLIDE 19

Computational approaches

Adaptive exchange algorithm with delayed rejection First stage move: α1(θ,θ ′) = 1∧ qθ ′(y) qθ(y) p(θ ′) p(θ) h1(θ|θ ′) h1(θ ′|θ) qθ(y′) qθ ′(y′) If θ ′ rejected, try a second stage move to θ ′′ with probability α2(θ,θ ′,θ ′′) = 1∧ qθ(y′′) p(θ ′′) h1(θ ′|θ ′′) [1−α1(θ ′′,θ ′)] h2(θ|θ ′′,θ ′) qθ ′′(y) qθ(y) p(θ) h1(θ ′|θ) [1−α1(θ,θ ′)] h2(θ ′′|θ,θ ′) qθ ′′(y′)

Alberto Caimo Bayesian computation for ERGMs

slide-20
SLIDE 20

Computational approaches

A hierarchy of proposal distributions can be exploited Moves (tennis-service strategy): “first bold” h1(·) proposal versus “second timid” h2(·)

Alberto Caimo Bayesian computation for ERGMs

slide-21
SLIDE 21

Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

Zachary Karate Club Network: Friendship relations between 34 members of a karate club at a US university in the 1970

Alberto Caimo Bayesian computation for ERGMs

slide-22
SLIDE 22

Examples

ERGM: p(y|θ) ∝ exp{θ1 s1(y)+θ2 s2(y,φu)+θ3 s3(y,φv)}

Alberto Caimo Bayesian computation for ERGMs

slide-23
SLIDE 23

Examples

ERGM: p(y|θ) ∝ exp{θ1 s1(y)+θ2 s2(y,φu)+θ3 s3(y,φv)} where: s1(y) = ∑i<j yij = number of edges s2(y,φv) = eφv ∑n−2

i=1

  • 1−
  • 1−e−φv i

EPi(y) geometrically weighted edgewise shared partners (gwesp) s3(y,φu) = eφu ∑n−1

i=1

  • 1−
  • 1−e−φui

Di(y) geometrically weighted degrees (gwdegree) EPi(y) = distribution of the number of unordered pairs of connected nodes having exactly k common neighbours Di(y) = degree distribution

Alberto Caimo Bayesian computation for ERGMs

slide-24
SLIDE 24

Examples

for model 14 θ(1) (edges) θ(2) (gwesp) θ(3) (gwdegree) ADS-AEA

  • Post. mean

−3.51 0.74 1.18

  • Post. sd

0.62 0.21 1.12 AAEA-2+DR (horizontal adaptation + DR)

  • Post. mean

−3.44 0.72 1.01

  • Post. sd

0.59 0.21 1.07

Estimated posterior means and standard deviations

Alberto Caimo Bayesian computation for ERGMs

slide-25
SLIDE 25

Examples

θ1 (edges)

  • 6
  • 5
  • 4
  • 3
  • 2
  • 1

0.0 0.1 0.2 0.3 0.4 0.5 θ2 (gwesp.fixed.0.693147180559945)

  • 0.5

0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 θ3 (gwdegree)

  • 4
  • 2

2 4 6 8 0.00 0.10 0.20 0.30 10 20 30 40 50

  • 1.0
  • 0.5

0.0 0.5 1.0 Lag 10 20 30 40 50

  • 1.0
  • 0.5

0.0 0.5 1.0 Lag 10 20 30 40 50

  • 1.0
  • 0.5

0.0 0.5 1.0 Lag θ1 (edges)

  • 6
  • 5
  • 4
  • 3
  • 2

0.0 0.2 0.4 0.6 θ2 (gwesp.fixed.0.693147180559945) 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 2.0 θ3 (gwdegree)

  • 2

2 4 6 0.0 0.1 0.2 0.3 0.4 10 20 30 40 50

  • 1.0
  • 0.5

0.0 0.5 1.0 Lag 10 20 30 40 50

  • 1.0
  • 0.5

0.0 0.5 1.0 Lag 10 20 30 40 50

  • 1.0
  • 0.5

0.0 0.5 1.0 Lag

Posterior density estimates for AEA (left) and AAEA+DR (right)

Alberto Caimo Bayesian computation for ERGMs

slide-26
SLIDE 26

Examples

Effective sample size (ESS): AAEA-2+DR +70% Performance (= ESS/CPU time): AAEA-2+DR +40%

Alberto Caimo Bayesian computation for ERGMs

slide-27
SLIDE 27

Examples

0.0 0.1 0.2 0.3 degree 2 4 6 8 10 12 14 16 18 0.0 0.2 0.4 0.6 0.8 minimum geodesic distance proportion of dyads 1 2 3 4 5 6 7 8 9 NR 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 edge-wise shared partners proportion of edges 2 4 6 8 10 12 14

Bayesian goodness-of-fit diagnostics: the observed network y is compared with a set of networks simulated from independent realisations of p(θ|y) in terms of high-level network statistics

Alberto Caimo Bayesian computation for ERGMs

slide-28
SLIDE 28

Examples

Faux Mesa High School friendship 203-node network graph.

Alberto Caimo Bayesian computation for ERGMs

slide-29
SLIDE 29

Examples

ERGM specification:

s1(y) = P

i<j yij number of edges

s2(y, x) = P

i<j yij(1(gradei=8) + 1(gradej=8))

node factor for “grade” = 8 s3(y, x) = P

i<j yij(1(gradei=9) + 1(gradej=9))

node factor for “grade” = 9 s4(y, x) = P

i<j yij(1(gredei=10) + 1(gradej=10))

node factor for “grade” = 10 s5(y, x) = P

i<j yij(1(gradei=11) + 1(gradej=11))

node factor for “grade” = 11 s6(y, x) = P

i<j yij(1(gradei=12) + 1(gradej=12))

node factor for “grade” = 12 s7(y, x) = P

i<j yij(1(sexi=M) + 1(sexj=M))

node factor for “sex = male” s8(y) = v(y, φv) GWESP s9(y) = u(y, φu) GWD

Alberto Caimo Bayesian computation for ERGMs

slide-30
SLIDE 30

Examples

ADS-AEA AAEA-1 AAEA-2 AAEA-3 ESS 667 1041 1008 1094 Performance (per sec) 1.8 2.3 2.1 2.2 ADS-AEA+DR AAEA-1+DR AAEA-2+DR AAEA-3+DR ESS 873 1376 1320 1440 Performance (per sec) 1.4 2.6 2.6 2.6

Variance reduction of AAEA-based algorithms varies between 55% and 98% relative to the ADS-AEA. This translates into a better performance varying from 25% to 40%.

Alberto Caimo Bayesian computation for ERGMs