Exponential-family Random Network Models (ERNM) Ian Fellows UCLA - - PowerPoint PPT Presentation

exponential family random network models ernm
SMART_READER_LITE
LIVE PREVIEW

Exponential-family Random Network Models (ERNM) Ian Fellows UCLA - - PowerPoint PPT Presentation

Exponential-family Random Network Models (ERNM) Ian Fellows UCLA January 9, 2012 Ian Fellows Exponential-family Random Network Models (ERNM) The landscape Random graphs == Random connections, Fixed nodal attributes Gibbs/Markov


slide-1
SLIDE 1

Exponential-family Random Network Models (ERNM)

Ian Fellows

UCLA

January 9, 2012

Ian Fellows Exponential-family Random Network Models (ERNM)

slide-2
SLIDE 2

The landscape

◮ Random graphs == Random connections, Fixed nodal

attributes

◮ Gibbs/Markov random fields == Fixed connections, Random

nodal attributes

◮ ERNM == Random connections, Random nodal attributes

Ian Fellows Exponential-family Random Network Models (ERNM)

slide-3
SLIDE 3

ERNM Formulation

Let Y be an n by n matrix who’s entries Yi,j indicate whether subject i and j are connected, where n is the size of the

  • population. Further let X be a n × q matrix of nodal variates. We

define the network to be the random variable (Y , X). Then a joint exponential family model for the network may be written as: P(X = x, Y = y|η) = 1 c(η)eηh(x,y), (x, y) ∈ N (1)

Ian Fellows Exponential-family Random Network Models (ERNM)

slide-4
SLIDE 4

Uninteresting Example: Separable Models

Suppose that h is composed such that the model can be expressed as P(X = x, Y = y|η1, η2) = 1 c(η1, η2)eη1h1(x)+η2h2(y) (x, y) ∈ N. (2) Then P(X = x|η1) = 1 c1(η1)eη1h1(x) P(Y = y|η2) = 1 c2(η2)eη2h2(y).

Ian Fellows Exponential-family Random Network Models (ERNM)

slide-5
SLIDE 5

Pathological Example: Ising as a Joint Model

Joint model: P(X = x, Y = y|η1, η2) ∝ eη1

  • i
  • j yi,j+η2
  • i
  • j xiyi,jxj.

With conditional distributions being: P(Yi,j = yi,j|X = x, η1, η2) ∝ eη1yi,j+η2xiyi,jxj P(X = x|Y = y, η2) ∝ eη2

  • i
  • j xiyi,jxj

Ian Fellows Exponential-family Random Network Models (ERNM)

slide-6
SLIDE 6

Pathological Example: Ising as a Joint Model

Degeneracy: Oh My!!!

# of edges within x = 1

Count Frequency 50 100 150 200 20000

# of edges within x = -1

Count Frequency 50 100 150 200 20000

# of edges between x = 1 and x = -1

Count Frequency 20 40 60 80 100 120 6000

# of nodes with x = 1

Count Frequency 5 10 15 20 4000

Figure: 100,000 draws from an Ising Joint Model with η1 = 0 and η2 = 0.13. Mean values are marked in red.

Ian Fellows Exponential-family Random Network Models (ERNM)

slide-7
SLIDE 7

Dispare

Oh well, better give up.

Ian Fellows Exponential-family Random Network Models (ERNM)

slide-8
SLIDE 8

Hope

But wait, is there a better measure of homophily which doesn’t display degeneracy?

Ian Fellows Exponential-family Random Network Models (ERNM)

slide-9
SLIDE 9

...

... 6 months pass ....

Ian Fellows Exponential-family Random Network Models (ERNM)

slide-10
SLIDE 10

Regularized Homophily

reg homophily(k, l) =

  • i:xi=k
  • di,l − Ebinom(
  • di,l),

where di,l is the number of edges connecting node i to nodes in group l, and Ebinom(di,l) is the expectation of the square root of a binomial variable, with probability equal to the proportion of nodes in group l and size equal to the out-degree of node i.

Ian Fellows Exponential-family Random Network Models (ERNM)

slide-11
SLIDE 11

Logistic Regression in Network Data

P(Z = z, X = x, Y = y|η, β, λ) = 1 c(β, η, λ)ezxβ·+ηh(x,y)+λg(z,y). (3) P(zi = 1|z−i, xi, Y = y, β, λ) = exiβ eλ[g(z−,y)−g(z+,y)] + exiβ . (4) where z−i represents the set of z not including zi, z+ represents z where zi = 1, z− is z where zi = 0,and xi represents the ith row of X.

Ian Fellows Exponential-family Random Network Models (ERNM)

slide-12
SLIDE 12

A Super-population Model for an Add Health High School

η

  • Std. Error

Z p-value Mean Degree

  • 167.90

8.51

  • 19.73

<0.001 Log Variance of Degree 22.18 10.01 2.22 0.027 Degree = 0 3.91 0.47 8.28 <0.001 Degree = 1 2.20 0.38 5.86 <0.001 Degree = 2 0.73 0.35 2.05 0.041 Grade = 9 0.88 0.78 1.13 0.258 Grade = 10 1.74 0.92 1.89 0.058 Grade = 11 2.53 0.79 3.20 0.001 Within Grade Homophily 3.97 0.47 8.44 <0.001 +1 Grade Homophily 0.50 0.33 1.54 0.125 +2 Grade Homophily

  • 1.07

0.27

  • 4.03

<0.001 +3 Grade Homophily

  • 0.59

0.40

  • 1.47

0.143

Table: ERNM Model with Standard Errors Based on the Fisher Information

Ian Fellows Exponential-family Random Network Models (ERNM)

slide-13
SLIDE 13

A Super-population Model for an Add Health High School

Figure: Model-Based Simulated High School

Ian Fellows Exponential-family Random Network Models (ERNM)

slide-14
SLIDE 14

A Super-population Model for an Add Health High School

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 5 10 15 20 25

In-Degree

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 5 10 15 20 25

Out-Degree

10-10 11-10 12-10 9-10 10-11 11-11 12-11 9-11 10-12 11-12 12-12 9-12 10-9 11-9 12-9 9-9 20 40 60 80 100 120 140

# of Edges Between Grades

9 10 11 12 10 15 20 25 30

Grade Counts

Figure: Model Diagnostics

Ian Fellows Exponential-family Random Network Models (ERNM)

slide-15
SLIDE 15

Logistic Regression on Substance Use: Naive model

β

  • Std. Error

Z p-value Intercept

  • 1.70

0.44

  • 3.84

<0.001 Male 1.18 0.57 2.09 0.037

Table: Simple Logistic Regression Model Ignoring Network Structure

Ian Fellows Exponential-family Random Network Models (ERNM)

slide-16
SLIDE 16

Logistic Regression on Substance Use: ERNM model

Bootstrap Asymptotic η

  • Std. Error
  • Std. Error

Z p-value Mean Degree

  • 164.18

7.86 8.07

  • 20.36

<0.001 Log Variance of Degree 20.35 8.85 9.07 2.24 0.025 Degree 0 4.01 0.45 0.44 9.12 <0.001 Degree 1 2.25 0.37 0.35 6.44 <0.001 Degree 2 0.74 0.36 0.35 2.08 0.038 Grade Homophily 3.85 0.46 0.46 8.41 <0.001 +1 Grade Homophily 0.45 0.33 0.33 1.39 0.166 +2 Grade Homophily

  • 1.14

0.28 0.25

  • 4.50

<0.001 +3 Grade Homophily

  • 0.58

0.39 0.38

  • 1.52

0.129 Sex Homophily 0.98 0.28 0.27 3.56 <0.001 Substance Homophily 0.88 0.25 0.26 3.44 <0.001 Intercept

  • 1.79

0.49 0.43

  • 4.11

<0.001 Male 0.94 0.56 0.52 1.81 0.070

Table: ERNM Model Inference

Ian Fellows Exponential-family Random Network Models (ERNM)

slide-17
SLIDE 17

Logistic Regression on Substance Use: homophily diagnostics

# of edges within non-substance users

Count Frequency 100 150 200 250 300 350 100 200

# of edges within substance users

Count Frequency 20 40 60 80 100 200

# of edges between users and non-users

Count Frequency 20 40 60 80 100 120 140 100 200

# of non-substance users

Count Frequency 45 50 55 60 65 100 200

Figure: Substance Use Homophily Diagnostics. The values of the

  • bserved statistics are marked in red.

Ian Fellows Exponential-family Random Network Models (ERNM)

slide-18
SLIDE 18

Conclusion

ERNM is a framework for inference about networks, including both the graph and the nodal characteristics.

Ian Fellows Exponential-family Random Network Models (ERNM)