Measuring Segregation in Social Networks Micha l Bojanowski Rense - - PowerPoint PPT Presentation

measuring segregation in social networks
SMART_READER_LITE
LIVE PREVIEW

Measuring Segregation in Social Networks Micha l Bojanowski Rense - - PowerPoint PPT Presentation

Introduction Problem Approach Properties Measures Summary Measuring Segregation in Social Networks Micha l Bojanowski Rense Corten ICS/Sociology, Utrecht University July 2, 2010 Sunbelt XXX, Riva del Garda Introduction Problem


slide-1
SLIDE 1

Introduction Problem Approach Properties Measures Summary

Measuring Segregation in Social Networks

Micha l Bojanowski Rense Corten

ICS/Sociology, Utrecht University

July 2, 2010 Sunbelt XXX, Riva del Garda

slide-2
SLIDE 2

Introduction Problem Approach Properties Measures Summary

Outline

1

Introduction Homophily and segregation

2

Problem

3

Approach Approach Notation

4

Properties Ties Nodes Network

5

Measures

6

Summary

slide-3
SLIDE 3

Introduction Problem Approach Properties Measures Summary Homophily and segregation

Homophily and segregation

Homophily Contact between similar people occurs at a higher rate than among dissimilar people (McPherson, Smith-Lovin, & Cook, 2001). Segregation Nonrandom allocation of people who belong to different groups into social positions and the associated social and physical distances between groups (Bruch & Mare, 2009).

slide-4
SLIDE 4

Introduction Problem Approach Properties Measures Summary Homophily and segregation

Homophily and segregation

Homophily Contact between similar people occurs at a higher rate than among dissimilar people (McPherson, Smith-Lovin, & Cook, 2001). Segregation Nonrandom allocation of people who belong to different groups into social positions and the associated social and physical distances between groups (Bruch & Mare, 2009).

slide-5
SLIDE 5

Introduction Problem Approach Properties Measures Summary Homophily and segregation

Homophily: Friendship selection in school classes

Moody (2001)

slide-6
SLIDE 6

Introduction Problem Approach Properties Measures Summary Homophily and segregation

Residential segregation in Seattle

Blacks Asians Whites

Source: Seattle Civil Rights and Labor History Project

slide-7
SLIDE 7

Introduction Problem Approach Properties Measures Summary Homophily and segregation

Segregation in network terms

Neighborhood structure can be conceptualized as a network in which links correspond to neigh- borhood proximities.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

slide-8
SLIDE 8

Introduction Problem Approach Properties Measures Summary Homophily and segregation

Assumption In static terms homophily and segregation correspond to the same network phenomenon. We will stick with the segregation label.

slide-9
SLIDE 9

Introduction Problem Approach Properties Measures Summary

Measurement problem

To be able to compare the levels of segregation of different networks (different school classes, different cities etc.) we need a measure.

slide-10
SLIDE 10

Introduction Problem Approach Properties Measures Summary

Problems with measures

There exist an abundance of measures in the literature, but: Stem from different research streams Follow different logics Hardly ever refer to each other Lead to different conclusions given the same problems (data) So, the problems are: Which one to select in a given setting? On what grounds such selection should be performed?

slide-11
SLIDE 11

Introduction Problem Approach Properties Measures Summary Approach

Possible approaches

slide-12
SLIDE 12

Introduction Problem Approach Properties Measures Summary Approach

Possible approaches

Empirical Assemble a large set of empirical datasets. Calculate the measures for all of them. Look how they

  • correlate. Perhaps through PCA or alike.
slide-13
SLIDE 13

Introduction Problem Approach Properties Measures Summary Approach

Possible approaches

Empirical Assemble a large set of empirical datasets. Calculate the measures for all of them. Look how they

  • correlate. Perhaps through PCA or alike.

Theo-pirical Take a set of probabilistic models of networks (Erd¨

  • s-Renyi random graph, preferential attachment,

small-world etc.). Generate a collection of networks. Proceed as in the item above.

slide-14
SLIDE 14

Introduction Problem Approach Properties Measures Summary Approach

Possible approaches

Empirical Assemble a large set of empirical datasets. Calculate the measures for all of them. Look how they

  • correlate. Perhaps through PCA or alike.

Theo-pirical Take a set of probabilistic models of networks (Erd¨

  • s-Renyi random graph, preferential attachment,

small-world etc.). Generate a collection of networks. Proceed as in the item above. Theoretical Come-up with a set of properties that the measures might (or might not) posses. Evaluate the differences between the measures in terms of satisfying (or not) certain properties.

slide-15
SLIDE 15

Introduction Problem Approach Properties Measures Summary Approach

Possible approaches

Empirical Assemble a large set of empirical datasets. Calculate the measures for all of them. Look how they

  • correlate. Perhaps through PCA or alike.

Theo-pirical Take a set of probabilistic models of networks (Erd¨

  • s-Renyi random graph, preferential attachment,

small-world etc.). Generate a collection of networks. Proceed as in the item above. Theoretical Come-up with a set of properties that the measures might (or might not) posses. Evaluate the differences between the measures in terms of satisfying (or not) certain properties.

slide-16
SLIDE 16

Introduction Problem Approach Properties Measures Summary Notation

Actors

Actors N = {1, 2, . . . , i, . . . , N} Groups of actors Actors are assigned into K exhaustive and mutually exclusive groups. G = {G1, . . . , Gk, . . . , GK}. Group membership is denoted with “type vector”: t = [t1, . . . , ti, . . . , tN] where ti ∈ {1, . . . , K} ti = group of actor i Let T be a set of all possible type vectors for N.

slide-17
SLIDE 17

Introduction Problem Approach Properties Measures Summary Notation

Network

Network Actors form an undirected network which is a square binary matrix X = [xij]N×N. Let X be a set of all possible networks over actors in N. Mixing matrix A three-dimensional array M = [mghy]K×K×2 defined as mgh1 =

  • i∈Gg
  • j∈Gh

xij mgh0 =

  • i∈Gg
  • j∈Gh

(1 − xij)

slide-18
SLIDE 18

Introduction Problem Approach Properties Measures Summary Notation

Segregation index

Segregation measure A generic segregation index S(·): S : X × T → ℜ For a given network and type vector assign a real number.

slide-19
SLIDE 19

Introduction Problem Approach Properties Measures Summary Ties

Adding between-group ties

Property (Monotonicity in between-group ties: MBG) Let there be two networks X and Y defined on the same set of nodes, a type vector t, and two nodes i and j such that ti = tj, xij = 0, and yij = 1. For all the other nodes p, q = i, j xpq = ypq, i.e. the networks X and Y are identical. Network segregation index S is monotonic in between-group ties iff S(X, t) ≥ S(Y , t) In words: adding a between-group tie cannot increase segregation.

slide-20
SLIDE 20

Introduction Problem Approach Properties Measures Summary Ties

Adding within-group ties

Property (Monotonicity in within-group ties: MWG) Let there be two networks X and Y defined on the same set of nodes, a type vector t, and two nodes i and j such that ti = tj, xij = 0 and yij = 1. For all the other nodes p, q = i, j xpg = ypg, i.e. the networks X and Y are identical. Network segregation index S is monotonic in within-group ties iff S(X, t) ≤ S(Y , t) In words: adding a within-group tie to the network cannot decrease segregation.

slide-21
SLIDE 21

Introduction Problem Approach Properties Measures Summary Ties

Rewiring between-group tie to within-group

Property (Monotonicity in rewiring: MR) Let there be two networks X and Y , a type vector t and three nodes i, j and k such that

1 xij = 1 and ti = tj 2 yij = 0, yik = 1, and ti = tk

That is, an between-group tie ij in X is rewired to a within-group tie ik in Y . Network segregation index S is monotonic in rewiring iff S(X, t) ≤ S(Y , t)

slide-22
SLIDE 22

Introduction Problem Approach Properties Measures Summary Nodes

Adding isolates

Property (Effect of adding isolates: ISO) Define two networks X = [xij]N×N and Y = [ypq]N+1×N+1 and associated type vectors u and w which are identical for the N actors and differ by an (N + 1)-th node which is an isolate:

1 ∀p, q ∈ 1..N

ypq = xpq

2 N+1

p=1 yp N+1 = N+1 q=1 yN+1 q = 0.

3 ∀k ∈ 1..N

wk = uk. S(X, u) ? S(X, w) In words: how does the segregation level change if isolates are added to the network?

slide-23
SLIDE 23

Introduction Problem Approach Properties Measures Summary Network

Duplicating the network

Property (Symmetry: S) Define two identical networks X and Y and some type vector t. Network segregation index S satisfies symmetry iff S(X, t) = S(Y , t) = S(Z, z) where the network Z is constructed by considering X and Y together as a single network, namely: Z = [zpq]2N×2N such that ∀p, q ∈ {1, . . . , N} zpq = xpq ∀p, q ∈ {N + 1, . . . , 2N} zpq = ypq

  • therwise zpq = 0
slide-24
SLIDE 24

Introduction Problem Approach Properties Measures Summary

Measures

Freeman’s segregation index (Freeman, 1978) Spectral Segregation Index (Echenique & Fryer, 2007) Assortativity coefficient (Newman, 2003) Gupta-Anderson-May’s Q (Gupta et al, 1989) Coleman’s Homophily Index (Coleman, 1958) Segregation Matrix index (Freshtman, 1997) Exponential Random Graph Models (Snijders et al, 2006) Conditional Log-linear models for mixing matrix (Koehly, Goodreau & Morris, 2004)

slide-25
SLIDE 25

Introduction Problem Approach Properties Measures Summary

Measure Level Network type Scale Freeman network U [0; 1] SSI node U [0; ∞] Assortativity network D/U [−

  • g pg+p+g

1−

g pg+p+g ; 1]

Gupta-Anderson-May network D/U [−

1 G−1 ; 1]

Coleman group D [−1; 1] Segregation Matrix Index group D/U [−1; 1] Uniform homophily (CLL) network D/U [−∞; ∞] Differential homophily (CLL) group D/U [−∞; ∞] Uniform homophily (ERGM) network D/U [−∞; ∞] Differential homophily (ERGM) group D/U [−∞; ∞]

slide-26
SLIDE 26

Introduction Problem Approach Properties Measures Summary

Freeman (1978)

Given two groups SFreeman = 1 − p π where p is the observed proportion of between-group ties and π is the expected proportion given that ties are created randomly. It varies between 0 (random network) and 1 (full segregation of groups).

slide-27
SLIDE 27

Introduction Problem Approach Properties Measures Summary

Assortativity Coefficient, Newman (2003)

Based on a contact layer of the mixing matrix pgh = mgh1/m++1. SNewman = K

g=1 pgg − K g=1 pg+p+g

1 − K

g=1 pg+p+g

Maximum of 1 for perfect segregation; 0 for random network. Negative values for “dissasortative” networks. Minimum depends

  • n the density.
slide-28
SLIDE 28

Introduction Problem Approach Properties Measures Summary

Gupta, Anderson & May 1989

Also based on contact layer of the mixing matrix SGAM = K

g=1 λg − 1

K − 1 Where λg are eigenvalues of pgh. It varies between −1/(K − 1) and 1

slide-29
SLIDE 29

Introduction Problem Approach Properties Measures Summary

Coleman, 1958

Expected number of ties within group g m∗

gg =

  • i∈Gg

ηi ng − 1 N − 1 Sg

Coleman =

mgg − m∗

gg

  • i∈Gg ηi − m∗

gg

where mgg >= m∗

gg

(1) Sg

Coleman = mgg − m∗ gg

m∗

gg

where mgg < m∗

gg

(2)

slide-30
SLIDE 30

Introduction Problem Approach Properties Measures Summary

Segregation matrix index, Freshtman 1997

SSMI = d11 − d12 d11 + d12 (3) where d11 is the density of within-group ties and d12 is the density

  • f between-group ties.
slide-31
SLIDE 31

Introduction Problem Approach Properties Measures Summary

Conditional Log-Linear Models (Koehly et al, 2004)

log mgh1 = µ + λA

g + λB h + λUHOM gh

  • λUHOM

gh

= λUHOM g = h λUHOM

gh

= 0 g = h log mgh1 = µ + λA

g + λB h + λDHOM gh

  • λDHOM

gh

= λDHOM

g

g = h λDHOM

gh

= 0 g = h Parameters λUHOM and λDHOM

g

as measures of homophily/segregation.

slide-32
SLIDE 32

Introduction Problem Approach Properties Measures Summary

ERGM

Exponential Random Graph models log mgh1 mgh0

  • = α + βA

g + βB h + βUHOM gh

  • βUHOM

gh

= βUHOM g = h βUHOM

gh

= 0 g = h log mgh1 mgh0

  • = µ + βA

g + βB h + βDHOM gh

  • βDHOM

gh

= βDHOM

g

g = h βDHOM

gh

= 0 g = h Parameters βUHOM and βDHOM

g

as measures of homophily/segregation.

slide-33
SLIDE 33

Introduction Problem Approach Properties Measures Summary

Spectral Segregation Index, Echenique & Fryer (2007)

Segregation level of individual i in group g in component B: sg

i (B) =

1 Sg

Ci

  • j

rijsg

j (B)

(4) where rij are entries in a row-normalized adjacency matrix. Segregation of individual i Si

SSI = li

l λ (5) where λ is the largest eigenvalue of B, and l is the corresponding eigenvector

slide-34
SLIDE 34

Introduction Problem Approach Properties Measures Summary

SSI (2)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

Node segregation in White's kinship data

Mother Sister Brother's Wife Sister's Daughter Brother's Daughter Father Brother Sister's Husband Brother's Son Sister's Son

  • Men

Women

slide-35
SLIDE 35

Introduction Problem Approach Properties Measures Summary

Summary

Measure MBG (ց) MWG (ր) MR (ր) ISO S (→) Freeman

  • ր
  • ց

SSI ց ր ր ց → Assortativity ց ր ր → → Gupta-Anderson-May ց ր ր → → Coleman ց ր ր

  • ց

Segregation Matrix Index ց ր ր

Uniform homophily (CLL) ց ր ր → → Differential homophily (CLL) ց ր ր → → Uniform homophily (ERGM) ց ր ր

Differential homophily (ERGM) ց ր ր

slide-36
SLIDE 36

Introduction Problem Approach Properties Measures Summary

Summary

Measures on different levels: individuals, groups, global network Different zero points: random graph, proportionate mixing, full integration MBW, MWG not very informative, all measures satisfy them. Symmetry: All but two measures satisfy it, Coleman and Freeman decrease.

slide-37
SLIDE 37

Introduction Problem Approach Properties Measures Summary

Summary: adding isolates

Measures based on contact layer of mixing matrix are insensitive to isolates. SSI is the only one that always decreases The effect on others depend on relative group sizes.

slide-38
SLIDE 38

Introduction Problem Approach Properties Measures Summary

Summary

Measures based on contact layer of the mixing matrix summarize probability of node attribute combination given that the tie exists (CLL, assortativity, GAM): explaining attributes given the network. Measures that take also disconnected dyads into account. (ERGM, Freeman, SSI): explaining tie formation given the attributes.

slide-39
SLIDE 39

Introduction Problem Approach Properties Measures Summary

Further questions

Stricter formal analysis (axiomatizations). SSI is the only measure derived axiomatically. Link to behavioral models: how the segregation comes about. For example

Network formation game further justifying Bonacich centrality (Ballester et al., 2006) Coleman’s index in Currarini et al. (2010).

slide-40
SLIDE 40

Introduction Problem Approach Properties Measures Summary

Thanks

Thanks! http://www.bojanorama.pl