Human Social Dynamics vs. The Data We Can Get Aaron Clauset - - PowerPoint PPT Presentation

human social dynamics vs the data we can get
SMART_READER_LITE
LIVE PREVIEW

Human Social Dynamics vs. The Data We Can Get Aaron Clauset - - PowerPoint PPT Presentation

Human Social Dynamics vs. The Data We Can Get Aaron Clauset Assistant Professor, Computer Science and BioFrontiers Institute, University of Colorado Boulder External Faculty, Santa Fe Institute 4 June 2013 NetSci 2013 Social Dynamics


slide-1
SLIDE 1

Aaron Clauset Assistant Professor, Computer Science and BioFrontiers Institute, University of Colorado Boulder External Faculty, Santa Fe Institute

Human Social Dynamics vs. The Data We Can Get

4 June 2013 NetSci 2013 Social Dynamics Workshop

1

slide-2
SLIDE 2
  • Twitter, Facebook, Google+, Pinterest, etc.
  • Academic coauthorships & citations
  • World Wide Web
  • etc.

data sources for social network dynamics

2

slide-3
SLIDE 3

but...

  • links often have low or no cost = unrealistic
  • system structure drives social dynamics
  • few sources capture “real” social networks

(face-to-face time)

  • Twitter, Facebook, Google+, Pinterest, etc.
  • Academic coauthorships & citations
  • World Wide Web
  • etc.

data sources for social network dynamics

3

slide-4
SLIDE 4

the data we want

[detailed, individual traces, over time, about specific social processes]

are not the data we can get

[detailed, massive electronic traces that may be only vaguely relevant to any real human social processes]

the more general problem

4

slide-5
SLIDE 5

an illustration

5

slide-6
SLIDE 6

Halo: Reach (Bungie, 2010)

  • played online via XBox Live platform
  • team combat simulation (FPS)
  • 20TB of game data, spanning
  • 18 months of time
  • 17+ million players
  • 1 billion competitions
  • 70% are team competitions
  • complex spatial environments
  • complex social interactions

an illustration

6

slide-7
SLIDE 7
  • join “party” (of 0-3 friends)
  • choose game type and

subtype (“competitive / team 4v4”)

  • Xbox Live places parties into

matches (matchmaking)

  • play! (for roughly 10 minutes)
  • repeat (1 billion times)

how it works

7

slide-8
SLIDE 8
  • observed interactions =

F(game matchmaking , friendships)

  • mean interaction degree 330
  • how to distinguish latent friendships

from observed interactions?

  • plus, no demographic information

the problem

8

slide-9
SLIDE 9
  • anonymous web survey

a small solution

Mason and Clauset, CSCW 2013 9

slide-10
SLIDE 10
  • anonymous web survey
  • 847 participants
  • demographic questions

age, sex, location, education

  • psychometric questions

attitudes, play style, etc.

  • friendship survey
  • 14,405 labeled friends
  • 7,159,989 labeled non-friends

Mason and Clauset, CSCW 2013

a small solution

10

slide-11
SLIDE 11

we can observe a sequence of pairwise interactions can we robustly distinguish friendships from non-friendships? recovering friendships from interactions

σij = (i, j, t1), (i, j, t2), . . .

11

slide-12
SLIDE 12

we can observe a sequence of pairwise interactions can we robustly distinguish friendships from non-friendships? recovering friendships from interactions

σij = (i, j, t1), (i, j, t2), . . .

  • volume of data varies widely by individual =

heavy-tailed distribution in

  • friendships are sparse in large networks
  • survey data provide “subjective” truth only

problems:

|σij|

12

slide-13
SLIDE 13

friendship = periodic + prosocial interactions diurnal cycle modulates all interactions what is a friendship? social interactions: recovering latent friendship ties supervised learning define 9 statistical features which do well?

Merrit, Jacobs, Mason and Clauset, ICWSM 2013 13

slide-14
SLIDE 14
  • AUC vs. how much data
  • n a person we have
  • periodic + prosocial

interactions highly robust and efficient

  • total interaction count

also good, eventually accuracy vs. amount of interaction data 9 features of pairwise interactions

periodic or prosocial interactions volume of interactions

Merrit, Jacobs, Mason and Clauset, ICWSM 2013 14

slide-15
SLIDE 15
  • friendships easy to recover from interactions
  • mean degree (interactions) 330
  • mean degree (friendships) 4
  • friendship graph very different from interaction graph
  • results likely to generalize [see Jones et al. PLoS ONE (2013)]
  • clarifies “friendship” = periodic + prosocial interactions
  • privacy concerns

recovering friendships from interactions

Merrit, Jacobs, Mason and Clauset, ICWSM 2013 Jones et al. PLOS ONE 8(1):e52168 (2013).

≈ ≈

15

slide-16
SLIDE 16

general outlook

  • electronic data
  • new window into human social dynamics!
  • big and detailed!
  • but, dumb. (not the data we want)
  • computational social science
  • know limits of dumb data
  • keep eye on underlying social processes
  • be willing to commit a “social science”
  • a general prescription
  • 1. obtain electronic data (big and dumb)
  • 2. seek out user labeled data (small and smart)
  • 3. model latent variables from observed data (supervised)
  • 4. extract underlying social dynamics

16

slide-17
SLIDE 17

thanks

Abigail Z Jacobs (Colorado) Sears Merritt (Colorado) Winter Mason (Stevens Inst. Tech.)

funded in part by references

  • Mason and Clauset, “Friends FTW! Friendship, Collaboration and Competition in Halo: Reach.”

CSCW (2013)

  • Merritt, Jacobs, Mason and Clauset, “Detecting friendship within dynamic online interaction networks.”

ICWSM (2013).

  • Merrit and Clauset, “Environmental structure and competitive scoring advantages in team competitions.”

arXiv:1304.1039 (2013)

17

slide-18
SLIDE 18

fin

18