Analysis of Social Networks in IETF Mailing Lists Boonyakorn - - PowerPoint PPT Presentation

analysis of social networks in ietf mailing lists
SMART_READER_LITE
LIVE PREVIEW

Analysis of Social Networks in IETF Mailing Lists Boonyakorn - - PowerPoint PPT Presentation

Chair of Network Architectures and Services Department of Informatics Technical University of Munich Analysis of Social Networks in IETF Mailing Lists Boonyakorn Jantaranuson Final Talk of Masters Thesis June 25, 2018 Chair of Network


slide-1
SLIDE 1

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Analysis of Social Networks in IETF Mailing Lists

Boonyakorn Jantaranuson

Final Talk of Master’s Thesis June 25, 2018 Chair of Network Architectures and Services Department of Informatics Technical University of Munich

slide-2
SLIDE 2

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Outlines

Introduction Approach Analysis Conclusion Bibliography

  • B. Jantaranuson

– SNA in IETF Mailing Lists 2

slide-3
SLIDE 3

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Motivation

  • IETF use mailing lists to communicate inside 100+ working groups
  • Social interactions happen from communications via a large num-

ber of emails

  • Different WGs have different characteristics
  • SNA using IETF mailing lists is still new
  • B. Jantaranuson

– SNA in IETF Mailing Lists 3

slide-4
SLIDE 4

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Dataset

  • Archived e-mails are available online
  • Database available from Bachelor’s Thesis and IDP of N. Schwell-

nus [9] in PostgreSQL database format

  • DB relation of discussion threads is done by the bachelor’s thesis
  • f S. Klimek [7].
  • B. Jantaranuson

– SNA in IETF Mailing Lists 4

slide-5
SLIDE 5

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Relationships between People

  • Replying emails
  • Appearing in same discussion threads
  • B. Jantaranuson

– SNA in IETF Mailing Lists 5

slide-6
SLIDE 6

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Example Network (bmwg)

  • B. Jantaranuson

– SNA in IETF Mailing Lists 6

slide-7
SLIDE 7

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Analysis Workflow

  • B. Jantaranuson

– SNA in IETF Mailing Lists 7

slide-8
SLIDE 8

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Example 1: Centrality

  • Used a lot in studies using SNA to identify important individuals
  • In several papers [10, 5, 8], betweenness centrality is a common

measure to identify important people in the Enron company

  • Evaluate the importance using peoples’ roles: CEO, manager,

head of departments

  • B. Jantaranuson

– SNA in IETF Mailing Lists 8

slide-9
SLIDE 9

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Centrality: IETF

  • Some people have specific roles in the WGs: chairs, area direc-

tors, secretary, reviewers, tech advisors

  • Question: Can we use centrality as a measure to identify important

people in IETF context?

  • B. Jantaranuson

– SNA in IETF Mailing Lists 9

slide-10
SLIDE 10

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Centrality: Roles Analysis

  • B. Jantaranuson

– SNA in IETF Mailing Lists 10

slide-11
SLIDE 11

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Centrality: What makes them important?

  • Send more emails or start more discussion threads
  • B. Jantaranuson

– SNA in IETF Mailing Lists 11

slide-12
SLIDE 12

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Centrality: Changes in Centrality

  • Use the dynamic social network for the analysis
  • Data is from 2007-2017
  • Moving window of size 1 year is used to generated a network at a

point of time with 4 months step size

  • How stable is the importance of a person?
  • Example: v6ops (large WG) and rtgwg (small WG)
  • B. Jantaranuson

– SNA in IETF Mailing Lists 12

slide-13
SLIDE 13

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Centrality: Changes in Centrality (v6ops)

  • B. Jantaranuson

– SNA in IETF Mailing Lists 13

slide-14
SLIDE 14

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Centrality: Changes in Centrality (rtgwg)

  • B. Jantaranuson

– SNA in IETF Mailing Lists 14

slide-15
SLIDE 15

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Example 2: Comparing IETF to Real-World Networks

  • Applying SNA to mailing list of Open-Source Software mailing lists

from Bird et al. [1, 2] and conference mailing list from Chen et al. [3]

  • They found that those the degree distributions follow power-law

distribution, which is the same as other real-world networks

  • Are networks generated from IETF mailing lists the same as real-

world networks?

  • B. Jantaranuson

– SNA in IETF Mailing Lists 15

slide-16
SLIDE 16

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Comparing IETF to Real-World Networks

  • B. Jantaranuson

– SNA in IETF Mailing Lists 16

slide-17
SLIDE 17

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Example 3: Potential Events Analysis

  • Diesner et al. [4] and Juszczyszyn et al [6]. generated dynamic

social networks and tracked the changes of e.g. # of active users, network density, average centrality

  • In Enron case [4], they spotted significant rise in the period before

company crisis (bankrupt)

  • In university emails case [6], they found that the significant drop in

the changes related to semester break

  • There are several events occurred in IETF community
  • Can we detect potential events from the change of social net-

works?

  • B. Jantaranuson

– SNA in IETF Mailing Lists 17

slide-18
SLIDE 18

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Example: httpbis

  • We analyze from the changes of the number of active users and

the number of newcomers

  • In January 2012, the initiative for HTTP/2 was announced
  • In July 2014, the WG obsoleted the RFC of HTTP/1.1
  • In May 2015, the WG released the RFC of HTTP/2
  • B. Jantaranuson

– SNA in IETF Mailing Lists 18

slide-19
SLIDE 19

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Example: httpbis (# of Active Users)

  • B. Jantaranuson

– SNA in IETF Mailing Lists 19

slide-20
SLIDE 20

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Example: httpbis (# of Newcomers)

  • B. Jantaranuson

– SNA in IETF Mailing Lists 20

slide-21
SLIDE 21

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Limitations

  • Very large number of working groups (101 WGs)
  • Lack of background knowledge in each studied WG
  • Hard to evaluate the results of possible analysis
  • The analyses are quite broad and exploratory
  • B. Jantaranuson

– SNA in IETF Mailing Lists 21

slide-22
SLIDE 22

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Conclusion

  • Analysis on IETF mailing lists using SNA, statistics, and visualiza-

tion techniques

  • The analyses are done in Jupyter Notebook and they are easy to

follow and reproducible

  • We can use centrality to identify important people in the social

network

  • We found that sending more emails and starting more threads

make people in the WG more important

  • The centrality in larger WG are more stable
  • We can detect important events e.g. proposing new standards

from the changes of social networks

  • B. Jantaranuson

– SNA in IETF Mailing Lists 22

slide-23
SLIDE 23

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Thank you

  • B. Jantaranuson

– SNA in IETF Mailing Lists 23

slide-24
SLIDE 24

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

[1]

  • C. Bird, A. Gourley, P

. Devanbu, M. Gertz, and A. Swaminathan. Mining email social networks. In Proceedings of the 2006 International Workshop on Mining Software Reposi- tories, MSR ’06, pages 137–143, New York, NY, USA, 2006. ACM. [2]

  • C. Bird, A. Gourley, P

. Devanbu, M. Gertz, and A. Swaminathan. Mining email social networks. In Proceedings of the 2006 International Workshop on Mining Software Reposi- tories, MSR ’06, pages 137–143, New York, NY, USA, 2006. ACM. [3]

  • H. Chen, H. Shen, J. Xiong, S. Tan, and X. Cheng.

Social network structure behind the mailing lists: Ict-iiis at trec 2006 expert find- ing track. The Fifteenth Text REtrieval Conference (TREC 2006) Proceedings, 2006.

  • B. Jantaranuson

– SNA in IETF Mailing Lists 24

slide-25
SLIDE 25

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

[4]

  • J. Diesner, T. L. Frantz, and K. M. Carley.

Communication networks from the enron email corpus “it’s always about the peo-

  • ple. enron is no different”.

Computational & Mathematical Organization Theory, 11(3):201–228, Oct 2005. [5]

  • J. Hardin, G. Sarkis, and P

. C. Urc. Network Analysis with the Enron Email Corpus. ArXiv e-prints, Oct. 2014. [6]

  • K. Juszczyszyn, K. Musiał, P

. Kazienko, and B. Gabrys. Temporal changes in local topology of an email-based social network. COMPUTING AND INFORMATICS, 28(6), 2012. [7]

  • S. Klimek.

mail_threads.sql · GitHub. https://gist.github.com/simkli/8dffa3ee4ce456d9197404f20d40d4a8. Accessed: 2018-02-24.

  • B. Jantaranuson

– SNA in IETF Mailing Lists 25

slide-26
SLIDE 26

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

[8]

  • R. Rowe, G. Creamer, S. Hershkop, and S. J. Stolfo.

Automated social hierarchy detection through email network analysis. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis, WebKDD/SNA-KDD ’07, pages 109–117, New York, NY, USA, 2007. ACM. [9]

  • F. N. Schwellnus.

A heat map for ietf standardization activities. Bachelor’s thesis, Technical University Munich, 2016. [10] G. Wilson and W. Banzhaf. Discovery of email communication networks from the enron corpus with a genetic algorithm using social network analysis. In 2009 IEEE Congress on Evolutionary Computation, pages 3256–3263, May 2009.

  • B. Jantaranuson

– SNA in IETF Mailing Lists 26

slide-27
SLIDE 27

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Backup: Types of Networks

  • Static social networks: analyze in the specific period
  • Dynamic social networks: include temporal features and study

changes of network

  • B. Jantaranuson

– SNA in IETF Mailing Lists 27

slide-28
SLIDE 28

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Backup: Correlation

  • B. Jantaranuson

– SNA in IETF Mailing Lists 28

slide-29
SLIDE 29

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Backup: Box Plot

  • B. Jantaranuson

– SNA in IETF Mailing Lists 29

slide-30
SLIDE 30

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Backup: Relationships of replying emails

  • Using relationships of sending emails is not meaningful in mailing

lists context [1]

  • A directed edge (v1, v2) will be generated if v1 replies an email to

v2

  • We will call this "replies_to"
  • B. Jantaranuson

– SNA in IETF Mailing Lists 30

slide-31
SLIDE 31

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Backup: Relationships of being in same discussion threads

  • Complement to "replies_to" relationship
  • An undirected edge (v1, v2) will be generated if both of v1 and v2

appear in the same discussion thread

  • There’s no need that v1 replies to v2 to create an edge
  • We will call this "in_same_threads"
  • B. Jantaranuson

– SNA in IETF Mailing Lists 31

slide-32
SLIDE 32

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Backup: Creating threads vs Centrality

Type of centrality

  • Avg. corr coeff.

S.D. corr coeff. Eigenvector centrality 0.7617 0.1457 In-degree centrality 0.8132 0.1396 Betweenness centrality 0.8294 0.1604 Closeness centrality 0.4517 0.1109

  • B. Jantaranuson

– SNA in IETF Mailing Lists 32

slide-33
SLIDE 33

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Backup: Data Transformation

  • B. Jantaranuson

– SNA in IETF Mailing Lists 33

slide-34
SLIDE 34

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Response Time Analysis

  • How fast a person respond to an email, given that the specific

person sent him that email?

  • Dyad (pairwise) analysis
  • Assumption: the closer the pair of people is, the faster they might

respond

  • B. Jantaranuson

– SNA in IETF Mailing Lists 34

slide-35
SLIDE 35

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Response Time Analysis

  • Avg correlation of 0.3103 with S.D. of 0.0834
  • B. Jantaranuson

– SNA in IETF Mailing Lists 35

slide-36
SLIDE 36

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Events: tls (# Active Users)

  • B. Jantaranuson

– SNA in IETF Mailing Lists 36

slide-37
SLIDE 37

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Events: tls (# Newcomers)

  • B. Jantaranuson

– SNA in IETF Mailing Lists 37

slide-38
SLIDE 38

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Events: tls

  • In August 2008, TLS v1.2 was proposed as new standard
  • In August 2013, works on TLS v1.2 started
  • B. Jantaranuson

– SNA in IETF Mailing Lists 38

slide-39
SLIDE 39

Chair of Network Architectures and Services Department of Informatics Technical University of Munich

Cross-posts

  • B. Jantaranuson

– SNA in IETF Mailing Lists 39