Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Analysis of Social Networks in IETF Mailing Lists Boonyakorn - - PowerPoint PPT Presentation
Analysis of Social Networks in IETF Mailing Lists Boonyakorn - - PowerPoint PPT Presentation
Chair of Network Architectures and Services Department of Informatics Technical University of Munich Analysis of Social Networks in IETF Mailing Lists Boonyakorn Jantaranuson Final Talk of Masters Thesis June 25, 2018 Chair of Network
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Outlines
Introduction Approach Analysis Conclusion Bibliography
- B. Jantaranuson
– SNA in IETF Mailing Lists 2
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Motivation
- IETF use mailing lists to communicate inside 100+ working groups
- Social interactions happen from communications via a large num-
ber of emails
- Different WGs have different characteristics
- SNA using IETF mailing lists is still new
- B. Jantaranuson
– SNA in IETF Mailing Lists 3
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Dataset
- Archived e-mails are available online
- Database available from Bachelor’s Thesis and IDP of N. Schwell-
nus [9] in PostgreSQL database format
- DB relation of discussion threads is done by the bachelor’s thesis
- f S. Klimek [7].
- B. Jantaranuson
– SNA in IETF Mailing Lists 4
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Relationships between People
- Replying emails
- Appearing in same discussion threads
- B. Jantaranuson
– SNA in IETF Mailing Lists 5
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Example Network (bmwg)
- B. Jantaranuson
– SNA in IETF Mailing Lists 6
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Analysis Workflow
- B. Jantaranuson
– SNA in IETF Mailing Lists 7
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Example 1: Centrality
- Used a lot in studies using SNA to identify important individuals
- In several papers [10, 5, 8], betweenness centrality is a common
measure to identify important people in the Enron company
- Evaluate the importance using peoples’ roles: CEO, manager,
head of departments
- B. Jantaranuson
– SNA in IETF Mailing Lists 8
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Centrality: IETF
- Some people have specific roles in the WGs: chairs, area direc-
tors, secretary, reviewers, tech advisors
- Question: Can we use centrality as a measure to identify important
people in IETF context?
- B. Jantaranuson
– SNA in IETF Mailing Lists 9
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Centrality: Roles Analysis
- B. Jantaranuson
– SNA in IETF Mailing Lists 10
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Centrality: What makes them important?
- Send more emails or start more discussion threads
- B. Jantaranuson
– SNA in IETF Mailing Lists 11
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Centrality: Changes in Centrality
- Use the dynamic social network for the analysis
- Data is from 2007-2017
- Moving window of size 1 year is used to generated a network at a
point of time with 4 months step size
- How stable is the importance of a person?
- Example: v6ops (large WG) and rtgwg (small WG)
- B. Jantaranuson
– SNA in IETF Mailing Lists 12
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Centrality: Changes in Centrality (v6ops)
- B. Jantaranuson
– SNA in IETF Mailing Lists 13
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Centrality: Changes in Centrality (rtgwg)
- B. Jantaranuson
– SNA in IETF Mailing Lists 14
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Example 2: Comparing IETF to Real-World Networks
- Applying SNA to mailing list of Open-Source Software mailing lists
from Bird et al. [1, 2] and conference mailing list from Chen et al. [3]
- They found that those the degree distributions follow power-law
distribution, which is the same as other real-world networks
- Are networks generated from IETF mailing lists the same as real-
world networks?
- B. Jantaranuson
– SNA in IETF Mailing Lists 15
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Comparing IETF to Real-World Networks
- B. Jantaranuson
– SNA in IETF Mailing Lists 16
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Example 3: Potential Events Analysis
- Diesner et al. [4] and Juszczyszyn et al [6]. generated dynamic
social networks and tracked the changes of e.g. # of active users, network density, average centrality
- In Enron case [4], they spotted significant rise in the period before
company crisis (bankrupt)
- In university emails case [6], they found that the significant drop in
the changes related to semester break
- There are several events occurred in IETF community
- Can we detect potential events from the change of social net-
works?
- B. Jantaranuson
– SNA in IETF Mailing Lists 17
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Example: httpbis
- We analyze from the changes of the number of active users and
the number of newcomers
- In January 2012, the initiative for HTTP/2 was announced
- In July 2014, the WG obsoleted the RFC of HTTP/1.1
- In May 2015, the WG released the RFC of HTTP/2
- B. Jantaranuson
– SNA in IETF Mailing Lists 18
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Example: httpbis (# of Active Users)
- B. Jantaranuson
– SNA in IETF Mailing Lists 19
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Example: httpbis (# of Newcomers)
- B. Jantaranuson
– SNA in IETF Mailing Lists 20
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Limitations
- Very large number of working groups (101 WGs)
- Lack of background knowledge in each studied WG
- Hard to evaluate the results of possible analysis
- The analyses are quite broad and exploratory
- B. Jantaranuson
– SNA in IETF Mailing Lists 21
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Conclusion
- Analysis on IETF mailing lists using SNA, statistics, and visualiza-
tion techniques
- The analyses are done in Jupyter Notebook and they are easy to
follow and reproducible
- We can use centrality to identify important people in the social
network
- We found that sending more emails and starting more threads
make people in the WG more important
- The centrality in larger WG are more stable
- We can detect important events e.g. proposing new standards
from the changes of social networks
- B. Jantaranuson
– SNA in IETF Mailing Lists 22
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Thank you
- B. Jantaranuson
– SNA in IETF Mailing Lists 23
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
[1]
- C. Bird, A. Gourley, P
. Devanbu, M. Gertz, and A. Swaminathan. Mining email social networks. In Proceedings of the 2006 International Workshop on Mining Software Reposi- tories, MSR ’06, pages 137–143, New York, NY, USA, 2006. ACM. [2]
- C. Bird, A. Gourley, P
. Devanbu, M. Gertz, and A. Swaminathan. Mining email social networks. In Proceedings of the 2006 International Workshop on Mining Software Reposi- tories, MSR ’06, pages 137–143, New York, NY, USA, 2006. ACM. [3]
- H. Chen, H. Shen, J. Xiong, S. Tan, and X. Cheng.
Social network structure behind the mailing lists: Ict-iiis at trec 2006 expert find- ing track. The Fifteenth Text REtrieval Conference (TREC 2006) Proceedings, 2006.
- B. Jantaranuson
– SNA in IETF Mailing Lists 24
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
[4]
- J. Diesner, T. L. Frantz, and K. M. Carley.
Communication networks from the enron email corpus “it’s always about the peo-
- ple. enron is no different”.
Computational & Mathematical Organization Theory, 11(3):201–228, Oct 2005. [5]
- J. Hardin, G. Sarkis, and P
. C. Urc. Network Analysis with the Enron Email Corpus. ArXiv e-prints, Oct. 2014. [6]
- K. Juszczyszyn, K. Musiał, P
. Kazienko, and B. Gabrys. Temporal changes in local topology of an email-based social network. COMPUTING AND INFORMATICS, 28(6), 2012. [7]
- S. Klimek.
mail_threads.sql · GitHub. https://gist.github.com/simkli/8dffa3ee4ce456d9197404f20d40d4a8. Accessed: 2018-02-24.
- B. Jantaranuson
– SNA in IETF Mailing Lists 25
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
[8]
- R. Rowe, G. Creamer, S. Hershkop, and S. J. Stolfo.
Automated social hierarchy detection through email network analysis. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis, WebKDD/SNA-KDD ’07, pages 109–117, New York, NY, USA, 2007. ACM. [9]
- F. N. Schwellnus.
A heat map for ietf standardization activities. Bachelor’s thesis, Technical University Munich, 2016. [10] G. Wilson and W. Banzhaf. Discovery of email communication networks from the enron corpus with a genetic algorithm using social network analysis. In 2009 IEEE Congress on Evolutionary Computation, pages 3256–3263, May 2009.
- B. Jantaranuson
– SNA in IETF Mailing Lists 26
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Backup: Types of Networks
- Static social networks: analyze in the specific period
- Dynamic social networks: include temporal features and study
changes of network
- B. Jantaranuson
– SNA in IETF Mailing Lists 27
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Backup: Correlation
- B. Jantaranuson
– SNA in IETF Mailing Lists 28
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Backup: Box Plot
- B. Jantaranuson
– SNA in IETF Mailing Lists 29
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Backup: Relationships of replying emails
- Using relationships of sending emails is not meaningful in mailing
lists context [1]
- A directed edge (v1, v2) will be generated if v1 replies an email to
v2
- We will call this "replies_to"
- B. Jantaranuson
– SNA in IETF Mailing Lists 30
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Backup: Relationships of being in same discussion threads
- Complement to "replies_to" relationship
- An undirected edge (v1, v2) will be generated if both of v1 and v2
appear in the same discussion thread
- There’s no need that v1 replies to v2 to create an edge
- We will call this "in_same_threads"
- B. Jantaranuson
– SNA in IETF Mailing Lists 31
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Backup: Creating threads vs Centrality
Type of centrality
- Avg. corr coeff.
S.D. corr coeff. Eigenvector centrality 0.7617 0.1457 In-degree centrality 0.8132 0.1396 Betweenness centrality 0.8294 0.1604 Closeness centrality 0.4517 0.1109
- B. Jantaranuson
– SNA in IETF Mailing Lists 32
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Backup: Data Transformation
- B. Jantaranuson
– SNA in IETF Mailing Lists 33
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Response Time Analysis
- How fast a person respond to an email, given that the specific
person sent him that email?
- Dyad (pairwise) analysis
- Assumption: the closer the pair of people is, the faster they might
respond
- B. Jantaranuson
– SNA in IETF Mailing Lists 34
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Response Time Analysis
- Avg correlation of 0.3103 with S.D. of 0.0834
- B. Jantaranuson
– SNA in IETF Mailing Lists 35
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Events: tls (# Active Users)
- B. Jantaranuson
– SNA in IETF Mailing Lists 36
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Events: tls (# Newcomers)
- B. Jantaranuson
– SNA in IETF Mailing Lists 37
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Events: tls
- In August 2008, TLS v1.2 was proposed as new standard
- In August 2013, works on TLS v1.2 started
- B. Jantaranuson
– SNA in IETF Mailing Lists 38
Chair of Network Architectures and Services Department of Informatics Technical University of Munich
Cross-posts
- B. Jantaranuson