Analyzing the Facebook friendship graph . De Meo 1 , 2 , E. Ferrara 3 - - PowerPoint PPT Presentation

analyzing the facebook friendship graph
SMART_READER_LITE
LIVE PREVIEW

Analyzing the Facebook friendship graph . De Meo 1 , 2 , E. Ferrara 3 - - PowerPoint PPT Presentation

Analyzing the Facebook friendship graph . De Meo 1 , 2 , E. Ferrara 3 , G. Fiumara 1 and A. S. Catanese 1 , P Provetti 1 , 4 1 Dept. of Physics, Informatics Section, University of Messina 2 Dept. of Computer Sciences, Vrije Universiteit Amsterdam 3


slide-1
SLIDE 1

Analyzing the Facebook friendship graph

  • S. Catanese1, P

. De Meo1,2, E. Ferrara3, G. Fiumara1 and A. Provetti1,4

  • 1Dept. of Physics, Informatics Section, University of Messina
  • 2Dept. of Computer Sciences, Vrije Universiteit Amsterdam
  • 3Dept. of Mathematics, University of Messina

3Oxford-Man Institute, University of Oxford

Int’l Conf. on Web Intelligence, Mining and Semantics

May 26th 2011, Sogndal

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 1 / 43

slide-2
SLIDE 2

Outline

1

Motivation Main objective The Basic Problem Classic Work

2

Our Results/Contribution Data Extraction and Cleaning Data Analysis Main Results

3

Future Issues

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 2 / 43

slide-3
SLIDE 3

Outline

1

Motivation Main objective The Basic Problem Classic Work

2

Our Results/Contribution Data Extraction and Cleaning Data Analysis Main Results

3

Future Issues

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 3 / 43

slide-4
SLIDE 4

Main objective

Extract a (partial) graph of friendship relations from Facebook

◮ starting from the friendlist of a real user ◮ accessing only publicly accessible data of Facebook users

using:

◮ a wrapper (for extraction, cleaning and normalization of data) ◮ a tool for graph visualization and analysis

developed by some of us

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 4 / 43

slide-5
SLIDE 5

Main objective

Extract a (partial) graph of friendship relations from Facebook

◮ starting from the friendlist of a real user ◮ accessing only publicly accessible data of Facebook users

using:

◮ a wrapper (for extraction, cleaning and normalization of data) ◮ a tool for graph visualization and analysis

developed by some of us

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 4 / 43

slide-6
SLIDE 6

Main objective

Extract a (partial) graph of friendship relations from Facebook

◮ starting from the friendlist of a real user ◮ accessing only publicly accessible data of Facebook users

using:

◮ a wrapper (for extraction, cleaning and normalization of data) ◮ a tool for graph visualization and analysis

developed by some of us

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 4 / 43

slide-7
SLIDE 7

Outline

1

Motivation Main objective The Basic Problem Classic Work

2

Our Results/Contribution Data Extraction and Cleaning Data Analysis Main Results

3

Future Issues

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 5 / 43

slide-8
SLIDE 8

Social Networks

A Taxonomy

Social Networks (SN)

Described with graphs representing users and relationships among them Organizational Networks Collaboration Networks Communication Networks Friendship Networks Online Social Networks (OSNs) [1]:

◮ Social Communities: Facebook, MySpace, etc. ◮ Social Bookmarking: Digg, Delicious, etc. ◮ Content Sharing: YouTube, Flickr, etc. Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 6 / 43

slide-9
SLIDE 9

Social Networks

A Taxonomy

Social Networks (SN)

Described with graphs representing users and relationships among them Organizational Networks Collaboration Networks Communication Networks Friendship Networks Online Social Networks (OSNs) [1]:

◮ Social Communities: Facebook, MySpace, etc. ◮ Social Bookmarking: Digg, Delicious, etc. ◮ Content Sharing: YouTube, Flickr, etc. Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 6 / 43

slide-10
SLIDE 10

Social Networks

Examples

fig1-a.png

Figure: Organizational Network

fig1-c.png

Figure: Friendship Network

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 7 / 43

slide-11
SLIDE 11

Mining Online Social Networks

Motivation

Is the distribution of friendship computable? Calculating graph properties of OSNs Exploiting new algorithms in following tasks:

◮ Walking through a large graph (e.g. BFS, MHRW, etc.) ◮ Data compression (matrix decomposition, quadtrees, etc.) ◮ Efficient visualization of large graphs ◮ Clustering data (Fruchterman-Reingold, Harel-Koren, etc.) ◮ Optimize efficiency in metrics evaluation (e.g. All-Pairs

Shortest-Paths related: BC, CC, diameter, etc.) Studying the scalability of the problem Investigating similarities between OSNs and real-life SNs

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 8 / 43

slide-12
SLIDE 12

Mining Online Social Networks

Motivation

Is the distribution of friendship computable? Calculating graph properties of OSNs Exploiting new algorithms in following tasks:

◮ Walking through a large graph (e.g. BFS, MHRW, etc.) ◮ Data compression (matrix decomposition, quadtrees, etc.) ◮ Efficient visualization of large graphs ◮ Clustering data (Fruchterman-Reingold, Harel-Koren, etc.) ◮ Optimize efficiency in metrics evaluation (e.g. All-Pairs

Shortest-Paths related: BC, CC, diameter, etc.) Studying the scalability of the problem Investigating similarities between OSNs and real-life SNs

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 8 / 43

slide-13
SLIDE 13

Mining Online Social Networks

Motivation

Is the distribution of friendship computable? Calculating graph properties of OSNs Exploiting new algorithms in following tasks:

◮ Walking through a large graph (e.g. BFS, MHRW, etc.) ◮ Data compression (matrix decomposition, quadtrees, etc.) ◮ Efficient visualization of large graphs ◮ Clustering data (Fruchterman-Reingold, Harel-Koren, etc.) ◮ Optimize efficiency in metrics evaluation (e.g. All-Pairs

Shortest-Paths related: BC, CC, diameter, etc.) Studying the scalability of the problem Investigating similarities between OSNs and real-life SNs

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 8 / 43

slide-14
SLIDE 14

Mining Online Social Networks

Motivation

Is the distribution of friendship computable? Calculating graph properties of OSNs Exploiting new algorithms in following tasks:

◮ Walking through a large graph (e.g. BFS, MHRW, etc.) ◮ Data compression (matrix decomposition, quadtrees, etc.) ◮ Efficient visualization of large graphs ◮ Clustering data (Fruchterman-Reingold, Harel-Koren, etc.) ◮ Optimize efficiency in metrics evaluation (e.g. All-Pairs

Shortest-Paths related: BC, CC, diameter, etc.) Studying the scalability of the problem Investigating similarities between OSNs and real-life SNs

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 8 / 43

slide-15
SLIDE 15

Mining Online Social Networks

Motivation

Is the distribution of friendship computable? Calculating graph properties of OSNs Exploiting new algorithms in following tasks:

◮ Walking through a large graph (e.g. BFS, MHRW, etc.) ◮ Data compression (matrix decomposition, quadtrees, etc.) ◮ Efficient visualization of large graphs ◮ Clustering data (Fruchterman-Reingold, Harel-Koren, etc.) ◮ Optimize efficiency in metrics evaluation (e.g. All-Pairs

Shortest-Paths related: BC, CC, diameter, etc.) Studying the scalability of the problem Investigating similarities between OSNs and real-life SNs

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 8 / 43

slide-16
SLIDE 16

Mining Online Social Networks

Pros and Cons

Pros:

◮ Large-scale studies of phenomena and behaviors impossible before ◮ Relations among users are clearly defined ◮ Data can be automatically acquired ◮ Huge amount of information can be mined ◮ Several levels of granularity can be established

Cons:

◮ Large-scale mining issues ◮ Computational and algorithmic challenges ◮ Online friendship = Real-life friendship ◮ Bias of data depends on visiting algorithm [2] Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 9 / 43

slide-17
SLIDE 17

Mining Online Social Networks

Pros and Cons

Pros:

◮ Large-scale studies of phenomena and behaviors impossible before ◮ Relations among users are clearly defined ◮ Data can be automatically acquired ◮ Huge amount of information can be mined ◮ Several levels of granularity can be established

Cons:

◮ Large-scale mining issues ◮ Computational and algorithmic challenges ◮ Online friendship = Real-life friendship ◮ Bias of data depends on visiting algorithm [2] Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 9 / 43

slide-18
SLIDE 18

Outline

1

Motivation Main objective The Basic Problem Classic Work

2

Our Results/Contribution Data Extraction and Cleaning Data Analysis Main Results

3

Future Issues

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 10 / 43

slide-19
SLIDE 19

Classic Work on (online or offline) SNs

Milgram, Travers [3]: the Small World problem (1969-70) Zachary [4]: ’mining’ and modeling real-life SNs (1980) Kleinberg [5]: the small world problem from an algorithmic perspective (2000) Golbeck et al. [6]: social networks vs OSNs (2005) Barabasi [7], Leskovec [8], Shneiderman [9], etc.: all focusing on OSNs and their analysis (nowadays)

◮ Online Social Network Analysis and Tools ◮ Large-scale data mining from OSNs ◮ Visualization of large graphs ◮ Bias of data acquired from OSNs ◮ Dynamics and evolution of OSNs Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 11 / 43

slide-20
SLIDE 20

Classic Work on (online or offline) SNs

Milgram, Travers [3]: the Small World problem (1969-70) Zachary [4]: ’mining’ and modeling real-life SNs (1980) Kleinberg [5]: the small world problem from an algorithmic perspective (2000) Golbeck et al. [6]: social networks vs OSNs (2005) Barabasi [7], Leskovec [8], Shneiderman [9], etc.: all focusing on OSNs and their analysis (nowadays)

◮ Online Social Network Analysis and Tools ◮ Large-scale data mining from OSNs ◮ Visualization of large graphs ◮ Bias of data acquired from OSNs ◮ Dynamics and evolution of OSNs Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 11 / 43

slide-21
SLIDE 21

Classic Work on (online or offline) SNs

Milgram, Travers [3]: the Small World problem (1969-70) Zachary [4]: ’mining’ and modeling real-life SNs (1980) Kleinberg [5]: the small world problem from an algorithmic perspective (2000) Golbeck et al. [6]: social networks vs OSNs (2005) Barabasi [7], Leskovec [8], Shneiderman [9], etc.: all focusing on OSNs and their analysis (nowadays)

◮ Online Social Network Analysis and Tools ◮ Large-scale data mining from OSNs ◮ Visualization of large graphs ◮ Bias of data acquired from OSNs ◮ Dynamics and evolution of OSNs Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 11 / 43

slide-22
SLIDE 22

Classic Work on (online or offline) SNs

Milgram, Travers [3]: the Small World problem (1969-70) Zachary [4]: ’mining’ and modeling real-life SNs (1980) Kleinberg [5]: the small world problem from an algorithmic perspective (2000) Golbeck et al. [6]: social networks vs OSNs (2005) Barabasi [7], Leskovec [8], Shneiderman [9], etc.: all focusing on OSNs and their analysis (nowadays)

◮ Online Social Network Analysis and Tools ◮ Large-scale data mining from OSNs ◮ Visualization of large graphs ◮ Bias of data acquired from OSNs ◮ Dynamics and evolution of OSNs Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 11 / 43

slide-23
SLIDE 23

Classic Work on (online or offline) SNs

Milgram, Travers [3]: the Small World problem (1969-70) Zachary [4]: ’mining’ and modeling real-life SNs (1980) Kleinberg [5]: the small world problem from an algorithmic perspective (2000) Golbeck et al. [6]: social networks vs OSNs (2005) Barabasi [7], Leskovec [8], Shneiderman [9], etc.: all focusing on OSNs and their analysis (nowadays)

◮ Online Social Network Analysis and Tools ◮ Large-scale data mining from OSNs ◮ Visualization of large graphs ◮ Bias of data acquired from OSNs ◮ Dynamics and evolution of OSNs Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 11 / 43

slide-24
SLIDE 24

Outline

1

Motivation Main objective The Basic Problem Classic Work

2

Our Results/Contribution Data Extraction and Cleaning Data Analysis Main Results

3

Future Issues

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 12 / 43

slide-25
SLIDE 25

Mining the Facebook graph

Visiting Algorithm

BFS approach: starting from a single seed (a FB profile), visiting friend-lists of nodes in order

  • f discovering.

Pros: Optimal solution for unw. und. graphs Implementation is easy and intuitive Cons: Introduces bias in incomplete visits Challenges: FB anti-data mining policies

fig2.png

Figure: Breadth-first search (3rd sub-level) 1 Seed 2-4 Friends 5-8 Friends of friends 9-12 Friends of fr. of fr.

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 13 / 43

slide-26
SLIDE 26

Mining the Facebook graph

Design of the Mining Agent

Figure: State Diagram of the Data Mining Process

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 14 / 43

slide-27
SLIDE 27

Mining the Facebook graph

Architecture

Java application Firefox browser embedded XPCOM/XULRunner interface Web pages spider Wrapper fig10.png

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 15 / 43

slide-28
SLIDE 28

Mining the Facebook graph

How the Agent Works

Agent Initialization: FB authentication → Seed friend-list page Selection an example friend → XPath extraction Wrapper generation and adaptation Wrapper execution → Generation of the queue Agent Execution: Load FIFO queue For all the user profiles in the queue:

◮ Visit friend-list page of the current user ⋆ Extract friends (nodes) and save friendships (edges) ⋆ Insert unvisited profiles in the queue ◮ Visit ’next pages’ of the friend-list ◮ Cycle the process Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 16 / 43

slide-29
SLIDE 29

Mining the Facebook graph

Handling Data

Possible representations of vis- ited nodes and edges: Adjacency list Adjacency matrix fig5.png Possible representation of BFS visit for unvisited nodes: FIFO queue fig8.png

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 17 / 43

slide-30
SLIDE 30

Mining the Facebook graph

Cleaning Data

removing duplicate nodes exploiting hash tables relinking edges deleting parallel edges Data cleaning: O(n) time (optimal) fig9.png Structured Format: Clean data is saved under the XML structure GraphML fig6.png

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 18 / 43

slide-31
SLIDE 31

Mining the Facebook graph

Agent Running

fig7.png

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 19 / 43

slide-32
SLIDE 32

Outline

1

Motivation Main objective The Basic Problem Classic Work

2

Our Results/Contribution Data Extraction and Cleaning Data Analysis Main Results

3

Future Issues

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 20 / 43

slide-33
SLIDE 33

Network Analysis Metrics

Types of Networks

Classifications of several types of networks exist. They affect metrics and maps generated in order to reflect their interpretation. Networks member’s point of view

◮ Egocentric ◮ Partial ◮ Full

Networks entity’s point of view

◮ Unimodal ◮ Multimodal ◮ Bimodal ◮ Affiliation

Multiplex networks

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 21 / 43

slide-34
SLIDE 34

Network Analysis Metrics

Types of Networks

Classifications of several types of networks exist. They affect metrics and maps generated in order to reflect their interpretation. Networks member’s point of view

◮ Egocentric ◮ Partial ◮ Full

Networks entity’s point of view

◮ Unimodal ◮ Multimodal ◮ Bimodal ◮ Affiliation

Multiplex networks

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 21 / 43

slide-35
SLIDE 35

Network Analysis Metrics

Types of Networks

Classifications of several types of networks exist. They affect metrics and maps generated in order to reflect their interpretation. Networks member’s point of view

◮ Egocentric ◮ Partial ◮ Full

Networks entity’s point of view

◮ Unimodal ◮ Multimodal ◮ Bimodal ◮ Affiliation

Multiplex networks

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 21 / 43

slide-36
SLIDE 36

Network Analysis Metrics

Facebook Friendship Network

Facebook characteristics: Egocentric networks: the term ego denotes a person connected to everyone (alter) in the network Unweighted, undirected network:

◮ 1.0 degree ◮ 1.5 degree ◮ 2.0 degree

fig12.png

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 22 / 43

slide-37
SLIDE 37

Network Analysis Metrics

Facebook Friendship Network

Facebook characteristics: Egocentric networks: the term ego denotes a person connected to everyone (alter) in the network Unweighted, undirected network:

◮ 1.0 degree ◮ 1.5 degree ◮ 2.0 degree

fig12.png

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 22 / 43

slide-38
SLIDE 38

Network Analysis Metrics

Facebook Friendship Network

Facebook characteristics: Egocentric networks: the term ego denotes a person connected to everyone (alter) in the network Unweighted, undirected network:

◮ 1.0 degree ◮ 1.5 degree ◮ 2.0 degree

fig12.png

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 22 / 43

slide-39
SLIDE 39

Network Analysis Metrics

Measures

Network metrics

Allow analysts to systematically dissect the social world, creating a basis on which to compare networks, track changes in a network over time and determine the relative position of individuals and clusters within the network. Research focuses on:

◮ Structure of the whole graph; ◮ Large sub-graphs; ◮ Identifying individual nodes of particular interest; ◮ Analyze the whole graph aggregated over its entire lifetime; ◮ To slice the network into units of time to explore the progression of

the development of the network.

A starting point: list from Perer and Shneiderman

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 23 / 43

slide-40
SLIDE 40

Network Analysis Metrics

Measures

Network metrics

Allow analysts to systematically dissect the social world, creating a basis on which to compare networks, track changes in a network over time and determine the relative position of individuals and clusters within the network. Research focuses on:

◮ Structure of the whole graph; ◮ Large sub-graphs; ◮ Identifying individual nodes of particular interest; ◮ Analyze the whole graph aggregated over its entire lifetime; ◮ To slice the network into units of time to explore the progression of

the development of the network.

A starting point: list from Perer and Shneiderman

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 23 / 43

slide-41
SLIDE 41

Network Analysis Metrics

Measures - Perer and Shneiderman List

Overall network metrics: number of nodes, number of edges, density, diameter ecc; Node rankings: degree, betweenness and closeness centrality; Edge rankings: weight, betweenness centrality; Node rankings in pairs: degree vs. betweenness, plotted on a scatter gram; Edge rankings in pairs; Cohesive subgroups: finding communities; Multiplexity: analyzing comparisons among different edge types.

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 24 / 43

slide-42
SLIDE 42

Analyzing Social Networks

Visual Analysis: Motivation

Visualizing Social Networks

Constructing visual images of social networks provides insights about the structure of a network, so as representing a visual support for explaining network phenomena [10]. Graph drawing issues:

◮ As network complexity increases, its illegibility increases as well; ◮ Interactive operations on nodes, such as filtering or manual

placement, are needed

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 25 / 43

slide-43
SLIDE 43

Analyzing Social Networks

Better-quality Network Visualization

Readability Metrics (RMs)

RMs measure how much understandable is the graph drawing (such as the number of edge crossings or occluded nodes in the drawing) [11]. Each algorithm attempts to find an optimal layout of the graph,

  • ften according to a set of readability metrics;

A simple interim set of guidelines might aspire to the four principles of NetViz Nirvana [12]:

◮ Every vertex is visible; ◮ Every vertex’s degree is countable; ◮ Every edge can be followed from source to destination; ◮ Clusters and outliers are identifiable.

Approach: layout and filtering techniques.

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 26 / 43

slide-44
SLIDE 44

Analyzing Social Networks

Better-quality Network Visualization

Readability Metrics (RMs)

RMs measure how much understandable is the graph drawing (such as the number of edge crossings or occluded nodes in the drawing) [11]. Each algorithm attempts to find an optimal layout of the graph,

  • ften according to a set of readability metrics;

A simple interim set of guidelines might aspire to the four principles of NetViz Nirvana [12]:

◮ Every vertex is visible; ◮ Every vertex’s degree is countable; ◮ Every edge can be followed from source to destination; ◮ Clusters and outliers are identifiable.

Approach: layout and filtering techniques.

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 26 / 43

slide-45
SLIDE 45

Analyzing Social Networks

Better-quality Network Visualization

Readability Metrics (RMs)

RMs measure how much understandable is the graph drawing (such as the number of edge crossings or occluded nodes in the drawing) [11]. Each algorithm attempts to find an optimal layout of the graph,

  • ften according to a set of readability metrics;

A simple interim set of guidelines might aspire to the four principles of NetViz Nirvana [12]:

◮ Every vertex is visible; ◮ Every vertex’s degree is countable; ◮ Every edge can be followed from source to destination; ◮ Clusters and outliers are identifiable.

Approach: layout and filtering techniques.

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 26 / 43

slide-46
SLIDE 46

SNA Tools

Some Powerful Tools and Libraries Adopted

GUESS focuses on improving the interactive exploration of graphs. NodeXL developed as an add-in to the Microsoft Excel 2007 spreadsheet software, provides tools for network overview, discovery and exploration. LogAnalysis helps forensic analysts in visual statistical analysis

  • f mobile phone traffic networks.

Jung and Prefuse provide Java APIs implementing algorithms and methods for building applications for graphical visualization and SNA for graphs. A list of other SNA tools for extract, analyze and display social media networks can be found on International Network for Social Network Analysis (INSNA) site 1.

1http://www.insna.org/software/index.html Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 27 / 43

slide-47
SLIDE 47

Outline

1

Motivation Main objective The Basic Problem Classic Work

2

Our Results/Contribution Data Extraction and Cleaning Data Analysis Main Results

3

Future Issues

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 28 / 43

slide-48
SLIDE 48

Facebook Network Analysis

NodeXL - Overall Metrics

Graph Type: Undirected Vertices: 547,302 Unique Edges: 836,468 Edges With Duplicates: Total Edges: 836,468 Self-Loops: Connected Components: 2 Single-Vertex Connected Components: Maximum Vertices in a Connected Component: 546,733 Maximum Edges in a Connected Component: 835.9 Maximum Geodesic Distance (Diameter): 10 Average Geodesic Distance: 5.00

Table: Overall Network Metrics

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 29 / 43

slide-49
SLIDE 49

Facebook Network Analysis

NodeXL - Miscellaneous Metrics

Minimum Maximum Average Median Degree 1 4,958 3.057 1.000 PageRank 0.269 2,120.268 1.000 0.491 Clustering Coefficient 0.000 1.000 0.053 0.000 Eigenvector Centrality 0.000 0.003 0.000 0.000

Table: Miscellaneous Metrics

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 30 / 43

slide-50
SLIDE 50

Facebook Network Graph

LogAnalysis Force Directed Filtered View (25K Nodes Sub-graph)

fig7cat.png

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 31 / 43

slide-51
SLIDE 51

Facebook Network Graph

LogAnalysis Force Directed Filtered View (2.0 degree)

fig8cat.png

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 32 / 43

slide-52
SLIDE 52

Facebook Network Graph

LogAnalysis Force Directed Aggregate Filtered View (2.0 Degree)

fig9cat.png

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 33 / 43

slide-53
SLIDE 53

Facebook Network Graph

NodeXL Visualization (25K Nodes Sub-graph)

fig10cat.png

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 34 / 43

slide-54
SLIDE 54

Facebook Network Graph

NodeXL Filtered Visualization (25K Nodes Sub-graph)

fig11cat.png

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 35 / 43

slide-55
SLIDE 55

Facebook Network Graph

NodeXL Filtered Visualization (25K Nodes Sub-graph)

fig12cat.png

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 36 / 43

slide-56
SLIDE 56

Metrics Importance: Betweenness Centrality

Top 25 Nodes Ordered by BC (25K Nodes Sub-graph)

fig4.png

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 37 / 43

slide-57
SLIDE 57

Future Issues

Enrich the sample (currently 5 million nodes and 15 million edges) Refine features and metrics thanks to larger sample Study communities emerging from the overall graph Implement parallel techniques to speed-up metrics calculations Determine scaling (-up and -down) coefficients How visiting algorithms affect extracted data Dynamic (i.e., temporal) evolution of the graph

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 38 / 43

slide-58
SLIDE 58

Thank you

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 39 / 43

slide-59
SLIDE 59

For Further Reading I

  • R. Kumar

Online social networks: modeling and mining

  • Proc. of the 2nd ACM International Conference on Web Search and Data

Mining, 2009

  • M. Kurant, A. Markopoulou, P

. Thiran

On the bias of BFS Arxiv preprint arXiv:1004.1729, 2010

  • S. Milgram, J. Travers

An experimental study of the small world problem Sociometry, 32(4), 1969

  • W. Zachary

A language for modeling and simulating social process PhD Thesis, 1980

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 40 / 43

slide-60
SLIDE 60

For Further Reading II

  • J. Kleinberg

The small-world phenomenon: an algorithm perspective

  • Proc. of the 32nd ACM symposium on Theory of computing, 2000
  • J. Golbeck et al.

Social networks applied

IEEE Intelligent Systems, 20(1), 2005

A.L. Barabasi et al. Linked: the new science of networks

American Journal of Physics, 71(4), 2003

  • J. Leskovec

Dynamics of large networks

PhD Thesis, 2008

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 41 / 43

slide-61
SLIDE 61

For Further Reading III

  • B. Shneiderman

Analyzing (social media) networks with NodeXL

  • Proc. of the 4th International Conference on Communities and

Technologies, 2009

L.C. Freeman Visualizing Social Networks

Journal of Social Structure, 2000

  • B. Shneiderman, C. Dunne

Improving Graph Drawing Readability by Incorporating Readability Metrics: A Software Tool for Network Analysts

University of Maryland, HCIL Tech Report HCIL-2009-13, May 2009

  • B. Shneiderman, A. Aris

Network Visualization with Semantic Substrates

Ieee Symposium on Information Visualization and Ieee Trans, Visualization and Computer Graphics 12 (5) (2006) 733-740

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 42 / 43

slide-62
SLIDE 62

For Further Reading IV

Catanese, De Meo, Ferrara, Fiumara & Provetti () Analyzing the Facebook friendship graph WIMS11 43 / 43