Applying Social Network Analysis (SNA) to P2P File Sharing Andreas - - PowerPoint PPT Presentation

applying social network analysis sna to p2p file sharing
SMART_READER_LITE
LIVE PREVIEW

Applying Social Network Analysis (SNA) to P2P File Sharing Andreas - - PowerPoint PPT Presentation

Applying Social Network Analysis (SNA) to P2P File Sharing Andreas Schaufelbhl Robin Stohler Benjamin Brgisser P2P Centralized - Napster Decentralized - Gnutella 0.4 /Freenet Hybrid - Gnutella 0.6 BitTorrent BitTorrent BitTorrent


slide-1
SLIDE 1

Applying Social Network Analysis (SNA) to P2P File Sharing

Andreas Schaufelbühl Robin Stohler Benjamin Bürgisser

slide-2
SLIDE 2

P2P

slide-3
SLIDE 3

Centralized - Napster

slide-4
SLIDE 4

Decentralized - Gnutella 0.4 /Freenet

slide-5
SLIDE 5

Hybrid - Gnutella 0.6

slide-6
SLIDE 6

BitTorrent

slide-7
SLIDE 7

BitTorrent

slide-8
SLIDE 8

BitTorrent

slide-9
SLIDE 9

BitTorrent

slide-10
SLIDE 10

BitTorrent

slide-11
SLIDE 11

SNA

slide-12
SLIDE 12

SNA

Graph with Nodes and Edges Nodes: Individuals or Groups Edges: Relationships, interaction

slide-13
SLIDE 13

Measurements

slide-14
SLIDE 14

Network-centric measurements

slide-15
SLIDE 15

Network size

Counting number of edges/nodes + simple

  • Not significant, just

describing the dimension Networksize(nodes): 7 Networksize(edges): 8

slide-16
SLIDE 16

Network compactness

Edges/possibly existing edges + describes comparative compactness +ratio possible to compare

  • Only a general view, no

statement about specific node/ area

slide-17
SLIDE 17

Average degree

+ describes comparative compactness/cohesion +Node interconnectivity compare to average node centric!

sum of all degres number of nodes

slide-18
SLIDE 18

Diameter

Longest shortest path in network = max {(, )} S(i,j): shortest path between any two nodes i,j +measurement of distance in network

  • scales with the number
  • f nodes no

comparision possible for different sized networks

slide-19
SLIDE 19

Measures of Connectivity

How many edges or nodes to remove until it falls in multiple parts? Searching weakest Link Describes cohesion/reliability high number high reliability

slide-20
SLIDE 20

Global Clustering coefficient

Describes ratio between triangles and triplets Range: [0,1] High global clustering good connectivity

/5 2 = 0.4

slide-21
SLIDE 21

Node-centric measurements

slide-22
SLIDE 22

Degree

Node 3: Degree of 4

Number of edges connected to node +simple +number of connectivities comparable

  • No information about importance of the connectivities
slide-23
SLIDE 23

Betweenness centrality

Sum of all shortest paths connecting two nodes, passing the measured node, divided by all shortest path connecting the same two nodes, including the shortest paths not passing the measured node Node 2: Node 3:

  • 1

2 2 2

  • + 1

2 + 2 2 High number important node

  • Scales with number of nodes

No comparision between different networks Divide by number of nodes + importance of one node

slide-24
SLIDE 24

Closeness centrality

Inverse of farness from one node to all other Showing centrality of a specific node

slide-25
SLIDE 25

Eccentricity

Number of longest shortest path for a node = max ! ", # ∶ # ∈ & e(u): eccentricity d(y,x): shortest path yx

  • scales with number of nodes

How far from the furtest other?

slide-26
SLIDE 26

Eigenvector centrality

  • Assings relative score to a node
  • High scoring neighbours raise the score of the

node

  • Measurement of influence

Examples: Google PageRank, Katz centrality

slide-27
SLIDE 27

Local Clustering coefficent

('(−1) 2 '( Edges in neighborhood possibly existing edges in neighborhood )( = 2|

+,: .+,/0 ∈ 1(, +, ∈ 2 |

'(('( − 1) Num of max edges in N: +Comparative number describing clustering of node

slide-28
SLIDE 28

Coreness, k-core

Largest subgraph of connected nodes, where each node has degree of at least k 1-core Subgraph 2-core Subgraph 3-core Subgraph 4-core Subgraph Rank of node: combination of degree and centrality

slide-29
SLIDE 29

Comparison of Centrality Measurements

Low Degree Low Closeness Low Betweenness High Degree Key player tied to important important/active alters Ego's connections are redundant - communication bypasses him/her High Closeness Key player tied to important important/active alters Probably multiple paths in the network, ego is near many people, but so are many others High Betweenness Ego's few ties are crucial for network flow Very rare cell. Would mean that ego monopolizes the ties from a small number

  • f people to many
  • thers.
slide-30
SLIDE 30

Applications of SNA

slide-31
SLIDE 31

SNA

slide-32
SLIDE 32

Facebook

slide-33
SLIDE 33
slide-34
SLIDE 34

Social Network Analysis of Terrorist Networks

Two initial suspects linked to al-Qaeda

slide-35
SLIDE 35

Social Network Analysis of Terrorist Networks

Direct links to original suspects

slide-36
SLIDE 36

Social Network Analysis of Terrorist Networks

Indirect links to original suspects

slide-37
SLIDE 37

Social Network Analysis of Terrorist Networks

Mohammed Atta discovered to be local leader

slide-38
SLIDE 38

Page Rank

slide-39
SLIDE 39

SNA in the Enterprise

slide-40
SLIDE 40

Different possibilities to model a graph

slide-41
SLIDE 41
  • Time
slide-42
SLIDE 42
  • Weight

http://irishbrentgoose.blogspot.ch/2012/07/social-networks-revisited.html

slide-43
SLIDE 43
  • Directed
slide-44
SLIDE 44

One Mode Two Mode

slide-45
SLIDE 45

Our Model of the BitTorrent Network

slide-46
SLIDE 46

Our model of the BitTorrent Network

slide-47
SLIDE 47

Our model

slide-48
SLIDE 48

Our model

slide-49
SLIDE 49

Our model

slide-50
SLIDE 50

Random graph according to our model

slide-51
SLIDE 51

Random graph

slide-52
SLIDE 52

Our model

  • Directed
  • One mode
  • Edge = unspecified number of chunks of a known

file

  • Nodes = Peers

Possible enhancements

  • weighted
slide-53
SLIDE 53

Some interpretations of the measurements

slide-54
SLIDE 54

Interpretation of the measurements

  • Degree Centrality
  • Closeness Centrality
  • Betweenness Centrality
  • Eigenvector Cetrality
  • Clustering Coefficient
slide-55
SLIDE 55

Interpretation of the measurements

  • Degree Centrality
  • Closeness Centrality
  • Betweenness Centrality
  • Eigenvector Cetrality
  • Clustering Coefficient
slide-56
SLIDE 56

Interpretation of the measurements

  • Degree Centrality
  • Closeness Centrality
  • Betweenness Centrality
  • Eigenvector Cetrality
  • Clustering Coefficient
slide-57
SLIDE 57

Interpretation of the measurements

  • Degree Centrality
  • Closeness Centrality
  • Betweenness Centrality
  • Eigenvector Cetrality
  • Clustering Coefficient
slide-58
SLIDE 58

Interpretation of the measurements

  • Degree Centrality
  • Closeness Centrality
  • Betweenness Centrality
  • Eigenvector Cetrality
  • Clustering Coefficient
slide-59
SLIDE 59

Optimization

slide-60
SLIDE 60

Optimization

  • Performance
  • Tracker Localized Algorithm
  • Piecepicker Localized Algorithm
  • Friend list approach
slide-61
SLIDE 61

Optimization

  • System Integrity
slide-62
SLIDE 62

Optimization

  • Free riding
slide-63
SLIDE 63

Conclusion

  • BitTorrent is not the best P2P system to apply SNA

because of the role of the tracker

  • Random Nodes are returned
  • BitTorrent is already a better system then the Beginnings
  • f P2P file sharing systems like Gnutella
  • SNA is a very powerful instrument to get insights of

structures that are hard to see

  • Many measurements depending on the graph

model

slide-64
SLIDE 64

Questions?

slide-65
SLIDE 65

Discussion

slide-66
SLIDE 66

Which (if any) P2P systems do you use and why? Did you experience problems such as free-riding?

slide-67
SLIDE 67

What do you think about free riding in BitTorrent? Is it ok to only consume and not contribute?

slide-68
SLIDE 68

Do you see weaknesses in our model how we modeled the graph of the file distribution systems in BitTorrent? What would you change?

slide-69
SLIDE 69

As we heard from Benjamin SNA’s might be used to enhance the social network in enterprises e.g. By adding new edges Do you see problems with that?

slide-70
SLIDE 70

Do you think that the application of SNA adds or diminishes value of the private usage in facebook?

slide-71
SLIDE 71

What is your opinion about SNA/Information gathering in Facebook, Google+ etc? How far is it allowed to go?

slide-72
SLIDE 72

Friends count is basically a degree measure in facebook, do you see also a use of a closeness or betweenness centrality, why ? Why not?

slide-73
SLIDE 73

Since SNA is a network of relations Could you think

  • f other applications for SNA?

Not in the field of social life?