8. Network Analysis December 8, 2019 Slides by Marta Arias, Jos - PowerPoint PPT Presentation

CAI: Cerca i Anàlisi d’Informació Grau en Ciència i Enginyeria de Dades, UPC 8. Network Analysis December 8, 2019 Slides by Marta Arias, José Luis Balcázar, Ramon Ferrer-i-Cancho, Ricard Gavaldà, Department of Computer Science, UPC 1 / 75

Contents 8. Network Analysis Examples of complex networks Small-world networks and mathematical models Centrality measures Communities in networks Spreading in networks 2 / 75

Examples of complex networks ◮ Social networks ◮ Information networks ◮ Technological networks ◮ Biological networks ◮ The Web 3 / 75

Social networks Links denote social “interactions” ◮ friendship, collaborations, e-mail, etc. 4 / 75

Information networks Nodes store information, links associate information ◮ citation networks, the web, p2p networks, etc. 5 / 75

Technological networks Man-built for the distribution of a commodity ◮ telephone networks, power grids, transportation networks, etc. 6 / 75

Biological networks Represent biological systems ◮ protein-protein interaction networks, gene regulation networks, metabolic pathways, etc. 7 / 75

Representing networks ◮ Network ≡ Graph ◮ Networks are just collections of “points” joined by “lines” points lines vertices edges, arcs math nodes links computer science sites bonds physics actors ties, relations sociology 8 / 75

Types of networks From [Newman 2003] (a) unweighted, undirected (b) discrete vertex and edge types, undirected (c) varying vertex and edge weights, undirected (d) directed 9 / 75

Three common properties 1. A friend of a friend is also frequently a friend 2. There are very short paths among most pairs of nodes “Only 6 hops separate any two people in the world” 3. Degree distribution follows a power law 1+2 is often called the small-world property. 10 / 75

Measuring the small-world phenomenon, I ◮ d ij = length of the shortest path from i to j ◮ To discuss “every two people are 6 hops away” we use: ◮ The diameter (max longest shortest-path distance) as d = max i,j d ij ◮ The average shortest-path length as 2 � l = d ij n ( n + 1) i>j ◮ The effective diameter as the d s.t. 95% of d ij are ≤ d 11 / 75

From [Newman 2003] z=avg degree; l=avg distance; α =exponent of degree powerlaw; C 1 , C 2 : clustering coefficients 12 / 75

Is this surprising? Should we expect this in a random network? It depends on what you mean by random network 13 / 75

The (basic) random graph model a.k.a. ER model Basic G n,p Erdös-Rényi random graph model: ◮ parameter n is the number of vertices ◮ parameter p is s.t. 0 ≤ p ≤ 1 ◮ Generate and edge ( i, j ) independently at random with probability p 14 / 75

Measuring the diameter in ER networks Want to show that the diameter in ER networks is small ◮ Let the average degree be z ◮ At distance l , can reach z l nodes ◮ At distance log n log z , reach all n nodes ◮ So, diameter is (roughly) O (log n ) 15 / 75

ER networks have small diameter As shown by the following simulation 16 / 75

Measuring the small-world phenomenon, II ◮ To check whether “the friend of a friend is also frequently a friend”, we use: ◮ The transitivity or clustering coefficient, which basically measures the probability that two of my friends are also friends 17 / 75

Global clustering coefficient 3 × number of triangles C = number of connected triples C = 3 × 1 = 0 . 375 8 18 / 75

Local clustering coefficient ◮ For each vertex i , let n i be the number of neighbors of i ◮ Let C i be the fraction of pairs of neighbors that are connected within each other C i = nr. of connections between i ’s neighbors 1 2 n i ( n i − 1) ◮ Finally, average C i over all nodes i in the network C = 1 � C i n i 19 / 75

Local clustering coefficient example ◮ C 1 = C 2 = 1 / 1 ◮ C 3 = 1 / 6 ◮ C 4 = C 5 = 0 ◮ C = 1 5 (1 + 1 + 1 / 6) = 13 / 30 = 0 . 433 20 / 75

From [Newman 2003] z=avg degree; l=avg distance; α =exponent of degree powerlaw; C 1 , C 2 : clustering coefficients 21 / 75

ER networks do not show transitivity ◮ In ER networks, C = p , since each edge is added independently ◮ in many real networks, C ≫ p ◮ where p is estimated as | E | / ( n ( n − 1) / 2) 22 / 75

ER networks do not show transitivity 23 / 75

So ER networks do not have high clustering, but.. ◮ Other “random network” models generate graphs with low diameter and high clustering coefficient ◮ The Watts-Strogatz model is an example 24 / 75

The Watts-Strogatz model ◮ Start with all n vertices arranged on a ring ◮ Each vertex has initially 4 connections to their closest nodes ◮ With probability p , rewire each local connection to a random vertex 25 / 75

The Watts-Strogatz model For an appropriate value of p ≈ 0 . 01 (1%), the model achieves high clustering and small diameter 26 / 75

Degree distribution Histogram of nr of nodes having a particular degree f k = fraction of nodes of degree k 27 / 75

Degree distribution The degree distribution of most real-world networks follows a power-law distribution f k = ck − α ◮ “heavy-tail” distribution, implies existence of hubs ◮ hubs are nodes with very high degree 28 / 75

Scale-free or scale-invariant Networks with power-law degree distribution are often called scale-free or scale-invariant. ◮ D is scale-invariant if D ( λx ) = f ( λ ) D ( x ) ◮ True for powerlaw degree distribution ( x = #links) ◮ For non-powerlaws, the f ( λ ) instead depends on x ◮ This means no characteristic scale or “units of measure” For “growing” networks, it implies that the statistics remain similar as the network grows - fractality etc. 29 / 75

ER Random networks are not scale-free! For ER random networks, the degree distribution follows the binomial distribution (or Poisson if n is large) p k (1 − p ) ( n − k ) ≈ z k e − z � n � f k = k k ! ◮ Where z = p ( n − 1) is the mean degree ◮ Probability of nodes with very large degree becomes exponentially small ◮ Maximum degree is pn + O ( � ( pn )) with high probability ◮ so no hubs 30 / 75

So ER networks are not scale-free, but. . . ◮ One can build models of “random graph” that do ◮ Barabasi-Albert “preferential attachment” 31 / 75

Preferential attachment ◮ “Rich get richer” dynamics ◮ The more someone has, the more she is likely to have ◮ Examples ◮ the more friends you have, the easier it is to make new ones ◮ the more business a firm has, the easier it is to win more ◮ the more people there are at a restaurant, the more who want to go 32 / 75

Barabási-Albert model From [Barabasi 1999] ◮ “Growth” model ◮ The model controls how a network grows over time ◮ Uses preferential attachment as a guide to grow the network ◮ new nodes prefer to attach to well-connected nodes ◮ (Simplified) process: ◮ the process starts with some initial subgraph ◮ each new node comes in with m edges ◮ probability of connecting to existing node i is proportional to i ’s degree ◮ results in a power-law degree distribution with exponent α = 3 33 / 75

ER vs. BA Experiment with 1000 nodes, 999 edges ( m 0 = 1 in BA model). random preferential attachment 34 / 75

The Web . . . is different. “Bowtie” structure [The web is a bow tie. Nature 405, 113 (2000) doi:10.1038/35012155] https://en.wikipedia.org/wiki/Topology_of_the_World_Wide_Web http://cs.wellesley.edu/~pmetaxas/Why_Is_the_Shape_of_the_Web_a_Bowtie.pdf 35 / 75

Centrality in Networks Centrality is a node’s measure w.r.t. others ◮ A central node is important and/or powerful ◮ A central node has an influential position in the network ◮ A central node has an advantageous position in the network 36 / 75

Degree centrality Power through connections First approximation: Centrality ≃ number of connections Normalize by maximum possible number of connections to put it in [0,1] But look at these examples, does degree centrality look OK to you? 37 / 75

Closeness centrality Power through proximity to others � − 1 �� j � = i d ( i, j ) n − 1 def = closeness _ centrality ( i ) = n − 1 � j � = i d ( i, j ) Here, what matters is to be close to everybody else, i.e., to be easily reachable or have the power to quickly reach others. 38 / 75

Betweenness centrality Power through brokerage A node is important if it lies in many shortest-paths ◮ so it is essential in passing information through the network 39 / 75

Betweenness centrality Power through brokerage g jk ( i ) def � betweenness _ centrality ( i ) = g jk j<k Where ◮ g jk is the number of shortest-paths between j and k , and ◮ g jk ( i ) is the number of shortest-paths through i Oftentimes it is normalized: = betweenness _ centrality ( i ) def norm _ betweenness _ centrality ( i ) � n − 1 � 2 40 / 75

Betweenness centrality Examples (non-normalized) 41 / 75

Communities 42 / 75

8. Network Analysis December 8, 2019 Slides by Marta Arias, Jos - PowerPoint PPT Presentation

CAI: Cerca i Anlisi dInformaci Grau en Cincia i Enginyeria de Dades, UPC 8. Network Analysis December 8, 2019 Slides by Marta Arias, Jos Luis Balczar, Ramon Ferrer-i-Cancho, Ricard Gavald, Department of Computer Science, UPC 1 /

Why actor analysis? Actor and network analysis Bert Enserink Network map of linked Network map

Week 5 Video 5 Relationship Mining Network Analysis Todays Class Network Analysis Network

DNA Interaction Follow Network Network User-Product Network Nonuniform network comm costs

Epistemic Network Analysis Todays Class Epistemic Network Analysis Epistemic Network

1 Network Layer Network Layer Recall: Circuit Switching vs. Packet Interplay between routing

Network Coding Network Coding Jie Gao Existing network Existing network Independent data

Definitions & basic recap Network Analysis in Python II Network/Graph Network = Graph

SWOT Analysis W T S O SWOT Analysis Learning Objectives What is SWOT Analysis? What is SWOT

Analysis and Optimizations Analysis and Optimizations Program Analysis Program Analysis

Bioinformatics: Network Analysis Comparative Network Analysis COMP 572 (BIOS 572 / BIOE 564) -

Applying Ontology in Network Analysis EWG-DSS Research Collaboration Network EWG-DSS Collab-Net

Improvised Explosive Device Network Analysis IED NA Overview IED NA utilizes network analysis

Lecture 11 Vector Linear Network Coding Vector Linear Network Coding Outline Fundamentals for

Introduction to Network Introduction to Network Theory Theory What is a Network? What is a

Network Data Plane Network Data Plane Network Data Plane (S. S. Lam) 3/23/2017 1 Network layer

Access Network Access Network Access network: local loop infrastructure It is the last

figura serpentinata, isolated, significance? revelation Michelangelo, Doni Tondo Cardinal Giulio

Deaths in Oregon 900 Suicides 800 700 600 500 400 300 Homicides 200 100 0 2002 2003

Delivering Serious News Definitions of Serious News Includes communication regarding

Read the first four chapters of 'Holes'. See accompanying videos on youtube While we are reading

HOWL c A story of a king who loses his kingdom, his humanity and his most beloved daughter C A

Intro to Web Security What is the Internet? global system of interconnected computer

When tragedy strikes: The God of all comfort Mike Taylor Forest Community Church Sunday 24

FrigIDR, extreme freecooling Bruno Bzeznik , Olivier Richard, Pierre Neyron, Fran coise

8. Network Analysis December 8, 2019 Slides by Marta Arias, Jos - PowerPoint PPT Presentation

CAI: Cerca i Anlisi dInformaci Grau en Cincia i Enginyeria de Dades, UPC 8. Network Analysis December 8, 2019 Slides by Marta Arias, Jos Luis Balczar, Ramon Ferrer-i-Cancho, Ricard Gavald, Department of Computer Science, UPC 1 /

Why actor analysis? Actor and network analysis Bert Enserink Network map of linked Network map

Week 5 Video 5 Relationship Mining Network Analysis Todays Class Network Analysis Network

DNA Interaction Follow Network Network User-Product Network Nonuniform network comm costs

Epistemic Network Analysis Todays Class Epistemic Network Analysis Epistemic Network

1 Network Layer Network Layer Recall: Circuit Switching vs. Packet Interplay between routing

Network Coding Network Coding Jie Gao Existing network Existing network Independent data

Definitions &amp; basic recap Network Analysis in Python II Network/Graph Network = Graph

SWOT Analysis W T S O SWOT Analysis Learning Objectives What is SWOT Analysis? What is SWOT

Analysis and Optimizations Analysis and Optimizations Program Analysis Program Analysis

Bioinformatics: Network Analysis Comparative Network Analysis COMP 572 (BIOS 572 / BIOE 564) -

Applying Ontology in Network Analysis EWG-DSS Research Collaboration Network EWG-DSS Collab-Net

Improvised Explosive Device Network Analysis IED NA Overview IED NA utilizes network analysis

Lecture 11 Vector Linear Network Coding Vector Linear Network Coding Outline Fundamentals for

Introduction to Network Introduction to Network Theory Theory What is a Network? What is a

Network Data Plane Network Data Plane Network Data Plane (S. S. Lam) 3/23/2017 1 Network layer

Access Network Access Network Access network: local loop infrastructure It is the last

figura serpentinata, isolated, significance? revelation Michelangelo, Doni Tondo Cardinal Giulio

Deaths in Oregon 900 Suicides 800 700 600 500 400 300 Homicides 200 100 0 2002 2003

Delivering Serious News Definitions of Serious News Includes communication regarding

Read the first four chapters of 'Holes'. See accompanying videos on youtube While we are reading

HOWL c A story of a king who loses his kingdom, his humanity and his most beloved daughter C A

Intro to Web Security What is the Internet? global system of interconnected computer

When tragedy strikes: The God of all comfort Mike Taylor Forest Community Church Sunday 24

FrigIDR, extreme freecooling Bruno Bzeznik , Olivier Richard, Pierre Neyron, Fran coise

Definitions & basic recap Network Analysis in Python II Network/Graph Network = Graph