Degrees, Power Laws and Popularity Gonzalo Mateos Dept. of ECE and - PowerPoint PPT Presentation

Degrees, Power Laws and Popularity Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ February 13, 2020 Network Science Analytics Degrees, Power Laws and Popularity 1

Degree distributions Degree distributions Power-law degree distributions Visualizing and fitting power laws Popularity and preferential attachment Network Science Analytics Degrees, Power Laws and Popularity 2

Descriptive analysis of network characterstics ◮ Given a network graph representation of a complex system ⇒ Structural properties of G key to system-level understanding Example ◮ Q1: Underpinning of various types of basic social dynamics? A: Study vertex triplets (triads) and patterns of ties among them ◮ Q2: How can we formalize the notion of ‘importance’ in a network? A: Define measures of individual vertex (or group) centrality ◮ Q3: Can we identify communities and cohesive subgroups? A: Formulate as a graph partitioning (clustering) problem ◮ Characterization of individual vertices/edges and network cohesion ◮ Social network analysis, math, computer science, statistical physics Network Science Analytics Degrees, Power Laws and Popularity 3

Degree ◮ Def: The degree d v of vertex v is its number of incident edges ⇒ Degree sequence arranges degrees in non-decreasing order 3 3 5 4 2 2 1 6 2 4 2 3 ◮ In figure ⇒ Vertex degrees shown in red, e.g., d 1 = 2 and d 5 = 3 ⇒ Graph’s degree sequence is 2,2,2,3,3,4 ◮ In general, the degree sequence does not uniquely specify the graph ◮ High-degree vertices are likely to be influential, central, prominent Network Science Analytics Degrees, Power Laws and Popularity 4

Degree distribution ◮ Let N ( d ) denote the number of vertices with degree d ⇒ Fraction of vertices with degree d is P ( d ) := N ( d ) N v ◮ Def: The collection { P ( d ) } d ≥ 0 is the degree distribution of G ◮ Histogram formed from the degree sequence (bins of size one) P(d) d ◮ P ( d ) = probability that randomly chosen node has degree d ⇒ Summarizes the local connectivity in the network graph Network Science Analytics Degrees, Power Laws and Popularity 5

Joint degree distribution ◮ Q: What about patterns of association among nodes of given degrees? ◮ A: Define the two-dimensional analogue of a degree distribution Router-level Internet Protein interaction 10 8 8 6 log 2 (Degree) log 2 (Degree) 6 4 4 2 2 0 0 0 2 4 6 8 10 0 2 4 6 8 log 2 (Degree) log 2 (Degree) ◮ Prob. of random edge having incident vertices with degrees ( d 1 , d 2 ) Network Science Analytics Degrees, Power Laws and Popularity 6

A simple random graph model ◮ Def: The Erd¨ os-Renyi random graph model G n , p ◮ Undirected graph with n vertices, i.e., of order N v = n ◮ Edge ( u , v ) present with probability p , independent of other edges ◮ Simulation is easy: draw � n � i.i.d. Bernoulli( p ) RVs 2 Example ◮ Three realizations of G 10 , 1 6 . The size N e is a random variable Network Science Analytics Degrees, Power Laws and Popularity 7

Degree distribution of G n , p ◮ Q: Degree distribution P ( d ) of the Erd¨ os-Renyi graph G n , p ? ◮ Define I { ( v , u ) } = 1 if ( v , u ) ∈ E , and I { ( v , u ) } = 0 otherwise. ⇒ Fix v . For all u � = v , the indicator RVs are i.i.d. Bernoulli( p ) ◮ Let D v be the (random) degree of vertex v . Hence, � D v = I { ( v , u ) } u � = v ⇒ D v is binomial with parameters ( n − 1 , p ) and � n − 1 � p d (1 − p ) ( n − 1) − d P ( d ) = P ( D v = d ) = d ◮ In words, the probability of having exactly d edges incident to v ⇒ Same for all v ∈ V , by independence of the G n , p model Network Science Analytics Degrees, Power Laws and Popularity 8

Behavior for large n ◮ Q: How does the degree distribution look like for a large network? ◮ Recall D v is a sum of n − 1 i.i.d. Bernoulli( p ) RVs ⇒ Central Limit Theorem: D v ∼ N ( np , np (1 − p )) for large n 0.2 0.2 p=0.5, n=20 Binomial(20,1/2) p=0.5, n=40 0.18 0.18 Binomial(60,1/6) p=0.5, n=60 Poisson(10) 0.16 0.16 0.14 0.14 0.12 0.12 P(d) P(d) 0.1 0.1 0.08 0.08 0.06 0.06 0.04 0.04 0.02 0.02 0 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 d d ◮ Makes most sense to increase n with fixed E [ D v ] = ( n − 1) p = µ ⇒ Law of rare events: D v ∼ Poisson( µ ) for large n Network Science Analytics Degrees, Power Laws and Popularity 9

Law of rare events ◮ Substituting p = µ/ n in the binomial PMF yields n ! � µ � d � 1 − µ � n − d P n ( d ) = ( n − d )! d ! n n µ d (1 − µ/ n ) n = n ( n − 1) . . . ( n − d + 1) n d d ! (1 − µ/ n ) d n →∞ (1 − µ/ n ) n = e − µ ◮ In the limit, red term is lim ◮ Black and blue terms converge to 1. Limit is the Poisson PMF n →∞ P n ( d ) = 1 µ d e − µ = e − µ µ d lim d ! 1 d ! ◮ Approximation usually called “law of rare events” ◮ Individual edges happen with small probability p = µ/ n ◮ The aggregate (degree, number of edges), though, need not be rare Network Science Analytics Degrees, Power Laws and Popularity 10

The G n , p model and real-world networks ◮ For large graphs, G n , p suggests P ( d ) with an exponential tail ⇒ Unlikely to see degrees spanning several orders of magnitude Linear scale Logarithmic scale 10 0 0.12 10 -50 0.1 10 -100 0.08 10 -150 P(d) P(d) 0.06 10 -200 0.04 10 -250 0.02 10 -300 0 0 10 20 30 40 50 10 1 10 2 10 3 d d ◮ Concentrated distribution around the mean E [ D v ] = ( n − 1) p ◮ Q: Is this in agreement with real-world networks? Network Science Analytics Degrees, Power Laws and Popularity 11

World Wide Web ◮ Degree distributions of the WWW analyzed in [Broder et al ’00] ⇒ Web a digraph, study both in- and out-degree distributions ◮ Majority of vertices naturally have small degrees ⇒ Nontrivial amount with orders of magnitude higher degrees Network Science Analytics Degrees, Power Laws and Popularity 12

Internet autonomous systems ◮ The topology of the AS-level Internet studied in [Faloutsos 3 ’99] ◮ Right-skewed degree distributions also found for router-level Internet Network Science Analytics Degrees, Power Laws and Popularity 13

Seems to be a structural pattern ◮ More heavy-tailed degree distributions found in [Barabasi-Albert ’99] P(d) d d d Author collaboration Web graph Power grid ◮ These heterogeneous, diffuse degree distributions are not exponential Network Science Analytics Degrees, Power Laws and Popularity 14

Power laws Degree distributions Power-law degree distributions Visualizing and fitting power laws Popularity and preferential attachment Network Science Analytics Degrees, Power Laws and Popularity 15

Power-law degree distributions − 4 − 5 log 2 (Frequency) log 2 (Frequency) − 6 − 10 − 8 − 10 − 15 − 12 0 2 4 6 8 10 0 2 4 6 8 log 2 (Degree) log 2 (Degree) ◮ Log-log plots show roughly a linear decay, suggesting the power law P ( d ) ∝ d − α ⇒ log P ( d ) = C − α log d ◮ Power-law exponent (negative slope) is typically α ∈ [2 , 3] ◮ Normalization constant C is mostly uninteresting ◮ Power laws often best followed in the tail, i.e., for d ≥ d min Network Science Analytics Degrees, Power Laws and Popularity 16

Power law and exponential degree distributions ( a) ( b) 0.15 10 0 P(d)=d -‑2.1 ¡ P(d)=d -‑2.1 ¡ p k ~ k -2.1 p k ~ k -2.1 10 -1 � 0.1 10 -2 p k p � � P(d) p k Poisson ¡ 10 -3 POISSON 0.05 10 -4 Poisson ¡ POISSON 10 -5 10 -6 � � 0 10 20 k 30 40 50 10 0 10 1 10 2 10 3 d ¡ k d ¡ �� (d) ( c) � � � ◮ Erd¨ os-Renyi’s Poisson degree distribution exhibits a sharp cutoff ⇒ Power laws upper bound exponential tails for large enough d � Network Science Analytics Degrees, Power Laws and Popularity 17 ��

Scale-free networks ◮ Scale-free network: degree distribution with power-law tail ◮ Name motivated for the scale-invariance property of power laws ◮ Def: A scale-free function f ( x ) satisfies f ( ax ) = bf ( x ), for a , b ∈ R Example ◮ Power-law functions f ( x ) = x − α are scale-free since f ( ax ) = ( ax ) − α = a − α f ( x ) = bf ( x ) , where b := a − α ◮ Exponential functions f ( x ) = c x are not scale-free because f ( ax ) = c ax = ( c x ) a = f a ( x ) � = bf ( x ) , except when a = b = 1 ◮ No ‘characteristic scale’ for the degrees. More soon ⇒ Functional form of the distribution is invariant to scale Network Science Analytics Degrees, Power Laws and Popularity 18

Power-law distributions are ubiquitous ◮ Power-law distributions widespread beyond networks [Clauset et al ’07] Network Science Analytics Degrees, Power Laws and Popularity 19

Normalization ◮ The power-law degree distribution P ( d ) = Cd − α is a PMF, hence ∞ ∞ 1 Cd − α ⇒ C = � � 1 = P ( d ) = � ∞ d =0 d − α d =0 d =0 ◮ Often a power law is only valid for the tail d ≥ d min , hence 1 1 d min x − α dx = ( α − 1) d α − 1 C = d = d min d − α ≈ � ∞ � ∞ min ⇒ Sound approximation since P ( d ) varies slowly for large d ◮ The normalized power-law degree distribution is � d � − α P ( d ) = α − 1 d ≥ d min , d min d min Network Science Analytics Degrees, Power Laws and Popularity 20

Degrees, Power Laws and Popularity Gonzalo Mateos Dept. of ECE and - PowerPoint PPT Presentation

Degrees, Power Laws and Popularity Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ February 13, 2020 Network Science Analytics

How kids perceive popularity? Sha Lana Clinton Outline u Popularity u Experiment u

Shimura Degrees, New Modular Degrees, and Congruence Primes Alyson Deines CCR La Jolla October

SimpleGraphs: Degrees lastweek Multi-Graph thisweek degrees.1 degrees.2 AlbertRMeyer

Sum of Degrees of Vertices Theorem Theorem (Sum of Degrees of Vertices Theorem) Suppose a graph

Stability, Popularity, and Lower Quotas Meghana Nasre IIT Madras CAALM 2019 Chennai

Paris Agreement (2015) UN IPCC Report (2018) Significant differences between 1.5 degrees C and 2

latitude : (noun) distance north or south of the equator measured in degrees up to 90 degrees. The

(power x 0) == 1 (power x (+ n 1)) == (* (power x n) x) (power x 0) == 1 (power x (+ (* 2 m)

Scripting languages: Perl and PHP Tuukka Haapasalo December 1, 2009 Tuukka Haapasalo Scripting

CS345a: Data Mining Jure Leskovec and Anand Rajaraman j Stanford University Instead of generic

Does Content Determine Information Popularity in Social Media? A Case Study of YouTube Videos

A STUDY OF REPOSITORY NETWORK Distribution of popularity & Effect of coexisting languages

WALES SOFT POWER BAROMETER 2018 Measuring soft power beyond the nation-state April 2018 01 WHAT

Cupping and Diamond Embeddings: A Unifying Approach Guohua Wu Nanyang Technological University

Goodness and Jump Inversion in the Enumeration Degrees. Charles M. Harris Department Of

Weihrauch degrees of numerical problems comparison with arithmetic Keita Yokoyama joint

Bounds on the epsilon expansion Matthijs Hogervorst Ecole polytechnique f ed erale de

Estimating MultiWay Fixed Effect Models with reghdfe Sergio Correia, Duke University 2016

Panel data estimation and forecasting Christopher F Baum Boston College and DIW Berlin NCER,

Towards verification of distributed algorithms in the Heard-of model Igor Walukiewicz CNRS

http://demo.clab.cs.cmu.edu/algo4nlp19/ https://piazza.com/class/jy617kmo6ub134

Language Models Machine Translation Lecture 3 Instructor: Chris Callison-Burch TAs: Mitchell

Lesson 6: Case study: Polio Aaron A. King, Edward L. Ionides, and Kidus Asfaw 1 / 68 Outline

Jonathan Siegel Stanford EE Computer Systems Colloquium Jan 22, 2013 Embracing Failure

Degrees, Power Laws and Popularity Gonzalo Mateos Dept. of ECE and - PowerPoint PPT Presentation

Degrees, Power Laws and Popularity Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ February 13, 2020 Network Science Analytics

How kids perceive popularity? Sha Lana Clinton Outline u Popularity u Experiment u

Shimura Degrees, New Modular Degrees, and Congruence Primes Alyson Deines CCR La Jolla October

SimpleGraphs: Degrees lastweek Multi-Graph thisweek degrees.1 degrees.2 AlbertRMeyer

Sum of Degrees of Vertices Theorem Theorem (Sum of Degrees of Vertices Theorem) Suppose a graph

Stability, Popularity, and Lower Quotas Meghana Nasre IIT Madras CAALM 2019 Chennai

Paris Agreement (2015) UN IPCC Report (2018) Significant differences between 1.5 degrees C and 2

latitude : (noun) distance north or south of the equator measured in degrees up to 90 degrees. The

(power x 0) == 1 (power x (+ n 1)) == (* (power x n) x) (power x 0) == 1 (power x (+ (* 2 m)

Scripting languages: Perl and PHP Tuukka Haapasalo December 1, 2009 Tuukka Haapasalo Scripting

CS345a: Data Mining Jure Leskovec and Anand Rajaraman j Stanford University Instead of generic

Does Content Determine Information Popularity in Social Media? A Case Study of YouTube Videos

A STUDY OF REPOSITORY NETWORK Distribution of popularity &amp; Effect of coexisting languages

WALES SOFT POWER BAROMETER 2018 Measuring soft power beyond the nation-state April 2018 01 WHAT

Cupping and Diamond Embeddings: A Unifying Approach Guohua Wu Nanyang Technological University

Goodness and Jump Inversion in the Enumeration Degrees. Charles M. Harris Department Of

Weihrauch degrees of numerical problems comparison with arithmetic Keita Yokoyama joint

Bounds on the epsilon expansion Matthijs Hogervorst Ecole polytechnique f ed erale de

Estimating MultiWay Fixed Effect Models with reghdfe Sergio Correia, Duke University 2016

Panel data estimation and forecasting Christopher F Baum Boston College and DIW Berlin NCER,

Towards verification of distributed algorithms in the Heard-of model Igor Walukiewicz CNRS

http://demo.clab.cs.cmu.edu/algo4nlp19/ https://piazza.com/class/jy617kmo6ub134

Language Models Machine Translation Lecture 3 Instructor: Chris Callison-Burch TAs: Mitchell

Lesson 6: Case study: Polio Aaron A. King, Edward L. Ionides, and Kidus Asfaw 1 / 68 Outline

Jonathan Siegel Stanford EE Computer Systems Colloquium Jan 22, 2013 Embracing Failure

A STUDY OF REPOSITORY NETWORK Distribution of popularity & Effect of coexisting languages