Probabilistic Foundations of Statistical Network Analysis Chapter 4: - PowerPoint PPT Presentation

Probabilistic Foundations of Statistical Network Analysis Chapter 4: Generative models Harry Crane Based on Chapter 4 of Probabilistic Foundations of Statistical Network Analysis Book website: http://www.harrycrane.com/networks.html Harry Crane Chapter 4: Generative models 1 / 13

Table of Contents Chapter 1 Orientation 2 Binary relational data 3 Network sampling 4 Generative models 5 Statistical modeling paradigm 6 Vertex exchangeability 7 Getting beyond graphons 8 Relative exchangeability 9 Edge exchangeability 10 Relational exchangeability 11 Dynamic network models Harry Crane Chapter 4: Generative models 2 / 13

Specification of generative models Sampling models (Chapter 3) specified by candidate distributions describing network variation sampling scheme that links the population Y N to the sample Y n = Σ n , N Y N Generative models (Chapter 4) specified by candidate distributions generative scheme to describe network growth Describe generative scheme by an evolution map . Harry Crane Chapter 4: Generative models 3 / 13

Evolution maps (Chapter 4 of FPSNA) Definition For n ≤ N, call P : { 0 , 1 } n × n → { 0 , 1 } N × N an evolution map if for all y ∈ { 0 , 1 } n × n . P ( y ) | [ n ] = y An evolution map is an operation by which y ∈ { 0 , 1 } n × n ‘evolves’ into P ( y ) ∈ { 0 , 1 } N × N by holding fixed the part of the network that already exists, namely y . Let P n , N be the set of all evolution maps { 0 , 1 } n × n → { 0 , 1 } N × N . A generating scheme is a random map Π n , N in P n , N . Distribution can depend on Y n . More precisely, Π n , N Y n is the network with N vertices obtained by first generating Y n and, given Y n = y , putting Π n , N Y n = P ( y ) , for P ∈ P n , N chosen according to the conditional distribution of Π n , N given Y n = y . The distribution of Π n , N Y n is computed by � Pr (Π n , N Y n = y ) = Pr (Π n , N = P | Y n = y | [ n ] ) Pr ( Y n = y | [ n ] ) 1 ( P ( y | [ n ] ) = y ) , P ∈P n , N (1) where 1 ( · ) is the indicator function. Harry Crane Chapter 4: Generative models 4 / 13

Generative consistency Definition (Generative consistency (Definition 4.1 of PFSNA)) Let Y n and Y N be random { 0 , 1 } -valued arrays and let Π n , N be a generating scheme. Then Y n and Y N are consistent with respect to Π n , N if Π n , N Y n = D Y N , for Π n , N Y n defined by the distribution in (1) . Duality between generative consistency and consistency under selection : For any Y n and generating mechanism Π n , N , define Y N by Y N = Π n , N Y n . Then by the defining property of an evolution map, Y n and Y N enjoy the relationship S n , N Y N = S n , N Π n , N Y n = Y n with probability 1 ; that is, Y n and Π n , N Y n are consistent under selection by default. Harry Crane Chapter 4: Generative models 5 / 13

Preferential attachment model (Barabási–Albert) Dynamics based on Simon’s preferential attachment scheme for heavy-tailed distributions. Vertices arrive one at a time and attach preferentially to previous vertices based on their degree. Formal definition : Take m ≥ 1 (integer) and δ > − m (real number) so that each new vertex attaches randomly to m existing vertices with probability increasing with degree. Initiate at a graph y 0 with n 0 ≥ 1 vertices, which then evolves successively into y 1 , y 2 , . . . by connecting a new vertex to the existing graph at each step. For any y = ( y ij ) 1 ≤ i , j ≤ n and every i = 1 , . . . , n , the degree of i in y is the number of edges incident to i , � deg y ( i ) = y ij . j � = i At step n ≥ 1, a new vertex v n attaches to m ≥ 1 vertices in y n − 1 , with each of the m vertices v ′ chosen independently without replacement with probability proportional to deg y n − 1 ( v ′ ) + δ/ m . Harry Crane Chapter 4: Generative models 6 / 13

Barabási–Albert model (Generative scheme) In keeping with the notation of Section 4.1, let Π δ, m k , n , k ≤ n , denote the generating mechanism for the process parameterized by m ≥ 1 and δ > − m . By letting the parameters n 0 ≥ 1, m ≥ 1, and δ > − m vary over all permissible values and treating the initial conditions y 0 and n 0 as fixed, the above generating mechanism determines a family of distributions for each finite sample size n ≥ 1, where n is the number of vertices that have been added to y 0 . For each n ≥ 1, this process gives a collection of distributions M n indexed by ( m , δ ) , and each distribution in M k indexed by ( m , δ ) is related to a distribution in M n , n ≥ k , with the same parameters through the preferential attachment scheme Π δ, m k , n associated to the model. For any choice of parameter ( δ, m ) , we express the relationship between Y k and Y n , n ≥ k , by Y n = D Π δ, m k , n Y k . Harry Crane Chapter 4: Generative models 7 / 13

Barabási–Albert model (Empirical properties) Sparsity : Let y = ( y ( n ) ) n ≥ 1 be sequence of graphs ( y ( n ) has n vertices). Call y sparse if 1 � y ( n ) lim = 0 . ij n ( n − 1 ) n →∞ 1 ≤ i � = j ≤ n Under BA model, ( Y n ) n ≥ 1 grows by adding one vertex at a time with m new edges, so that 1 1 � Y ij = n ( n − 1 )( mn + n 0 ) → 0 as n → ∞ . n ( n − 1 ) 1 ≤ i � = j ≤ n Networks under BA model are sparse with probability 1. Power law degree distribution : For k ≥ 1, let n p y ( k ) = n − 1 � 1 ( deg y ( i ) = k ) . i = 1 A sequence y = ( y ( n ) ) n ≥ 1 exhibits power law degree distribution with exponent γ > 1 if p y ( n ) ( k ) ∼ γ − k for all large k as n → ∞ , where a ( k ) ∼ b ( k ) indicates that a ( k ) / b ( k ) → 1 as k → ∞ . BA model with parameter ( δ, m ) has power law degree distribution with exponent 3 + δ/ m with probability 1. Harry Crane Chapter 4: Generative models 8 / 13

Power law and ‘scale-free’ networks Many real-world networks believed to exhibit power law, or nearly power law, degree distribution (Barabási–Albert, ...). Heuristic check: power law degree distribution implies log p y ( k ) ∼ − γ log ( k ) , large k ≥ 1 . (2) Yule–Simon distribution (dotted) vs. line − 3 log ( k ) (solid). Power law distribution with exponent 3 0 −2 −4 −gamma*log(degree) −6 −8 −10 −12 0 1 2 3 4 5 log(degree) Figure: Dotted line shows log-log plot of the Yule–Simon distribution for γ = 3. Solid line shows the linear approximation in (2) by approximating Γ( γ ) / Γ( k + γ ) ∼ γ − k , which holds asymptotically for large values of k . Harry Crane Chapter 4: Generative models 9 / 13

Random walk (RW) models Add a new edge at each step (instead of new vertex as in BA model). Start with initial graph y 0 and evolve y 1 , y 2 , . . . as follows. At step n ≥ 1, choose vertex v n in y n − 1 randomly with distribution F n (which can depend on y n − 1 ). Then draw a random nonnegative integer L n from distribution also depending on y n − 1 . Given v n and L n , perform a simple random walk on y n − 1 for L n steps starting at v n . If after L n steps the random walk is at v ∗ � = v n , then add edge between v ∗ and v n ; otherwise, add new vertex v ∗∗ and put edge between v ∗∗ and v n . Choosing v n by degree-biased distribution on y n − 1 and taking L n to be large simulates BA model. For more details on these models see Bloem-Reddy and Orbanz ( https://arxiv.org/abs/1612.06404 ), Bollobas, et al (2003), and related work. Harry Crane Chapter 4: Generative models 10 / 13

Erd˝ os–Rényi–Gilbert model Classical Erd˝ os–Rényi–Gilbert model includes each edge in random graph independently with fixed probability θ . Generative description: For any θ ∈ [ 0 , 1 ] , define Π θ n , N as the generating scheme which acts on { 0 , 1 } n × n by Π θ y �→ n , N ( y ) B 1 , n + 1 · · · B 1 , N   . . ... . .   y . .     B n , n + 1 · · · B n , N   y �→ ,   B n + 1 , 1 · · · B n + 1 , n 0 · · · B n + 1 , N     . . . . ... ...  . . . .  . . . .   B N , 1 · · · B N , n B N , n + 1 · · · 0 which fixes the upper n × n submatrix to be y and fills in the rest of the off-diagonal entries with i.i.d. Bernoulli random variables ( B ij ) 1 ≤ i � = j ≤ N with success probability θ . Harry Crane Chapter 4: Generative models 11 / 13

General sequential construction Above examples start with a base case Y 0 , from which a family of networks Y 1 , Y 2 , . . . is constructed inductively according to a random scheme. A generic way to specify a generative network model is to specify a conditional distribution for Y n given Y n − 1 such that Y n | [ n − 1 ] = Y n − 1 with probability 1. Conditional distribution Pr ( Y n = · | Y n − 1 ) determines the distribution of a random generating mechanism Π n − 1 , n in P n − 1 , n = ⇒ Y n can be expressed as Y n = Π n − 1 , n Y n − 1 for every n ≥ 1. Composing these actions for successive values of n determines the generating mechanism Π n , N , n ≤ N , by the law of iterated conditioning: = ⇒ Given Y n , construct Y N = Π n , N Y n by Y N = Π N − 1 , N (Π N − 2 , N − 1 ( · · · (Π n , n + 1 Y n ))) . The conditional distribution of Y N given Y n computed by Pr ( Y N = y ∗ | Y n = y ∗ | [ n ] ) = Pr ( Y N = y ∗ | Y N − 1 = y ∗ | [ N − 1 ] ) × Pr ( Y N − 1 = y ∗ | [ N − 1 ] | Y n = y ∗ | [ n ] ) = N − n Pr (Π N − i , N − i + 1 ( y ∗ | [ N − i ] ) = y ∗ | [ N − i + 1 ] | Y N − i = y ∗ | [ N − i ] ) . � = i = 1 Harry Crane Chapter 4: Generative models 12 / 13

Probabilistic Foundations of Statistical Network Analysis Chapter 4: - PowerPoint PPT Presentation

Probabilistic Foundations of Statistical Network Analysis Chapter 4: Generative models Harry Crane Based on Chapter 4 of Probabilistic Foundations of Statistical Network Analysis Book website: http://www.harrycrane.com/networks.html Harry Crane

Probabilistic Foundations of Statistical Network Analysis Chapter 5: Statistical modeling paradigm

Probabilistic Foundations of Statistical Network Analysis Chapter 3: Network sampling Harry Crane

Probabilistic Foundations of Statistical Network Analysis Chapter 2: Binary relational data Harry

Probabilistic Foundations of Statistical Network Analysis Chapter 1: Orientation Harry Crane

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

recap to this point foundations foundations foundations foundations genetics =

Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations

Some Comments on the Some Comments on the Foundations of Network Analysis Foundations of Network

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

Running Probabilistic Running Probabilistic Running Probabilistic Programs Backwards Programs

Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Thesis

Probabilistic Computation Lecture 13 BPP vs. PH 1 Recap 2 Recap Probabilistic computation 2

Table of Contents I Probabilistic Reasoning Classical Probabilistic Models Basic Probabilistic

Probabilistic Computation Lecture 12 Flipping coins, taking chances PP, BPP 1 Probabilistic

Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Reconstruction

Chapter 12: Iterative Methods ES 240: Scientific and Engineering Computation. Iterative Methods

Chapter 12 Randomized Algorithms II High Probability NEW CS 473: Theory II, Fall 2015

Kjell Karlsen President Sea Launch July 15, 2011 Aerospace & Defense Forum Aerospace &

Social security austerity Dr. Chris Grover Department of Sociology Lancaster University 1

Help and Do cumen tation Ov erview Users require dieren t t yp es of supp ort

Pythagorean Theorem In a right triangle with sides of length a and b and hypotenuse c : a 2 + b 2 =

W = F x or Expansion/Contrac8on of a Closed System

Chapter 7. Inclusion-Exclusion a.k.a. The Sieve Formula Prof. Tesler Math 184A Fall 2019 Prof.

Probabilistic Foundations of Statistical Network Analysis Chapter 4: - PowerPoint PPT Presentation

Probabilistic Foundations of Statistical Network Analysis Chapter 4: Generative models Harry Crane Based on Chapter 4 of Probabilistic Foundations of Statistical Network Analysis Book website: http://www.harrycrane.com/networks.html Harry Crane

Probabilistic Foundations of Statistical Network Analysis Chapter 5: Statistical modeling paradigm

Probabilistic Foundations of Statistical Network Analysis Chapter 3: Network sampling Harry Crane

Probabilistic Foundations of Statistical Network Analysis Chapter 2: Binary relational data Harry

Probabilistic Foundations of Statistical Network Analysis Chapter 1: Orientation Harry Crane

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

recap to this point foundations foundations foundations foundations genetics =

Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations

Some Comments on the Some Comments on the Foundations of Network Analysis Foundations of Network

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

Running Probabilistic Running Probabilistic Running Probabilistic Programs Backwards Programs

Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Thesis

Probabilistic Computation Lecture 13 BPP vs. PH 1 Recap 2 Recap Probabilistic computation 2

Table of Contents I Probabilistic Reasoning Classical Probabilistic Models Basic Probabilistic

Probabilistic Computation Lecture 12 Flipping coins, taking chances PP, BPP 1 Probabilistic

Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Reconstruction

Chapter 12: Iterative Methods ES 240: Scientific and Engineering Computation. Iterative Methods

Chapter 12 Randomized Algorithms II High Probability NEW CS 473: Theory II, Fall 2015

Kjell Karlsen President Sea Launch July 15, 2011 Aerospace &amp; Defense Forum Aerospace &amp;

Social security austerity Dr. Chris Grover Department of Sociology Lancaster University 1

Help and Do cumen tation Ov erview Users require dieren t t yp es of supp ort

Pythagorean Theorem In a right triangle with sides of length a and b and hypotenuse c : a 2 + b 2 =

W = F x or Expansion/Contrac8on of a Closed System

Chapter 7. Inclusion-Exclusion a.k.a. The Sieve Formula Prof. Tesler Math 184A Fall 2019 Prof.

Kjell Karlsen President Sea Launch July 15, 2011 Aerospace & Defense Forum Aerospace &