Statistical inference of network structure Part 2 Tiago P. Peixoto - PowerPoint PPT Presentation

Statistical inference of network structure Part 2 Tiago P. Peixoto University of Bath Berlin, August 2017

Weighted graphs C. Aicher et al. Journal of Complex Networks 3(2), 221-248 (2015); T.P.P arXiv: 1708.01432 Adjacency: A ij ∈ { 0 , 1 } or N Weights: x ij ∈ N or R SBMs with edge covariates: P ( A , x | θ , γ , b ) = P ( x | A , γ , b ) P ( A | θ , b ) Adjacency: e − λ bi,bj κ i κ j ( λ b i ,bj κ i κ j ) A ij � P ( A | θ = { λ , κ } , b ) = , A ij ! i<j Edge covariates: � P ( x | A , γ , b ) = P ( x rs | γ rs ) r ≤ s P ( x | γ ) → Exponential, Normal, Geometric, Binomial, Poisson, . . .

UN Migrations

UN Migrations 10 − 1 SBM fit with geometric weights 10 − 2 Geometric distribution fit 10 − 3 10 − 4 Probability 10 − 5 10 − 6 10 − 7 10 − 8 10 − 9 10 0 10 1 10 2 10 3 10 4 10 5 10 6 Migrations

Votes in congress O p p o s i t i o n 0 . 8 0 . 6 Vote correlation Deputy 0 . 4 0 . 2 t n e 0 . 0 m n r e v o G Deputy SBM fit on original data 4 SBM fit on shuffled data Probability density 3 2 1 0 − 0 . 2 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 Vote correlation

Human connectome Right hemisphere SBM fit 10 0 Probability density 10 − 2 10 − 4 10 − 6 10 − 8 10 − 1 10 0 10 1 10 2 Electrical connectivity (mm − 1 ) 5 SBM fit 4 Probability density 3 2 1 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 Fractional anisotropy (dimensionless) Left hemisphere

Overlapping groups c) (Palla et al 2005)

Overlapping groups c) (Palla et al 2005) ◮ Number of nonoverlapping partitions: B N ◮ Number of overlapping partitions: 2 BN

Group overlap A ij e − λ ij λ e − λ ii / 2 ( λ ii / 2) A ii / 2 � � � ij P ( A | κ , λ ) = × , λ ij = κ ir λ rs κ js A ij ! A ii / 2! i<j i rs � � G rs Labelled half-edges: A ij = ij , P ( A | κ , λ ) = P ( G | κ , λ ) rs G

Group overlap A ij e − λ ij λ e − λ ii / 2 ( λ ii / 2) A ii / 2 � � � ij P ( A | κ , λ ) = × , λ ij = κ ir λ rs κ js A ij ! A ii / 2! i<j i rs � � G rs Labelled half-edges: A ij = ij , P ( A | κ , λ ) = P ( G | κ , λ ) rs G � P ( G | κ , λ ) P ( κ ) P ( λ | ¯ P ( G ) = λ ) d κ d λ , � r<s e rs ! � λ E ¯ r e rr !! ( N − 1)! � � k r = ii !! × ( e r + N − 1)! × i ! , � � ij ! � (¯ i<j G rs i G rs λ + 1) E + B ( B +1) / 2 rs r ir

Group overlap A ij e − λ ij λ e − λ ii / 2 ( λ ii / 2) A ii / 2 � � � ij P ( A | κ , λ ) = × , λ ij = κ ir λ rs κ js A ij ! A ii / 2! i<j i rs � � G rs Labelled half-edges: A ij = ij , P ( A | κ , λ ) = P ( G | κ , λ ) rs G � P ( G | κ , λ ) P ( κ ) P ( λ | ¯ P ( G ) = λ ) d κ d λ , � r<s e rs ! � λ E ¯ r e rr !! ( N − 1)! � � k r = ii !! × ( e r + N − 1)! × i ! , � � ij ! � (¯ i<j G rs i G rs λ + 1) E + B ( B +1) / 2 rs r ir Microcanonical equivalence: P ( G ) = P ( G | k , e ) P ( k | e ) P ( e ) , � r<s e rs ! � r e rr !! � ir k r i ! P ( G | k , e ) = r e r ! , � � ij ! � ii !! � i<j G rs i G rs rs � � e r � � − 1 � P ( k | e ) = N r

Overlap vs. non-overlap Social “ego” network (from Facebook) 4 6 3 4 n k 2 n k 2 1 0 0 0 8 16 24 0 5 10 15 20 k k 3 3 2 2 n k n k 1 1 0 0 3 6 9 4 8 12 16 k k B = 4 , Λ ≃ 0 . 053

Overlap vs. non-overlap Social “ego” network (from Facebook) 4 6 4 4 3 3 4 n k n k 2 n k n k 2 2 2 1 1 0 0 0 0 0 8 16 24 0 5 10 15 20 0 8 16 24 0 3 6 k k k k 3 3 3 3 2 2 2 2 n k n k n k n k 1 1 1 1 0 0 0 0 3 6 9 4 8 12 16 0 3 6 9 4 8 12 k k k k B = 4 , Λ ≃ 0 . 053 B = 5 , Λ = 1

Overlap vs. non-overlap 6 . 0 5 . 5 Σ /E 5 . 0 B = 4 (overlapping) B = 15 (nonoverlapping) 4 . 5 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 µ

SBM with layers T.P.P, Phys. Rev. E 92, 042807 (2015) ◮ Fairly straightforward. Easily combined with l = 3 degree-correction, overlaps, etc. ◮ Edge probabilities are in general different in each layer. l = 2 ◮ Node memberships can move or stay the same across layer. l = 1 ◮ Works as a general model for discrete as well as discretized edge covariates. Collapsed ◮ Works as a model for temporal networks.

SBM with layers Edge covariates l m l � rs ! � P ( { A l }|{ θ } ) = P ( A c |{ θ } ) m rs ! r ≤ s Independent layers � P ( { A l }|{{ θ } l } , { φ } , { z il }} ) = P ( A l |{ θ } l , { φ } ) l Embedded models can be of any type: Traditional, degree-corrected, overlapping.

Layer information can reveal hidden structure

... but it can also hide structure! → · · · × C 1 . 0 Collapsed E/C = 500 0 . 8 E/C = 100 E/C = 40 0 . 6 NMI E/C = 20 E/C = 15 0 . 4 E/C = 12 E/C = 10 0 . 2 E/C = 5 0 . 0 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 c

Model selection Null model: Collapsed (aggregated) SBM + fully random layers � l E l ! P ( { G l }|{ θ } , { E l } ) = P ( G c |{ θ } ) × E ! (we can also aggregate layers into larger layers)

Model selection Example: Social network of physicians N = 241 Physicians Survey questions: ◮ “When you need information or advice about questions of therapy where do you usually turn?” ◮ “And who are the three or four physicians with whom you most often find yourself discussing cases or therapy in the course of an ordinary week – last week for instance?” ◮ “Would you tell me the first names of your three friends whom you see most often socially?” T.P.P, Phys. Rev. E 92, 042807 (2015)

Model selection Example: Social network of physicians

Model selection Example: Social network of physicians Λ = 1 log 10 Λ ≈ − 50

Example: Brazilian chamber of deputies Voting network between members of congress (1999-2006) C e n t P M e r D B , P P , P T B B T P , B D M P , P P P P , M E D B D P T S P L t h F P g T P i L R P e , D f M T t , P E S D B , P C R o d P B , B P D B T S P , P D M B P T , P P B

Example: Brazilian chamber of deputies Voting network between members of congress (1999-2006) G o v e r n m e n t B , M D P P P , P T B C P e M n D t B P M e , r P D P B S T P B P D , , P T P , , P B S P B P D , , P P C M d T o B B 2003-2006 B P T P P B T , B T D P M P T B D P D M E , P M P , L F P , , R P P L F P B D S P n o i t i s o p p O P P D B , , M M P P P , P T B E D P M D B B , P P B S D P T P P D P , T T P , S P O , S B B p P p t D , o n P s e C i t m M d i L o o n 1999-2002 B n e r P t v h F o P P G T g T B P i L T R P e P , D f M P T t T , B D P D E M E S P M D B , L , F P , P , R P C P L F B D P R o d S P B , B P D B T S P , P D M B T P , P P B

Example: Brazilian chamber of deputies Voting network between members of congress (1999-2006) G o v e r n m e n t B , M D P P P , P T B C P e M n D t B P M e , r P D P B S T P B P D , , P T P , , P B S P B P D , , P P C M d T o B B 2003-2006 B P T P P B T , B T D P M P T B D P D M E , P M P , L F P , , R P P L F P B D S P n o i t i s o p p O P P D B , , M M P P P , P T B E D P M D B B , P P B S D P T P P D P , T T P , S P O , S B B p P p t D , o n P s e C t i m M d i L o o n 1999-2002 B n r e P t v h F o P P G T g T B P i L T R P e P , D f M P T t T , B D P D E M E S P M D B , L , F P , P , R P C P L F B D P R o d S P B , B P D B T S P , P D M B P T , P P B log 10 Λ ≈ − 111 Λ = 1

Real-valued edges? Idea: Layers { ℓ } → bins of edge values! � P ( { G x }|{ θ } { ℓ } , { ℓ } ) = P ( { G l }|{ θ } { ℓ } , { ℓ } ) × ρ ( x l ) l Bayesian posterior → Number (and shape) of bins

Movement between groups... P M D B T P D P T , P S B P M D B P T T D P P , P B S S , P P T P F L , D E B M T P , R P , P P B P S S D P B , B o d C P

Networks with metadata Many network datasets contain metadata : Annotations that go beyond the mere adjacency between nodes. Often assumed as indicators of topological structure, and used to validate community detection methods. A.k.a. “ground-truth”.

Statistical inference of network structure Part 2 Tiago P. Peixoto - PowerPoint PPT Presentation

Statistical inference of network structure Part 2 Tiago P. Peixoto University of Bath Berlin, August 2017 Weighted graphs C. Aicher et al. Journal of Complex Networks 3(2), 221-248 (2015); T.P.P arXiv: 1708.01432 Adjacency: A ij { 0 , 1 }

STAT 401A - Statistical Methods for Research Workers Statistical Inference Jarad Niemi (Dr. J)

Neurally-Guided Structure Inference http://ngsi.csail.mit.edu Sidi Lu, Jiayuan Mao, Josh

Foundations for Inference I Dajiang Liu @PHS525 Feb-09-2016 Statistical Inference

UQ, STAT2201, 2017, Lecture 6 Unit 6 Statistical Inference Ideas. 1 Statistical Inference is

Statistical Natural Language Processing Statistical models: learning, inference, estimation,

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

Lifted Inference in Statistical Relational Models Guy Van den Broeck BUDA Invited Tutorial June

Statistical Inference https://people.bath.ac.uk/masss/APTS/apts.html Simon Shaw University of

COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 23. PGM

Inference Suppose you are given a Bayesian network with the graph structure and the parameters

Inference Suppose you are given a Bayesian network with the graph structure and the parameters

Post-Selection Inference Todd Kuffner Washington University in St. Louis PhyStat 2016

Soft Inference and Posterior Marginals September 19, 2013 Soft vs. Hard Inference Hard

Type Inference 75 Definition Type Inference Type inference = Java compiler's ability

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Session 14: Poster highlights 14b. Medical Physics Dosimetry and verification Ben Mijnheer

Susan Marina Wolfe Living on the Edge Conference St Martin in the Fields Oct 2015

Co-authors: C-F Chien, Y-J Chen National Tsing Hua University ISMI 2015, 16 th -18 th Oct. KAIST,

HVAC Energy Flow in Buildings Charles H. Culp, P.E., Ph.D., FASHRAE, LEED-AP Professor,

A validation/uncertainty quantification analysis for a 1.5 MW oxy-coal fired L1500 furnace using a

In the Government Inquiry into Operation Burnham SYNOPSIS OF SUBMISSIONS OF COUNSEL FOR JON

First of all, what is the definition of the term microfossils? Dr. A. J. Rundle applies

Bayesian Causal Inference in High Dimensional Data Settings Jacob Spertus and Sharon-Lise Normand

Statistical inference of network structure Part 2 Tiago P. Peixoto - PowerPoint PPT Presentation

Statistical inference of network structure Part 2 Tiago P. Peixoto University of Bath Berlin, August 2017 Weighted graphs C. Aicher et al. Journal of Complex Networks 3(2), 221-248 (2015); T.P.P arXiv: 1708.01432 Adjacency: A ij { 0 , 1 }

STAT 401A - Statistical Methods for Research Workers Statistical Inference Jarad Niemi (Dr. J)

Neurally-Guided Structure Inference http://ngsi.csail.mit.edu Sidi Lu*, Jiayuan Mao*, Josh

Foundations for Inference I Dajiang Liu @PHS525 Feb-09-2016 Statistical Inference

UQ, STAT2201, 2017, Lecture 6 Unit 6 Statistical Inference Ideas. 1 Statistical Inference is

Statistical Natural Language Processing Statistical models: learning, inference, estimation,

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

Lifted Inference in Statistical Relational Models Guy Van den Broeck BUDA Invited Tutorial June

Statistical Inference https://people.bath.ac.uk/masss/APTS/apts.html Simon Shaw University of

COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 23. PGM

Inference Suppose you are given a Bayesian network with the graph structure and the parameters

Inference Suppose you are given a Bayesian network with the graph structure and the parameters

Post-Selection Inference Todd Kuffner Washington University in St. Louis PhyStat 2016

Soft Inference and Posterior Marginals September 19, 2013 Soft vs. Hard Inference Hard

Type Inference 75 Definition Type Inference Type inference = Java compiler's ability

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

Session 14: Poster highlights 14b. Medical Physics Dosimetry and verification Ben Mijnheer

Susan Marina Wolfe Living on the Edge Conference St Martin in the Fields Oct 2015

Co-authors: C-F Chien, Y-J Chen National Tsing Hua University ISMI 2015, 16 th -18 th Oct. KAIST,

HVAC Energy Flow in Buildings Charles H. Culp, P.E., Ph.D., FASHRAE, LEED-AP Professor,

A validation/uncertainty quantification analysis for a 1.5 MW oxy-coal fired L1500 furnace using a

In the Government Inquiry into Operation Burnham SYNOPSIS OF SUBMISSIONS OF COUNSEL FOR JON

First of all, what is the definition of the term microfossils? Dr. A. J. Rundle applies

Bayesian Causal Inference in High Dimensional Data Settings Jacob Spertus and Sharon-Lise Normand

Neurally-Guided Structure Inference http://ngsi.csail.mit.edu Sidi Lu, Jiayuan Mao, Josh