statistical inference of network structure
play

Statistical inference of network structure Part 2 Tiago P. Peixoto - PowerPoint PPT Presentation

Statistical inference of network structure Part 2 Tiago P. Peixoto University of Bath Berlin, August 2017 Weighted graphs C. Aicher et al. Journal of Complex Networks 3(2), 221-248 (2015); T.P.P arXiv: 1708.01432 Adjacency: A ij { 0 , 1 }


  1. Statistical inference of network structure Part 2 Tiago P. Peixoto University of Bath Berlin, August 2017

  2. Weighted graphs C. Aicher et al. Journal of Complex Networks 3(2), 221-248 (2015); T.P.P arXiv: 1708.01432 Adjacency: A ij ∈ { 0 , 1 } or N Weights: x ij ∈ N or R SBMs with edge covariates: P ( A , x | θ , γ , b ) = P ( x | A , γ , b ) P ( A | θ , b ) Adjacency: e − λ bi,bj κ i κ j ( λ b i ,bj κ i κ j ) A ij � P ( A | θ = { λ , κ } , b ) = , A ij ! i<j Edge covariates: � P ( x | A , γ , b ) = P ( x rs | γ rs ) r ≤ s P ( x | γ ) → Exponential, Normal, Geometric, Binomial, Poisson, . . .

  3. Weighted graphs T.P.P arXiv: 1708.01432 Nonparametric Bayesian approach P ( b | A , x ) = P ( A , x | b ) P ( b ) , P ( A , x ) Marginal likelihood: � P ( A , x | b ) = P ( A , x | θ , γ , b ) P ( θ ) P ( γ ) d θ d γ = P ( A | b ) P ( x | A , b ) , Adjacency part (unweighted): � P ( A | b ) = P ( A | θ , b ) P ( θ ) d θ Weights part: � P ( x | A , b ) = P ( x | A , γ , b ) P ( γ ) d γ � � = P ( x rs | γ rs ) P ( γ rs ) d γ rs r ≤ s

  4. UN Migrations

  5. UN Migrations 10 − 1 SBM fit with geometric weights 10 − 2 Geometric distribution fit 10 − 3 10 − 4 Probability 10 − 5 10 − 6 10 − 7 10 − 8 10 − 9 10 0 10 1 10 2 10 3 10 4 10 5 10 6 Migrations

  6. Votes in congress O p p o s i t i o n 0 . 8 0 . 6 Vote correlation Deputy 0 . 4 0 . 2 t n e 0 . 0 m n r e v o G Deputy SBM fit on original data 4 SBM fit on shuffled data Probability density 3 2 1 0 − 0 . 2 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 Vote correlation

  7. Human connectome Right hemisphere SBM fit 10 0 Probability density 10 − 2 10 − 4 10 − 6 10 − 8 10 − 1 10 0 10 1 10 2 Electrical connectivity (mm − 1 ) 5 SBM fit 4 Probability density 3 2 1 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 Fractional anisotropy (dimensionless) Left hemisphere

  8. Overlapping groups c) (Palla et al 2005)

  9. Overlapping groups c) (Palla et al 2005)

  10. Overlapping groups c) (Palla et al 2005) ◮ Number of nonoverlapping partitions: B N ◮ Number of overlapping partitions: 2 BN

  11. Overlapping groups c) (Palla et al 2005) ◮ Number of nonoverlapping partitions: B N ◮ Number of overlapping partitions: 2 BN

  12. Group overlap A ij e − λ ij λ e − λ ii / 2 ( λ ii / 2) A ii / 2 � � � ij P ( A | κ , λ ) = × , λ ij = κ ir λ rs κ js A ij ! A ii / 2! i<j i rs � � G rs Labelled half-edges: A ij = ij , P ( A | κ , λ ) = P ( G | κ , λ ) rs G

  13. Group overlap A ij e − λ ij λ e − λ ii / 2 ( λ ii / 2) A ii / 2 � � � ij P ( A | κ , λ ) = × , λ ij = κ ir λ rs κ js A ij ! A ii / 2! i<j i rs � � G rs Labelled half-edges: A ij = ij , P ( A | κ , λ ) = P ( G | κ , λ ) rs G � P ( G | κ , λ ) P ( κ ) P ( λ | ¯ P ( G ) = λ ) d κ d λ , � r<s e rs ! � λ E ¯ r e rr !! ( N − 1)! � � k r = ii !! × ( e r + N − 1)! × i ! , � � ij ! � (¯ i<j G rs i G rs λ + 1) E + B ( B +1) / 2 rs r ir

  14. Group overlap A ij e − λ ij λ e − λ ii / 2 ( λ ii / 2) A ii / 2 � � � ij P ( A | κ , λ ) = × , λ ij = κ ir λ rs κ js A ij ! A ii / 2! i<j i rs � � G rs Labelled half-edges: A ij = ij , P ( A | κ , λ ) = P ( G | κ , λ ) rs G � P ( G | κ , λ ) P ( κ ) P ( λ | ¯ P ( G ) = λ ) d κ d λ , � r<s e rs ! � λ E ¯ r e rr !! ( N − 1)! � � k r = ii !! × ( e r + N − 1)! × i ! , � � ij ! � (¯ i<j G rs i G rs λ + 1) E + B ( B +1) / 2 rs r ir Microcanonical equivalence: P ( G ) = P ( G | k , e ) P ( k | e ) P ( e ) , � r<s e rs ! � r e rr !! � ir k r i ! P ( G | k , e ) = r e r ! , � � ij ! � ii !! � i<j G rs i G rs rs � � e r � � − 1 � P ( k | e ) = N r

  15. Overlap vs. non-overlap Social “ego” network (from Facebook) 4 6 3 4 n k 2 n k 2 1 0 0 0 8 16 24 0 5 10 15 20 k k 3 3 2 2 n k n k 1 1 0 0 3 6 9 4 8 12 16 k k B = 4 , Λ ≃ 0 . 053

  16. Overlap vs. non-overlap Social “ego” network (from Facebook) 4 6 4 4 3 3 4 n k n k 2 n k n k 2 2 2 1 1 0 0 0 0 0 8 16 24 0 5 10 15 20 0 8 16 24 0 3 6 k k k k 3 3 3 3 2 2 2 2 n k n k n k n k 1 1 1 1 0 0 0 0 3 6 9 4 8 12 16 0 3 6 9 4 8 12 k k k k B = 4 , Λ ≃ 0 . 053 B = 5 , Λ = 1

  17. Overlap vs. non-overlap 6 . 0 5 . 5 Σ /E 5 . 0 B = 4 (overlapping) B = 15 (nonoverlapping) 4 . 5 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 µ

  18. SBM with layers T.P.P, Phys. Rev. E 92, 042807 (2015) ◮ Fairly straightforward. Easily combined with l = 3 degree-correction, overlaps, etc. ◮ Edge probabilities are in general different in each layer. l = 2 ◮ Node memberships can move or stay the same across layer. l = 1 ◮ Works as a general model for discrete as well as discretized edge covariates. Collapsed ◮ Works as a model for temporal networks.

  19. SBM with layers Edge covariates l m l � rs ! � P ( { A l }|{ θ } ) = P ( A c |{ θ } ) m rs ! r ≤ s Independent layers � P ( { A l }|{{ θ } l } , { φ } , { z il }} ) = P ( A l |{ θ } l , { φ } ) l Embedded models can be of any type: Traditional, degree-corrected, overlapping.

  20. Layer information can reveal hidden structure

  21. Layer information can reveal hidden structure

  22. ... but it can also hide structure! → · · · × C 1 . 0 Collapsed E/C = 500 0 . 8 E/C = 100 E/C = 40 0 . 6 NMI E/C = 20 E/C = 15 0 . 4 E/C = 12 E/C = 10 0 . 2 E/C = 5 0 . 0 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 c

  23. Model selection Null model: Collapsed (aggregated) SBM + fully random layers � l E l ! P ( { G l }|{ θ } , { E l } ) = P ( G c |{ θ } ) × E ! (we can also aggregate layers into larger layers)

  24. Model selection Example: Social network of physicians N = 241 Physicians Survey questions: ◮ “When you need information or advice about questions of therapy where do you usually turn?” ◮ “And who are the three or four physicians with whom you most often find yourself discussing cases or therapy in the course of an ordinary week – last week for instance?” ◮ “Would you tell me the first names of your three friends whom you see most often socially?” T.P.P, Phys. Rev. E 92, 042807 (2015)

  25. Model selection Example: Social network of physicians

  26. Model selection Example: Social network of physicians

  27. Model selection Example: Social network of physicians Λ = 1 log 10 Λ ≈ − 50

  28. Example: Brazilian chamber of deputies Voting network between members of congress (1999-2006) C e n t P M e r D B , P P , P T B B T P , B D M P , P P P P , M E D B D P T S P L t h F P g T P i L R P e , D f M T t , P E S D B , P C R o d P B , B P D B T S P , P D M B P T , P P B

  29. Example: Brazilian chamber of deputies Voting network between members of congress (1999-2006) G o v e r n m e n t B , M D P P P , P T B C P e M n D t B P M e , r P D P B S T P B P D , , P T P , , P B S P B P D , , P P C M d T o B B 2003-2006 B P T P P B T , B T D P M P T B D P D M E , P M P , L F P , , R P P L F P B D S P n o i t i s o p p O P P D B , , M M P P P , P T B E D P M D B B , P P B S D P T P P D P , T T P , S P O , S B B p P p t D , o n P s e C i t m M d i L o o n 1999-2002 B n e r P t v h F o P P G T g T B P i L T R P e P , D f M P T t T , B D P D E M E S P M D B , L , F P , P , R P C P L F B D P R o d S P B , B P D B T S P , P D M B T P , P P B

  30. Example: Brazilian chamber of deputies Voting network between members of congress (1999-2006) G o v e r n m e n t B , M D P P P , P T B C P e M n D t B P M e , r P D P B S T P B P D , , P T P , , P B S P B P D , , P P C M d T o B B 2003-2006 B P T P P B T , B T D P M P T B D P D M E , P M P , L F P , , R P P L F P B D S P n o i t i s o p p O P P D B , , M M P P P , P T B E D P M D B B , P P B S D P T P P D P , T T P , S P O , S B B p P p t D , o n P s e C t i m M d i L o o n 1999-2002 B n r e P t v h F o P P G T g T B P i L T R P e P , D f M P T t T , B D P D E M E S P M D B , L , F P , P , R P C P L F B D P R o d S P B , B P D B T S P , P D M B P T , P P B log 10 Λ ≈ − 111 Λ = 1

  31. Real-valued edges? Idea: Layers { ℓ } → bins of edge values! � P ( { G x }|{ θ } { ℓ } , { ℓ } ) = P ( { G l }|{ θ } { ℓ } , { ℓ } ) × ρ ( x l ) l Bayesian posterior → Number (and shape) of bins

  32. Movement between groups... P M D B T P D P T , P S B P M D B P T T D P P , P B S S , P P T P F L , D E B M T P , R P , P P B P S S D P B , B o d C P

  33. Networks with metadata Many network datasets contain metadata : Annotations that go beyond the mere adjacency between nodes. Often assumed as indicators of topological structure, and used to validate community detection methods. A.k.a. “ground-truth”.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend