network curvature friendship paradox and dispersion
play

Network Curvature, friendship paradox and dispersion Rik Sarkar - PowerPoint PPT Presentation

Network Curvature, friendship paradox and dispersion Rik Sarkar Recap: Hyperbolic distances Points in a disk Shortest paths along circular curves bent toward the center Similar to internet paths being bent toward the core


  1. Network Curvature, friendship paradox and dispersion Rik Sarkar

  2. Recap: Hyperbolic distances • Points in a disk • Shortest paths along circular curves bent toward the center • Similar to internet paths being bent toward the core • Distances look cramped close to the boundaries

  3. Internet emulates hyperbolic metrics • Shavitt, Tankel. ACM ToN 2008.

  4. Hyperbolic model for networks • People connect to popular “central” nodes • Preferential attachment. Hubs. Cause small diameters. • People connect to other “similar” nodes • Similar in location, or interests, or communities • Similar means small distance in some measure • Preferential attachment does not model this well • Cannot model the clustering properties

  5. Popularity/similarity model • Put all nodes on the plane at polar coord: (r, θ ) • Popularity: Distance from the center • Like preferential attachment, earlier nodes are popular • If a node appears at time t, its distance from center is r = ln t • Interests/features for similarity: Represented by angle • θ • Two nodes a,b are similar if | θ a - θ b | is small.

  6. Edge attachments • A new node appears at time t • Sets r = ln t • Sets θ = random • It connects to the k nearest nodes in hyperbolic distance • Central nodes are older and higher degree

  7. Properties • Creates power law distribution • Creates strong clustering • Different from pref. attachment • More realistic in real networks

  8. Modeling the internet • A suitable hyperbolic embedding gives very good model of connection probabilities • Similar results in other power law networks

  9. Actor networks • Does not work equally well

  10. • Popularity vs Similarity in Growing Networks • Papdopoulos et al. Nature 2012.

  11. Hyperbolic geometry • Useful in modeling metrics with exponential growth (number of nodes within distance x) • E.g. balanced binary tree • Many parameters may have such properties • Position in a hierarchy • Topological types of paths in a domain • Subsets of items

  12. Few other things

  13. Friendship paradox • Your friends have more friends than you do! • Are you less social than others?

  14. Friendship paradox • The paradox: • If you ask everyone to report their degrees, you get the average degree • If you ask everyone to report the average degrees of their friends and take the averages of all, • you get more than the overall average degree! • Most of us have some popular friends (hence they are popular) • If you pick a random friend of a random person, (random edge) • This friend is relatively likely to be popular, since popular nodes have more edges

  15. Friendship paradox • Average degree of nodes: • A node with degree d(v) contributes d(v) once • Average degree of a friend: • Each person picks a friend and counts degree • A node with degree d(v) contributes d(v) times, with total contribution d(v) 2 • A few nodes with relatively high d(v) can skew the count • https://en.wikipedia.org/wiki/Friendship_paradox • S. L. Feld, Why your friends have more friends than you do, American journal of sociology, 1991

  16. Identify spouses or romantic partners • Suppose you have the facebook graph • Only the graph and nothing else • Can you identify which edges correspond to spouses or romantic partners?

  17. Identify spouses or romantic partners

  18. Identify spouses or romantic partners • Tie strengths are important • Romantic ties tend to be of high strength, more likely to transmit information • Do you expect romantic links to have high embeddedness (number/fraction of common friends)?

  19. • People have clusters of friend circles • Work, school, college, hobbies • Edges in these have high embeddedness, even if they are not strong friends

  20. • Spouses usually know some friends in each-others different circles • The edge does not have high embeddedness • Compared to links in groups such as school/ college • But, it has a dispersed structure: • There are several mutual friends, but the mutual friends are not well connected among themselves

  21. Dispersion • dispersion between u,v • Notations: • C(u,v): Common friends of u, v • G u : Subgraph induced by u and all neighbors of u • d uv : distance measured in G u -{u,v}: Without using u or v X disp ( u, v ) = d uv ( s, t ) s,t ∈ C ( u,v )

  22. Dispersion X disp ( u, v ) = d uv ( s, t ) s,t ∈ C ( u,v ) • Increases with more mutual friends • Increases when these friends are far in the graph • It is possible to use other distance measures • Good results with d = 1 if no direct edge, 0 otherwise

  23. Normalized dispersion • Use norm(u,v) = disp(u,v)/embed(u,v) • 48% accuracy • Apply recursively, to weigh higher nodes with high dispersion • Gives 50.5% accuracy • 60% accuracy for married couples • High accuracy considering hundreds of friends • Works better than usual machine learning based on posts, visits, photos etc features • Best results with combination of features

  24. • Backstrom and Kleinberg. Romantic partnerships and dispersion of social ties, ACM CSCW 2014

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend