distance matters geo social metrics for online social
play

Distance Matters: Geo-social Metrics for Online Social Networks - PowerPoint PPT Presentation

Distance Matters: Geo-social Metrics for Online Social Networks Salvatore Scellato Computer Laboratory, University of Cambridge Joint work with: Cecilia Mascolo , Mirco Musolesi, Vito Latora 3rd Workshop on Online Social Networks Boston, 22


  1. Distance Matters: Geo-social Metrics for Online Social Networks Salvatore Scellato Computer Laboratory, University of Cambridge Joint work with: Cecilia Mascolo , Mirco Musolesi, Vito Latora 3rd Workshop on Online Social Networks Boston, 22 June 2010

  2. Location, location, location. Plethora of new services: increasingly important, And social networks. excitingly new. 2

  3. Information, social Geography may shape social structures structure and space. and affect information flows. 3

  4. Put people on a map and We need new tools to model these networks. social ties across space. 4

  5. Probability of friendship Distance matters. decreases with distance. 5

  6. Interesting questions... • Can we discriminate between users according to their attitude towards long-range ties? • How geographically close are clusters of friends ? • How is information spreading across space over social links? • Can we improve real systems exploiting geographic information in social networks? 6 6 Flickr: Oberazzi

  7. Geographic Social Network Given a graph G=(N,K) and the geographic location of the nodes: •Place all nodes in a 2D metric space adopting great-circle 1,120 km distance on the Earth. •Assign a weight to each edge equal to the geographic distance between the two 1,070 km nodes. 210 km 7

  8. Geo-social metrics How close are the neighbors of a given node to the node itself? Node locality User A How spatially inter-connected are the neighbors of a given node? Geographic clustering coefficient User B User C User D 8

  9. Node locality How close are the neighbors of a given node to the node itself? Our aim is to: • Highlight only extremely short-range social connections. • Normalize this measure for nodes with various degrees. • Allow networks at different geographic scales to be compared. Link length Network scaling factor Node degree Node neighborhood 9

  10. Geographic clustering coefficient How spatially inter-connected are the node’s neighbours? Our aim is to: • Generalise the standard clustering coefficient. • Highlight only extremely short-range social triangles. • Allow networks at different geographic scales to be compared. Triangle size Triangle link lengths i j Network scaling factor Possible k triangles Node neighborhood 10

  11. Scaling factor The scaling factor β allows us to compare geo-social metrics across networks with different scales. For example, by choosing β so that if all lengths are rescaled, β is also rescaled , geo-social metrics are not affected. Graph 1 Graph 2 k 2k k k 2k k 2k 2k 11

  12. Dataset collection Online Social Collection Location Sampling Network method information Public API Complete GPS Public API Snowball crawling GPS Public API + Snowball crawling Text-based HTML scraping GPS or text- Public API Snowball crawling based 12

  13. Yahoo Geocoding API 13

  14. Problems with geocoding Hilton Paris Paris Hilton Keep only city-level accurate results 14

  15. Dataset properties Nodes Edges BrightKite 54,190 213,668 FourSquare 58,424 351,216 LiveJournal 992,886 29,645,952 Twitter 409,093 182,986,352 1 10,000 100,000,000 15

  16. Social Metrics Degree Clustering 0.181 BrightKite 7.88 0.253 FourSquare 12.02 0.185 LiveJournal 29.85 0.207 Twitter 447.45 16

  17. Geographic Properties Average link length Average user distance 2,041 km BrightKite 5,683 km 1,296 km FourSquare 4,312 km 2,727 km LiveJournal 6,142 km 5,117 km Twitter 6,087 km 17

  18. Social Link Geographic Distance BrightKite FourSquare 36% 58% below 100Km below 100Km LiveJournal Twitter 32% 4% below 100Km below 100Km 18 18

  19. Geo-social Metrics Geographic clustering Node Locality Clustering 0.165 BrightKite 0.82 0.181 0.237 FourSquare 0.85 0.256 0.146 LiveJournal 0.71 0.185 0.108 Twitter 0.49 0.207 19

  20. Node Locality Distributions BrightKite FourSquare LiveJournal Twitter 20 20

  21. Geographic Clustering Distributions BrightKite FourSquare LiveJournal Twitter 21

  22. Findings Location-based services (LBSs) foster user interaction on shorter distance. LBSs have many users with predominance of local ties and local triangles. Twitter does not exhibit this ‘hyperlocal’ behaviour. In general, users with higher degrees appear more global , (with the exception of Twitter). 22

  23. Conclusions and future works We have shown how social networks with geographic information can be studied and represented. We have defined two new geo-social metrics which take into account both social connections and geographic distance: node locality and geographic clustering coefficient. We have collected 4 large-scale online datasets and applied our metrics to their structure, highlighting differences between purely location-based social network services and other online social communities. In future: information propagation over space on Twitter, combining user mobility with geo-social metrics, general geographic generative model for OSNs. 23

  24. Thanks! Questions? Salvatore Scellato Email : salvatore.scellato@cl.cam.ac.uk Web : http://www.cl.cam.ac.uk/~ss824/ Twitter : www.twitter.com/thetarro 24 Flickr: sean dreilinger

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend