Distance Matters: Geo-social Metrics for Online Social Networks - - PowerPoint PPT Presentation

distance matters geo social metrics for online social
SMART_READER_LITE
LIVE PREVIEW

Distance Matters: Geo-social Metrics for Online Social Networks - - PowerPoint PPT Presentation

Distance Matters: Geo-social Metrics for Online Social Networks Salvatore Scellato Computer Laboratory, University of Cambridge Joint work with: Cecilia Mascolo , Mirco Musolesi, Vito Latora 3rd Workshop on Online Social Networks Boston, 22


slide-1
SLIDE 1

Distance Matters: Geo-social Metrics for Online Social Networks

Salvatore Scellato

Computer Laboratory, University of Cambridge

Joint work with: Cecilia Mascolo, Mirco Musolesi, Vito Latora

3rd Workshop on Online Social Networks

Boston, 22 June 2010

slide-2
SLIDE 2

2

Location, location, location. And social networks.

Plethora of new services: increasingly important, excitingly new.

slide-3
SLIDE 3

3

Information, social structure and space.

Geography may shape social structures and affect information flows.

slide-4
SLIDE 4

4

Put people on a map and social ties across space.

We need new tools to model these networks.

slide-5
SLIDE 5

5

Distance matters.

Probability of friendship decreases with distance.

slide-6
SLIDE 6

6

Interesting questions...

  • Can we discriminate between

users according to their attitude towards long-range ties?

  • How geographically close are

clusters of friends?

  • How is information spreading

across space over social links?

  • Can we improve real systems

exploiting geographic information in social networks?

Flickr: Oberazzi

6

slide-7
SLIDE 7

7

Geographic Social Network

Given a graph G=(N,K) and the geographic location of the nodes:

  • Place all nodes in a 2D metric

space adopting great-circle distance on the Earth.

  • Assign a weight to each edge

equal to the geographic distance between the two nodes.

1,070 km 1,120 km 210 km

slide-8
SLIDE 8

8

How close are the neighbors of a given node to the node itself? How spatially inter-connected are the neighbors of a given node?

Geo-social metrics

Node locality Geographic clustering coefficient User A User D User C User B

slide-9
SLIDE 9

9

How close are the neighbors of a given node to the node itself?

Our aim is to:

  • Highlight only extremely short-range social connections.
  • Normalize this measure for nodes with various degrees.
  • Allow networks at different geographic scales to be compared.

Node locality

Link length Network scaling factor Node neighborhood Node degree

slide-10
SLIDE 10

10

How spatially inter-connected are the node’s neighbours?

Our aim is to:

  • Generalise the standard clustering coefficient.
  • Highlight only extremely short-range social triangles.
  • Allow networks at different geographic scales to be compared.

Geographic clustering coefficient

Triangle link lengths Network scaling factor Node neighborhood Possible triangles Triangle size

j k i

slide-11
SLIDE 11

11

Scaling factor

The scaling factor β allows us to compare geo-social metrics across networks with different scales. For example, by choosing β so that if all lengths are rescaled, β is also rescaled, geo-social metrics are not affected.

Graph 1 Graph 2

k k k k 2k 2k 2k 2k

slide-12
SLIDE 12

12

Dataset collection

Online Social Network Collection method Sampling Location information

Public API Complete GPS Public API Snowball crawling GPS Public API + HTML scraping Snowball crawling Text-based Public API Snowball crawling GPS or text- based

slide-13
SLIDE 13

13

Yahoo Geocoding API

slide-14
SLIDE 14

14

Problems with geocoding

Hilton Paris Paris Hilton

Keep only city-level accurate results

slide-15
SLIDE 15

15

Dataset properties

BrightKite FourSquare LiveJournal Twitter

409,093 992,886 58,424 54,190

Nodes

1 10,000 100,000,000

182,986,352 29,645,952 351,216 213,668

Edges

slide-16
SLIDE 16

16

Social Metrics

BrightKite FourSquare LiveJournal Twitter

447.45 29.85 12.02 7.88

Degree

0.207 0.185 0.253 0.181

Clustering

slide-17
SLIDE 17

17

Geographic Properties

BrightKite FourSquare LiveJournal Twitter

6,087 km 6,142 km 4,312 km 5,683 km 5,117 km 2,727 km 1,296 km 2,041 km

Average link length Average user distance

slide-18
SLIDE 18

18

Social Link Geographic Distance

BrightKite Twitter LiveJournal FourSquare

36%

below 100Km

58%

below 100Km

32%

below 100Km

4%

below 100Km

18

slide-19
SLIDE 19

19

Geo-social Metrics

BrightKite FourSquare LiveJournal Twitter

0.49 0.71 0.85 0.82

Node Locality

0.207 0.185 0.256 0.181 0.108 0.146 0.237 0.165

Geographic clustering Clustering

slide-20
SLIDE 20

20

Node Locality Distributions

BrightKite Twitter LiveJournal FourSquare

20

slide-21
SLIDE 21

21

Geographic Clustering Distributions

BrightKite Twitter LiveJournal FourSquare

slide-22
SLIDE 22

22

Findings

Location-based services (LBSs) foster user interaction on shorter distance. LBSs have many users with predominance of local ties and local triangles. Twitter does not exhibit this ‘hyperlocal’ behaviour. In general, users with higher degrees appear more global, (with the exception of Twitter).

slide-23
SLIDE 23

23

Conclusions and future works

We have shown how social networks with geographic information can be studied and represented. We have defined two new geo-social metrics which take into account both social connections and geographic distance: node locality and geographic clustering coefficient. We have collected 4 large-scale online datasets and applied our metrics to their structure, highlighting differences between purely location-based social network services and other online social communities. In future: information propagation over space on Twitter, combining user mobility with geo-social metrics, general geographic generative model for OSNs.

slide-24
SLIDE 24

24

Thanks! Questions?

Salvatore Scellato

Email: salvatore.scellato@cl.cam.ac.uk

Web: http://www.cl.cam.ac.uk/~ss824/ Twitter: www.twitter.com/thetarro

Flickr: sean dreilinger