L ECTURE 34: N ETWORKS 1 T EACHER : G IANNI A. D I C ARO N ETWORK S - - PowerPoint PPT Presentation

l ecture 34 n etworks 1
SMART_READER_LITE
LIVE PREVIEW

L ECTURE 34: N ETWORKS 1 T EACHER : G IANNI A. D I C ARO N ETWORK S - - PowerPoint PPT Presentation

15-382 C OLLECTIVE I NTELLIGENCE S19 L ECTURE 34: N ETWORKS 1 T EACHER : G IANNI A. D I C ARO N ETWORK S CIENCE Barabasi, Network Science Easley & Kleinberg, Networks, Crowds, and Markets: Reasoning about a Highly Connected


slide-1
SLIDE 1

LECTURE 34: NETWORKS 1

TEACHER: GIANNI A. DI CARO

15-382 COLLECTIVE INTELLIGENCE – S19

slide-2
SLIDE 2

15781 Fall 2016: Lecture 22

NETWORK SCIENCE

§ Barabasi, “Network Science” § Easley & Kleinberg, “Networks, Crowds, and Markets: Reasoning about a Highly Connected World” § Newman, “Networks”

2

slide-3
SLIDE 3

COMPLEX SYSTEMS AS NETWORKS

Many complex systems can be represented as networks Any complex system has an associated network of communication / interaction among the components § Nodes = components of the complex system § Links = interactions between them

This and fol

  • llow
  • wing

g slides are adapted from

  • m

Kri Kristina Le Lerman’s sl slides s

slide-4
SLIDE 4

DIRECTED VS. UNDIRECTED NETWORKS

Directed § Directed links

  • interaction flows one way

§ Examples

  • WWW: web pages and hyperlinks
  • Twitter follower graph
  • Animal relations, prey-predator

Undirected § Undirected links

  • Interactions flow both ways

§ Examples

  • Social networks: people and

friendship

  • Atoms in a crystal
  • Countries in geographic maps
slide-5
SLIDE 5

HOW DO WE CHARACTERIZE NETWORKS?

§ Size

  • Number of nodes
  • Number of links

§ Degree

  • Average degree
  • Degree distribution

§ Diameter § Clustering coefficient § …

slide-6
SLIDE 6

NODE DEGREE

Undirected networks § Node degree: number of links to

  • ther nodes

[𝑙# = 2, 𝑙' = 3, 𝑙) = 2, 𝑙* = 1] § Number of links 𝑀 = 1 2 .

/0# 1

𝑙/ § Average degree 𝑙 = 1 𝑂 .

/0# 1

𝑙/ = 2𝑀 𝑂

1 3 2 4 1 3 2 4

Directed networks § Indegree: [𝑙#

/3 = 1 𝑙' /3 = 2, 𝑙) /3 = 0, 𝑙* /3 = 1]

§ Outdegree: [𝑙#

567 = 1 𝑙' 567 = 1, 𝑙) 567 = 2, 𝑙* 567 = 0]

§ Total degree = in + out § Number of links 𝑀 = .

/0# 1

𝑙/

/3 = . /0# 1

𝑙/

567

§ Average degree: 𝑀/𝑂

slide-7
SLIDE 7

DEGREE DISTRIBUTION

§ Degree distribution 𝑞: is the probability that a randomly selected node has degree 𝑙 𝑞: = 𝑂:/𝑂

  • Where 𝑂: is number of nodes of degree 𝑙

regular lattice clique (fully connected graph)

5

regular lattice

4

karate club friendship network

slide-8
SLIDE 8

DEGREE DISTRIBUTION IN REAL NETWORKS

Degree distribution of real-world networks is highly heterogeneous, i.e., it can vary significantly

hubs

slide-9
SLIDE 9

REAL NETWORKS ARE SPARSE

§ Complete graph § Real network 𝑀 ≪ 𝑂(𝑂 − 1)/2

slide-10
SLIDE 10

MATHEMATICAL REPRESENTATION OF DIRECTED GRAPHS

§ Adjacency list

  • List of links

[(1,2), (2,4), (3,1), (3,2)] § Adjacency matrix 𝑂×𝑂 matrix 𝑩 such that

  • 𝐵/B = 1 if link (𝑗, 𝑘) exists
  • 𝐵/B = 0 if there is no link

1 1 1 1 1 3 2 4 i j 𝐵/B =

slide-11
SLIDE 11

UNDIRECTED VS. DIRECTED GRAPHS

1 3 2 4 1 3 2 4 1 1 1 1 1 1 1 1 1 1 1 1 Symmetric 𝐵/B = 𝐵/B =

slide-12
SLIDE 12

PATHS AND DISTANCES IN NETWORKS

§ Path: sequence of links (or nodes) from

  • ne node to another

§ Walk: a Path of length 𝑜 from one node to another, that can include repeated nodes / links (e.g., [1-2-1]) § Shortest Path: path with the shortest distance between two nodes § Diameter: Shortest paths between most distant nodes

slide-13
SLIDE 13

COMPUTING PATHS/DISTANCES

Number of walks 𝑂/B between nodes i and j can be calculated using the adjacency matrix § 𝐵/B gives paths of length 𝑒 = 1 § 𝐵' /B gives #walks of length 𝑒 = 2 § 𝐵G

/B gives #walks of length 𝑒 = 𝑚

2 1 1 1 1 3 1 1 1 2 1 1 1 1

1 3 2 4

2 4 3 1 4 2 4 3 3 4 2 1 1 3 1

§ The minimum 𝑚 such that 𝐵G

/B > 0 gives the distance

(in hops) between 𝑗 and 𝑘

𝐵' /B = 𝐵) /B =

§ 𝐵/B = 𝑏/B § 𝐵' )* = 𝑏)#𝑏#* + 𝑏)'𝑏'* + 𝑏))𝑏)* + 𝑏)*𝑏** + 𝑏)L𝑏L* + 𝑏)M𝑏M* § 𝑏)*𝑏** is the # of walks from 3 to 1 multiplied by the # of walks from 1 to 4 à # of walks from 3 to 4 through 1 § 𝑏):𝑏:* is the # of walks from 3 to 𝑙 multiplied by the # of walks from 𝑙 to 4 à # of walks from 3 to 4 through 𝑙 § Sum of all two-steps walks between 3 and 4

𝐵/B = 1 1 1 1 1 1 1 1

slide-14
SLIDE 14

AVERAGE DISTANCE IN NETWORKS

regular lattice (ring): 𝑒 = 𝑃(𝑂) clique: 𝑒 = 1 karate club friendship network: 𝑒 = 2.44 regular lattice (square): 𝑒 = 𝑃( 𝑂)

slide-15
SLIDE 15

CLUSTERING

§ Clustering g coe

  • efficient captures the probability of neighbors of a given

node 𝑗 to be linked § Loc

  • cal clustering

g coe

  • efficient of a vertex 𝑗 in a graph quantifies how close

its neighbors are to being a clique

slide-16
SLIDE 16

PROPERTIES OF REAL WORLD NETWORKS

§ Real networks are fundamentally different from what we’d expect

  • Degree distribution
  • Real networks are scale-free
  • Average distance between nodes
  • Real networks are small world
  • Clustering
  • Real networks are locally dense

§ What do we expect?

  • Create a model of a network. Useful for calculating network

properties and thinking about networks.

slide-17
SLIDE 17

RANDOM NETWORK MODEL

§ Networks do not have a regular structure § Given N nodes, how can we link them in a way that reproduces the

  • bserved complexity of real networks?

§ Let connect nodes at random! § Erdos-Renyi model of a random network

  • Given N isolated nodes
  • Select a pair of nodes. Pick a random number between 0 and 1.

If the number > 𝑞, create a link

  • Repeat previous step for each remaining node pair
  • Average degree: 𝑙 = 𝑞(𝑂 − 1)

§ Easy to compute properties of random networks

slide-18
SLIDE 18

RANDOM NETWORKS ARE TRULY RANDOM

N=12, p=1/6 N=100, p=1/6 Average degree: 𝑙 = 𝑞(𝑂 − 1)

slide-19
SLIDE 19

DEGREE DISTRIBUTION IN RANDOM NETWORK

§ Follows a binomial distribution § For sparse networks, <k> << N, Poisson distribution.

  • Depends only on <k>, not network size N
slide-20
SLIDE 20

REAL NETWORKS DO NOT HAVE POISSON DEGREE DISTRIBUTION

degree (followers) distribution activity (num posts) distribution

slide-21
SLIDE 21

SCALE FREE PROPERTY

WWW hyperlinks distribution

Pow

  • wer-law distribution
  • n: 𝒒𝒍~𝒍T𝜹

§ Networks whose degree distribution follows a power-law distribution are called sc scale fr free networks § Real network have hubs

slide-22
SLIDE 22

RANDOM VS SCALE-FREE NETWORKS

10 10 10

1

10

2

10

3

  • 4

10

  • 3

10

  • 2

10

  • 1

10

loglog

1

cx ) x ( f

  • =

x

c ) x ( f

  • =

5 0.

cx ) x ( f

  • =

Random networks and scale-free networks are very different. Differences are apparent when degree distribution is plotted on log scale.

slide-23
SLIDE 23

THE MILGRAM EXPERIMENT

§ In 1960’s, Stanley Milgram asked 160 randomly selected people in Kansas and Nebraska to deliver a letter to a stock broker in Boston.

  • Rule: can only forward the letter to a friend who is more likely

to know the target person § How many steps would it take?

slide-24
SLIDE 24

THE MILGRAM EXPERIMENT

§ Within a few days the first letter arrived, passing through

  • nly two links.

§ Eventually 42 of the 160 letters made it to the target, some requiring close to a dozen intermediates. § The median number of steps in completed chains was 5.5 à“six degr grees of

  • f separation
  • n”
slide-25
SLIDE 25

FACEBOOK IS A VERY SMALL WORLD

§ Ugander et al. directly measured distances between nodes in the Facebook social graph (May 2011)

  • 721 million active users
  • 68 billion symmetric friendship links
  • the average distance between the users was 4.74
slide-26
SLIDE 26

SMALL WORLD PROPERTY

§ Distance between any two nodes in a network is surprisingly short

  • “six degrees of separation”: you can reach any other

individual in the world through a short sequence of intermediaries § What is small?

  • Consider a random network with average degree 𝑙
  • Expected number of nodes a distance d is 𝑂(𝑒)~ 𝑙 V
  • Diameter 𝑒WXY~ log 𝑂 / log 𝑙
  • Random networks are small
slide-27
SLIDE 27

WHAT IS IT SURPRISING?

§ Regu gular lattices (e.g. g., physical ge geogr

  • graphy) do
  • not ha

have e the the small wor

  • rld prop
  • perty
  • Distances grow polynomially with system size
  • In networks, distances grow logarithmically with network

size

slide-28
SLIDE 28

SMALL WORLD EFFECT IN RANDOM NETWORKS

Wa Watts-Stroga

  • gatz mode
  • del

§ Start with a regular lattice, e.g., a ring where each node is connected to immediate and next neighbors.

  • Local clustering is 𝐷 = 3/4

§ With probability 𝑞, rewire link to a randomly chosen node

  • For small 𝑞, clustering remains high, but diameter shrinks
  • For large 𝑞, becomes random network
slide-29
SLIDE 29

SMALL WORLD NETWORKS

§ Small wor

  • rld networ
  • rks constructed using Watts-Strogatz

model have small average distance and high clustering, just like real networks § Long-distance links, joining distant local clusters

Clustering Average distance p regular lattice random network

slide-30
SLIDE 30

SOCIAL NETWORKS ARE SEARCHABLE

§ Milgram experiments showed that

  • Short chains exist!
  • People can find them!
  • Using only local knowledge (who their friends are, their location

and profession)

  • How are short chains discovered with this limited information?
  • Hint: geographic information?

[Milgram]

slide-31
SLIDE 31

KLEINBERG MODEL OF GEOGRAPHIC LINKS

§ Incorporate geographic distance in the distribution of links

Link to all nodes within distance r, then add q long range links with probability d-a Distance between nodes is d

slide-32
SLIDE 32

HOW DOES THIS AFFECT SHORT CHAINS?

§ Simulate Milgram experiment

  • at each time step, a node selects a friend who is closer to the

target (in lattice space) and forwards the letter to it

  • Each node uses only local information about its own social

network and not the entire structure of the network

  • delivery time T is the time for the letter to reach the target

a delivery time

slide-33
SLIDE 33

KLEINBERG’S ANALYSIS

§ Network is only searchable when a=2

  • i.e., probability to form a link drops as square of distance
  • Average delivery time is at most proportional to (log N)2

§ For other values of a, the average chain length produced by search algorithm is at least Nb.

slide-34
SLIDE 34

DOES THIS HOLD FOR REAL NETWORKS?

§ Liben-Nowell et al. tested Kleinberg’s prediction for the LiveJournal network of 1M+ bloggers

  • Blogger’s geographic information in profile
  • How does friendship probability in LiveJournal network

depend on distance between people? § People are not uniformly distributed spatially

  • Coasts, cities are denser

Use rank, instead of distance d(u,v) ranku(v) = 6 Since ranku(v) ~ d(u,v)2, and link probability Pr(uàv) ~ d(u,v)-2, we expect that Pr(uàv) ~ 1/ranku(v)

slide-35
SLIDE 35

LIVEJOURNAL IS A SEARCHABLE NETWORK

§ Probability that a link exists between two people as a function

  • f the rank between them
  • LiveJournal is a rank-based network à it is searchable