Learning Opinions in Social Networks Vincent Conitzer Debmalya - - PowerPoint PPT Presentation

learning opinions in social networks
SMART_READER_LITE
LIVE PREVIEW

Learning Opinions in Social Networks Vincent Conitzer Debmalya - - PowerPoint PPT Presentation

Learning Opinions in Social Networks Vincent Conitzer Debmalya Panigrahi Hanrui Zhang Duke University Learning opinions in social networks a social media company (say Facebook) runs a poll ask users: have you heard


slide-1
SLIDE 1

Learning Opinions in Social Networks

Vincent Conitzer Debmalya Panigrahi Hanrui Zhang Duke University

slide-2
SLIDE 2

Learning “opinions” in social networks

  • a social media company (say Facebook) runs a poll
  • ask users: “have you heard about the new product?”
  • awareness of product propagates in social network
  • observe: responses from some random users
  • goal: infer opinions of users who did not respond
slide-3
SLIDE 3

more generally, “opinions” can be:

  • awareness about a new product / political candidate

/ news item

  • spread of a biological / computer virus

Learning “opinions” in social networks

slide-4
SLIDE 4

this talk:

  • review propagation of opinions in social

networks

  • how to measure the complexity of a network for

learning opinions?

  • how to learn opinions with random

propagation, when the randomness is unknown?

slide-5
SLIDE 5

Related research topics

  • learning propagation models: given outcome of

propagation, infer propagation model

(Liben-Nowell & Kleinberg, 2007; Du et al., 2012; 2014; Narasimhan et al., 2015; etc)

  • social network analysis & influence maximization:

given fixed budget, try to maximize influence of some

  • pinion

(Kempe et al., 2003; Faloutsos et al., 2004; Mossel & Roch, 2007; Chen et al., 2009; 2010; Tang et al., 2014; etc)

slide-6
SLIDE 6

Information propagation in social networks

a simplistic model:

  • network is a directed graph G = (V, E)
  • a seed set S0 of nodes which are initially informed

(i.e., active)

  • active nodes deterministically propagate the

information through outgoing edges

slide-7
SLIDE 7

Information propagation in social networks

S0 S0: seed set that is initially active

slide-8
SLIDE 8

Information propagation in social networks

S1: active nodes after 1 step of propagation S1

slide-9
SLIDE 9

Information propagation in social networks

S2: active nodes after 2 steps of propagation S2

slide-10
SLIDE 10

Information propagation in social networks

S3: active nodes after 3 steps of propagation S3

slide-11
SLIDE 11

Information propagation in social networks

propagation stops after step 2 final active set S2 = S3 = … = S∞ S∞

slide-12
SLIDE 12

PAC learning opinions

S∞

  • fix G, unknown seed set S0 and distribution 𝒠 over V
  • observe m iid labeled samples {(ui, oi)}i, where for each i,

ui ~ 𝒠, and oi = 1 iff ui in S∞

  • based on the sample set, predict if u in S∞ for u ~ 𝒠
slide-13
SLIDE 13

PAC learning opinions

? ? ? ? ? ? ? ? ? ?

slide-14
SLIDE 14

PAC learning opinions

? ? ? ? ? ? ? ? ? in S∞

slide-15
SLIDE 15

PAC learning opinions

? ? ? ? ? ? ? ? in S∞

slide-16
SLIDE 16

PAC learning opinions

? ? ? ? ? ? ? not in S∞

slide-17
SLIDE 17

PAC learning opinions

? ? ? ? ? ? ? is this node in S∞?

slide-18
SLIDE 18
  • key challenge: how to generalize from observations to

future nodes to make predictions for

  • common sense: generalization is impossible without

some prior knowledge

  • so what prior knowledge do we have?
  • answer: structure of the network

PAC learning opinions

slide-19
SLIDE 19

Implicit hypothesis class

1 2 4 3

S∞ for any pair of nodes u, v where u can reach v:

  • if u is in S∞, then v must be in S∞ (e.g., u = 1, v = 2)
  • equivalently, if v is not in S∞, then u must not be in S∞

(e.g., u = 3, v = 4)

slide-20
SLIDE 20

PAC learning opinions

? ? ? ? ? ? ? is this node in S∞?

slide-21
SLIDE 21

PAC learning opinions

? ? ? ? ? ? in S∞

slide-22
SLIDE 22

for any pair of nodes u, v where u can reach v:

  • if u is in S∞, then v must be in S∞ (e.g., u = 1, v = 2)
  • equivalently, if v is not in S∞, then u must not be in S∞

(e.g., u = 3, v = 4)

  • implicit hypothesis class associated with G = (V, E):

family of all sets H of nodes consistent with the above (i.e., if u can reach v, then u in H implies v in H)

  • implicit hypothesis class can be much smaller than 2V

Implicit hypothesis class

slide-23
SLIDE 23

H1 H2 H3 H4 H5 H6 implicit hypothesis class ℋ = {H0, H1, H2, H3, H4, H5, H6} where H0 = ∅ is the empty set |V| = 6, |2V| = 64, |ℋ| = 7

Implicit hypothesis class

slide-24
SLIDE 24
  • VC(G): VC dimension of implicit hypothesis class

associated with network G

  • VC(G) = size of largest “independent” set (aka width),

within which no node u can reach another node v

VC theory for deterministic networks

slide-25
SLIDE 25

blue nodes: independent

VC theory for deterministic networks

slide-26
SLIDE 26

green nodes: independent

VC theory for deterministic networks

slide-27
SLIDE 27
  • range nodes: not independent

VC theory for deterministic networks

slide-28
SLIDE 28
  • range nodes: not independent

VC theory for deterministic networks

slide-29
SLIDE 29
  • VC(G): VC dimension of implicit hypothesis class

associated with network G

  • VC(G) = size of largest “independent” set (aka width),

within which no node u can reach another node v

  • VC(G) can be computed in polynomial time
  • sample complexity of learning opinions:

Õ(VC(G) / 𝛇)

VC theory for deterministic networks

slide-30
SLIDE 30

Why width?

LB: 𝒠 is uniform over a maximum independent set

slide-31
SLIDE 31

Why width?

LB: 𝒠 is uniform over a maximum independent set

slide-32
SLIDE 32

Why width?

UB: number of chains to cover G = VC(G) need to learn one threshold for each chain

slide-33
SLIDE 33

Why width?

UB: number of chains to cover G = VC(G) need to learn one threshold for each chain in S∞ not in S∞

slide-34
SLIDE 34
  • so far: VC theory for deterministic

networks

  • next: the case of random networks
slide-35
SLIDE 35
  • propagation of opinions is inherently random
  • randomness in propagation = randomness in network
  • random network 𝒣: distribution over deterministic graphs
  • propagation: draw G ~ 𝒣, propagate from seed set S0 in G

Random social networks

slide-36
SLIDE 36
  • random network 𝒣: distribution over deterministic networks
  • propagation: draw G ~ 𝒣, propagate from seed set S0 in G
  • PAC learning opinions: fix 𝒣, unknown S0 and 𝒠
  • graph G ~ 𝒣 realizes (unknown to algorithm), propagation

happens from S0 in G and results in S∞

  • algorithm observes m labeled samples, tries to predict S∞
  • “random” hypothesis class — VC theory no longer applies

Random social networks

slide-37
SLIDE 37

Random social networks

  • S0: information to recover, G: noise
  • learning is impossible when noise overwhelms information
  • hard instance: nodes form a chain in a uniformly random
  • rder, S0 = {node 1}
  • learning the label of any other node requires Ω(n) samples
slide-38
SLIDE 38

Random social networks

  • S0: information to recover, G: noise
  • learning is impossible when noise overwhelms information
  • when noise is reasonably small:

Õ(𝔽[VC(G)] / 𝛇) samples are enough to learn opinions up to the intrinsic resolution of the network

slide-39
SLIDE 39

Random social networks

when noise is reasonably small: Õ(𝔽[VC(G)] / 𝛇) samples are enough to learn opinions sketch of algorithm:

  • draw iid sample realizations Gj ~ 𝒣 of the network
  • for each Gj, find the ERM Hj on Gj with the observed

sample set {(ui, oi)}, by computing an s-t min-cut

  • output H = node-wise majority vote by {Hj}, i.e., each node

u is in H iff u is in at least half of {Hj}

slide-40
SLIDE 40

Algorithm for ERM

in S∞ not in S∞

slide-41
SLIDE 41

Algorithm for ERM

S T solid edges: capacity = ∞ dashed edges: capacity = 1

slide-42
SLIDE 42

Algorithm for ERM

S T X edges being cut: X, nodes on S side: M total capacity of S-T mincut = 1

slide-43
SLIDE 43

Algorithm for ERM

misclassified by ERM in S∞ not in S∞

slide-44
SLIDE 44

Random social networks

  • each ERM Hj has expected error 𝛇
  • ... but probability of high error is still large
  • use majority voting to boost probability of success
slide-45
SLIDE 45

Future directions

  • other propagation models
  • non-binary / multiple opinions
slide-46
SLIDE 46

Thanks for your attention!

Questions?