A Distance Measure for the Analysis of Polar Opinion Dynamics in Social Networks
Victor Amelkin
University of California, Santa Barbara Department of Computer Science victor@cs.ucsb.edu
1 / 26
A Distance Measure for the Analysis of Polar Opinion Dynamics in - - PowerPoint PPT Presentation
A Distance Measure for the Analysis of Polar Opinion Dynamics in Social Networks Victor Amelkin University of California, Santa Barbara Department of Computer Science victor@cs.ucsb.edu 1 / 26 Contributors 1,2 Petko Bogdanov Ambuj K. Singh
University of California, Santa Barbara Department of Computer Science victor@cs.ucsb.edu
1 / 26
Victor Amelkin UC Santa Barbara victor@cs.ucsb.edu Petko Bogdanov University at Albany, SUNY pbogdanov@albany.edu Ambuj K. Singh UC Santa Barbara ambuj@cs.ucsb.edu
1Victor Amelkin, Petko Bogdanov, and Ambuj K Singh. “A Distance Measure for the Analysis
In: Proc. IEEE ICDE. 2017, pp. 159–162.
2Victor Amelkin, Petko Bogdanov, and Ambuj K. Singh. “A Distance Measure for the Analysis
In: arXiv:1510.05058 [cs.SI] (2015).
2 / 26
3 / 26
3Wayne Zachary. “An information flow model for conflict and fission in small groups”.
In: Journal of Anthropological Research (1977), pp. 452–473.
4 / 26
+
+
+ +
+ +
5 / 26
+
+
+ +
+ +
5 / 26
4Stephen Ranshous et al. “Anomaly detection in dynamic networks: a survey”.
In: Wiley Interdisciplinary Reviews: Computational Statistics 7.3 (2015), pp. 223–247.
6 / 26
extrapolate
reconstruct
7 / 26
8 / 26
i |xi − yi|p)1/p
i δxi,yi
i |xi−yi| |xi|+|yi|
|x∪y|
x,y x y
i ln [xi/yi]xi
9 / 26
5Horst Bunke and Kim Shearer. “A graph distance metric based on the maximal common
subgraph”. In: Pattern recognition letters 19.3 (1998), pp. 255–259.
6Xinbo Gao et al. “A survey of Graph Edit Distance”.
In: Pattern Analysis and Applications 13.1 (2010), pp. 113–129.
7Sergey Melnik, Hector Garcia-Molina, and Erhard Rahm. “Similarity flooding: A versatile
graph matching algorithm and its application to schema matching”. In: IEEE Data Engineering. 2002, pp. 117–128.
8S Vichy N Vishwanathan et al. “Graph kernels”.
In: The Journal of Machine Learning Research 11 (2010), pp. 1201–1242.
9Owen Macindoe and Whitman Richards. “Graph comparison using fine structure analysis”.
In: IEEE SocialCom. IEEE. 2010, pp. 193–200.
10 / 26
⊲ compare networks structurally ⊲ disregard node states
⊲ edit distance over node/edge insertion, deletion, substitution operations ⊲ mostly, structure-driven; expensive to compute
⊲ nodes are similar if their neighborhoods are similar ⊲ hard to account for node state differences in a socially meaningful way; expensive to compute
⊲ compare substructures—walks, paths, cycles, trees—of non-aligned (small) networks ⊲ opinion dynamics-unaware; expensive to compute
⊲ compare degree, clust. coeff., betweenness, diameter, frequent substructures, spectra ⊲ only look at summaries; does not capture opinion dynamics
10 / 26
5Amelkin, Bogdanov, and Singh, “A Distance Measure for the Analysis of Polar Opinion
Dynamics in Social Networks (Extended Paper)”.
11 / 26
. 5 . 3
11 / 26
. 5 . 3
11 / 26
. 5 . 3
11 / 26
...
(histogram) (histogram) (ground distance) (network state) (network state)
EMD(P, Q, D) =
n
Dij fij
n
fijDij → min,
n
fij = min n
Pi,
n
Qi
n
fij ≤ Pi,
n
fij ≤ Qj, (1 ≤ i, j ≤ n)
12 / 26
...
(histogram) (histogram) (ground distance) (network state) (network state)
EMD(P, Q, D) =
n
Dij fij
n
fijDij → min,
n
fij = min n
Pi,
n
Qi
n
fij ≤ Pi,
n
fij ≤ Qj, (1 ≤ i, j ≤ n)
12 / 26
Ground distance computed in: Opinion type “transported”:
13 / 26
Ground distance computed in: Opinion type “transported”:
13 / 26
Ground distance computed in: Opinion type “transported”:
13 / 26
Ground distance computed in: Opinion type “transported”:
13 / 26
1/2
1 1
1/2
1 1 1
“bank bins”
1/2 1/2
“bank bins” 14 / 26
1/2
1 1
1/2
1 1 1
“bank bins”
14 / 26
EMD⋆(P, Q) = EMD( P , Q, D) max
,
,
D + 1n ⊗ γT D + 1T
n ⊗ γ
D + 1n ⊗ γT + 1T
n ⊗ γ − 2 diag(γ)
P (i) =
n
j=1 Qj − n k=1 Pk
if Qj > Pk, 0,
P (i): capacity of the i’th bank bin, γ = [γ1, . . . , γn]⊺ : ground distances to/from bank bins.
14 / 26
15 / 26
15 / 26
15 / 26
SND(P, Q) = EMD⋆(P +, Q+, D(P, +)) + EMD⋆(P −, Q−, D(P, −))+ EMD⋆(Q+, P +, D(Q, +)) + EMD⋆(Q−, P −, D(Q, −)).
16 / 26
SND(P, Q) = EMD⋆(P +, Q+, D(P, +)) + EMD⋆(P −, Q−, D(P, −))+ EMD⋆(Q+, P +, D(Q, +)) + EMD⋆(Q−, P −, D(Q, −)).
16 / 26
SND(P, Q) = EMD⋆(P +, Q+, D(P, +)) + EMD⋆(P −, Q−, D(P, −))+ EMD⋆(Q+, P +, D(Q, +)) + EMD⋆(Q−, P −, D(Q, −)).
16 / 26
SND(P, Q) = EMD⋆(P +, Q+, D(P, +)) + EMD⋆(P −, Q−, D(P, −))+ EMD⋆(Q+, P +, D(Q, +)) + EMD⋆(Q−, P −, D(Q, −)).
16 / 26
P1 P2 P3 P4 Q1 Q2 Q3 Q4 ... # bins: 17 / 26
P2 P3 Q1 Q3 Q4 # bins: 17 / 26
O(n2 log n) O(n∆n log n)
P2 P3 Q1 Q4 # bins:
(unbalanced BP network)
17 / 26
O(n2 log n) O(n∆n log √ U)
5Ravindra K Ahuja et al. “Faster algorithms for the shortest path problem”.
In: Journal of the ACM 37.2 (1990), pp. 213–223.
17 / 26
O(n2 log n) O(n∆n log √ U)
O(n3 log n) O(n∆m + n3
∆ log (n∆nU))
5Ahuja et al., “Faster algorithms for the shortest path problem”. 6Ravindra K Ahuja et al. “Improved algorithms for bipartite network flow”.
In: SIAM Journal
17 / 26
O(n2 log n) O(n∆n log √ U)
O(n3 log n) O(n∆m + n3
∆ log (n∆nU))
∆ log (n∆nU)))
5Ahuja et al., “Faster algorithms for the shortest path problem”. 6Ahuja et al., “Improved algorithms for bipartite network flow”. 17 / 26
O(n2 log n) O(n∆n log √ U)
O(n3 log n) O(n∆m + n3
∆ log (n∆nU))
∆ log (n∆nU)))
5Ahuja et al., “Faster algorithms for the shortest path problem”. 6Ahuja et al., “Improved algorithms for bipartite network flow”. 17 / 26
18 / 26
Network state pair index Distance (scaled) 0.2 0.4 0.6 0.8 1 Distance between adjacent network states SND hamming walk-dist quad-form simulated anomaly
series of 40 network states is generated using Pnbr = 0.12 and Pext = 0.01 for normal and Pnbr = 0.08 and Pext = 0.05 for anomalous network states’ generation, respectively. The three simulated anomalies are displayed as solid vertical lines.
19 / 26
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 False Positive Rate (FPR) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 True Positive Rate (TPR) SND hamming walk-dist quad-form
a series of 300 network states over synthetic network with |V | = 30k and scale-free exponent γ = −2.3. The network states are generated using Pnbr = 0.08 and Pext = 0.001 for normal and Pnbr = 0.07 and Pext = 0.011 for anomalous instances.
19 / 26
05'08-11'08 08'08-02'09 11'08-05'09 02'09-08'09 05'09-11'09 08'09-02'10 11'09-05'10 02'10-08'10 05'10-11'10 08'10-02'11 11'10-05'11 02'11-08'11 Distance (scaled) 0.2 0.4 0.6 0.8 1 1.2 1.4 Distance between adjacent network states (topic "Obama") SND hamming walk-dist quad-form
election Economic Stimulus Bill Nobel Prize "Obama Care" Tax plan bin Laden inauguration
anomaly interest
accompanied by the curve showing Google Trends’ scaled interest in topic “Obama”. Network states detected to be anomalous by at least one distance measure are displayed as solid vertical lines.
20 / 26
21 / 26
22 / 26
# of users in the network 1k 2k 3k 4k 5k 10k 30k 50k 70k 90k 200k Time, sec 100 101 102 103 Time Computing SND (log-log scale)
Our method CPLEX
when the number of users having different opinion is fixed at n∆ = 1000 and the total number of users n in the network grows up 200k.
n" 2k 4k 6k 8k 10k Time, sec 50 100 150 200 250 300 350 400 450 Time Computing SND
using our method when the network size is fixed at n = 20k, and the number n∆ of users having changed their opinions grows up to 10k.
23 / 26
24 / 26
25 / 26
26 / 26