Sampling Vertices Uniformly from a Graph
Flavio Chierichetti Sapienza University With subsets of Anirban Dasgupta IIT Gandhinagar Shahrzad Haddadan Sapienza University Silvio Lattanzi Google Zurich Ravi Kumar Google MTV Tamás Sarlós Google MTV
Sampling Vertices Uniformly from a Graph Flavio Chierichetti - - PowerPoint PPT Presentation
Sampling Vertices Uniformly from a Graph Flavio Chierichetti Sapienza University With subsets of Anirban Dasgupta IIT Gandhinagar Shahrzad Haddadan Sapienza University Silvio Lattanzi Google Zurich Ravi Kumar Google MTV Tams Sarls Google MTV
Flavio Chierichetti Sapienza University With subsets of Anirban Dasgupta IIT Gandhinagar Shahrzad Haddadan Sapienza University Silvio Lattanzi Google Zurich Ravi Kumar Google MTV Tamás Sarlós Google MTV
2
2 1 3 4 2 2 5 2 1 1 4
2 1 3 4 2 2 5 2 1 1 4
Asking all the users is too costly!
Select some people uniformly-at-random and ask them their opinion
Select some people uniformly-at-random and ask them their opinion
d = 1 d = 2
1 2 1
Select some people uniformly-at-random and ask them their opinion
1 2 1
Select some people uniformly-at-random and ask them their opinion
The empirical average will be close to the real average
What is the fraction of ?
Select some people uniformly-at-random and ask them their opinion
The empirical fraction of will be close to the real fraction
Select some people uniformly-at-random and ask them their opinion
http://s-n.com/001.html
Then, what can we do?
Then, what can we do? http://s-n.com/001.html
http://s-n.com/005.html
Then, what can we do?
http://s-n.com/011.html
Then, what can we do?
Then, what can we do? http://s-n.com/012.html
1/4 1/4 1/4 1/4
1/3 1/3 1/3
If the process goes on for enough many steps, the random node it ends up on will be “random”, chosen with probability proportional to its degree
If the process goes on for enough many steps, the random node it ends up on will be “random”, chosen with probability proportional to its degree Mixing Time T(G)
If the process goes on for enough many steps, the random node it ends up on will be “random”, chosen with probability proportional to its degree The Mixing Times of many “Social Networks” are small [Leskovec et al, ’08] Mixing Time T(G)
If the process goes on for enough many steps, the random node it ends up on will be “random”, chosen with probability proportional to its degree Mixing Time T(G)
If the process goes on for enough many steps, the random node it ends up on will be “random”, chosen with probability proportional to its degree 1/18 Mixing Time T(G)
If the process goes on for enough many steps, the random node it ends up on will be “random”, chosen with probability proportional to its degree 1/18 Mixing Time T(G) 4/18
If the process goes on for enough many steps, the random node it ends up on will be “random”, chosen with probability proportional to its degree 1/18 Mixing Time T(G) 4/18
~ 4/18 · 1/4 = ~ 1/18
~ 4/18 · 1/4 = ~ 1/18
~ 4/18 · 1/4 = ~ 1/18
~ 1/18 ~ 1/18 · 1/1
~ 1/18 ~ 1/18
This algorithm returns a node chosen (arbitrarily close to) uniformly at random
One can easily show that this algorithm downloads, with high probability, at most O(T(G) · AvgDeg(G)) nodes from the network
Running Time: D · T(G)
# of Downloaded Vertices ≤ AvgDeg(G) · T(G) Running Time: D · T(G)
various algorithms for selecting a UAR node.
downloads < o(T(G) AvgDeg(G)) nodes from the network, then it cannot return anything close to a uniform-at-random node.
various algorithms for selecting a UAR node.
downloads < o(T(G) AvgDeg(G)) nodes from the network, then it cannot return anything close to a uniform-at-random node.
G H
G H
A distribution over graphs G
[C., Haddadan,’18]
v
[C., Haddadan,’18]
v
[C., Haddadan,’18]
v
[C., Haddadan,’18]
v
[C., Haddadan,’18]
v’
v’1 v’2 v’3
v
[C., Haddadan,’18]
v’
v’1 v’2 v’3
v
[C., Haddadan,’18]
v’
v’1 v’2 v’3
v cT
[C., Haddadan,’18]
cT
[C., Haddadan,’18]
average degree d > ω(1).
satisfies α T < S < α’ T, for constants α = α(c) and α’=α’(c).
cT
[C., Haddadan,’18]
average degree d > ω(1).
satisfies α T < S < α’ T, for constants α = α(c) and α’=α’(c).
cT
[C., Haddadan,’18]
average degree d > ω(1).
increases by a factor of 1 + Θ(c)
cT 1/T
[C., Haddadan,’18]
average degree d > ω(1).
decreases by a factor of 1 + Θ(c).
cT 1/T
[C., Haddadan,’18]
average degree d > ω(1).
[C., Haddadan,’18]
G
Let G be some (random) graph, and let H a (random) decoration of G
G H
Let G be some (random) graph, and let H a (random) decoration of G
G H
Let G be some (random) graph, and let H a (random) decoration of G We flip a fair coin, and run the (generic) algorithm on one of the two graphs
G
Let G be some (random) graph, and let H a (random) decoration of G We flip a fair coin, and run the (generic) algorithm on one of the two graphs
G
We flip a fair coin, and run the (generic) algorithm on one of the two graphs
H
Let G be some (random) graph, and let H a (random) decoration of G
By showing that the algorithm cannot detect whether it is running on G or H, we prove that the algorithm cannot solve a number of problems.
G(n, d/n) graph, with d ~ log n,
O(log n).
them with a random matching of < n / 2 edges,
T of G.
[C., Haddadan,’18]
G(n, d/n) graph, with d ~ log n,
O(log n).
them with a random matching of < n / 2 edges,
T of G.
[C., Haddadan,’18]
G(n, d/n) graph, with d ~ log n,
O(log n).
them with a random matching of < n / 2 edges,
T of G.
[C., Haddadan,’18]
G(n, d/n) graph, with d ~ log n,
O(log n).
them with a random matching of < n / 2 edges,
T of G.
[C., Haddadan,’18]
G(n, d/n) graph, with d ~ log n,
O(log n).
them with a random matching of < n / 2 edges,
T of G.
The edges towards stars will make up a 1 / (T d) fraction of the visited edges
[C., Haddadan,’18]
G(n, d/n) graph, with d ~ log n,
~ log n.
them with a random matching of < n / 2 edges,
T of G.
The edges towards stars will make up a 1 / (T d) fraction of the visited edges
[C., Haddadan,’18]
G(n, d/n) graph, with d ~ log n,
~ log n.
them with a random matching of < n / 2 edges
T of the resulting G.
[C., Haddadan,’18]
G(n, d/n) graph, with d ~ log n,
~ log n.
them with a random matching of < n / 2 edges,
time T of the resulting G.
[C., Haddadan,’18]
average degree Θ(d) and mixing time Θ(T) such that, no algorithm accessing o(T d) nodes of G can
[C., Haddadan,’18]
average degree Θ(d) and mixing time Θ(T) such that, no algorithm accessing o(T d) nodes of G can
[C., Haddadan,’18]
average degree Θ(d) and mixing time Θ(T) such that, no algorithm accessing o(T d) nodes of G can
[C., Haddadan,’18]
average degree Θ(d) and mixing time Θ(T) such that, no algorithm accessing o(T d) nodes of G can
1
[C., Haddadan,’18]
average degree Θ(d) and mixing time Θ(T) such that, no algorithm accessing o(T d) nodes of G can
[C., Haddadan,’18]
Upper Bound Lower Bound Average of a O(tmix davg log(δ−1)−2) Ω(tmix davg log(δ−1)−2) Bounded Function
(Theorem 2.2, with an Algorithm of [2]) (Theorem 2.3)
Uniform Sample O(tmix davg log(−1)) Ω(tmix davg)
( [2] ) (Theorem 2.1)
Number of Vertices O(tmix max{davg, |Π1|−1/2
2
} log(δ−1) log(−1)−2) Ω(tmix davg)
( [11] ) (Theorem 2.4)
Average Degree O(D2 tmix davg log(δ−1)−2) Ω(tmix davg)
(Application of Theorem 2.2) (Theorem 2.4)
Max-Degree Max-Degree/Rejection-Sampling [Katzir et al.]
Upper Bound Lower Bound Average of a O(tmix davg log(δ−1)−2) Ω(tmix davg log(δ−1)−2) Bounded Function
(Theorem 2.2, with an Algorithm of [2]) (Theorem 2.3)
Uniform Sample O(tmix davg log(−1)) Ω(tmix davg)
( [2] ) (Theorem 2.1)
Number of Vertices O(tmix max{davg, |Π1|−1/2
2
} log(δ−1) log(−1)−2) Ω(tmix davg)
( [11] ) (Theorem 2.4)
Average Degree O(D2 tmix davg log(δ−1)−2) Ω(tmix davg)
(Application of Theorem 2.2) (Theorem 2.4)
Max-Degree Max-Degree/Rejection-Sampling [Katzir et al.]
Upper Bound Lower Bound Average of a O(tmix davg log(δ−1)−2) Ω(tmix davg log(δ−1)−2) Bounded Function
(Theorem 2.2, with an Algorithm of [2]) (Theorem 2.3)
Uniform Sample O(tmix davg log(−1)) Ω(tmix davg)
( [2] ) (Theorem 2.1)
Number of Vertices O(tmix max{davg, |Π1|−1/2
2
} log(δ−1) log(−1)−2) Ω(tmix davg)
( [11] ) (Theorem 2.4)
Average Degree O(D2 tmix davg log(δ−1)−2) Ω(tmix davg)
(Application of Theorem 2.2) (Theorem 2.4)
Max-Degree Max-Degree/Rejection-Sampling [Katzir et al.]
approximate the number of nodes of G?
be improved?
studied the number of node accesses to return a node with probability proportional to some power of its degree.
problem?