Latest on Linear Sketches for Large Graphs: Lots of Problems, Little Space, and Loads of Handwaving
Andrew McGregor
University of Massachusetts
Latest on Linear Sketches for Large Graphs: Lots of Problems, Little - - PowerPoint PPT Presentation
Latest on Linear Sketches for Large Graphs: Lots of Problems, Little Space, and Loads of Handwaving Andrew McGregor University of Massachusetts Latest on Linear Sketches for Large Graphs: Lots of Problems, Little Space, and Loads of Handwaving
Andrew McGregor
University of Massachusetts
Andrew McGregor
University of Massachusetts
Vertex Connectivity and Sparsification
Guha, McGregor, Tench [PODS 15]
Densest Subgraphs
McGregor, Tench, Vorotnikova, Vu [MFCS 15]
Matching, Vertex Cover, Hitting Set
Chitnis, Cormode, Esfandiari, Hajiaghayi, McGregor, Monemizadeh, Vorotnikova [SODA 16]
massive graph defined by a long sequence of edge insertions and deletions. Don’t want to have to store the entire graph.
massive graph defined by a long sequence of edge insertions and deletions. Don’t want to have to store the entire graph.
Technique Linear Sketches. Maintain random linear projections of vectors and matrices representing the graph.
massive graph defined by a long sequence of edge insertions and deletions. Don’t want to have to store the entire graph.
Technique Linear Sketches. Maintain random linear projections of vectors and matrices representing the graph.
spectral sparsification, matching, vertex cover, hitting set, correlation clustering, triangles, spanners, densest subgraph…
Graph Streaming Survey
McGregor [SIGMOD Record 14]
M∈ℝpolylog(N) x N such that for any x∈ℝN, a random non-zero element of x can be reconstructed from Mx whp.
Tardos [PODS 11]
M∈ℝpolylog(N) x N such that for any x∈ℝN, a random non-zero element of x can be reconstructed from Mx whp.
Tardos [PODS 11]
stream model using O(polylog n) bits of space.
based on sampled edges. Return maxS ĎS.
based on sampled edges. Return maxS ĎS.
based on sampled edges. Return maxS ĎS.
based on sampled edges. Return maxS ĎS.
based on sampled edges. Return maxS ĎS.
What other types of sampling are there that a) are useful for solving graph problems and b) can be supported on dynamic graph streams?
via SNAPE Sampling
via DEALS Sampling
max matching in dynamic stream model using Õ(k2) space.
cover, hitting set… but gets a lot more complicated.
max matching in dynamic stream model using Õ(k2) space.
cover, hitting set… but gets a lot more complicated.
size Ω(k/t) in the dynamic stream model using Õ(k2/t3) space.
using Õ(n2/t3) space. This is also optimal; ask Grigory!
remain, return “null”
remain, return “null”
remain, return “null”
SNAPE samples will include a max matching from G.
is ≥10k and edge is shallow if both endpoints aren’t heavy.
SHALLOW EDGE HEAVY NODE
is ≥10k and edge is shallow if both endpoints aren’t heavy.
SHALLOW EDGE HEAVY NODE
is ≥10k and edge is shallow if both endpoints aren’t heavy.
G’ includes all shallow edges in G. Every heavy node in G has degree at least 5k in G’.
SHALLOW EDGE HEAVY NODE
is ≥10k and edge is shallow if both endpoints aren’t heavy.
G’ includes all shallow edges in G. Every heavy node in G has degree at least 5k in G’.
SHALLOW EDGE HEAVY NODE
is ≥10k and edge is shallow if both endpoints aren’t heavy.
G’ includes all shallow edges in G. Every heavy node in G has degree at least 5k in G’.
you still have plenty of other edges on that node.
SHALLOW EDGE HEAVY NODE
is ≥10k and edge is shallow if both endpoints aren’t heavy.
G’ includes all shallow edges in G. Every heavy node in G has degree at least 5k in G’.
you still have plenty of other edges on that node.
SHALLOW EDGE HEAVY NODE
u v
SHALLOW EDGE
Γ(u) or Γ(v) are sampled then uv is only edge remaining!
u v
SHALLOW EDGE
Γ(u) or Γ(v) are sampled then uv is only edge remaining!
u v
SHALLOW EDGE
Γ(u) or Γ(v) are sampled then uv is only edge remaining!
u v
SHALLOW EDGE
Γ(u) or Γ(v) are sampled then uv is only edge remaining!
u v
SHALLOW EDGE
Γ(u) or Γ(v) are sampled then uv is only edge remaining!
Pr[uv is only remaining edge] ≥ p2(1 − p)|Γ(u)|+|Γ(v)|+|W | = Ω(k−2)
u v
SHALLOW EDGE
Γ(u) or Γ(v) are sampled then uv is only edge remaining!
Pr[uv is only remaining edge] ≥ p2(1 − p)|Γ(u)|+|Γ(v)|+|W | = Ω(k−2)
u v
SHALLOW EDGE
HEAVY NODE
HEAVY NODE
HEAVY NODE
HEAVY NODE
Pr[edge incident to u is sampled] ≈ Ω(kp2) · (1 − p)|W | = Ω(k−1)
HEAVY NODE
Pr[edge incident to u is sampled] ≈ Ω(kp2) · (1 − p)|W | = Ω(k−1)
with p=ϴ(t/k) has matching of size Ω(k/t) with high probability.
with p=ϴ(t/k) has matching of size Ω(k/t) with high probability.
constructing greedy matching M.
with p=ϴ(t/k) has matching of size Ω(k/t) with high probability.
constructing greedy matching M.
Pr[ei added to M] ≈ Pr[ei isn’t a NULL] · Pr[all endpoints in M are deleted] = Ω(kp2) · (1 − p)o(k/t) = Ω(t2/k)
with p=ϴ(t/k) has matching of size Ω(k/t) with high probability.
constructing greedy matching M.
Pr[ei added to M] ≈ Pr[ei isn’t a NULL] · Pr[all endpoints in M are deleted] = Ω(kp2) · (1 − p)o(k/t) = Ω(t2/k)
via SNAPE Sampling
via DEALS Sampling
using Õ(ε-1kn) space.
using Õ(ε-1kn) space.
at end of the stream. May use Õ(n) space.
at end of the stream. May use Õ(n) space.
sketch and ai is vector encoding neighborhood of node i.
at end of the stream. May use Õ(n) space.
sketch and ai is vector encoding neighborhood of node i.
1 2 3 5 4
at end of the stream. May use Õ(n) space.
sketch and ai is vector encoding neighborhood of node i.
1 2 3 5 4
{1,2} {1,3} {1,4} {1,5} {2,3} {2,4} {2,5} {3,4} {3,5} {4,5}
a1 = 1 1 a1 = ( 1 1 0)
at end of the stream. May use Õ(n) space.
sketch and ai is vector encoding neighborhood of node i.
1 2 3 5 4
{1,2} {1,3} {1,4} {1,5} {2,3} {2,4} {2,5} {3,4} {3,5} {4,5}
a1 = 1 1 a1 = ( 1 1 0) a2 = ( − 1 1 0)
at end of the stream. May use Õ(n) space.
sketch and ai is vector encoding neighborhood of node i.
1 2 3 5 4
{1,2} {1,3} {1,4} {1,5} {2,3} {2,4} {2,5} {3,4} {3,5} {4,5}
a1 = 1 1 a1 = ( 1 1 0) a2 = ( − 1 1 0)
at end of the stream. May use Õ(n) space.
sketch and ai is vector encoding neighborhood of node i.
1 2 3 5 4
{1,2} {1,3} {1,4} {1,5} {2,3} {2,4} {2,5} {3,4} {3,5} {4,5}
a1 = 1 1 a1 = ( 1 1 0) a2 = ( − 1 1 0) a1 + a2 = ( 1 1 0)
at end of the stream. May use Õ(n) space.
sketch and ai is vector encoding neighborhood of node i.
hence ∑i∈S Mai = M(∑i∈S ai) yields random edge across (S,V\S).
1 2 3 5 4
{1,2} {1,3} {1,4} {1,5} {2,3} {2,4} {2,5} {3,4} {3,5} {4,5}
a1 = 1 1 a1 = ( 1 1 0) a2 = ( − 1 1 0) a1 + a2 = ( 1 1 0)
at end of the stream. May use Õ(n) space.
sketch and ai is vector encoding neighborhood of node i.
hence ∑i∈S Mai = M(∑i∈S ai) yields random edge across (S,V\S).
1 2 3 5 4
{1,2} {1,3} {1,4} {1,5} {2,3} {2,4} {2,5} {3,4} {3,5} {4,5}
a1 = 1 1 a1 = ( 1 1 0) a2 = ( − 1 1 0) a1 + a2 = ( 1 1 0)
connected after removal of set of k nodes S” using Õ(kn) space.
connected after removal of set of k nodes S” using Õ(kn) space.
path between u and v amongst sampled edges.
connected after removal of set of k nodes S” using Õ(kn) space.
path between u and v amongst sampled edges.
spanning forest on these nodes. Repeat Õ(k2) times.
connected after removal of set of k nodes S” using Õ(kn) space.
path between u and v amongst sampled edges.
spanning forest on these nodes. Repeat Õ(k2) times.
connected after removal of set of k nodes S” using Õ(kn) space.
path between u and v amongst sampled edges.
spanning forest on these nodes. Repeat Õ(k2) times.
between xi and xi+1 with prob. p2(1-p)k≈k-2. After Õ(k2) repeats we have S-avoiding path in E’ with high probability.
cut sizes where λe is the edge connectivity. Fung et al. [STOC 11]
cut sizes where λe is the edge connectivity. Fung et al. [STOC 11]
sample each remaining edge with probably 1/2.
cut sizes where λe is the edge connectivity. Fung et al. [STOC 11]
sample each remaining edge with probably 1/2.
Graph Streaming Survey
McGregor [SIGMOD Record 14]
Vertex Connectivity and Sparsification
Guha, McGregor, Tench [PODS 15]
Densest Subgraphs
McGregor, Tench, Vorotnikova, Vu [MFCS 15]
Matching, Vertex Cover, Hitting Set
Chitnis, Cormode, Esfandiari, Hajiaghayi, McGregor, Monemizadeh, Vorotnikova [SODA 16]
Graph Streaming Survey
McGregor [SIGMOD Record 14]
Vertex Connectivity and Sparsification
Guha, McGregor, Tench [PODS 15]
Densest Subgraphs
McGregor, Tench, Vorotnikova, Vu [MFCS 15]
Matching, Vertex Cover, Hitting Set
Chitnis, Cormode, Esfandiari, Hajiaghayi, McGregor, Monemizadeh, Vorotnikova [SODA 16]