SLIDE 1 Finding Graph Matchings in Data Streams
Andrew McGregor, UPenn
SLIDE 2
The Streaming Model
SLIDE 3 The Streaming Model
- Classic Problem: Median Finding [Munro & Paterson]
SLIDE 4 The Streaming Model
- Classic Problem: Median Finding [Munro & Paterson]
- Parameters of the Model:
- How much memory?
- How many passes?
- How much computation time between data elements?
SLIDE 5 The Streaming Model
- Classic Problem: Median Finding [Munro & Paterson]
- Parameters of the Model:
- How much memory?
- How many passes?
- How much computation time between data elements?
- Statistics, Norms and Histograms…
SLIDE 6 The Streaming Model
- Classic Problem: Median Finding [Munro & Paterson]
- Parameters of the Model:
- How much memory?
- How many passes?
- How much computation time between data elements?
- Statistics, Norms and Histograms…
- What about graph problems?
SLIDE 7 Graph Streaming
- Instance of graph problem G = (V, E)
- Edges arrive in arbitrary order: e1, e2, e3, …, em
- Memory limit O(n polylog n) where n = |V|
- Spanner Construction, Bipartite Matching, Lower Bounds
[Feigenbaum, Kannan, M. , Suri, Zhang ’04 &’05]
- “Annotation” Stream Model [Aggarwal, Datar, Rajagopalan,
Ruhl ’04, Demetrescu, Finocchi, Ribichini ’05]
SLIDE 8 Matching
- A matching - set of edges with no two edges sharing an end point.
- Problems:
Find the matching of maximum cardinality (MCM) Find the matching of maximum weight (MWM)
- (Non-streamable) Algorithms:
Exact polytime algorithm for both [Gabow ’90] Linear-time 1+ε approx for MCM [Kalantari & Shokoufandeh ’95] Linear-time 3/2+ε approx for MWM [Drake & Hougardy ’03]
SLIDE 9 Results
1+ε approximation in constant passes.
3+2√2 approximation in single pass. 2+ε approximation in constant passes.
SLIDE 10
Unweighted Matchings.
SLIDE 11 An Easy 2 Approximation
Store an edge if it is not adjacent to stored edge
- Construct a maximal matching - 2 Approximation
SLIDE 12
Augmenting Paths
SLIDE 13
Augmenting Paths
SLIDE 14
Augmenting Paths
Matching M
SLIDE 15 Augmenting Paths
- Augmenting Path: simple path starting and ending
at unmatched nodes such that edges alternate between M and E\M.
SLIDE 16 Augmenting Paths
- Augmenting Path: simple path starting and ending
at unmatched nodes such that edges alternate between M and E\M.
SLIDE 17 Augmenting Paths
- Augmenting Path: simple path starting and ending
at unmatched nodes such that edges alternate between M and E\M.
SLIDE 18 Augmenting Paths
- Consider augmenting paths defined by taking the
symmetric difference between current (maximal) matching and optimum matching.
- Let Pi be the number of length i augmenting paths
|M| +
Pi ≥ OPT(1 − 1/k)
SLIDE 19 Algorithm Outline
- 1. Find a maximal matching
- 2. For 1 ≤ i ≤ k:
Find a set, Si, of length i augmenting paths
- 3. Augment current matching with Sj where j = argmax Si
- 4. Repeat from 2 unless Sj is small
SLIDE 20
Projecting to Layered Graphs
G
SLIDE 21
Projecting to Layered Graphs
G L(G)
SLIDE 22
Projecting to Layered Graphs
G L(G)
SLIDE 23
Projecting to Layered Graphs
G L(G)
SLIDE 24
Projecting to Layered Graphs
G L(G)
SLIDE 25
Projecting to Layered Graphs
G L(G)
SLIDE 26
Projecting to Layered Graphs
G L(G)
SLIDE 27
Projecting to Layered Graphs
G L(G)
SLIDE 28
Projecting to Layered Graphs
G L(G)
SLIDE 29
Projecting to Layered Graphs
G L(G)
SLIDE 30
Projecting to Layered Graphs
G L(G)
SLIDE 31
Projecting to Layered Graphs
G L(G)
SLIDE 32
SLIDE 33 Lemma: If there are Pi length i augmenting paths in G then we expect Pi / 2(2i)i node disjoint paths in L(G).
SLIDE 34 Lemma: If there are Pi length i augmenting paths in G then we expect Pi / 2(2i)i node disjoint paths in L(G). Lemma: A maximal set of node disjoint paths in L(G), is an i+2 approximation to the maximum set of node disjoint paths in L(G).
SLIDE 35 Lemma: If there are Pi length i augmenting paths in G then we expect Pi / 2(2i)i node disjoint paths in L(G). Lemma: A maximal set of node disjoint paths in L(G), is an i+2 approximation to the maximum set of node disjoint paths in L(G). To find a constant fraction of length i augmenting paths Pi, create layered graph and greedily find node disjoint paths.
SLIDE 36
SLIDE 37
SLIDE 38
SLIDE 39
SLIDE 40
SLIDE 41
SLIDE 42
SLIDE 43
SLIDE 44
SLIDE 45
SLIDE 46
SLIDE 47
SLIDE 48
SLIDE 49
SLIDE 50
SLIDE 51
SLIDE 52
SLIDE 53
SLIDE 54
SLIDE 55
SLIDE 56
SLIDE 57
SLIDE 58
SLIDE 59
SLIDE 60
SLIDE 61
SLIDE 62
SLIDE 63
Limiting Backtracking
SLIDE 64
Limiting Backtracking
SLIDE 65
Limiting Backtracking
SLIDE 66
Limiting Backtracking
SLIDE 67
Limiting Backtracking
SLIDE 68 Limiting Backtracking
- Solution: If number of paths being grown falls below threshold
δn then delete and backtrack.
Good: Only backtrack a constant number of times Bad: Don’t find a maximal set of node disjoint paths
- In a constant number of passes, we find a constant fraction of
length i node disjoint paths/augmenting paths.
SLIDE 69
Weighted Matching.
SLIDE 70
Single Pass 3+2√2 Approximation
SLIDE 71 Single Pass 3+2√2 Approximation
- At all times we store some matching M
SLIDE 72 Single Pass 3+2√2 Approximation
- At all times we store some matching M
- For each edge e:
Compute total weight W of edges e1, e2 in M incident to e If w(e) > (1+γ) W then M ← M ∪ {e} \ {e1,e2}
SLIDE 73 Single Pass 3+2√2 Approximation
- At all times we store some matching M
- For each edge e:
Compute total weight W of edges e1, e2 in M incident to e If w(e) > (1+γ) W then M ← M ∪ {e} \ {e1,e2}
- We say e is “born” and “killed” e1 and e2
SLIDE 74
Proof (Sketch)
SLIDE 75 Proof (Sketch)
- We say an edge e is a survivor if it is born and was never killed.
SLIDE 76 Proof (Sketch)
- We say an edge e is a survivor if it is born and was never killed.
- Let S = all survivors.
SLIDE 77 Proof (Sketch)
- We say an edge e is a survivor if it is born and was never killed.
- Let S = all survivors.
- For survivor e we define the trail of the dead T(e) to be the
transitive closure of edges killed by e.
SLIDE 78 Proof (Sketch)
- We say an edge e is a survivor if it is born and was never killed.
- Let S = all survivors.
- For survivor e we define the trail of the dead T(e) to be the
transitive closure of edges killed by e.
- Claim 1: w(T(e)) ≤ w(e)/γ
SLIDE 79 Proof (Sketch)
- We say an edge e is a survivor if it is born and was never killed.
- Let S = all survivors.
- For survivor e we define the trail of the dead T(e) to be the
transitive closure of edges killed by e.
- Claim 1: w(T(e)) ≤ w(e)/γ
- Claim 2: Can charge the weights of edges in OPT such that:
- At most (1+ γ) w(T(e)) is charged to T(e)
- At most 2(1+ γ) w(e) is charged to e
SLIDE 80 Proof (Sketch)
- We say an edge e is a survivor if it is born and was never killed.
- Let S = all survivors.
- For survivor e we define the trail of the dead T(e) to be the
transitive closure of edges killed by e.
- Claim 1: w(T(e)) ≤ w(e)/γ
- Claim 2: Can charge the weights of edges in OPT such that:
- At most (1+ γ) w(T(e)) is charged to T(e)
- At most 2(1+ γ) w(e) is charged to e
- Hence w(OPT) ≤ (1+ γ) w(T(S)) + 2(1+ γ) w(S)< (3+2√2) w(S)
SLIDE 81
Multi-pass 2+ε Approximation
SLIDE 82 Multi-pass 2+ε Approximation
- First pass: find a constant approximate M1
SLIDE 83 Multi-pass 2+ε Approximation
- First pass: find a constant approximate M1
- Subsequent passes: create Mi from Mi-1 by running the
previous algorithm with γ(ε)
SLIDE 84 Multi-pass 2+ε Approximation
- First pass: find a constant approximate M1
- Subsequent passes: create Mi from Mi-1 by running the
previous algorithm with γ(ε)
- Repeat if |Mi|/ |Mi-1|> 1+κ(ε)
SLIDE 85 Multi-pass 2+ε Approximation
- First pass: find a constant approximate M1
- Subsequent passes: create Mi from Mi-1 by running the
previous algorithm with γ(ε)
- Repeat if |Mi|/ |Mi-1|> 1+κ(ε)
- Claim 1: A constant number of passes suffices
SLIDE 86 Multi-pass 2+ε Approximation
- First pass: find a constant approximate M1
- Subsequent passes: create Mi from Mi-1 by running the
previous algorithm with γ(ε)
- Repeat if |Mi|/ |Mi-1|> 1+κ(ε)
- Claim 1: A constant number of passes suffices
- Claim 2: When |Mi|/ |Mi-1| ≤ 1+κ we have a 2+ε approx.
SLIDE 87 Conclusions
1+ε approximation in constant passes.
3+2√2 approximation in single pass. 2+ε approximation in constant passes.
SLIDE 88
Thanks.