randomized composable coreset for matching and vertex
play

Randomized Composable Coreset for Matching and Vertex Cover Sepehr - PowerPoint PPT Presentation

Randomized Composable Coreset for Matching and Vertex Cover Sepehr Assadi University of Pennsylvania Joint work with Sanjeev Khanna (Penn) Sepehr Assadi (Penn) SPAA 2017 Massive Graphs Massive graphs abound in variety of applications: web


  1. Previous Work: Matching and Vertex It turned out that matching and vertex cover do not admit efficient summaries! [Assadi et al., 2016]: Any simultaneous protocol that can compute an n o (1) -approximation for these problems requires summaries of size n 2 − o (1) . As is traditional in this setting, this impossibility result is doubly worst case: Sepehr Assadi (Penn) SPAA 2017

  2. Previous Work: Matching and Vertex It turned out that matching and vertex cover do not admit efficient summaries! [Assadi et al., 2016]: Any simultaneous protocol that can compute an n o (1) -approximation for these problems requires summaries of size n 2 − o (1) . As is traditional in this setting, this impossibility result is doubly worst case: Both the underlying graph and the partitioning of the input are chosen adversarially! Sepehr Assadi (Penn) SPAA 2017

  3. Previous Work: Matching and Vertex It turned out that matching and vertex cover do not admit efficient summaries! [Assadi et al., 2016]: Any simultaneous protocol that can compute an n o (1) -approximation for these problems requires summaries of size n 2 − o (1) . As is traditional in this setting, this impossibility result is doubly worst case: Both the underlying graph and the partitioning of the input are chosen adversarially! Can we distribute the original input in a better way? Sepehr Assadi (Penn) SPAA 2017

  4. Our Results in a Nutshell A natural data oblivious partitioning scheme completely alters this landscape. Sepehr Assadi (Penn) SPAA 2017

  5. Our Results in a Nutshell A natural data oblivious partitioning scheme completely alters this landscape. Our work: Both matching and vertex cover admit efficient simultaneous protocols provided that the edges of the graph are partitioned randomly across the machines. Sepehr Assadi (Penn) SPAA 2017

  6. Our Results in a Nutshell A natural data oblivious partitioning scheme completely alters this landscape. Our work: Both matching and vertex cover admit efficient simultaneous protocols provided that the edges of the graph are partitioned randomly across the machines. The idea that random partitioning can help was nicely illustrated by [Mirrokni and Zadimoghaddam, 2015] and [da Ponte Barbosa et al., 2015] on maximizing submodular functions. Sepehr Assadi (Penn) SPAA 2017

  7. Our Results in a Nutshell A natural data oblivious partitioning scheme completely alters this landscape. Our work: Both matching and vertex cover admit efficient simultaneous protocols provided that the edges of the graph are partitioned randomly across the machines. The idea that random partitioning can help was nicely illustrated by [Mirrokni and Zadimoghaddam, 2015] and [da Ponte Barbosa et al., 2015] on maximizing submodular functions. Our work is the first illustration in the domain of graph problems. Sepehr Assadi (Penn) SPAA 2017

  8. Randomized Composable Coresets Define G (1) , . . . , G ( k ) as a random partitioning of a graph G : each edge e ∈ G is sent to one of the graphs uniformly at random. Sepehr Assadi (Penn) SPAA 2017

  9. Randomized Composable Coresets Define G (1) , . . . , G ( k ) as a random partitioning of a graph G : each edge e ∈ G is sent to one of the graphs uniformly at random. Consider an algorithm ALG that given any graph G computes a subgraph ALG ( G ) ⊆ G with at most s edges. Sepehr Assadi (Penn) SPAA 2017

  10. Randomized Composable Coresets Define G (1) , . . . , G ( k ) as a random partitioning of a graph G : each edge e ∈ G is sent to one of the graphs uniformly at random. Consider an algorithm ALG that given any graph G computes a subgraph ALG ( G ) ⊆ G with at most s edges. ALG outputs an α -approximation randomized composable coreset of size s for a problem P iff: Sepehr Assadi (Penn) SPAA 2017

  11. Randomized Composable Coresets Define G (1) , . . . , G ( k ) as a random partitioning of a graph G : each edge e ∈ G is sent to one of the graphs uniformly at random. Consider an algorithm ALG that given any graph G computes a subgraph ALG ( G ) ⊆ G with at most s edges. ALG outputs an α -approximation randomized composable coreset of size s for a problem P iff: � � ALG ( G (1) ) ∪ . . . ∪ ALG ( G ( k ) ) P is an α -approximation for P ( G ) with high probability (over the randomness of the partitioning). Sepehr Assadi (Penn) SPAA 2017

  12. Randomized Composable Coresets Define G (1) , . . . , G ( k ) as a random partitioning of a graph G : each edge e ∈ G is sent to one of the graphs uniformly at random. Consider an algorithm ALG that given any graph G computes a subgraph ALG ( G ) ⊆ G with at most s edges. ALG outputs an α -approximation randomized composable coreset of size s for a problem P iff: � � ALG ( G (1) ) ∪ . . . ∪ ALG ( G ( k ) ) P is an α -approximation for P ( G ) with high probability (over the randomness of the partitioning). Defined originally by [Mirrokni and Zadimoghaddam, 2015] in the context of distributed submodular maximization. Sepehr Assadi (Penn) SPAA 2017

  13. Upper Bound Results: Maximum Matching Greedy and local search are typical choices for composable coresets. Sepehr Assadi (Penn) SPAA 2017

  14. Upper Bound Results: Maximum Matching Greedy and local search are typical choices for composable coresets. However, one can show that the greedy algorithm for matching, i.e., picking a maximal matching, performs poorly in general. Sepehr Assadi (Penn) SPAA 2017

  15. Upper Bound Results: Maximum Matching Greedy and local search are typical choices for composable coresets. However, one can show that the greedy algorithm for matching, i.e., picking a maximal matching, performs poorly in general. Our approach: pick a maximum matching! Sepehr Assadi (Penn) SPAA 2017

  16. Upper Bound Results: Maximum Matching Greedy and local search are typical choices for composable coresets. However, one can show that the greedy algorithm for matching, i.e., picking a maximal matching, performs poorly in general. Our approach: pick a maximum matching! Theorem Any maximum matching is an O (1) -randomized composable coreset of size n/ 2 for the matching problem. Sepehr Assadi (Penn) SPAA 2017

  17. Upper Bound Results: Vertex Cover Can a minimum vertex cover also be used as a randomized composable coreset for this problem? Sepehr Assadi (Penn) SPAA 2017

  18. Upper Bound Results: Vertex Cover Can a minimum vertex cover also be used as a randomized composable coreset for this problem? Not really; consider a star with k petals for example. Sepehr Assadi (Penn) SPAA 2017

  19. Upper Bound Results: Vertex Cover Can a minimum vertex cover also be used as a randomized composable coreset for this problem? Not really; consider a star with k petals for example. Unlike most problems that admit a composable coreset, the vertex cover problem has a hard to verify feasibility constraint. Sepehr Assadi (Penn) SPAA 2017

  20. Upper Bound Results: Vertex Cover Can a minimum vertex cover also be used as a randomized composable coreset for this problem? Not really; consider a star with k petals for example. Unlike most problems that admit a composable coreset, the vertex cover problem has a hard to verify feasibility constraint. This motivates a slightly more general notion of composable coresets. Sepehr Assadi (Penn) SPAA 2017

  21. Composable Coresets for Vertex Cover A (randomized) composable coreset for the vertex cover problem contains both: Sepehr Assadi (Penn) SPAA 2017

  22. Composable Coresets for Vertex Cover A (randomized) composable coreset for the vertex cover problem contains both: A subset of edges of the input graph to guide the coordinator on 1 the choice of the vertex cover. Sepehr Assadi (Penn) SPAA 2017

  23. Composable Coresets for Vertex Cover A (randomized) composable coreset for the vertex cover problem contains both: A subset of edges of the input graph to guide the coordinator on 1 the choice of the vertex cover. An explicitly specified subset of vertices to be always included in 2 the final vertex cover Sepehr Assadi (Penn) SPAA 2017

  24. Composable Coresets for Vertex Cover A (randomized) composable coreset for the vertex cover problem contains both: A subset of edges of the input graph to guide the coordinator on 1 the choice of the vertex cover. An explicitly specified subset of vertices to be always included in 2 the final vertex cover Size of a coreset: number of edges + number of specified vertices. Sepehr Assadi (Penn) SPAA 2017

  25. Upper Bound Results: Vertex Cover The vertex cover problem admits an efficient randomized composable coreset. Sepehr Assadi (Penn) SPAA 2017

  26. Upper Bound Results: Vertex Cover The vertex cover problem admits an efficient randomized composable coreset. Theorem There exists an O (log n ) -approximation randomized composable coreset of size O ( n · log n ) for the vertex cover problem. Sepehr Assadi (Penn) SPAA 2017

  27. Lower Bound Results: Randomized Coresets Why coresets of size � O ( n ) ? Sepehr Assadi (Penn) SPAA 2017

  28. Lower Bound Results: Randomized Coresets Why coresets of size � O ( n ) ? � O ( n ) space is a “sweet spot” for graph streaming algorithms: typically the space needed to even store the answer. Sepehr Assadi (Penn) SPAA 2017

  29. Lower Bound Results: Randomized Coresets Why coresets of size � O ( n ) ? � O ( n ) space is a “sweet spot” for graph streaming algorithms: typically the space needed to even store the answer. However, such considrations only imply that size of all coresets together need to be Ω( n ) . Sepehr Assadi (Penn) SPAA 2017

  30. Lower Bound Results: Randomized Coresets Why coresets of size � O ( n ) ? � O ( n ) space is a “sweet spot” for graph streaming algorithms: typically the space needed to even store the answer. However, such considrations only imply that size of all coresets together need to be Ω( n ) . Can we achieve coresets of size, say, Θ( n/k ) ? Sepehr Assadi (Penn) SPAA 2017

  31. Lower Bound Results: Randomized Coresets Why coresets of size � O ( n ) ? � O ( n ) space is a “sweet spot” for graph streaming algorithms: typically the space needed to even store the answer. However, such considrations only imply that size of all coresets together need to be Ω( n ) . Can we achieve coresets of size, say, Θ( n/k ) ? No! Theorem Any α -approximation randomized composable coreset requires, Ω( n/α 2 ) space for the matching problem, and, Ω( n/α ) space for the vertex cover problem. Sepehr Assadi (Penn) SPAA 2017

  32. Lower Bound Results: Randomized Coresets Why coresets of size � O ( n ) ? � O ( n ) space is a “sweet spot” for graph streaming algorithms: typically the space needed to even store the answer. However, such considrations only imply that size of all coresets together need to be Ω( n ) . Can we achieve coresets of size, say, Θ( n/k ) ? No! Theorem Any α -approximation randomized composable coreset requires, Ω( n/α 2 ) space for the matching problem, and, Ω( n/α ) space for the vertex cover problem. Remark. These bounds are tight for all values of α . Sepehr Assadi (Penn) SPAA 2017

  33. Upper Bound Results: Distributed Computing Our randomized composable coresets immediately imply simultaneous distributed protocols: Sepehr Assadi (Penn) SPAA 2017

  34. Upper Bound Results: Distributed Computing Our randomized composable coresets immediately imply simultaneous distributed protocols: Theorem There exists simultaneous protocol with approximation guarantee O (1) for the matching problem, and, 1 O (log n ) for the vertex cover problem, 2 that require only � O ( k · n ) total communication when the input is randomly partitioned between k machines. Sepehr Assadi (Penn) SPAA 2017

  35. Upper Bound Results: Distributed Computing Remark. These result also imply MapReduce algorithms for matching and vertex cover with the same approximation guarantee in at most 2 rounds of computation and O ( n √ n ) space per each machine. Sepehr Assadi (Penn) SPAA 2017

  36. Upper Bound Results: Distributed Computing Remark. These result also imply MapReduce algorithms for matching and vertex cover with the same approximation guarantee in at most 2 rounds of computation and O ( n √ n ) space per each machine. Our MapReduce algorithms outperform the previous algorithms for these problems [Lattanzi et al., 2011, Ahn and Guha, 2015] in terms of number of rounds, albeit with a larger approximation guarantee. Sepehr Assadi (Penn) SPAA 2017

  37. Upper Bound Results: Distributed Computing Remark. These result also imply MapReduce algorithms for matching and vertex cover with the same approximation guarantee in at most 2 rounds of computation and O ( n √ n ) space per each machine. Our MapReduce algorithms outperform the previous algorithms for these problems [Lattanzi et al., 2011, Ahn and Guha, 2015] in terms of number of rounds, albeit with a larger approximation guarantee. The number of rounds of a MapReduce algorithm usually determines the dominant cost of the computation. Sepehr Assadi (Penn) SPAA 2017

  38. Lower Bound Results: Distributed Computing Our lower bound on size of randomized composable coresets implies that our distributed protocols are optimal among all coreset-based protocols. Sepehr Assadi (Penn) SPAA 2017

  39. Lower Bound Results: Distributed Computing Our lower bound on size of randomized composable coresets implies that our distributed protocols are optimal among all coreset-based protocols. What about general protocols? Sepehr Assadi (Penn) SPAA 2017

  40. Lower Bound Results: Distributed Computing Our lower bound on size of randomized composable coresets implies that our distributed protocols are optimal among all coreset-based protocols. What about general protocols? Theorem Any α -approximation simultaneous protocol (not necessarily a coreset) requires Ω( nk/α 2 ) communication for the matching problem, and, Ω( nk/α ) communication for the vertex cover problem, even when the input is randomly partitioned across the k machines. Sepehr Assadi (Penn) SPAA 2017

  41. Lower Bound Results: Distributed Computing Our lower bound on size of randomized composable coresets implies that our distributed protocols are optimal among all coreset-based protocols. What about general protocols? Theorem Any α -approximation simultaneous protocol (not necessarily a coreset) requires Ω( nk/α 2 ) communication for the matching problem, and, Ω( nk/α ) communication for the vertex cover problem, even when the input is randomly partitioned across the k machines. For adversarial partitions, an Ω( nk/α 2 ) lower bound for matching was known previously even for protocols that are allowed multiple rounds of communication [Huang et al., 2015]. Sepehr Assadi (Penn) SPAA 2017

  42. A Randomized Composable Coreset for Matching Sepehr Assadi (Penn) SPAA 2017

  43. A Randomized Coreset for Matching Theorem Any maximum matching is an O (1) -randomized composable coreset of size n/ 2 for the matching problem. Sepehr Assadi (Penn) SPAA 2017

  44. A Randomized Coreset for Matching Theorem Any maximum matching is an O (1) -randomized composable coreset of size n/ 2 for the matching problem. Let M i be the maximum matching computed by machine i ∈ [ k ] . Sepehr Assadi (Penn) SPAA 2017

  45. A Randomized Coreset for Matching Theorem Any maximum matching is an O (1) -randomized composable coreset of size n/ 2 for the matching problem. Let M i be the maximum matching computed by machine i ∈ [ k ] . Consider running the greedy algorithm over the edges in M 1 , . . . , M k in this order to obtain a matching M . Sepehr Assadi (Penn) SPAA 2017

  46. A Randomized Coreset for Matching Theorem Any maximum matching is an O (1) -randomized composable coreset of size n/ 2 for the matching problem. Let M i be the maximum matching computed by machine i ∈ [ k ] . Consider running the greedy algorithm over the edges in M 1 , . . . , M k in this order to obtain a matching M . We prove that | M | = Ω( opt ) , where opt is the size of a maximum matching in G . Sepehr Assadi (Penn) SPAA 2017

  47. A Randomized Coreset for Matching Theorem Any maximum matching is an O (1) -randomized composable coreset of size n/ 2 for the matching problem. Let M i be the maximum matching computed by machine i ∈ [ k ] . Consider running the greedy algorithm over the edges in M 1 , . . . , M k in this order to obtain a matching M . We prove that | M | = Ω( opt ) , where opt is the size of a maximum matching in G . This implies that there exists an O (1) -approximate matching in M 1 ∪ . . . ∪ M k . Sepehr Assadi (Penn) SPAA 2017

  48. Analysis Sketch: A Key Lemma Lemma At any step i ∈ [ k ] , either the greedy matching is already of size Ω( opt ) , or w.h.p., we can increase the size of the current matching by adding Ω( opt /k ) edges from M i greedily. Sepehr Assadi (Penn) SPAA 2017

  49. Analysis Sketch: A Key Lemma Lemma At any step i ∈ [ k ] , either the greedy matching is already of size Ω( opt ) , or w.h.p., we can increase the size of the current matching by adding Ω( opt /k ) edges from M i greedily. This immediately implies that the matching output by the greedy algorithm has size Ω( opt ) w.h.p. Sepehr Assadi (Penn) SPAA 2017

  50. Proof Sketch Consider the set of o ( opt ) already matched vertices by the greedy algorithm. Sepehr Assadi (Penn) SPAA 2017

  51. Proof Sketch Consider the set of o ( opt ) already matched vertices by the greedy algorithm. Define E old as the set of edges in G ( i ) incident on these already matched vertices. Sepehr Assadi (Penn) SPAA 2017

  52. Proof Sketch Consider the set of o ( opt ) already matched vertices by the greedy algorithm. Define E old as the set of edges in G ( i ) incident on these already matched vertices. Define µ old as size of a maximum matching in G ( i ) using only edges in E old . Sepehr Assadi (Penn) SPAA 2017

  53. Proof Sketch Claim. W.h.p. there is a matching of size ≥ µ old + Ω( opt /k ) in G ( i ) . Sepehr Assadi (Penn) SPAA 2017

  54. Proof Sketch Claim. W.h.p. there is a matching of size ≥ µ old + Ω( opt /k ) in G ( i ) . Fix a maximum matching in E old : at most o ( opt ) vertices that were previously unmatched are in the matching. Sepehr Assadi (Penn) SPAA 2017

  55. Proof Sketch Claim. W.h.p. there is a matching of size ≥ µ old + Ω( opt /k ) in G ( i ) . Fix a maximum matching in E old : at most o ( opt ) vertices that were previously unmatched are in the matching. Hence, G contains a matching of size Ω( opt ) outside the set of vertices matched by µ old . Sepehr Assadi (Penn) SPAA 2017

  56. Proof Sketch Claim. W.h.p. there is a matching of size ≥ µ old + Ω( opt /k ) in G ( i ) . Fix a maximum matching in E old : at most o ( opt ) vertices that were previously unmatched are in the matching. Hence, G contains a matching of size Ω( opt ) outside the set of vertices matched by µ old . By random partitioning, w.h.p., Ω( opt /k ) such edges appear in G ( i ) . Sepehr Assadi (Penn) SPAA 2017

  57. Proof Sketch Claim. W.h.p. there is a matching of size ≥ µ old + Ω( opt /k ) in G ( i ) . Fix a maximum matching in E old : at most o ( opt ) vertices that were previously unmatched are in the matching. Hence, G contains a matching of size Ω( opt ) outside the set of vertices matched by µ old . By random partitioning, w.h.p., Ω( opt /k ) such edges appear in G ( i ) . µ old + Ω( opt /k ) forms the desired matching. Sepehr Assadi (Penn) SPAA 2017

  58. Proof Sketch Claim. W.h.p. there is a matching of size ≥ µ old + Ω( opt /k ) in G ( i ) . Fix a maximum matching in E old : at most o ( opt ) vertices that were previously unmatched are in the matching. Hence, G contains a matching of size Ω( opt ) outside the set of vertices matched by µ old . By random partitioning, w.h.p., Ω( opt /k ) such edges appear in G ( i ) . µ old + Ω( opt /k ) forms the desired matching. Corollary. Any maximum matching of G ( i ) contains Ω( opt /k ) edges that can be added to the greedy matching. Sepehr Assadi (Penn) SPAA 2017

  59. Randomized Composable Coreset for Matching We showed that, Theorem Any maximum matching is an O (1) -randomized composable coreset of size at most n/ 2 for the matching problem. Sepehr Assadi (Penn) SPAA 2017

  60. A Randomized Composable Coreset for Vertex Cover Sepehr Assadi (Penn) SPAA 2017

  61. A Randomized Coreset for Vertex Cover Theorem There exists an O (log n ) -approximation randomized composable coreset of size O ( n · log n ) for the vertex cover problem. Sepehr Assadi (Penn) SPAA 2017

  62. A Randomized Coreset for Vertex Cover Theorem There exists an O (log n ) -approximation randomized composable coreset of size O ( n · log n ) for the vertex cover problem. Each machine computes a coreset using the following peeling process. Sepehr Assadi (Penn) SPAA 2017

  63. A Randomized Coreset for Vertex Cover Theorem There exists an O (log n ) -approximation randomized composable coreset of size O ( n · log n ) for the vertex cover problem. Each machine computes a coreset using the following peeling process. Iteratively remove high degree vertices and their neighboring edges; specify any removed vertex to be added to the final vertex cover. Sepehr Assadi (Penn) SPAA 2017

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend