Randomized Composable Coreset for Matching and Vertex Cover Sepehr - - PowerPoint PPT Presentation

randomized composable coreset for matching and vertex
SMART_READER_LITE
LIVE PREVIEW

Randomized Composable Coreset for Matching and Vertex Cover Sepehr - - PowerPoint PPT Presentation

Randomized Composable Coreset for Matching and Vertex Cover Sepehr Assadi University of Pennsylvania Joint work with Sanjeev Khanna (Penn) Sepehr Assadi (Penn) SPAA 2017 Massive Graphs Massive graphs abound in variety of applications: web


slide-1
SLIDE 1

Randomized Composable Coreset for Matching and Vertex Cover

Sepehr Assadi

University of Pennsylvania

Joint work with Sanjeev Khanna (Penn)

Sepehr Assadi (Penn) SPAA 2017

slide-2
SLIDE 2

Massive Graphs

Massive graphs abound in variety of applications: web graph, social networks, biological networks, etc.

Sepehr Assadi (Penn) SPAA 2017

slide-3
SLIDE 3

Massive Graphs

Massive graphs abound in variety of applications: web graph, social networks, biological networks, etc. How to deal with computation over such massive graph inputs?

Sepehr Assadi (Penn) SPAA 2017

slide-4
SLIDE 4

Distributed Computing

A common approach: distributed computing.

Sepehr Assadi (Penn) SPAA 2017

slide-5
SLIDE 5

Distributed Computing

A common approach: distributed computing.

1

Distribute the edges of the graph between some machines.

Sepehr Assadi (Penn) SPAA 2017

slide-6
SLIDE 6

Distributed Computing

A common approach: distributed computing.

1

Distribute the edges of the graph between some machines.

2

There is a communication network between the machines.

Sepehr Assadi (Penn) SPAA 2017

slide-7
SLIDE 7

Distributed Computing

A common approach: distributed computing.

1

Distribute the edges of the graph between some machines.

2

There is a communication network between the machines.

3

The machines communicate with each other to compute the answer.

Sepehr Assadi (Penn) SPAA 2017

slide-8
SLIDE 8

Distributed Computing

A common approach: distributed computing.

1

Distribute the edges of the graph between some machines.

2

There is a communication network between the machines.

3

The machines communicate with each other to compute the answer. Main measures of efficiency: communication cost and round complexity.

Sepehr Assadi (Penn) SPAA 2017

slide-9
SLIDE 9

The Simultaneous Communication Model

We are interested in the simultaneous communication model.

Sepehr Assadi (Penn) SPAA 2017

slide-10
SLIDE 10

The Simultaneous Communication Model

We are interested in the simultaneous communication model.

1

There are k machines plus an additional coordinator. Coordinator Machines

Sepehr Assadi (Penn) SPAA 2017

slide-11
SLIDE 11

The Simultaneous Communication Model

We are interested in the simultaneous communication model.

1

There are k machines plus an additional coordinator.

2

The input graph is edge-partitioned between the machines. Coordinator Machines

Sepehr Assadi (Penn) SPAA 2017

slide-12
SLIDE 12

The Simultaneous Communication Model

We are interested in the simultaneous communication model.

1

There are k machines plus an additional coordinator.

2

The input graph is edge-partitioned between the machines.

3

Each machine sends a summary of its input to the coordinator. Coordinator Machines

Sepehr Assadi (Penn) SPAA 2017

slide-13
SLIDE 13

The Simultaneous Communication Model

We are interested in the simultaneous communication model.

1

There are k machines plus an additional coordinator.

2

The input graph is edge-partitioned between the machines.

3

Each machine sends a summary of its input to the coordinator.

4

The coordinator computes the answer based on the summaries. Coordinator Machines

Sepehr Assadi (Penn) SPAA 2017

slide-14
SLIDE 14

Why Simultaneous Model?

1

Simultaneous protocols are inherently round-optimal.

Sepehr Assadi (Penn) SPAA 2017

slide-15
SLIDE 15

Why Simultaneous Model?

1

Simultaneous protocols are inherently round-optimal.

2

Communication cost is simply determined by the size of the summary sent by each machine.

Sepehr Assadi (Penn) SPAA 2017

slide-16
SLIDE 16

Why Simultaneous Model?

1

Simultaneous protocols are inherently round-optimal.

2

Communication cost is simply determined by the size of the summary sent by each machine.

3

Applications to other models of computation:

◮ For example, lower bounds in dynamic streams. Sepehr Assadi (Penn) SPAA 2017

slide-17
SLIDE 17

Simultaneous Protocols

Many general techniques for designing simultaneous protocols, including: Linear sketches Composable coresets Mergable summaries Sampling . . .

Sepehr Assadi (Penn) SPAA 2017

slide-18
SLIDE 18

Linear Sketches

We treat the input graph as a vector of edge multiplicities.

Sepehr Assadi (Penn) SPAA 2017

slide-19
SLIDE 19

Linear Sketches

We treat the input graph as a vector of edge multiplicities. Summary of each machine is a linear projection of its input subgraph.

Sepehr Assadi (Penn) SPAA 2017

slide-20
SLIDE 20

Linear Sketches

We treat the input graph as a vector of edge multiplicities. Summary of each machine is a linear projection of its input subgraph. Linearity of the sketches allows the coordinator to obtain a sketch of the combined input.

Sepehr Assadi (Penn) SPAA 2017

slide-21
SLIDE 21

Linear Sketches

We treat the input graph as a vector of edge multiplicities. Summary of each machine is a linear projection of its input subgraph. Linearity of the sketches allows the coordinator to obtain a sketch of the combined input. The coordinator runs an arbitrary function on the combined sketch to obtain the final answer.

Sepehr Assadi (Penn) SPAA 2017

slide-22
SLIDE 22

Linear Sketches

We treat the input graph as a vector of edge multiplicities. Summary of each machine is a linear projection of its input subgraph. Linearity of the sketches allows the coordinator to obtain a sketch of the combined input. The coordinator runs an arbitrary function on the combined sketch to obtain the final answer. Introduced for graph problems by Ahn, Guha, and McGregor [Ahn et al., 2012a].

Sepehr Assadi (Penn) SPAA 2017

slide-23
SLIDE 23

Composable Coresets

Summary of each machine is a suitably chosen subgraph of its input.

Sepehr Assadi (Penn) SPAA 2017

slide-24
SLIDE 24

Composable Coresets

Summary of each machine is a suitably chosen subgraph of its input. Composability means that the union of the coresets for a collection of graphs yields a coreset for the union of the graphs.

Sepehr Assadi (Penn) SPAA 2017

slide-25
SLIDE 25

Composable Coresets

Summary of each machine is a suitably chosen subgraph of its input. Composability means that the union of the coresets for a collection of graphs yields a coreset for the union of the graphs. The coordinator solves the original problem over the combined coreset to obtain the final answer.

Sepehr Assadi (Penn) SPAA 2017

slide-26
SLIDE 26

Composable Coresets

Summary of each machine is a suitably chosen subgraph of its input. Composability means that the union of the coresets for a collection of graphs yields a coreset for the union of the graphs. The coordinator solves the original problem over the combined coreset to obtain the final answer. Introduced by Indyk, Mahabadi, Mahdian, and Mirrokni [Indyk et al., 2014].

Sepehr Assadi (Penn) SPAA 2017

slide-27
SLIDE 27

Composable Coresets

Summary of each machine is a suitably chosen subgraph of its input. Composability means that the union of the coresets for a collection of graphs yields a coreset for the union of the graphs. The coordinator solves the original problem over the combined coreset to obtain the final answer. Introduced by Indyk, Mahabadi, Mahdian, and Mirrokni [Indyk et al., 2014]. Many graph problems admit natural composable coresets; for instance, connectivity, sparsifiers, and spanners.

Sepehr Assadi (Penn) SPAA 2017

slide-28
SLIDE 28

Previous Work

Successful applications of these two techniques have yielded O(n) size summaries for several graph problems:

Sepehr Assadi (Penn) SPAA 2017

slide-29
SLIDE 29

Previous Work

Successful applications of these two techniques have yielded O(n) size summaries for several graph problems: Connectivity, Minimum Spanning Tree, (Spectral) Sparsifiers, Spanners, Densest Subgraph, Subgraph Counting, . . .

Sepehr Assadi (Penn) SPAA 2017

slide-30
SLIDE 30

Previous Work

Successful applications of these two techniques have yielded O(n) size summaries for several graph problems: Connectivity, Minimum Spanning Tree, (Spectral) Sparsifiers, Spanners, Densest Subgraph, Subgraph Counting, . . . Two prominent problems are missing however:

Sepehr Assadi (Penn) SPAA 2017

slide-31
SLIDE 31

Previous Work

Successful applications of these two techniques have yielded O(n) size summaries for several graph problems: Connectivity, Minimum Spanning Tree, (Spectral) Sparsifiers, Spanners, Densest Subgraph, Subgraph Counting, . . . Two prominent problems are missing however: Maximum Matching and Minimum Vertex Cover

Sepehr Assadi (Penn) SPAA 2017

slide-32
SLIDE 32

Matchings and Vertex Covers

Matching: A collection of vertex-disjoint edges.

Sepehr Assadi (Penn) SPAA 2017

slide-33
SLIDE 33

Matchings and Vertex Covers

Matching: A collection of vertex-disjoint edges. Maximum Matching problem: Find a matching with a largest number of edges.

Sepehr Assadi (Penn) SPAA 2017

slide-34
SLIDE 34

Matchings and Vertex Covers

Vertex Cover: A collection of vertices containing at least one end point of every edge.

Sepehr Assadi (Penn) SPAA 2017

slide-35
SLIDE 35

Matchings and Vertex Covers

Vertex Cover: A collection of vertices containing at least one end point of every edge. Minimum Vertex Cover problem: Find a vertex cover with a smallest number of vertices.

Sepehr Assadi (Penn) SPAA 2017

slide-36
SLIDE 36

Previous Work: Matching and Vertex

It turned out that matching and vertex cover do not admit efficient summaries!

Sepehr Assadi (Penn) SPAA 2017

slide-37
SLIDE 37

Previous Work: Matching and Vertex

It turned out that matching and vertex cover do not admit efficient summaries! [Assadi et al., 2016]: Any simultaneous protocol that can compute an no(1)-approximation for these problems requires summaries

  • f size n2−o(1).

Sepehr Assadi (Penn) SPAA 2017

slide-38
SLIDE 38

Previous Work: Matching and Vertex

It turned out that matching and vertex cover do not admit efficient summaries! [Assadi et al., 2016]: Any simultaneous protocol that can compute an no(1)-approximation for these problems requires summaries

  • f size n2−o(1).

As is traditional in this setting, this impossibility result is doubly worst case:

Sepehr Assadi (Penn) SPAA 2017

slide-39
SLIDE 39

Previous Work: Matching and Vertex

It turned out that matching and vertex cover do not admit efficient summaries! [Assadi et al., 2016]: Any simultaneous protocol that can compute an no(1)-approximation for these problems requires summaries

  • f size n2−o(1).

As is traditional in this setting, this impossibility result is doubly worst case: Both the underlying graph and the partitioning of the input are chosen adversarially!

Sepehr Assadi (Penn) SPAA 2017

slide-40
SLIDE 40

Previous Work: Matching and Vertex

It turned out that matching and vertex cover do not admit efficient summaries! [Assadi et al., 2016]: Any simultaneous protocol that can compute an no(1)-approximation for these problems requires summaries

  • f size n2−o(1).

As is traditional in this setting, this impossibility result is doubly worst case: Both the underlying graph and the partitioning of the input are chosen adversarially! Can we distribute the original input in a better way?

Sepehr Assadi (Penn) SPAA 2017

slide-41
SLIDE 41

Our Results in a Nutshell

A natural data oblivious partitioning scheme completely alters this landscape.

Sepehr Assadi (Penn) SPAA 2017

slide-42
SLIDE 42

Our Results in a Nutshell

A natural data oblivious partitioning scheme completely alters this landscape. Our work: Both matching and vertex cover admit efficient simultaneous protocols provided that the edges of the graph are partitioned randomly across the machines.

Sepehr Assadi (Penn) SPAA 2017

slide-43
SLIDE 43

Our Results in a Nutshell

A natural data oblivious partitioning scheme completely alters this landscape. Our work: Both matching and vertex cover admit efficient simultaneous protocols provided that the edges of the graph are partitioned randomly across the machines. The idea that random partitioning can help was nicely illustrated by [Mirrokni and Zadimoghaddam, 2015] and [da Ponte Barbosa et al., 2015] on maximizing submodular functions.

Sepehr Assadi (Penn) SPAA 2017

slide-44
SLIDE 44

Our Results in a Nutshell

A natural data oblivious partitioning scheme completely alters this landscape. Our work: Both matching and vertex cover admit efficient simultaneous protocols provided that the edges of the graph are partitioned randomly across the machines. The idea that random partitioning can help was nicely illustrated by [Mirrokni and Zadimoghaddam, 2015] and [da Ponte Barbosa et al., 2015] on maximizing submodular functions. Our work is the first illustration in the domain of graph problems.

Sepehr Assadi (Penn) SPAA 2017

slide-45
SLIDE 45

Randomized Composable Coresets

Define G(1), . . . , G(k) as a random partitioning of a graph G: each edge e ∈ G is sent to one of the graphs uniformly at random.

Sepehr Assadi (Penn) SPAA 2017

slide-46
SLIDE 46

Randomized Composable Coresets

Define G(1), . . . , G(k) as a random partitioning of a graph G: each edge e ∈ G is sent to one of the graphs uniformly at random. Consider an algorithm ALG that given any graph G computes a subgraph ALG(G) ⊆ G with at most s edges.

Sepehr Assadi (Penn) SPAA 2017

slide-47
SLIDE 47

Randomized Composable Coresets

Define G(1), . . . , G(k) as a random partitioning of a graph G: each edge e ∈ G is sent to one of the graphs uniformly at random. Consider an algorithm ALG that given any graph G computes a subgraph ALG(G) ⊆ G with at most s edges. ALG outputs an α-approximation randomized composable coreset of size s for a problem P iff:

Sepehr Assadi (Penn) SPAA 2017

slide-48
SLIDE 48

Randomized Composable Coresets

Define G(1), . . . , G(k) as a random partitioning of a graph G: each edge e ∈ G is sent to one of the graphs uniformly at random. Consider an algorithm ALG that given any graph G computes a subgraph ALG(G) ⊆ G with at most s edges. ALG outputs an α-approximation randomized composable coreset of size s for a problem P iff: P

  • ALG(G(1)) ∪ . . . ∪ ALG(G(k))
  • is an α-approximation for P (G)

with high probability (over the randomness of the partitioning).

Sepehr Assadi (Penn) SPAA 2017

slide-49
SLIDE 49

Randomized Composable Coresets

Define G(1), . . . , G(k) as a random partitioning of a graph G: each edge e ∈ G is sent to one of the graphs uniformly at random. Consider an algorithm ALG that given any graph G computes a subgraph ALG(G) ⊆ G with at most s edges. ALG outputs an α-approximation randomized composable coreset of size s for a problem P iff: P

  • ALG(G(1)) ∪ . . . ∪ ALG(G(k))
  • is an α-approximation for P (G)

with high probability (over the randomness of the partitioning). Defined originally by [Mirrokni and Zadimoghaddam, 2015] in the context of distributed submodular maximization.

Sepehr Assadi (Penn) SPAA 2017

slide-50
SLIDE 50

Upper Bound Results: Maximum Matching

Greedy and local search are typical choices for composable coresets.

Sepehr Assadi (Penn) SPAA 2017

slide-51
SLIDE 51

Upper Bound Results: Maximum Matching

Greedy and local search are typical choices for composable coresets. However, one can show that the greedy algorithm for matching, i.e., picking a maximal matching, performs poorly in general.

Sepehr Assadi (Penn) SPAA 2017

slide-52
SLIDE 52

Upper Bound Results: Maximum Matching

Greedy and local search are typical choices for composable coresets. However, one can show that the greedy algorithm for matching, i.e., picking a maximal matching, performs poorly in general. Our approach: pick a maximum matching!

Sepehr Assadi (Penn) SPAA 2017

slide-53
SLIDE 53

Upper Bound Results: Maximum Matching

Greedy and local search are typical choices for composable coresets. However, one can show that the greedy algorithm for matching, i.e., picking a maximal matching, performs poorly in general. Our approach: pick a maximum matching!

Theorem

Any maximum matching is an O(1)-randomized composable coreset

  • f size n/2 for the matching problem.

Sepehr Assadi (Penn) SPAA 2017

slide-54
SLIDE 54

Upper Bound Results: Vertex Cover

Can a minimum vertex cover also be used as a randomized composable coreset for this problem?

Sepehr Assadi (Penn) SPAA 2017

slide-55
SLIDE 55

Upper Bound Results: Vertex Cover

Can a minimum vertex cover also be used as a randomized composable coreset for this problem? Not really; consider a star with k petals for example.

Sepehr Assadi (Penn) SPAA 2017

slide-56
SLIDE 56

Upper Bound Results: Vertex Cover

Can a minimum vertex cover also be used as a randomized composable coreset for this problem? Not really; consider a star with k petals for example. Unlike most problems that admit a composable coreset, the vertex cover problem has a hard to verify feasibility constraint.

Sepehr Assadi (Penn) SPAA 2017

slide-57
SLIDE 57

Upper Bound Results: Vertex Cover

Can a minimum vertex cover also be used as a randomized composable coreset for this problem? Not really; consider a star with k petals for example. Unlike most problems that admit a composable coreset, the vertex cover problem has a hard to verify feasibility constraint. This motivates a slightly more general notion of composable coresets.

Sepehr Assadi (Penn) SPAA 2017

slide-58
SLIDE 58

Composable Coresets for Vertex Cover

A (randomized) composable coreset for the vertex cover problem contains both:

Sepehr Assadi (Penn) SPAA 2017

slide-59
SLIDE 59

Composable Coresets for Vertex Cover

A (randomized) composable coreset for the vertex cover problem contains both:

1

A subset of edges of the input graph to guide the coordinator on the choice of the vertex cover.

Sepehr Assadi (Penn) SPAA 2017

slide-60
SLIDE 60

Composable Coresets for Vertex Cover

A (randomized) composable coreset for the vertex cover problem contains both:

1

A subset of edges of the input graph to guide the coordinator on the choice of the vertex cover.

2

An explicitly specified subset of vertices to be always included in the final vertex cover

Sepehr Assadi (Penn) SPAA 2017

slide-61
SLIDE 61

Composable Coresets for Vertex Cover

A (randomized) composable coreset for the vertex cover problem contains both:

1

A subset of edges of the input graph to guide the coordinator on the choice of the vertex cover.

2

An explicitly specified subset of vertices to be always included in the final vertex cover Size of a coreset: number of edges + number of specified vertices.

Sepehr Assadi (Penn) SPAA 2017

slide-62
SLIDE 62

Upper Bound Results: Vertex Cover

The vertex cover problem admits an efficient randomized composable coreset.

Sepehr Assadi (Penn) SPAA 2017

slide-63
SLIDE 63

Upper Bound Results: Vertex Cover

The vertex cover problem admits an efficient randomized composable coreset.

Theorem

There exists an O(log n)-approximation randomized composable coreset of size O(n · log n) for the vertex cover problem.

Sepehr Assadi (Penn) SPAA 2017

slide-64
SLIDE 64

Lower Bound Results: Randomized Coresets

Why coresets of size O(n)?

Sepehr Assadi (Penn) SPAA 2017

slide-65
SLIDE 65

Lower Bound Results: Randomized Coresets

Why coresets of size O(n)?

  • O(n) space is a “sweet spot” for graph streaming algorithms:

typically the space needed to even store the answer.

Sepehr Assadi (Penn) SPAA 2017

slide-66
SLIDE 66

Lower Bound Results: Randomized Coresets

Why coresets of size O(n)?

  • O(n) space is a “sweet spot” for graph streaming algorithms:

typically the space needed to even store the answer. However, such considrations only imply that size of all coresets together need to be Ω(n).

Sepehr Assadi (Penn) SPAA 2017

slide-67
SLIDE 67

Lower Bound Results: Randomized Coresets

Why coresets of size O(n)?

  • O(n) space is a “sweet spot” for graph streaming algorithms:

typically the space needed to even store the answer. However, such considrations only imply that size of all coresets together need to be Ω(n). Can we achieve coresets of size, say, Θ(n/k)?

Sepehr Assadi (Penn) SPAA 2017

slide-68
SLIDE 68

Lower Bound Results: Randomized Coresets

Why coresets of size O(n)?

  • O(n) space is a “sweet spot” for graph streaming algorithms:

typically the space needed to even store the answer. However, such considrations only imply that size of all coresets together need to be Ω(n). Can we achieve coresets of size, say, Θ(n/k)? No!

Theorem

Any α-approximation randomized composable coreset requires, Ω(n/α2) space for the matching problem, and, Ω(n/α) space for the vertex cover problem.

Sepehr Assadi (Penn) SPAA 2017

slide-69
SLIDE 69

Lower Bound Results: Randomized Coresets

Why coresets of size O(n)?

  • O(n) space is a “sweet spot” for graph streaming algorithms:

typically the space needed to even store the answer. However, such considrations only imply that size of all coresets together need to be Ω(n). Can we achieve coresets of size, say, Θ(n/k)? No!

Theorem

Any α-approximation randomized composable coreset requires, Ω(n/α2) space for the matching problem, and, Ω(n/α) space for the vertex cover problem.

  • Remark. These bounds are tight for all values of α.

Sepehr Assadi (Penn) SPAA 2017

slide-70
SLIDE 70

Upper Bound Results: Distributed Computing

Our randomized composable coresets immediately imply simultaneous distributed protocols:

Sepehr Assadi (Penn) SPAA 2017

slide-71
SLIDE 71

Upper Bound Results: Distributed Computing

Our randomized composable coresets immediately imply simultaneous distributed protocols:

Theorem

There exists simultaneous protocol with approximation guarantee

1

O(1) for the matching problem, and,

2

O(log n) for the vertex cover problem, that require only O(k · n) total communication when the input is randomly partitioned between k machines.

Sepehr Assadi (Penn) SPAA 2017

slide-72
SLIDE 72

Upper Bound Results: Distributed Computing

  • Remark. These result also imply MapReduce algorithms for

matching and vertex cover with the same approximation guarantee in at most 2 rounds of computation and O(n√n) space per each machine.

Sepehr Assadi (Penn) SPAA 2017

slide-73
SLIDE 73

Upper Bound Results: Distributed Computing

  • Remark. These result also imply MapReduce algorithms for

matching and vertex cover with the same approximation guarantee in at most 2 rounds of computation and O(n√n) space per each machine. Our MapReduce algorithms outperform the previous algorithms for these problems [Lattanzi et al., 2011, Ahn and Guha, 2015] in terms

  • f number of rounds, albeit with a larger approximation guarantee.

Sepehr Assadi (Penn) SPAA 2017

slide-74
SLIDE 74

Upper Bound Results: Distributed Computing

  • Remark. These result also imply MapReduce algorithms for

matching and vertex cover with the same approximation guarantee in at most 2 rounds of computation and O(n√n) space per each machine. Our MapReduce algorithms outperform the previous algorithms for these problems [Lattanzi et al., 2011, Ahn and Guha, 2015] in terms

  • f number of rounds, albeit with a larger approximation guarantee.

The number of rounds of a MapReduce algorithm usually determines the dominant cost of the computation.

Sepehr Assadi (Penn) SPAA 2017

slide-75
SLIDE 75

Lower Bound Results: Distributed Computing

Our lower bound on size of randomized composable coresets implies that our distributed protocols are optimal among all coreset-based protocols.

Sepehr Assadi (Penn) SPAA 2017

slide-76
SLIDE 76

Lower Bound Results: Distributed Computing

Our lower bound on size of randomized composable coresets implies that our distributed protocols are optimal among all coreset-based protocols. What about general protocols?

Sepehr Assadi (Penn) SPAA 2017

slide-77
SLIDE 77

Lower Bound Results: Distributed Computing

Our lower bound on size of randomized composable coresets implies that our distributed protocols are optimal among all coreset-based protocols. What about general protocols?

Theorem

Any α-approximation simultaneous protocol (not necessarily a coreset) requires Ω(nk/α2) communication for the matching problem, and, Ω(nk/α) communication for the vertex cover problem, even when the input is randomly partitioned across the k machines.

Sepehr Assadi (Penn) SPAA 2017

slide-78
SLIDE 78

Lower Bound Results: Distributed Computing

Our lower bound on size of randomized composable coresets implies that our distributed protocols are optimal among all coreset-based protocols. What about general protocols?

Theorem

Any α-approximation simultaneous protocol (not necessarily a coreset) requires Ω(nk/α2) communication for the matching problem, and, Ω(nk/α) communication for the vertex cover problem, even when the input is randomly partitioned across the k machines. For adversarial partitions, an Ω(nk/α2) lower bound for matching was known previously even for protocols that are allowed multiple rounds of communication [Huang et al., 2015].

Sepehr Assadi (Penn) SPAA 2017

slide-79
SLIDE 79

A Randomized Composable Coreset for Matching

Sepehr Assadi (Penn) SPAA 2017

slide-80
SLIDE 80

A Randomized Coreset for Matching

Theorem

Any maximum matching is an O(1)-randomized composable coreset

  • f size n/2 for the matching problem.

Sepehr Assadi (Penn) SPAA 2017

slide-81
SLIDE 81

A Randomized Coreset for Matching

Theorem

Any maximum matching is an O(1)-randomized composable coreset

  • f size n/2 for the matching problem.

Let Mi be the maximum matching computed by machine i ∈ [k].

Sepehr Assadi (Penn) SPAA 2017

slide-82
SLIDE 82

A Randomized Coreset for Matching

Theorem

Any maximum matching is an O(1)-randomized composable coreset

  • f size n/2 for the matching problem.

Let Mi be the maximum matching computed by machine i ∈ [k]. Consider running the greedy algorithm over the edges in M1, . . . , Mk in this order to obtain a matching M.

Sepehr Assadi (Penn) SPAA 2017

slide-83
SLIDE 83

A Randomized Coreset for Matching

Theorem

Any maximum matching is an O(1)-randomized composable coreset

  • f size n/2 for the matching problem.

Let Mi be the maximum matching computed by machine i ∈ [k]. Consider running the greedy algorithm over the edges in M1, . . . , Mk in this order to obtain a matching M. We prove that |M| = Ω(opt), where opt is the size of a maximum matching in G.

Sepehr Assadi (Penn) SPAA 2017

slide-84
SLIDE 84

A Randomized Coreset for Matching

Theorem

Any maximum matching is an O(1)-randomized composable coreset

  • f size n/2 for the matching problem.

Let Mi be the maximum matching computed by machine i ∈ [k]. Consider running the greedy algorithm over the edges in M1, . . . , Mk in this order to obtain a matching M. We prove that |M| = Ω(opt), where opt is the size of a maximum matching in G. This implies that there exists an O(1)-approximate matching in M1 ∪ . . . ∪ Mk.

Sepehr Assadi (Penn) SPAA 2017

slide-85
SLIDE 85

Analysis Sketch: A Key Lemma

Lemma

At any step i ∈ [k], either the greedy matching is already of size Ω(opt), or w.h.p., we can increase the size of the current matching by adding Ω(opt/k) edges from Mi greedily.

Sepehr Assadi (Penn) SPAA 2017

slide-86
SLIDE 86

Analysis Sketch: A Key Lemma

Lemma

At any step i ∈ [k], either the greedy matching is already of size Ω(opt), or w.h.p., we can increase the size of the current matching by adding Ω(opt/k) edges from Mi greedily. This immediately implies that the matching output by the greedy algorithm has size Ω(opt) w.h.p.

Sepehr Assadi (Penn) SPAA 2017

slide-87
SLIDE 87

Proof Sketch

Consider the set of o(opt) already matched vertices by the greedy algorithm.

Sepehr Assadi (Penn) SPAA 2017

slide-88
SLIDE 88

Proof Sketch

Consider the set of o(opt) already matched vertices by the greedy algorithm. Define Eold as the set of edges in G(i) incident on these already matched vertices.

Sepehr Assadi (Penn) SPAA 2017

slide-89
SLIDE 89

Proof Sketch

Consider the set of o(opt) already matched vertices by the greedy algorithm. Define Eold as the set of edges in G(i) incident on these already matched vertices. Define µold as size of a maximum matching in G(i) using only edges in Eold.

Sepehr Assadi (Penn) SPAA 2017

slide-90
SLIDE 90

Proof Sketch

  • Claim. W.h.p. there is a matching of size ≥ µold + Ω(opt/k) in G(i).

Sepehr Assadi (Penn) SPAA 2017

slide-91
SLIDE 91

Proof Sketch

  • Claim. W.h.p. there is a matching of size ≥ µold + Ω(opt/k) in G(i).

Fix a maximum matching in Eold: at most o(opt) vertices that were previously unmatched are in the matching.

Sepehr Assadi (Penn) SPAA 2017

slide-92
SLIDE 92

Proof Sketch

  • Claim. W.h.p. there is a matching of size ≥ µold + Ω(opt/k) in G(i).

Fix a maximum matching in Eold: at most o(opt) vertices that were previously unmatched are in the matching. Hence, G contains a matching of size Ω(opt) outside the set of vertices matched by µold.

Sepehr Assadi (Penn) SPAA 2017

slide-93
SLIDE 93

Proof Sketch

  • Claim. W.h.p. there is a matching of size ≥ µold + Ω(opt/k) in G(i).

Fix a maximum matching in Eold: at most o(opt) vertices that were previously unmatched are in the matching. Hence, G contains a matching of size Ω(opt) outside the set of vertices matched by µold. By random partitioning, w.h.p., Ω(opt/k) such edges appear in G(i).

Sepehr Assadi (Penn) SPAA 2017

slide-94
SLIDE 94

Proof Sketch

  • Claim. W.h.p. there is a matching of size ≥ µold + Ω(opt/k) in G(i).

Fix a maximum matching in Eold: at most o(opt) vertices that were previously unmatched are in the matching. Hence, G contains a matching of size Ω(opt) outside the set of vertices matched by µold. By random partitioning, w.h.p., Ω(opt/k) such edges appear in G(i). µold + Ω(opt/k) forms the desired matching.

Sepehr Assadi (Penn) SPAA 2017

slide-95
SLIDE 95

Proof Sketch

  • Claim. W.h.p. there is a matching of size ≥ µold + Ω(opt/k) in G(i).

Fix a maximum matching in Eold: at most o(opt) vertices that were previously unmatched are in the matching. Hence, G contains a matching of size Ω(opt) outside the set of vertices matched by µold. By random partitioning, w.h.p., Ω(opt/k) such edges appear in G(i). µold + Ω(opt/k) forms the desired matching.

  • Corollary. Any maximum matching of G(i) contains Ω(opt/k) edges

that can be added to the greedy matching.

Sepehr Assadi (Penn) SPAA 2017

slide-96
SLIDE 96

Randomized Composable Coreset for Matching

We showed that,

Theorem

Any maximum matching is an O(1)-randomized composable coreset

  • f size at most n/2 for the matching problem.

Sepehr Assadi (Penn) SPAA 2017

slide-97
SLIDE 97

A Randomized Composable Coreset for Vertex Cover

Sepehr Assadi (Penn) SPAA 2017

slide-98
SLIDE 98

A Randomized Coreset for Vertex Cover

Theorem

There exists an O(log n)-approximation randomized composable coreset of size O(n · log n) for the vertex cover problem.

Sepehr Assadi (Penn) SPAA 2017

slide-99
SLIDE 99

A Randomized Coreset for Vertex Cover

Theorem

There exists an O(log n)-approximation randomized composable coreset of size O(n · log n) for the vertex cover problem. Each machine computes a coreset using the following peeling process.

Sepehr Assadi (Penn) SPAA 2017

slide-100
SLIDE 100

A Randomized Coreset for Vertex Cover

Theorem

There exists an O(log n)-approximation randomized composable coreset of size O(n · log n) for the vertex cover problem. Each machine computes a coreset using the following peeling process. Iteratively remove high degree vertices and their neighboring edges; specify any removed vertex to be added to the final vertex cover.

Sepehr Assadi (Penn) SPAA 2017

slide-101
SLIDE 101

A Randomized Coreset for Vertex Cover

Theorem

There exists an O(log n)-approximation randomized composable coreset of size O(n · log n) for the vertex cover problem. Each machine computes a coreset using the following peeling process. Iteratively remove high degree vertices and their neighboring edges; specify any removed vertex to be added to the final vertex cover. When the remaining graph is sufficiently sparse, send it as the coreset.

Sepehr Assadi (Penn) SPAA 2017

slide-102
SLIDE 102

A Randomized Coreset for Vertex Cover

Theorem

There exists an O(log n)-approximation randomized composable coreset of size O(n · log n) for the vertex cover problem. Each machine computes a coreset using the following peeling process. Iteratively remove high degree vertices and their neighboring edges; specify any removed vertex to be added to the final vertex cover. When the remaining graph is sufficiently sparse, send it as the coreset. This peeling process was introduced originally by [Parnas and Ron, 2007] in the context of sublinear time algorithms.

Sepehr Assadi (Penn) SPAA 2017

slide-103
SLIDE 103

A Randomized Coreset for Vertex Cover

The algorithm to compute the coreset on each machine i ∈ [k]:

1

Pick all vertices in G(i) with degree more than n/2k and add them to the final vertex cover.

slide-104
SLIDE 104

A Randomized Coreset for Vertex Cover

The algorithm to compute the coreset on each machine i ∈ [k]:

1

Pick all vertices in G(i) with degree more than n/2k and add them to the final vertex cover.

2

Remove these vertices from G(i) together with all their edges.

slide-105
SLIDE 105

A Randomized Coreset for Vertex Cover

The algorithm to compute the coreset on each machine i ∈ [k]:

1

Pick all vertices in G(i) with degree more than n/2k and add them to the final vertex cover.

2

Remove these vertices from G(i) together with all their edges.

3

Continue with degree threshold n/4k and so on; stop when the degree of each vertex is O(log n).

slide-106
SLIDE 106

A Randomized Coreset for Vertex Cover

The algorithm to compute the coreset on each machine i ∈ [k]:

1

Pick all vertices in G(i) with degree more than n/2k and add them to the final vertex cover.

2

Remove these vertices from G(i) together with all their edges.

3

Continue with degree threshold n/4k and so on; stop when the degree of each vertex is O(log n).

4

Return all edges in the remaining graph as the coreset.

Sepehr Assadi (Penn) SPAA 2017

slide-107
SLIDE 107

A Randomized Coreset for Vertex Cover

The algorithm to compute the coreset on each machine i ∈ [k]:

1

Pick all vertices in G(i) with degree more than n/2k and add them to the final vertex cover.

2

Remove these vertices from G(i) together with all their edges.

3

Continue with degree threshold n/4k and so on; stop when the degree of each vertex is O(log n).

4

Return all edges in the remaining graph as the coreset. Size of the coreset is clearly O(n · log n).

Sepehr Assadi (Penn) SPAA 2017

slide-108
SLIDE 108

Analysis Sketch

Define opt as size of a minimum vertex cover in G.

Sepehr Assadi (Penn) SPAA 2017

slide-109
SLIDE 109

Analysis Sketch

Define opt as size of a minimum vertex cover in G. It follows from the known results that each coreset only specifies O(opt · log n) vertices to be added to the final vertex cover

Sepehr Assadi (Penn) SPAA 2017

slide-110
SLIDE 110

Analysis Sketch

Define opt as size of a minimum vertex cover in G. It follows from the known results that each coreset only specifies O(opt · log n) vertices to be added to the final vertex cover Using this directly only implies an approximation ratio of O(k · log n), i.e., a factor k worse than our goal.

Sepehr Assadi (Penn) SPAA 2017

slide-111
SLIDE 111

Analysis Sketch

Define opt as size of a minimum vertex cover in G. It follows from the known results that each coreset only specifies O(opt · log n) vertices to be added to the final vertex cover Using this directly only implies an approximation ratio of O(k · log n), i.e., a factor k worse than our goal. We show that the set of all specified vertices across all coresets is of size O(opt · log n).

Sepehr Assadi (Penn) SPAA 2017

slide-112
SLIDE 112

Analysis Sketch

Define opt as size of a minimum vertex cover in G. It follows from the known results that each coreset only specifies O(opt · log n) vertices to be added to the final vertex cover Using this directly only implies an approximation ratio of O(k · log n), i.e., a factor k worse than our goal. We show that the set of all specified vertices across all coresets is of size O(opt · log n). This finalizes the proof as any edge not covered by any of specified vertices is communicated in some coreset.

Sepehr Assadi (Penn) SPAA 2017

slide-113
SLIDE 113

Analysis Sketch: A Key Lemma

Lemma

W.h.p. at most O(opt · log n) vertices are specified to be added to the final vertex cover in total.

Sepehr Assadi (Penn) SPAA 2017

slide-114
SLIDE 114

Analysis Sketch: A Key Lemma

Lemma

W.h.p. at most O(opt · log n) vertices are specified to be added to the final vertex cover in total. Intuitively:

1

By random partitioning, degree of vertices is almost the same across the coresets.

Sepehr Assadi (Penn) SPAA 2017

slide-115
SLIDE 115

Analysis Sketch: A Key Lemma

Lemma

W.h.p. at most O(opt · log n) vertices are specified to be added to the final vertex cover in total. Intuitively:

1

By random partitioning, degree of vertices is almost the same across the coresets.

2

Hence, the same set of vertices should be peeled across in each iteration.

Sepehr Assadi (Penn) SPAA 2017

slide-116
SLIDE 116

Analysis Sketch: A Key Lemma

Lemma

W.h.p. at most O(opt · log n) vertices are specified to be added to the final vertex cover in total. Intuitively:

1

By random partitioning, degree of vertices is almost the same across the coresets.

2

Hence, the same set of vertices should be peeled across in each iteration. Any problem?

Sepehr Assadi (Penn) SPAA 2017

slide-117
SLIDE 117

Analysis Sketch: A Key Lemma

Lemma

W.h.p. at most O(opt · log n) vertices are specified to be added to the final vertex cover in total. Intuitively:

1

By random partitioning, degree of vertices is almost the same across the coresets.

2

Hence, the same set of vertices should be peeled across in each iteration. Any problem?

1

The peeling process is quite sensitive to the exact degrees.

Sepehr Assadi (Penn) SPAA 2017

slide-118
SLIDE 118

Analysis Sketch: A Key Lemma

Lemma

W.h.p. at most O(opt · log n) vertices are specified to be added to the final vertex cover in total. Intuitively:

1

By random partitioning, degree of vertices is almost the same across the coresets.

2

Hence, the same set of vertices should be peeled across in each iteration. Any problem?

1

The peeling process is quite sensitive to the exact degrees.

2

Slight changes in the degree can move vertices across iterations, potentially leading to a cascading effect.

Sepehr Assadi (Penn) SPAA 2017

slide-119
SLIDE 119

Analysis Sketch: A Key Lemma

Lemma

W.h.p. at most O(opt · log n) vertices are specified to be added to the final vertex cover in total. Our approach:

1

Define a hypothetical peeling process that is aware of a minimum vertex cover in G.

Sepehr Assadi (Penn) SPAA 2017

slide-120
SLIDE 120

Analysis Sketch: A Key Lemma

Lemma

W.h.p. at most O(opt · log n) vertices are specified to be added to the final vertex cover in total. Our approach:

1

Define a hypothetical peeling process that is aware of a minimum vertex cover in G.

2

Prove that this peeling process never picks more than O(opt · log n) vertices.

Sepehr Assadi (Penn) SPAA 2017

slide-121
SLIDE 121

Analysis Sketch: A Key Lemma

Lemma

W.h.p. at most O(opt · log n) vertices are specified to be added to the final vertex cover in total. Our approach:

1

Define a hypothetical peeling process that is aware of a minimum vertex cover in G.

2

Prove that this peeling process never picks more than O(opt · log n) vertices.

3

Show that the actual peeling process on each machine “faithfully” mimics this hypothetical process.

Sepehr Assadi (Penn) SPAA 2017

slide-122
SLIDE 122

Proof Sketch

Define O as a minimum vertex cover of G. The hypothetical peeling process is as follows: O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-123
SLIDE 123

Proof Sketch

Define O as a minimum vertex cover of G. The hypothetical peeling process is as follows: Remove all edges inside O. O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-124
SLIDE 124

Proof Sketch

Define O as a minimum vertex cover of G. The hypothetical peeling process is as follows: Remove all edges inside O. Remove vertices with degree

n 1.5 from O

and degree

n 2.5 from V \ O.

O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-125
SLIDE 125

Proof Sketch

Define O as a minimum vertex cover of G. The hypothetical peeling process is as follows: Remove all edges inside O. Remove vertices with degree

n 1.5 from O

and degree

n 2.5 from V \ O.

Remove all incident edges on these vertices; continue with degree threshold

n 2·(1.5) from O and n 2·(2.5) from V \ O.

O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-126
SLIDE 126

Proof Sketch

Define O as a minimum vertex cover of G. The hypothetical peeling process is as follows: Remove all edges inside O. Remove vertices with degree

n 1.5 from O

and degree

n 2.5 from V \ O.

Remove all incident edges on these vertices; continue with degree threshold

n 2·(1.5) from O and n 2·(2.5) from V \ O.

Repeat the above process until the degree threshold reaches Θ(k log n). O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-127
SLIDE 127

Proof Sketch

Define O as a minimum vertex cover of G. The hypothetical peeling process is as follows: Remove all edges inside O. Remove vertices with degree

n 1.5 from O

and degree

n 2.5 from V \ O.

Remove all incident edges on these vertices; continue with degree threshold

n 2·(1.5) from O and n 2·(2.5) from V \ O.

Repeat the above process until the degree threshold reaches Θ(k log n).

  • Claim. The number of peeled vertices from

V \ O in each iteration is at most 2 |O|. O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-128
SLIDE 128

Proof Sketch

Main Claim. For any machine i ∈ [k] and any iteration of the peeling process: O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-129
SLIDE 129

Proof Sketch

Main Claim. For any machine i ∈ [k] and any iteration of the peeling process: For vertices in O: the set of peeled vertices in the hypothetical process is a subset of vertices peeled in the actual coreset. O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-130
SLIDE 130

Proof Sketch

Main Claim. For any machine i ∈ [k] and any iteration of the peeling process: For vertices in O: the set of peeled vertices in the hypothetical process is a subset of vertices peeled in the actual coreset. O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-131
SLIDE 131

Proof Sketch

Main Claim. For any machine i ∈ [k] and any iteration of the peeling process: For vertices in O: the set of peeled vertices in the hypothetical process is a subset of vertices peeled in the actual coreset. For vertices in V \ O: the set of peeled vertices in the hypothetical process is a superset of vertices peeled in the actual coreset. O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-132
SLIDE 132

Proof Sketch

Main Claim. For any machine i ∈ [k] and any iteration of the peeling process: For vertices in O: the set of peeled vertices in the hypothetical process is a subset of vertices peeled in the actual coreset. For vertices in V \ O: the set of peeled vertices in the hypothetical process is a superset of vertices peeled in the actual coreset. O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-133
SLIDE 133

Proof Sketch

In the first iteration, by random partitioning: O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-134
SLIDE 134

Proof Sketch

In the first iteration, by random partitioning: For vertices in O: the degree threshold

  • f peeling vertices in the hypothetical

process is larger than the actual coreset (after scaling by k). O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-135
SLIDE 135

Proof Sketch

In the first iteration, by random partitioning: For vertices in O: the degree threshold

  • f peeling vertices in the hypothetical

process is larger than the actual coreset (after scaling by k). O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-136
SLIDE 136

Proof Sketch

In the first iteration, by random partitioning: For vertices in O: the degree threshold

  • f peeling vertices in the hypothetical

process is larger than the actual coreset (after scaling by k). For vertices in V \ O: the exact

  • pposite.

O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-137
SLIDE 137

Proof Sketch

In the first iteration, by random partitioning: For vertices in O: the degree threshold

  • f peeling vertices in the hypothetical

process is larger than the actual coreset (after scaling by k). For vertices in V \ O: the exact

  • pposite.

O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-138
SLIDE 138

Proof Sketch

In the first iteration, by random partitioning: For vertices in O: the degree threshold

  • f peeling vertices in the hypothetical

process is larger than the actual coreset (after scaling by k). For vertices in V \ O: the exact

  • pposite.

In the next iterations: O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-139
SLIDE 139

Proof Sketch

In the first iteration, by random partitioning: For vertices in O: the degree threshold

  • f peeling vertices in the hypothetical

process is larger than the actual coreset (after scaling by k). For vertices in V \ O: the exact

  • pposite.

In the next iterations: For vertices in O: the degree of remaining vertices after peeling is smaller in the hypothetical process compared to the actual coreset. O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-140
SLIDE 140

Proof Sketch

In the first iteration, by random partitioning: For vertices in O: the degree threshold

  • f peeling vertices in the hypothetical

process is larger than the actual coreset (after scaling by k). For vertices in V \ O: the exact

  • pposite.

In the next iterations: For vertices in O: the degree of remaining vertices after peeling is smaller in the hypothetical process compared to the actual coreset. O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-141
SLIDE 141

Proof Sketch

In the first iteration, by random partitioning: For vertices in O: the degree threshold

  • f peeling vertices in the hypothetical

process is larger than the actual coreset (after scaling by k). For vertices in V \ O: the exact

  • pposite.

In the next iterations: For vertices in O: the degree of remaining vertices after peeling is smaller in the hypothetical process compared to the actual coreset. For vertices in V \ O: the exact

  • pposite.

O V \ O

Sepehr Assadi (Penn) SPAA 2017

slide-142
SLIDE 142

Proof Sketch

To wrap-up:

Sepehr Assadi (Penn) SPAA 2017

slide-143
SLIDE 143

Proof Sketch

To wrap-up:

1

Across the machines, the set of peeled vertices in V \ O by the coresets is a subset of peeled vertices by the hypothetical process.

Sepehr Assadi (Penn) SPAA 2017

slide-144
SLIDE 144

Proof Sketch

To wrap-up:

1

Across the machines, the set of peeled vertices in V \ O by the coresets is a subset of peeled vertices by the hypothetical process.

2

The set of peeled vertices in V \ O by the hypothetical process is of size O(opt · log n).

Sepehr Assadi (Penn) SPAA 2017

slide-145
SLIDE 145

Proof Sketch

To wrap-up:

1

Across the machines, the set of peeled vertices in V \ O by the coresets is a subset of peeled vertices by the hypothetical process.

2

The set of peeled vertices in V \ O by the hypothetical process is of size O(opt · log n).

3

The remaining peeled vertices across the machines belong to O and hence are of size O(opt).

Sepehr Assadi (Penn) SPAA 2017

slide-146
SLIDE 146

Proof Sketch

To wrap-up:

1

Across the machines, the set of peeled vertices in V \ O by the coresets is a subset of peeled vertices by the hypothetical process.

2

The set of peeled vertices in V \ O by the hypothetical process is of size O(opt · log n).

3

The remaining peeled vertices across the machines belong to O and hence are of size O(opt).

Lemma

W.h.p. at most O(opt · log n) vertices are specified to be added to the final vertex cover in total.

Sepehr Assadi (Penn) SPAA 2017

slide-147
SLIDE 147

Randomized Composable Coreset for Vertex Cover

We showed that,

Theorem

There exists an O(log n)-approximation randomized composable coreset of size O(n · log n) for the vertex cover problem.

Sepehr Assadi (Penn) SPAA 2017

slide-148
SLIDE 148

Concluding Remarks

We provided efficient simultaneous protocols for matching and vertex cover when the edges of the graph are partitioned randomly across the machines.

Sepehr Assadi (Penn) SPAA 2017

slide-149
SLIDE 149

Concluding Remarks

We provided efficient simultaneous protocols for matching and vertex cover when the edges of the graph are partitioned randomly across the machines. Our protocols bypass the strong impossibility results known for these problems under adversarially partitioned inputs.

Sepehr Assadi (Penn) SPAA 2017

slide-150
SLIDE 150

Concluding Remarks

We provided efficient simultaneous protocols for matching and vertex cover when the edges of the graph are partitioned randomly across the machines. Our protocols bypass the strong impossibility results known for these problems under adversarially partitioned inputs. Open problems: Better approximation factors for matching and vertex cover?

Sepehr Assadi (Penn) SPAA 2017

slide-151
SLIDE 151

Concluding Remarks

We provided efficient simultaneous protocols for matching and vertex cover when the edges of the graph are partitioned randomly across the machines. Our protocols bypass the strong impossibility results known for these problems under adversarially partitioned inputs. Open problems: Better approximation factors for matching and vertex cover? Any super-linear (in n) lower bound for (1 + ε)-approximation of matching under random partitions?

Sepehr Assadi (Penn) SPAA 2017

slide-152
SLIDE 152

Concluding Remarks

We provided efficient simultaneous protocols for matching and vertex cover when the edges of the graph are partitioned randomly across the machines. Our protocols bypass the strong impossibility results known for these problems under adversarially partitioned inputs. Open problems: Better approximation factors for matching and vertex cover? Any super-linear (in n) lower bound for (1 + ε)-approximation of matching under random partitions? Randomized composable coresets for other problems?

Sepehr Assadi (Penn) SPAA 2017

slide-153
SLIDE 153

Concluding Remarks

We provided efficient simultaneous protocols for matching and vertex cover when the edges of the graph are partitioned randomly across the machines. Our protocols bypass the strong impossibility results known for these problems under adversarially partitioned inputs. Open problems: Better approximation factors for matching and vertex cover? Any super-linear (in n) lower bound for (1 + ε)-approximation of matching under random partitions? Randomized composable coresets for other problems?

◮ In particular, for obtaining a maximal matching? Sepehr Assadi (Penn) SPAA 2017

slide-154
SLIDE 154

Ahn, K. J. and Guha, S. (2015). Access to data and number of iterations: Dual primal algorithms for maximum matching under resource constraints. In Proceedings of the 27th ACM on Symposium on Parallelism in Algorithms and Architectures, SPAA 2015, Portland, OR, USA, June 13-15, 2015, pages 202–211. Ahn, K. J., Guha, S., and McGregor, A. (2012a). Analyzing graph structure via linear measurements. In Proceedings of the Twenty-third Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’12, pages 459–467. SIAM. Ahn, K. J., Guha, S., and McGregor, A. (2012b). Graph sketches: sparsification, spanners, and subgraphs. In Proceedings of the 31st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2012, Scottsdale, AZ, USA, May 20-24, 2012, pages 5–14.

Sepehr Assadi (Penn) SPAA 2017

slide-155
SLIDE 155

Assadi, S., Khanna, S., Li, Y., and Yaroslavtsev, G. (2016). Maximum matchings in dynamic graph streams and the simultaneous communication model. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016, pages 1345–1364. Badanidiyuru, A., Mirzasoleiman, B., Karbasi, A., and Krause, A. (2014). Streaming submodular maximization: massive data summarization on the fly. In The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, New York, NY, USA - August 24 - 27, 2014, pages 671–680. Balcan, M., Ehrlich, S., and Liang, Y. (2013). Distributed k-means and k-median clustering on general communication topologies.

Sepehr Assadi (Penn) SPAA 2017

slide-156
SLIDE 156

In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems

  • 2013. Proceedings of a meeting held December 5-8, 2013, Lake

Tahoe, Nevada, United States., pages 1995–2003. Bateni, M., Bhaskara, A., Lattanzi, S., and Mirrokni, V. S. (2014). Distributed balanced clustering via mapping coresets. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, pages 2591–2599. Bhattacharya, S., Henzinger, M., Nanongkai, D., and Tsourakakis, C. E. (2015). Space- and time-efficient algorithm for maintaining dense subgraphs on one-pass dynamic streams.

Sepehr Assadi (Penn) SPAA 2017

slide-157
SLIDE 157

In Proceedings of the Forty-Seventh Annual ACM on Symposium

  • n Theory of Computing, STOC 2015, Portland, OR, USA, June

14-17, 2015, pages 173–182. Bulteau, L., Froese, V., Kutzkov, K., and Pagh, R. (2016). Triangle counting in dynamic graph streams. Algorithmica, 76(1):259–278. Chitnis, R., Cormode, G., Esfandiari, H., Hajiaghayi, M., McGregor, A., Monemizadeh, M., and Vorotnikova, S. (2016). Kernelization via sampling with applications to finding matchings and related problems in dynamic graph streams. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016, pages 1326–1344. da Ponte Barbosa, R., Ene, A., Nguyen, H. L., and Ward, J. (2015).

Sepehr Assadi (Penn) SPAA 2017

slide-158
SLIDE 158

The power of randomization: Distributed submodular maximization on massive datasets. In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, pages 1236–1244. Huang, Z., Radunovic, B., Vojnovic, M., and Zhang, Q. (2015). Communication complexity of approximate matching in distributed graphs. In 32nd International Symposium on Theoretical Aspects of Computer Science, STACS 2015, March 4-7, 2015, Garching, Germany, pages 460–473. Indyk, P., Mahabadi, S., Mahdian, M., and Mirrokni, V. S. (2014). Composable core-sets for diversity and coverage maximization. In Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS’14, Snowbird, UT, USA, June 22-27, 2014, pages 100–108.

Sepehr Assadi (Penn) SPAA 2017

slide-159
SLIDE 159

Kapralov, M., Lee, Y. T., Musco, C., Musco, C., and Sidford, A. (2014). Single pass spectral sparsification in dynamic streams. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18-21, 2014, pages 561–570. Kapralov, M. and Woodruff, D. (2014). Spanners and sparsifiers in dynamic streams. PODC. Lattanzi, S., Moseley, B., Suri, S., and Vassilvitskii, S. (2011). Filtering: a method for solving graph problems in mapreduce. In SPAA 2011: Proceedings of the 23rd Annual ACM Symposium

  • n Parallelism in Algorithms and Architectures, San Jose, CA,

USA, June 4-6, 2011 (Co-located with FCRC 2011), pages 85–94. McGregor, A., Tench, D., Vorotnikova, S., and Vu, H. T. (2015). Densest subgraph in dynamic graph streams.

Sepehr Assadi (Penn) SPAA 2017

slide-160
SLIDE 160

In Mathematical Foundations of Computer Science 2015 - 40th International Symposium, MFCS 2015, Milan, Italy, August 24-28, 2015, Proceedings, Part II, pages 472–482. Mirrokni, V. S. and Zadimoghaddam, M. (2015). Randomized composable core-sets for distributed submodular maximization. In Proceedings of the Forty-Seventh Annual ACM on Symposium

  • n Theory of Computing, STOC 2015, Portland, OR, USA, June

14-17, 2015, pages 153–162. Mirzasoleiman, B., Karbasi, A., Sarkar, R., and Krause, A. (2013). Distributed submodular maximization: Identifying representative elements in massive data. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems

  • 2013. Proceedings of a meeting held December 5-8, 2013, Lake

Tahoe, Nevada, United States., pages 2049–2057.

Sepehr Assadi (Penn) SPAA 2017

slide-161
SLIDE 161

Parnas, M. and Ron, D. (2007). Approximating the minimum vertex cover in sublinear time and a connection to distributed algorithms.

  • Theor. Comput. Sci., 381(1-3):183–196.

Sepehr Assadi (Penn) SPAA 2017