[PPT] - Graph Sketching, Sampling, Streaming, and Space Efficient PowerPoint Presentation

SLIDE 1

Graph Sketching, Sampling, Streaming, and Space Efficient Optimization (Part II)

Sudipto Guha and Andrew McGregor

SLIDE 2

Space Efficient Optimization for Graphs

Impact of Dimensionality Reduction, Embeddings, Lp → Lq, etc. Thesis: Graph optimization problems are natural next candidates. (Part I): Building blocks: sketching, sampling in graphs. Why? How to use them? How do we think these problems?

SLIDE 3

Space Efficient Optimization for Graphs

Impact of Dimensionality Reduction, Embeddings, Lp → Lq, etc. Thesis: Graph optimization problems are natural next candidates. (Part I): Building blocks: sketching, sampling in graphs. Why? How to use them? How do we think these problems? Space Efficient Optimization.

◮ Storage grows. Problem sizes grow larger. ◮ Streaming=Organizing accesses in an algorithm. ◮ Sketching =Organizing information. ◮ Partition of input, model, output and algorithm. ◮ Processing Space = Storage Space.

SLIDE 4

Optimization?

Many frameworks to choose from. Linear/Convex programming.

1. A lot of general purpose techniques.
2. A rich history in graphs.
3. The connection to streaming is less well studied.

Correlation Clustering and Max Matchings (part I) as examples. Rephrasing papers in SODA 2014, ICML 2015, SPAA 2015.

SLIDE 5

Tutorial Plan

SLIDE 6

Tutorial Plan

(a) Recap of Multiplicative Weights Method.

SLIDE 7

Tutorial Plan

(a) Recap of Multiplicative Weights Method. Feasibility, LP version

SLIDE 8

Tutorial Plan

(a) Recap of Multiplicative Weights Method. Feasibility, LP version Multiple perspectives on the algorithm.

SLIDE 9

Tutorial Plan

(a) Recap of Multiplicative Weights Method. Feasibility, LP version Multiple perspectives on the algorithm. What is the basic idea behind the proof?

SLIDE 10

Tutorial Plan

(a) Recap of Multiplicative Weights Method. Feasibility, LP version Multiple perspectives on the algorithm. What is the basic idea behind the proof? Where/how do we start from?

SLIDE 11

Tutorial Plan

(a) Recap of Multiplicative Weights Method. Feasibility, LP version Multiple perspectives on the algorithm. What is the basic idea behind the proof? Where/how do we start from? (b) Application to Min Correlation Clustering.

SLIDE 12

Tutorial Plan

(a) Recap of Multiplicative Weights Method. Feasibility, LP version Multiple perspectives on the algorithm. What is the basic idea behind the proof? Where/how do we start from? (b) Application to Min Correlation Clustering. How to design an Oracle.

SLIDE 13

Tutorial Plan

(a) Recap of Multiplicative Weights Method. Feasibility, LP version Multiple perspectives on the algorithm. What is the basic idea behind the proof? Where/how do we start from? (b) Application to Min Correlation Clustering. How to design an Oracle. “Drag and Drop” sparsification.

SLIDE 14

Tutorial Plan

(a) Recap of Multiplicative Weights Method. (b) Application to Min Correlation Clustering.

SLIDE 15

Tutorial Plan

(a) Recap of Multiplicative Weights Method. (b) Application to Min Correlation Clustering. (c) Semi-Definite Programming (SDPs): Max Correlation Clustering

SLIDE 16

Tutorial Plan

(a) Recap of Multiplicative Weights Method. (b) Application to Min Correlation Clustering. (c) Semi-Definite Programming (SDPs): Max Correlation Clustering What if relaxations do not fit?

SLIDE 17

Tutorial Plan

(a) Recap of Multiplicative Weights Method. (b) Application to Min Correlation Clustering. (c) Semi-Definite Programming (SDPs): Max Correlation Clustering What if relaxations do not fit? How to “round” the fractional solution?

SLIDE 18

Tutorial Plan

(a) Recap of Multiplicative Weights Method. (b) Application to Min Correlation Clustering. (c) Semi-Definite Programming (SDPs): Max Correlation Clustering What if relaxations do not fit? How to “round” the fractional solution? (d) Multiple Passes I: Max Bipartite Matching

SLIDE 19

Tutorial Plan

(a) Recap of Multiplicative Weights Method. (b) Application to Min Correlation Clustering. (c) Semi-Definite Programming (SDPs): Max Correlation Clustering What if relaxations do not fit? How to “round” the fractional solution? (d) Multiple Passes I: Max Bipartite Matching Optimization over fixed constraint matrices.

SLIDE 20

Tutorial Plan

(a) Recap of Multiplicative Weights Method. (b) Application to Min Correlation Clustering. (c) Semi-Definite Programming (SDPs): Max Correlation Clustering What if relaxations do not fit? How to “round” the fractional solution? (d) Multiple Passes I: Max Bipartite Matching Optimization over fixed constraint matrices. Use of Approximation Algorithms for speedup.

SLIDE 21

Tutorial Plan

(a) Recap of Multiplicative Weights Method. (b) Application to Min Correlation Clustering. (c) Semi-Definite Programming (SDPs): Max Correlation Clustering What if relaxations do not fit? How to “round” the fractional solution? (d) Multiple Passes I: Max Bipartite Matching Optimization over fixed constraint matrices. Use of Approximation Algorithms for speedup. “Primal-Dual meets Primal-Dual” .

SLIDE 22

Tutorial Plan

(a) Recap of Multiplicative Weights Method. (b) Application to Min Correlation Clustering. (c) Semi-Definite Programming (SDPs): Max Correlation Clustering. (d) Multiple Passes I: Max Bipartite Matching.

SLIDE 23

Tutorial Plan

(a) Recap of Multiplicative Weights Method. (b) Application to Min Correlation Clustering. (c) Semi-Definite Programming (SDPs): Max Correlation Clustering. (d) Multiple Passes I: Max Bipartite Matching. (e) Multiple Passes II: Max Non-Bipartite Matching.

SLIDE 24

Tutorial Plan

(a) Recap of Multiplicative Weights Method. (b) Application to Min Correlation Clustering. (c) Semi-Definite Programming (SDPs): Max Correlation Clustering. (d) Multiple Passes I: Max Bipartite Matching. (e) Multiple Passes II: Max Non-Bipartite Matching. Exponentially many constraints. Constraint Sparsification! How to find your way in the dark?

SLIDE 25

Tutorial Plan

(a) Recap of Multiplicative Weights Method. (b) Application to Min Correlation Clustering. (c) Semi-Definite Programming (SDPs): Max Correlation Clustering. (d) Multiple Passes I: Max Bipartite Matching. (e) Multiple Passes II: Max Non-Bipartite Matching. Exponentially many constraints. Constraint Sparsification! How to find your way in the dark? MWM with Sparsification? Solution of optimization versus the path.

SLIDE 26

Tutorial Plan

(a) Recap of Multiplicative Weights Method. (b) Application to Min Correlation Clustering. (c) Semi-Definite Programming (SDPs): Max Correlation Clustering. (d) Multiple Passes I: Max Bipartite Matching. (e) Multiple Passes II: Max Non-Bipartite Matching. Exponentially many constraints. Constraint Sparsification! How to find your way in the dark? MWM with Sparsification? Solution of optimization versus the path. (f) Multiple Passes III: Max Non-Bipartite Matching.

SLIDE 27

Tutorial Plan

(a) Recap of Multiplicative Weights Method. (b) Application to Min Correlation Clustering. (c) Semi-Definite Programming (SDPs): Max Correlation Clustering. (d) Multiple Passes I: Max Bipartite Matching. (e) Multiple Passes II: Max Non-Bipartite Matching. Exponentially many constraints. Constraint Sparsification! How to find your way in the dark? MWM with Sparsification? Solution of optimization versus the path. (f) Multiple Passes III: Max Non-Bipartite Matching. Few passes and a good algorithm. Compute in parallel; use sequentially.

SLIDE 28

Tutorial Plan

(a) Recap of Multiplicative Weights Method. (b) Application to Min Correlation Clustering. (c) Semi-Definite Programming (SDPs): Max Correlation Clustering. (d) Multiple Passes I: Max Bipartite Matching. (e) Multiple Passes II: Max Non-Bipartite Matching. Exponentially many constraints. Constraint Sparsification! How to find your way in the dark? MWM with Sparsification? Solution of optimization versus the path. (f) Multiple Passes III: Max Non-Bipartite Matching. Few passes and a good algorithm. Compute in parallel; use sequentially. Dual-primal versus primal-dual. New relaxations for matching.

SLIDE 29

Tutorial Plan

(a) Recap of Multiplicative Weights Method. (b) Application to Min Correlation Clustering. (c) Semi-Definite Programming (SDPs): Max Correlation Clustering. (d) Multiple Passes I: Max Bipartite Matching. (e) Multiple Passes II: Max Non-Bipartite Matching. Exponentially many constraints. Constraint Sparsification! How to find your way in the dark? MWM with Sparsification? Solution of optimization versus the path. (f) Multiple Passes III: Max Non-Bipartite Matching. Few passes and a good algorithm. Compute in parallel; use sequentially. Dual-primal versus primal-dual. New relaxations for matching. (g) Wrap Up.

SLIDE 30

(a) Recap of Multiplicative Weights Method

Basic version. A proof sketch. Alternate views.

SLIDE 31

Multiplicative Weights Method: Basic Version

Ay ≤ b y ≥ 0

SLIDE 32

Multiplicative Weights Method: Basic Version

Ay ≤ b y ≥ 0 Ay ≤ b y ≥ 0 Ay ≤ ρb y ≥ 0

SLIDE 33

Multiplicative Weights Method: Basic Version

Initially u = 1. Assume A, b ≥ 0.

Ay ≤ b y ≥ 0 Ay ≤ b y ≥ 0 Ay ≤ ρb y ≥ 0 uTAy ≤ (1 + ǫ) uTb

SLIDE 34

Multiplicative Weights Method: Basic Version

Initially u = 1. Assume A, b ≥ 0. If Aiy < bi: lower ui, i.e., ui ← ui(1 − ǫ)(bi−Aiy)/biρ. If Aiy > bi: raise ui, i.e., ui ← ui(1 + ǫ)(Aiy−bi)/biρ. (≈ ui ← uieǫ(Aiy−bi)/ρ))

Ay ≤ b y ≥ 0 Ay ≤ b y ≥ 0 Ay ≤ ρb y ≥ 0 uTAy ≤ (1 + ǫ) uTb

SLIDE 35

Multiplicative Weights Method: Basic Version

Ay ≤ b y ≥ 0 Ay ≤ b y ≥ 0 Ay ≤ ρb y ≥ 0

SLIDE 36

Multiplicative Weights Method: Basic Version

Ay ≤ b y ≥ 0 Ay ≤ b y ≥ 0 Ay ≤ ρb y ≥ 0

SLIDE 37

Multiplicative Weights Method: Basic Version

Ay ≤ b y ≥ 0 Ay ≤ b y ≥ 0 Ay ≤ ρb y ≥ 0

SLIDE 38

Multiplicative Weights Method: Basic Version

Ay ≤ b y ≥ 0 Ay ≤ b y ≥ 0 Ay ≤ ρb y ≥ 0

SLIDE 39

Multiplicative Weights Method: Basic Version

Ay ≤ b y ≥ 0 Ay ≤ b y ≥ 0 Ay ≤ ρb y ≥ 0

SLIDE 40

Multiplicative Weights Method: Basic Version

Ay ≤ b y ≥ 0 Ay ≤ b y ≥ 0 Ay ≤ ρb y ≥ 0

SLIDE 41

Multiplicative Weights Method: Basic Version

Ay ≤ b y ≥ 0 Ay ≤ b y ≥ 0 Ay ≤ ρb y ≥ 0

SLIDE 42

Multiplicative Weights Method: Basic Version

Ay ≤ b y ≥ 0 Ay ≤ b y ≥ 0 Ay ≤ ρb y ≥ 0

SLIDE 43

Multiplicative Weights Method: Basic Version

Ay ≤ b y ≥ 0 Ay ≤ b y ≥ 0 Ay ≤ ρb y ≥ 0

Ay ≤ (1 + 3ǫ)b

y ≥ 0

SLIDE 44

Multiplicative Weights Method: Basic Version

Number of rounds depends on ρ, ǫ and other specifics of updating u. ρ =width.

Ay ≤ b y ≥ 0 Ay ≤ b y ≥ 0 Ay ≤ ρb y ≥ 0

Ay ≤ (1 + 3ǫ)b

y ≥ 0

SLIDE 45

How does the proof work?

Scale RHS to get Ay ≤ 1. Let solution for iteration t be y(t), assume −ρ ≤ −ℓ ≤ Aiy(t) ≤ ρ. “Violation” of constraint i as Vi(y(t)) = Aiy(t) − 1; recall ui(t + 1) ≈ ui(t)eǫVi(y(t))/ρ. “Average Violation” as av(t) =

i ui

j uj Vi(y(t)).

On the same side: ≤ 0 (easier case). For approximation ≤ δ. “Potential” at iteration t =

i ui(t).

Now

i ui(t + 1) ≤ ( i ui(t)) eǫ av(t)/ρ. Telescopes.

ln ui(t) ≤ ln Upper Bound Final Fractional wt of i + ǫ

ρ

t av(t)

ǫ

t Vi(t)/ρ − 2ǫ2ℓT/ρ ≤ ln

Upper Bound Final Fractional wt of i + ǫ

ρ

t aV (t)
t Vi(t) ≤ · · · ≤ δ

SLIDE 46

Dantzig Decompositions

A (weighted) running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 47

Dantzig Decompositions

A (weighted) running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 48

Dantzig Decompositions

A (weighted) running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 49

Dantzig Decompositions

A (weighted) running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 50

Dantzig Decompositions

A (weighted) running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 51

Dantzig Decompositions

A (weighted) running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 52

Dantzig Decompositions

A (weighted) running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 53

Dantzig Decompositions

A (weighted) running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 54

Dantzig Decompositions

A (weighted) running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 55

Dantzig Decompositions

A (weighted) running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 56

Dantzig Decompositions

A (weighted) running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 57

Dantzig Decompositions

A (weighted) running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 58

Dantzig Decompositions

A (weighted) running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 59

Dantzig Decompositions

A (weighted) running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 60

Dantzig Decompositions

A (weighted) running average view (primal space).

Hard decision problem Easy decision problem

Apx

decision problem

SLIDE 61

Dantzig Decompositions

A (weighted) running average view (primal space).

Hard decision problem Easy decision problem

Apx

decision problem

SLIDE 62

Dantzig Decompositions

A (weighted) running average view (primal space).

Hard decision problem Easy decision problem

Apx

decision problem

SLIDE 63

Multiplicative Weights: Optimization and Duals

Instead of tracking violations and averaging solutions at the end, Consider the process from the perspective of u

Ay ≤ b cTy ≥ β y ≥ 0 Ay ≤ ρb cTy ≥ β

SLIDE 64

Multiplicative Weights: Optimization and Duals

Instead of tracking violations and averaging solutions at the end, Consider the process from the perspective of u Dual of a hyperplane/constraint? Dual of a point?

Ay ≤ b cTy ≥ β y ≥ 0 Ay ≤ ρb cTy ≥ β

SLIDE 65

Multiplicative Weights: Optimization and Duals

Instead of tracking violations and averaging solutions at the end, Consider the process from the perspective of u Dual of a hyperplane/constraint? Point in dual space. Dual of a point?

Ay ≤ b cTy ≥ β y ≥ 0 Ay ≤ ρb cTy ≥ β

SLIDE 66

Multiplicative Weights: Optimization and Duals

Instead of tracking violations and averaging solutions at the end, Consider the process from the perspective of u Dual of a hyperplane/constraint? Point in dual space. Dual of a point? Hyperplane/constraint in dual space.

Ay ≤ b cTy ≥ β y ≥ 0 Ay ≤ ρb cTy ≥ β

SLIDE 67

Multiplicative Weights: Optimization and Duals

Instead of tracking violations and averaging solutions at the end, Consider the process from the perspective of u Dual of a hyperplane/constraint? Point in dual space. Dual of a point? Hyperplane/constraint in dual space.

Ay ≤ b cTy ≥ β y ≥ 0 Ay ≤ ρb cTy ≥ β

SLIDE 68

Multiplicative Weights: Optimization and Duals

Instead of tracking violations and averaging solutions at the end, Consider the process from the perspective of u Dual of a hyperplane/constraint? Point in dual space. Dual of a point? Hyperplane/constraint in dual space.

Ay ≤ b cTy ≥ β y ≥ 0 Ay ≤ ρb cTy ≥ β

SLIDE 69

Multiplicative Weights: Optimization and Duals

Instead of tracking violations and averaging solutions at the end, Consider the process from the perspective of u Dual of a hyperplane/constraint? Point in dual space. Dual of a point? Hyperplane/constraint in dual space.

Ay ≤ b cTy ≥ β y ≥ 0 Ay ≤ ρb cTy ≥ β

SLIDE 70

Multiplicative Weights: Optimization and Duals

Instead of tracking violations and averaging solutions at the end, Consider the process from the perspective of u Dual of a hyperplane/constraint? Point in dual space. Dual of a point? Hyperplane/constraint in dual space. Suppose we prove [*]: ∃u s.t. ATu ≥ c and ρbTu < β.

Ay ≤ b cTy ≥ β y ≥ 0 Ay ≤ ρb cTy ≥ β

SLIDE 71

Multiplicative Weights: Optimization and Duals

Instead of tracking violations and averaging solutions at the end, Consider the process from the perspective of u Dual of a hyperplane/constraint? Point in dual space. Dual of a point? Hyperplane/constraint in dual space. Suppose we prove []: ∃u s.t. ATu ≥ c and ρbTu < β. Providing a y corresponds to: we have not yet proved [].

Ay ≤ b cTy ≥ β y ≥ 0 Ay ≤ ρb cTy ≥ β

SLIDE 72

Multiplicative Weights: Optimization and Duals

Instead of tracking violations and averaging solutions at the end, Consider the process from the perspective of u Dual of a hyperplane/constraint? Point in dual space. Dual of a point? Hyperplane/constraint in dual space. Suppose we prove []: ∃u s.t. ATu ≥ c and ρbTu < β. Providing a y corresponds to: we have not yet proved []. Think trajectories. Decompositions on dual. What does y mean then?

ATu ≥ c ρbTu < β u ≥ 0

SLIDE 73

So the Dual or the Primal?

How do we choose which to start from?

SLIDE 74

Which set of constraints would you rather solve?

The one with more variables! Lot more degrees of freedom. Easier to approximate. Maybe sparse solutions exist.

SLIDE 75

Which set of constraints would you rather solve?

The one with more variables! Lot more degrees of freedom. Easier to approximate. Maybe sparse solutions exist. Rewrite relaxations to introduce freedom!

SLIDE 76

(b) Application to Min. Correlation Clustering

Exponentially many constraints. How to design an Oracle. Drag and Drop application of Graph Sparsification/Sketching!

SLIDE 77

Correlation Clustering: Motivation

Tutorial in KDD 2014. Bonchi, Garcia-Soriano, Liberty. Clustering of objects known only through relationships. (Can have wide ranges of edge weights, +ve/-ve.)

SLIDE 78

Correlation Clustering: Motivation

Tutorial in KDD 2014. Bonchi, Garcia-Soriano, Liberty. Clustering of objects known only through relationships. (Can have wide ranges of edge weights, +ve/-ve.) Consider an Entity Resolution example. News arcticle 1: Mr Smith is devoted to mountain climbing. . . . Mrs Smith is a diver and said that she finds diving to be a sublime

experience. . . . The goal is to reach new heights, said Smith.

Now consider a stream of such articles, with new as well as old entities.

SLIDE 79

Correlation Clustering: Motivation

Tutorial in KDD 2014. Bonchi, Garcia-Soriano, Liberty. Clustering of objects known only through relationships. (Can have wide ranges of edge weights, +ve/-ve.) Consider an Entity Resolution example. News arcticle 1: Mr Smith is devoted to mountain climbing. . . . Mrs Smith is a diver and said that she finds diving to be a sublime

experience. . . . The goal is to reach new heights, said Smith.

Now consider a stream of such articles, with new as well as old entities. Likely Mr Smith = Mrs Smith. Large -ve weight. The other references can be either. Small weights depending on context. Weights are not a metric. Have a large range.

SLIDE 80

Correlation Clustering: A Formulation

1

1.1 2 10 1

3

0.3 12 1

2

1 Find a grouping that disagrees least with the graph.

◮ Count +ve edges out of clusters. Count -ve edges in clusters. ◮ Use as many clusters as you like.

Alternatively we can find a grouping that agrees least. NP Hard. Bansal Blum, Chawla, 04. Many approximation algorithms are known. For many variants. Approximations factors were known defore, will not focus on the factor.

SLIDE 81

Correlation Clustering: A Formulation

1

1.1 2 10 1

3

0.3 12 1

2

1 C2 C1 Find a grouping that disagrees least with the graph.

◮ Count +ve edges out of clusters. Count -ve edges in clusters. ◮ Use as many clusters as you like.

Alternatively we can find a grouping that agrees least. NP Hard. Bansal Blum, Chawla, 04. Many approximation algorithms are known. For many variants. Approximations factors were known defore, will not focus on the factor.

SLIDE 82

Global Sparsification: There and back again

Think of a problem on graph cuts.

1 0.5 2 10 1 3 0.3 12 1 1 1

SLIDE 83

Global Sparsification: There and back again

Think of a problem on graph cuts.

1 0.5 2 10 1 3 0.3 12 1 1 1

s t Min s-t Cut? Sparsification preserves all cuts within (1 ± ǫ).

SLIDE 84

Global Sparsification: There and back again

Think of a problem on graph cuts.

1 0.5 2 10 1 3 0.3 12 1 1 1

s t Min s-t Cut? Max s-t Cut? Max Cut? Sparsification preserves all cuts within (1 ± ǫ).

SLIDE 85

Global Sparsification: There and back again

Think of a problem on graph cuts.

1 0.5 2 10 1 3 0.3 12 1 1 1

s t Min s-t Cut? Max s-t Cut? Max Cut? NP Hard. ≥ 0.5 apx uses SDPs. Sparsification preserves all cuts within (1 ± ǫ).

SLIDE 86

Global Sparsification: There and back again

Think of a problem on graph cuts.

1 0.5 2 10 1 3 0.3 12 1 1 1

s t Min s-t Cut? Max s-t Cut? Max Cut? NP Hard. ≥ 0.5 apx uses SDPs. Sparsification preserves all cuts within (1 ± ǫ). (a) Does not imply anything about finding specific cuts.

SLIDE 87

Global Sparsification: There and back again

Think of a problem on graph cuts.

1 0.5 2 10 1 3 0.3 12 1 1 1

s t Min s-t Cut? Max s-t Cut? Max Cut? NP Hard. ≥ 0.5 apx uses SDPs. Sparsification preserves all cuts within (1 ± ǫ). (a) Does not imply anything about finding specific cuts. Yet.

SLIDE 88

Global Sparsification: There and back again

Think of a problem on graph cuts.

1 0.5 2 10 1 3 0.3 12 1 1 1

s t Min s-t Cut? Max s-t Cut? Max Cut? NP Hard. ≥ 0.5 apx uses SDPs. Sparsification preserves all cuts within (1 ± ǫ). (a) Does not imply anything about finding specific cuts. Yet. (b) Does not obviously save space either!

SLIDE 89

Global Sparsification: There and back again

Think of a problem on graph cuts.

1 0.5 2 10 1 3 0.3 12 1 1 1

s t Min s-t Cut? Max s-t Cut? Max Cut? NP Hard. ≥ 0.5 apx uses SDPs. Sparsification preserves all cuts within (1 ± ǫ). (a) Does not imply anything about finding specific cuts. Yet. (b) Does not obviously save space either! We will see examples both (a)–(b) and how to overcome them. Lets return to correlation clustering.

SLIDE 90

Min Correlation Clustering

Equivalent to Max-Agreement at optimality. Not in approximation. xij = 1 if in same group, and 0 otherwise. E(+/−) = +/−ve edge sets.

min

(i,j)∈E(+)

wij(1 − xij) +

(i,j)∈E(−)

|wij|xij xij ≤ 1 ∀i, j xij ≥ 0 ∀i, j (1 − xij) + (1 − xjk) ≥ (1 − xik) ∀i, j, k

A linear program.

SLIDE 91

Min Correlation Clustering

Equivalent to Max-Agreement at optimality. Not in approximation. xij = 1 if in same group, and 0 otherwise. E(+/−) = +/−ve edge sets.

Triangle constraints

min

(i,j)∈E(+)

wij(1 − xij) +

(i,j)∈E(−)

|wij|xij xij ≤ 1 ∀i, j xij ≥ 0 ∀i, j (1 − xij) + (1 − xjk) ≥ (1 − xik) ∀i, j, k

A linear program. Θ(n3) Constraints, Θ(n2) variables.

SLIDE 92

Min Correlation Clustering

Equivalent to Max-Agreement at optimality. Not in approximation. xij = 1 if in same group, and 0 otherwise. E(+/−) = +/−ve edge sets.

Triangle constraints

min

(i,j)∈E(+)

wij(1 − xij) +

(i,j)∈E(−)

|wij|xij xij ≤ 1 ∀i, j xij ≥ 0 ∀i, j (1 − xij) + (1 − xjk) ≥ (1 − xik) ∀i, j, k

A linear program. Θ(n3) Constraints, Θ(n2) variables. 1 pass lower bound of |E(−)| for any apx via Communication Complexity.

SLIDE 93

Min Correlation Clustering

Equivalent to Max-Agreement at optimality. Not in approximation. xij = 1 if in same group, and 0 otherwise. E(+/−) = +/−ve edge sets.

Triangle constraints

min

(i,j)∈E(+)

wij(1 − xij) +

(i,j)∈E(−)

|wij|xij xij ≤ 1 ∀i, j xij ≥ 0 ∀i, j (1 − xij) + (1 − xjk) ≥ (1 − xik) ∀i, j, k

A linear program. Θ(n3) Constraints, Θ(n2) variables. 1 pass lower bound of |E(−)| for any apx via Communication Complexity. Sparsify E(+), store E(−)? Will have ˜ O(n) + |E(−)| variables.

SLIDE 94

Min Correlation Clustering

Equivalent to Max-Agreement at optimality. Not in approximation. xij = 1 if in same group, and 0 otherwise. E(+/−) = +/−ve edge sets.

Triangle constraints

min

(i,j)∈E(+)

wij(1 − xij) +

(i,j)∈E(−)

|wij|xij xij ≤ 1 ∀i, j xij ≥ 0 ∀i, j (1 − xij) + (1 − xjk) ≥ (1 − xik) ∀i, j, k

A linear program. Θ(n3) Constraints, Θ(n2) variables. 1 pass lower bound of |E(−)| for any apx via Communication Complexity. Sparsify E(+), store E(−)? Will have ˜ O(n) + |E(−)| variables. Does not work. The triangle constraints need all n

2

variables.

SLIDE 95

Min Correlation Clustering

Equivalent to Max-Agreement at optimality. Not in approximation. xij = 1 if in same group, and 0 otherwise. E(+/−) = +/−ve edge sets. Set yij = 1 − xij for +ve edges. zij = xij for -ve edges.

min

(i,j)∈E(+)

wijyij +

(i,j)∈E(−)

|wij|zij yij, zij ≥ 0 ∀(i, j) ∈ E yij, zij?

Sparsify E(+). Store E(−). Θ(n2) → ˜ O(n) + |E(−)| variables? Θ(n3) Constraints

SLIDE 96

Min Correlation Clustering

Equivalent to Max-Agreement at optimality. Not in approximation. xij = 1 if in same group, and 0 otherwise. E(+/−) = +/−ve edge sets. Set yij = 1 − xij for +ve edges. zij = xij for -ve edges.

min

(i,j)∈E(+)

wijyij +

(i,j)∈E(−)

|wij|zij yij, zij ≥ 0 ∀(i, j) ∈ E

(u,v)∈P(ij)

yuv + zij ≥ 1 ∀i, j, and i-j path P(ij)

Sparsify E(+). Store E(−). Θ(n2) → ˜ O(n) + |E(−)| variables. Θ(n3) Constraints → Exponentially many constraints!

i j |wij|

SLIDE 97

Min Correlation Clustering

Equivalent to Max-Agreement at optimality. Not in approximation. xij = 1 if in same group, and 0 otherwise. E(+/−) = +/−ve edge sets. Set yij = 1 − xij for +ve edges. zij = xij for -ve edges.

min

(i,j)∈E(+)

wijyij +

(i,j)∈E(−)

|wij|zij yij, zij ≥ 0 ∀(i, j) ∈ E

(u,v)∈P(ij)

yuv + zij ≥ 1 ∀i, j, and i-j path P(ij)

Sparsify E(+). Store E(−). Θ(n2) → ˜ O(n) + |E(−)| variables. Θ(n3) Constraints → Exponentially many constraints! Solve LP (ellipsoid) & Ball Growing: Garg, Vazirani, Yannakakis 93.

i j |wij|

SLIDE 98

Min Correlation Clustering

Equivalent to Max-Agreement at optimality. Not in approximation. xij = 1 if in same group, and 0 otherwise. E(+/−) = +/−ve edge sets. Set yij = 1 − xij for +ve edges. zij = xij for -ve edges.

min

(i,j)∈E(+)

wijyij +

(i,j)∈E(−)

|wij|zij yij, zij ≥ 0 ∀(i, j) ∈ E

(u,v)∈P(ij)

yuv + zij ≥ 1 ∀i, j, and i-j path P(ij)

Sparsify E(+). Store E(−). Θ(n2) → ˜ O(n) + |E(−)| variables. Θ(n3) Constraints → Exponentially many constraints! Solve LP (ellipsoid) & Ball Growing: Garg, Vazirani, Yannakakis 93.

i j |wij|

MWM on the dual. ˜ O(n + |E(−)|) space and ˜ O(n2) time.

SLIDE 99

Min Correlation Clustering

Equivalent to Max-Agreement at optimality. Not in approximation. xij = 1 if in same group, and 0 otherwise. E(+/−) = +/−ve edge sets. Set yij = 1 − xij for +ve edges. zij = xij for -ve edges.

min

(i,j)∈E(+)

wijyij +

(i,j)∈E(−)

|wij|zij yij, zij ≥ 0 ∀(i, j) ∈ E

(u,v)∈P(ij)

yuv + zij ≥ 1 ∀i, j, and i-j path P(ij)

Sparsify E(+). Store E(−). Θ(n2) → ˜ O(n) + |E(−)| variables. Θ(n3) Constraints → Exponentially many constraints! Solve LP (ellipsoid) & Ball Growing: Garg, Vazirani, Yannakakis 93.

i j |wij|

MWM on the dual. ˜ O(n + |E(−)|) space and ˜ O(n2) time. Round infeasible primal (the running average). Success → done. Failure → violated constraint(s) → point needed for MWM on Dual.

SLIDE 100

Algorithm in a Picture?

Reformulation Duality Graph Sparsification Duality

SLIDE 101

(c) SDPs and Max Correlation Clustering

Much more powerful than linear relaxations. Recurring theme: Known relaxations will not fit. New problem: What do we do to round?

SLIDE 102

Max-Agreement and SDPs

xij = 1 if in same group, and 0 otherwise. E(+/−) = +/−ve edge sets. Think of vector programming over unit length vectors. xij = vi · vj ≤ 1.

max

(i,j)∈E(+)

wijxij +

(i,j)∈E(−)

|wij|(1 − xij) xii = 1 ∀i xij ≥ 0 ∀i, j x 0

SLIDE 103

Max-Agreement and SDPs

xij = 1 if in same group, and 0 otherwise. E(+/−) = +/−ve edge sets. Think of vector programming over unit length vectors. xij = vi · vj ≤ 1.

max

(i,j)∈E(+)

wijxij +

(i,j)∈E(−)

|wij|(1 − xij) xii = 1 ∀i xij ≥ 0 ∀i, j x 0

MWM (in this context): Collection of constraints. Feasible set: X. Given x provide a real symmetric A (satisfying some width bounds) (a) A ◦ x ≤ b − ǫ, note A ◦ x =

i,j Aijxij.

(b) A ◦ x′ ≥ b for all feasible x′ ∈ X.

SLIDE 104

Max-Agreement and SDPs

xij = 1 if in same group, and 0 otherwise. E(+/−) = +/−ve edge sets. Think of vector programming over unit length vectors. xij = vi · vj ≤ 1.

max

(i,j)∈E(+)

wijxij +

(i,j)∈E(−)

|wij|(1 − xij) xii = 1 ∀i xij ≥ 0 ∀i, j x 0

MWM (in this context): Collection of constraints. Feasible set: X. Given x provide a real symmetric A (satisfying some width bounds) (a) A ◦ x ≤ b − ǫ, note A ◦ x =

i,j Aijxij.

(b) A ◦ x′ ≥ b for all feasible x′ ∈ X. Why??

SLIDE 105

Max-Agreement and SDPs

xij = 1 if in same group, and 0 otherwise. E(+/−) = +/−ve edge sets. Think of vector programming over unit length vectors. xij = vi · vj ≤ 1.

max

(i,j)∈E(+)

wijxij +

(i,j)∈E(−)

|wij|(1 − xij) xii = 1 ∀i xij ≥ 0 ∀i, j x 0

MWM (in this context): Collection of constraints. Feasible set: X. Given x provide a real symmetric A (satisfying some width bounds) (a) A ◦ x ≤ b − ǫ, note A ◦ x =

i,j Aijxij.

(b) A ◦ x′ ≥ b for all feasible x′ ∈ X. Why.

SLIDE 106

Max-Agreement and SDPs

xij = 1 if in same group, and 0 otherwise. E(+/−) = +/−ve edge sets. Think of vector programming over unit length vectors. xij = vi · vj ≤ 1.

β ≤

(i,j)∈E(+)

wijxij +

(i,j)∈E(−)

|wij|(1 − xij) xii = 1 ∀i xij ≥ 0 ∀i, j x 0

MWM (in this context): Collection of constraints. Feasible set: X. Given x provide a real symmetric A (satisfying some width bounds) (a) A ◦ x ≤ b − ǫ, note A ◦ x =

i,j Aijxij.

(b) A ◦ x′ ≥ b for all feasible x′ ∈ X.

Why. Does not work (width is high).

SLIDE 107

Max-Agreement and SDPs

xij = 1 if in same group, and 0 otherwise. E(+/−) = +/−ve edge sets. Think of vector programming over unit length vectors. xij = vi · vj ≤ 1.

β ≤

(i,j)∈E(+)

wijxij +

(i,j)∈E(−)

|wij| (1 − xij)xii + xjj − 2xij 2 xii = 1 ∀i xij ≥ 0 ∀i, j x 0

MWM (in this context): Collection of constraints. Feasible set: X. Given x provide a real symmetric A (satisfying some width bounds) (a) A ◦ x ≤ b − ǫ, note A ◦ x =

i,j Aijxij.

(b) A ◦ x′ ≥ b for all feasible x′ ∈ X.

Why. Does not work (width is high). Linear Space. Linear time. 0.76-apx

SLIDE 108

Max-Agreement and SDPs

xij = 1 if in same group, and 0 otherwise. E(+/−) = +/−ve edge sets. Think of vector programming over unit length vectors. xij = vi · vj ≤ 1.

β ≤

(i,j)∈E(+)

wijxij +

(i,j)∈E(−)

|wij| (1 − xij)xii + xjj − 2xij 2 xii = 1 ∀i xij ≥ 0 ∀i, j x 0

MWM (in this context): Collection of constraints. Feasible set: X. Given x provide a real symmetric A (satisfying some width bounds) (a) A ◦ x ≤ b − ǫ, note A ◦ x =

i,j Aijxij.

(b) A ◦ x′ ≥ b for all feasible x′ ∈ X.

Why. Does not work (width is high). Linear Space. Linear time. 0.76-apx

Relaxation needs to be compatible with trajectory. Single pass. Sparsify E(+) and E(−) separately.

SLIDE 109

(d) Multiple Passes I: Max Bipartite Matching

Optimization over fixed constraint matrices. Columns revealed one at a time. Use of Approximation Algorithms for speedup of convergence. “Primal-Dual meets Primal-Dual”.

SLIDE 110

MWM on Streams: Bipartite Matching

Integer and fractional optimums coincide. (yij = yji, (i, j) implies ∈ E.)

max

(i,j)

yijwij

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Streams: arbitrary list of m edges, . . . , i, j, wij, . . . for an n node graph. Different from online learning. Input itself is in small pieces.

SLIDE 111

MWM on Streams: Bipartite Matching

Integer and fractional optimums coincide. (yij = yji, (i, j) implies ∈ E.)

max

(i,j)

yijwij

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Streams: arbitrary list of m edges, . . . , i, j, wij, . . . for an n node graph. Different from online learning. Input itself is in small pieces.

m n

SLIDE 112

MWM on Streams: Bipartite Matching

Integer and fractional optimums coincide. (yij = yji, (i, j) implies ∈ E.)

max

(i,j)

yijwij

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Streams: arbitrary list of m edges, . . . , i, j, wij, . . . for an n node graph. Different from online learning. Input itself is in small pieces.

m n

SLIDE 113

MWM on Streams: Bipartite Matching

Integer and fractional optimums coincide. (yij = yji, (i, j) implies ∈ E.)

max

(i,j)

yijwij

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Streams: arbitrary list of m edges, . . . , i, j, wij, . . . for an n node graph. Different from online learning. Input itself is in small pieces.

m n

SLIDE 114

MWM on Streams: Bipartite Matching

Integer and fractional optimums coincide. (yij = yji, (i, j) implies ∈ E.)

max

(i,j)

yijwij

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Streams: arbitrary list of m edges, . . . , i, j, wij, . . . for an n node graph. Different from online learning. Input itself is in small pieces.

m n

SLIDE 115

MWM on Streams: Bipartite Matching

Integer and fractional optimums coincide. (yij = yji, (i, j) implies ∈ E.)

max

(i,j)

yijwij

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Streams: arbitrary list of m edges, . . . , i, j, wij, . . . for an n node graph. Different from online learning. Input itself is in small pieces.

m n

SLIDE 116

MWM on Streams: Bipartite Matching

Integer and fractional optimums coincide. (yij = yji, (i, j) implies ∈ E.)

max

(i,j)

yijwij

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Streams: arbitrary list of m edges, . . . , i, j, wij, . . . for an n node graph. Different from online learning. Input itself is in small pieces.

m n

SLIDE 117

MWM on Streams: Bipartite Matching

Integer and fractional optimums coincide. (yij = yji, (i, j) implies ∈ E.)

max

(i,j)

yijwij

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Streams: arbitrary list of m edges, . . . , i, j, wij, . . . for an n node graph. Different from online learning. Input itself is in small pieces.

m n

SLIDE 118

MWM on Streams: Bipartite Matching

Integer and fractional optimums coincide. (yij = yji, (i, j) implies ∈ E.)

max

(i,j)

yijwij

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Streams: arbitrary list of m edges, . . . , i, j, wij, . . . for an n node graph. Different from online learning. Input itself is in small pieces.

m n

SLIDE 119

MWM on Streams: Bipartite Matching

Integer and fractional optimums coincide. (yij = yji, (i, j) implies ∈ E.)

(i,j)

yijwij ≥ β

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Streams: arbitrary list of m edges, . . . , i, j, wij, . . . for an n node graph. Applying MWM: Point = candidate set of edges, in m-dim space. Hyperplanes?

SLIDE 120

MWM on Streams: Bipartite Matching

Integer and fractional optimums coincide. (yij = yji, (i, j) implies ∈ E.)

(i,j)

yijwij ≥ β ui →

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Streams: arbitrary list of m edges, . . . , i, j, wij, . . . for an n node graph. Applying MWM: Point = candidate set of edges, in m-dim space. Hyperplanes?

SLIDE 121

MWM on Streams: Bipartite Matching

Integer and fractional optimums coincide. (yij = yji, (i, j) implies ∈ E.)

(i,j)

yijwij ≥ β ui →

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Streams: arbitrary list of m edges, . . . , i, j, wij, . . . for an n node graph. Applying MWM: Point = candidate set of edges, in m-dim space. Hyperplanes?

i ui
j yij ≤

i ui

⇔

(i,j) yij(ui + uj) ≤

i ui.

Store & update u. O(n) storage.

SLIDE 122

MWM on Streams: Bipartite Matching

Integer and fractional optimums coincide. (yij = yji, (i, j) implies ∈ E.)

(i,j)

yijwij ≥ β ui →

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Streams: arbitrary list of m edges, . . . , i, j, wij, . . . for an n node graph. Applying MWM: Point = candidate set of edges, in m-dim space. Hyperplanes?

i ui
j yij ≤

i ui

⇔

(i,j) yij(ui + uj) ≤

i ui.

Want:                 

(i,j)

yij(ui + uj)

i

ui ≤

i

ui

(i,j)

yijwij ≥ β

j

yij ≤ ρ ∀i yij ≥ 0 ∀(i, j)

.

SLIDE 123

MWM on Streams: Bipartite Matching

Want:

                  

(i,j)

yij(ui + uj) ≤

i ui

(i,j)

yijwij ≥ β

j

yij ≤ ρ ∀i yij ≥ 0 ∀(i, j)

,

SLIDE 124

MWM on Streams: Bipartite Matching

Want:

                  

(i,j)

yij(ui + uj) ≤

i ui

(i,j)

yijwij ≥ β

j

yij ≤ ρ ∀i yij ≥ 0 ∀(i, j)

Now ∃y, ∀λ ≥ 0

          

(i,j)

(wij − λ(ui + uj))yij ≥ (β − λ

i ui)

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

SLIDE 125

MWM on Streams: Bipartite Matching

Want:

                  

(i,j)

yij(ui + uj) ≤

i ui

(i,j)

yijwij ≥ β

j

yij ≤ ρ ∀i yij ≥ 0 ∀(i, j)

Now ∃y, ∀λ ≥ 0

          

(i,j)

(wij − λ(ui + uj))yij ≥ (β − λ

i ui)

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Oracle(λ):

SLIDE 126

MWM on Streams: Bipartite Matching

Want:

                  

(i,j)

yij(ui + uj) ≤

i ui

(i,j)

yijwij ≥ β

j

yij ≤ ρ ∀i yij ≥ 0 ∀(i, j)

Now ∃y, ∀λ ≥ 0

          

(i,j)

(wij − λ(ui + uj))yij ≥ (β − λ

i ui)

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Oracle(λ):

◮ Seeing (i, j) compute (wij − λ(ui + uj)). If -ve, discard.

SLIDE 127

MWM on Streams: Bipartite Matching

Want:

                  

(i,j)

yij(ui + uj) ≤

i ui

(i,j)

yijwij ≥ β

j

yij ≤ ρ ∀i yij ≥ 0 ∀(i, j)

Have y, ∀λ ≥ 0

          

(i,j)

(wij − λ(ui + uj))yij ≥ (β − λ

i ui)/c

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Oracle(λ):

◮ Seeing (i, j) compute (wij − λ(ui + uj)). If -ve, discard. ◮ Find a streaming O(n) space c approximation on this filtered set.

SLIDE 128

MWM on Streams: Bipartite Matching

Want:

                  

(i,j)

yij(ui + uj) ≤

i ui

(i,j)

yijwij ≥ β

j

yij ≤ ρ ∀i yij ≥ 0 ∀(i, j)

Have y,

          

(i,j)

(wij − λ(ui + uj))yij ≥ (β − λ

i ui)/c

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Oracle(λ):

◮ Seeing (i, j) compute (wij − λ(ui + uj)). If -ve, discard. ◮ Find a streaming O(n) space c approximation on this filtered set.

If Oracle(λ) for λ = 0 satisfies

(i,j) yij(ui + uj) ≤ i ui/c then we also

have:

(i,j) wijyij ≥ β/c. (easier case)

SLIDE 129

MWM on Streams: Bipartite Matching

Want:

                  

(i,j)

yij(ui + uj) ≤

i ui

(i,j)

yijwij ≥ β

j

yij ≤ ρ ∀i yij ≥ 0 ∀(i, j)

Have y,

          

(i,j)

(wij − λ(ui + uj))yij ≥ (β − λ

i ui)/c

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Oracle(λ):

◮ Seeing (i, j) compute (wij − λ(ui + uj)). If -ve, discard. ◮ Find a streaming O(n) space c approximation on this filtered set.

For λ = 0 we have

(i,j) yij(ui + uj) ≥ i ui/c.

For λ =

i ui/β we have (i,j) yij(ui + uj) ≤ i ui/c. (Set y = 0)

SLIDE 130

MWM on Streams: Bipartite Matching

Want:

                  

(i,j)

yij(ui + uj) ≤

i ui

(i,j)

yijwij ≥ β

j

yij ≤ ρ ∀i yij ≥ 0 ∀(i, j)

Have y,

          

(i,j)

(ui + uj)yij ≤

i

ui/c and

(i,j)

wijyij ≥ β/c

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Oracle(λ):

◮ Seeing (i, j) compute (wij − λ(ui + uj)). If -ve, discard. ◮ Find a streaming O(n) space c approximation on this filtered set.

For λ = 0 we have

(i,j) yij(ui + uj) ≥ i ui/c.

For λ =

i ui/β we have (i,j) yij(ui + uj) ≤ i ui/c. (Set y = 0)

Binary search (or try values of λ in parallel).

SLIDE 131

MWM on Streams: Bipartite Matching

Want:

                  

(i,j)

yij(ui + uj) ≤

i ui

(i,j)

yijwij ≥ β

j

yij ≤ ρ ∀i yij ≥ 0 ∀(i, j)

Have y,

          

(i,j)

(ui + uj)yij ≤

i

ui/c and

(i,j)

wijyij ≥ β/c

j

yij ≤ 1 ∀i yij ≥ 0 ∀(i, j)

Oracle(λ):

◮ Seeing (i, j) compute (wij − λ(ui + uj)). If -ve, discard. ◮ Find a streaming O(n) space c approximation on this filtered set.

For λ = 0 we have

(i,j) yij(ui + uj) ≥ i ui/c.

For λ =

i ui/β we have (i,j) yij(ui + uj) ≤ i ui/c. (Set y = 0)

Binary search (or try values of λ in parallel). Multiply y by c. Set ρ = c and we have a solution!

SLIDE 132

MWM based Bipartite Matching for Map-Reduce?

More general than streaming. Map-Reduce based 8 approximations in O(log n) rounds exist, e.g., Lattanzi, Mosely, Suri, Vassilivitskii 11. We can compose them. O(log n) rounds to get a c-approximation. Repeat O(cǫ−2 log n) times to get a (1 + ǫ)- fractional solution. Can also round to an integral solution in small space. A story for some other time.

SLIDE 133

(e) Multiple Passes II: Max Non-Bipartite Matching

Exponentially many constraints. Adaptive constraint sparsification. Perturbations. How to find your way at night in the dark?

SLIDE 134

Revisiting Dantzig Decompositions

A running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 135

Revisiting Dantzig Decompositions

A running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 136

Revisiting Dantzig Decompositions

A running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 137

Revisiting Dantzig Decompositions

A running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 138

Revisiting Dantzig Decompositions

A running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 139

Revisiting Dantzig Decompositions

A running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 140

Revisiting Dantzig Decompositions

A running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 141

Revisiting Dantzig Decompositions

A running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 142

Revisiting Dantzig Decompositions

A running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 143

Revisiting Dantzig Decompositions

A running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 144

Revisiting Dantzig Decompositions

A running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 145

Revisiting Dantzig Decompositions

A running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 146

Revisiting Dantzig Decompositions

A running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 147

Revisiting Dantzig Decompositions

A running average view (primal space).

Hard decision problem Easy decision problem

SLIDE 148

Revisiting Dantzig Decompositions

A running average view (primal space).

Hard decision problem Easy decision problem

Apx

decision problem

SLIDE 149

Revisiting Dantzig Decompositions

A running average view (primal space).

Hard decision problem Easy decision problem

Apx

decision problem

SLIDE 150

Revisiting Dantzig Decompositions

A running average view (primal space).

Hard decision problem Easy decision problem

Apx

decision problem

SLIDE 151

Adaptive Sparsifications and Dantzig Decompositions

What if we sparsify u? What does that mean?

Hard decision problem Easy decision problem

SLIDE 152

Adaptive Sparsifications and Dantzig Decompositions

What if we sparsify u? What does that mean?

Hard decision problem Easy decision problem

SLIDE 153

Adaptive Sparsifications and Dantzig Decompositions

What if we sparsify u? What does that mean?

Hard decision problem Easy decision problem

SLIDE 154

Adaptive Sparsifications and Dantzig Decompositions

What if we sparsify u? What does that mean?

Hard decision problem Easy decision problem

SLIDE 155

Adaptive Sparsifications and Dantzig Decompositions

What if we sparsify u? What does that mean?

Hard decision problem Easy decision problem

SLIDE 156

Adaptive Sparsifications and Dantzig Decompositions

What if we sparsify u? What does that mean?

Hard decision problem Easy decision problem

SLIDE 157

Adaptive Sparsifications and Dantzig Decompositions

What if we sparsify u? What does that mean?

Hard decision problem Easy decision problem

SLIDE 158

Adaptive Sparsifications and Dantzig Decompositions

What if we sparsify u? What does that mean?

Hard decision problem Easy decision problem

SLIDE 159

Adaptive Sparsifications and Dantzig Decompositions

What if we sparsify u? What does that mean?

Hard decision problem Easy decision problem

SLIDE 160

Perturbations

Focus on the violations which are close to max violation. Modify the polytope to find such violations faster.

SLIDE 161

Cuts and Constraints

SLIDE 162

Cuts and Constraints

(Again dropping (i, j) ∈ E in the subscripts, yij = yji.)

β∗ = max

(i,j)

wijyij

j

yij ≤ 1 ∀i yij ≥ 0

SLIDE 163

Cuts and Constraints

(Again dropping (i, j) ∈ E in the subscripts, yij = yji.)

β∗ = max

(i,j)

wijyij

j

yij ≤ 1 ∀i (Cut constraint!) yij ≥ 0

SLIDE 164

Cuts and Constraints

(Again dropping (i, j) ∈ E in the subscripts, yij = yji.)

β∗ = max

(i,j)

wijyij

j

yij ≤ 1 ∀i (Cut constraint!)

i,j∈U

yij ≤ ⌊|U|/2⌋ ∀U yij ≥ 0

SLIDE 165

Cuts and Constraints

(Again dropping (i, j) ∈ E in the subscripts, yij = yji.)

β∗ = max

(i,j)

wijyij

j

yij ≤ 1 ∀i (Cut constraint!)

i,j∈U

yij ≤ ⌊|U|/2⌋ ∀U yij ≥ 0

Rules out:

1/2

1/2 1/2

SLIDE 166

Cuts and Constraints

(Again dropping (i, j) ∈ E in the subscripts, yij = yji.)

β∗ = max

(i,j)

wijyij

j

yij ≤ 1 ∀i (Cut constraint!)

i,j∈U

yij ≤ ⌊|U|/2⌋ ∀U yij ≥ 0

i,j∈U

yij ≤ ⌊|U|/2⌋ ⇐ ⇒

i∈U
j

yij

−

 

i∈U,j∈U

yij   ≤ 2 ⌊|U|/2⌋

i∈U,j∈U yij = Cut(U, V − U).

SLIDE 167

Cuts and Constraints

(Again dropping (i, j) ∈ E in the subscripts, yij = yji.)

β∗ = max

(i,j)

wijyij

j

yij ≤ 1 ∀i

i,j∈U

yij ≤ ⌊|U|/2⌋ ∀U yij ≥ 0

i,j∈U

yij ≤ ⌊|U|/2⌋ ⇐ ⇒

i∈U
j

yij

−

 

i∈U,j∈U

yij   ≤ 2 ⌊|U|/2⌋

i∈U,j∈U yij = Cut(U, V − U).

Find small cuts (with odd vertex sizes).

SLIDE 168

Cuts and Constraints

(Again dropping (i, j) ∈ E in the subscripts, yij = yji.)

β∗ = max

(i,j)

wijyij

j

yij ≤ 1 ∀i

i,j∈U

yij ≤ ⌊|U|/2⌋ ∀U yij ≥ 0

Find small cuts (with odd vertex sizes).

SLIDE 169

Cuts and Constraints

(Again dropping (i, j) ∈ E in the subscripts, yij = yji.)

β∗ = max

(i,j)

wijyij

j

yij ≤ 1 ∀i

i,j∈U

yij ≤ ⌊|U|/2⌋ ∀U yij ≥ 0

Find small cuts (with odd vertex sizes). Standard Algorithm: Augment, contract blossoms, ... (many rounds). Signature: feasible,. . . ,feasible (larger), . . . , feasible, (near) optimal

SLIDE 170

Cuts and Constraints

(Again dropping (i, j) ∈ E in the subscripts, yij = yji.)

β∗ = max

(i,j)

wijyij

j

yij ≤ 1 ∀i

i,j∈U

yij ≤ ⌊|U|/2⌋ ∀U yij ≥ 0

Find small cuts (with odd vertex sizes). Standard Algorithm: Augment, contract blossoms, ... (many rounds). Signature: feasible,. . . ,feasible (larger), . . . , feasible, (near) optimal Signature of this Algorithm: infeasible,. . . ,infeasible (smaller), . . . , feasible, (near) optimal Bipartite: O(ǫ−2 log n) rounds Non-Bipartite: O(ǫ−4 log n) rounds

SLIDE 171

How?

˜ β = max

(i,j)

wijyij

j

yij ≤ (1 − 4δ) ∀i

i,j∈U

yij ≤ ⌊|U|/2⌋ − δ2|U|2

4

∀U yij ≥ 0

Consider two odd sets with “density” similar to the densest set. Have to be disjoint or within each other (laminar)! Reduces to a bipartite problem with different “effective weights”. Near linear time algorithm.

SLIDE 172

How?

˜ β = max

(i,j)

wijyij

j

yij ≤ (1 − 4δ) ∀i

i,j∈U

yij ≤ ⌊|U|/2⌋ − δ2|U|2

4

∀U yij ≥ 0

Consider two odd sets with “density” similar to the densest set. Have to be disjoint or within each other (laminar)! Reduces to a bipartite problem with different “effective weights”. Near linear time algorithm. Extends to capacities on vertices and edges.

SLIDE 173

(f) Multiple Passes III: Non-Bipartite Matching

For a few passes less ... Sparsify non-adaptively in parallel; use sequentially. Dual-primal versus primal-dual. New relaxations for Matching.

SLIDE 174

Sparsify in Parallel, Use Sequentially

We saw a version of sketch in parallel, use sequentially in connectivity. Question: Where will we be after 5 steps of MWM? Recall: If Aiy > bi: raise ui, i.e., ui ← ui(1 + ǫ)(Aiy−bi)/biρ.

SLIDE 175

Sparsify in Parallel, Use Sequentially

We saw a version of sketch in parallel, use sequentially in connectivity. Question: Where will we be after 5 steps of MWM? Recall: If Aiy > bi: raise ui, i.e., ui ← ui(1 + ǫ)(Aiy−bi)/biρ. ui(5) ∈ (1 ± ǫ)5ui. Construct 5 independent sparsifications of u.

SLIDE 176

Sparsify in Parallel, Use Sequentially

We saw a version of sketch in parallel, use sequentially in connectivity. Question: Where will we be after 5 steps of MWM? Recall: If Aiy > bi: raise ui, i.e., ui ← ui(1 + ǫ)(Aiy−bi)/biρ. ui(5) ∈ (1 ± ǫ)5ui. Construct 5 independent sparsifications of u.

Lets exaggerate changes

(for illustration).

SLIDE 177

Sparsify in Parallel, Use Sequentially

We saw a version of sketch in parallel, use sequentially in connectivity. Question: Where will we be after 5 steps of MWM? Recall: If Aiy > bi: raise ui, i.e., ui ← ui(1 + ǫ)(Aiy−bi)/biρ. ui(5) ∈ (1 ± ǫ)5ui. Construct 5 independent sparsifications of u.

Lets exaggerate changes

(for illustration). If u were not changing ... But they are. Need (small) corrections. Presparsifiers.

SLIDE 178

Sparsify in Parallel, Use Sequentially

We saw a version of sketch in parallel, use sequentially in connectivity. Question: Where will we be after 5 steps of MWM? Recall: If Aiy > bi: raise ui, i.e., ui ← ui(1 + ǫ)(Aiy−bi)/biρ. ui(5) ∈ (1 ± ǫ)5ui. Construct 5 independent sparsifications of u.

Lets exaggerate changes

(for illustration). If u were not changing ... But they are. Need (small) corrections. Presparsifiers.

SLIDE 179

Sparsify in Parallel, Use Sequentially

We saw a version of sketch in parallel, use sequentially in connectivity. Question: Where will we be after 5 steps of MWM? Recall: If Aiy > bi: raise ui, i.e., ui ← ui(1 + ǫ)(Aiy−bi)/biρ. ui(5) ∈ (1 ± ǫ)5ui. Construct 5 independent sparsifications of u.

Lets exaggerate changes

(for illustration). If u were not changing ... But they are. Need (small) corrections. Presparsifiers.

SLIDE 180

Sparsify in Parallel, Use Sequentially

We saw a version of sketch in parallel, use sequentially in connectivity. Question: Where will we be after 5 steps of MWM? Recall: If Aiy > bi: raise ui, i.e., ui ← ui(1 + ǫ)(Aiy−bi)/biρ. ui(5) ∈ (1 ± ǫ)5ui. Construct 5 independent sparsifications of u.

Lets exaggerate changes

(for illustration). If u were not changing ... But they are. Need (small) corrections. Presparsifiers.

SLIDE 181

Sparsify in Parallel, Use Sequentially

We saw a version of sketch in parallel, use sequentially in connectivity. Question: Where will we be after 5 steps of MWM? Recall: If Aiy > bi: raise ui, i.e., ui ← ui(1 + ǫ)(Aiy−bi)/biρ. ui(5) ∈ (1 ± ǫ)5ui. Construct 5 independent sparsifications of u.

Lets exaggerate changes

(for illustration). If u were not changing ... But they are. Need (small) corrections. Presparsifiers.

SLIDE 182

Non-Bipartite Matching in Small Passes

A natural algorithm for non-bipartite matching.

1. Find an initial solution of the dual Problem. (A trend.)
2. Assign uij = 1 for all edges.
3. For O(10/ǫ) steps:

3.1 Compute t sparsifiers with n1.1 edges using uij. 3.2 Find the best weighted matching in the edges in the t

sparsifications. (wij unchanged).

3.3 Keep the largest weight matching found (say β) so far. 3.4 Recompute uij

Recompute:       

1. t = O( 1

ǫ log n)

2. Simulate t steps of a primal-dual algorithm trying

to prove Feasible Dual ≤ β(1 + O(ǫ)).

3. Adjust the sparsification in between.

SLIDE 183

Cuts, Duals and Graph Sparsification

β∗ = max

(i,j)

wijyij

j

yij ≤ 1 ∀i

i,j∈U

yij ≤ ⌊|U|/2⌋ ∀U yij ≥ 0

i,j∈U

yij ≤ ⌊|U|/2⌋ ⇐ ⇒

i∈U
j

yij

−

 

i∈U,j∈U

yij   ≤ 2 ⌊|U|/2⌋

i∈U,j∈U yij = Cut(U, V − U).

Find small cuts (with odd vertex sizes).

SLIDE 184

Cuts, Duals and Graph Sparsification

β∗ = max

(i,j)

wijyij

j

yij ≤ 1 ∀i

i,j∈U

yij ≤ ⌊|U|/2⌋ ∀U yij ≥ 0 β∗ = min

i

xi +

U

|U| 2

zU

uij : xi + xj +

i,j∈U

zU ≥ wij ∀(i, j) ∈ E xi, zU ≥ 0

i,j∈U

yij ≤ ⌊|U|/2⌋ ⇐ ⇒

i∈U
j

yij

−

 

i∈U,j∈U

yij   ≤ 2 ⌊|U|/2⌋

i∈U,j∈U yij = Cut(U, V − U).

Find small cuts (with odd vertex sizes).

SLIDE 185

Cuts, Duals and Graph Sparsification

β∗ = max

(i,j)

wijyij

j

yij ≤ 1 ∀i

i,j∈U

yij ≤ ⌊|U|/2⌋ ∀U yij ≥ 0 β∗ = min

i

xi +

U

|U| 2

zU

uij : xi + xj +

i,j∈U

zU ≥ wij ∀(i, j) ∈ E xi, zU ≥ 0

Standard Algorithm: Augment, contract blossoms, ... (many rounds). Signature: feasible,. . . ,feasible (larger), . . . , feasible, (near) optimal

SLIDE 186

Cuts, Duals and Graph Sparsification

β∗ = max

(i,j)

wijyij

j

yij ≤ 1 ∀i

i,j∈U

yij ≤ ⌊|U|/2⌋ ∀U yij ≥ 0 β∗ = min

i

xi +

U

|U| 2

zU

uij : xi + xj +

i,j∈U

zU ≥ wij ∀(i, j) ∈ E xi, zU ≥ 0

Standard Algorithm: Augment, contract blossoms, ... (many rounds). Signature: feasible,. . . ,feasible (larger), . . . , feasible, (near) optimal Signature of previous algorithm: infeasible,. . . ,infeasible (smaller), . . . , feasible, (near) optimal New algorithm?

SLIDE 187

Cuts, Duals and Graph Sparsification

β∗ = max

(i,j)

wijyij

j

yij ≤ 1 ∀i

i,j∈U

yij ≤ ⌊|U|/2⌋ ∀U yij ≥ 0 β∗ = min

i

xi +

U

|U| 2

zU

uij : xi + xj +

i,j∈U

zU ≥ wij ∀(i, j) ∈ E xi, zU ≥ 0

Standard Algorithm: Augment, contract blossoms, ... (many rounds). Signature: feasible,. . . ,feasible (larger), . . . , feasible, (near) optimal Signature of previous algorithm: infeasible,. . . ,infeasible (smaller), . . . , feasible, (near) optimal New algorithm? infeasible dual, . . . , (estimate of β∗ is increasing), . . . , (near) optimal (O(1/ǫ) rounds, sparsification)

SLIDE 188

Cuts, Duals and Graph Sparsification

β∗ = max

(i,j)

wijyij

j

yij ≤ 1 ∀i

i,j∈U

yij ≤ ⌊|U|/2⌋ ∀U yij ≥ 0 β∗ = min

i

xi +

U

|U| 2

zU

uij : xi + xj +

i,j∈U

zU ≥ wij ∀(i, j) ∈ E xi, zU ≥ 0

Standard Algorithm: Augment, contract blossoms, ... (many rounds). Signature: feasible,. . . ,feasible (larger), . . . , feasible, (near) optimal Signature of previous algorithm: infeasible,. . . ,infeasible (smaller), . . . , feasible, (near) optimal New algorithm? infeasible dual, . . . , (estimate of β∗ is increasing), . . . , (near) optimal (O(1/ǫ) rounds, sparsification) . . . . . . keep best matching seen so far, . . . . . . . . . (near) optimal

SLIDE 189

New Relaxations for Maximum Matching, . . . 3, 2, 1

Lets consider wij = 1.

β∗ = max

(i,j)

yij − 3

i

µi

j

yij − 2µi ≤ 1 ∀i

i,j∈U

yij −

i∈U

µi ≤ ⌊|U|/2⌋ + ∀U yij ≥ 0

SLIDE 190

New Relaxations for Maximum Matching, . . . 3, 2, 1

Lets consider wij = 1.

β∗ = max

(i,j)

yij − 3

i

µi

j

yij − 2µi ≤ 1 ∀i

i,j∈U

yij −

i∈U

µi ≤ ⌊|U|/2⌋ + ∀U yij ≥ 0 β∗ = min

i

xi +

U

|U| 2

zU

uij : xi + xj +

i,j∈U

zU ≥ wij ∀(i, j) ∈ E 2xi +

i∈U

zU ≤ 3 ∀i ∈ V xi, zU ≥ 0

SLIDE 191

Wrap up

(1) Primitives: Sampling, Sketching and Sparsification.

SLIDE 192

Wrap up

(1) Primitives: Sampling, Sketching and Sparsification. (2) LPs/SDPs (MWM) on Streams.

SLIDE 193

Wrap up

(1) Primitives: Sampling, Sketching and Sparsification. (2) LPs/SDPs (MWM) on Streams. (3) Remember a small number of weight values.

SLIDE 194

Wrap up

(1) Primitives: Sampling, Sketching and Sparsification. (2) LPs/SDPs (MWM) on Streams. (3) Remember a small number of weight values. (4) Compute in sketch (sparsified) space entirely. Correlation clustering.

SLIDE 195

Wrap up

(1) Primitives: Sampling, Sketching and Sparsification. (2) LPs/SDPs (MWM) on Streams. (3) Remember a small number of weight values. (4) Compute in sketch (sparsified) space entirely. Correlation clustering. (5) May need to change the natural relaxations (convergence speed).

SLIDE 196

Wrap up

(1) Primitives: Sampling, Sketching and Sparsification. (2) LPs/SDPs (MWM) on Streams. (3) Remember a small number of weight values. (4) Compute in sketch (sparsified) space entirely. Correlation clustering. (5) May need to change the natural relaxations (convergence speed). (6) May need new relaxations for rounding.

SLIDE 197

Wrap up

(1) Primitives: Sampling, Sketching and Sparsification. (2) LPs/SDPs (MWM) on Streams. (3) Remember a small number of weight values. (4) Compute in sketch (sparsified) space entirely. Correlation clustering. (5) May need to change the natural relaxations (convergence speed). (6) May need new relaxations for rounding. (7) Think differently. The real voyage of discovery ...

SLIDE 198