Optimisation While Streaming Amit Chakrabarti Dartmouth College - PowerPoint PPT Presentation

Optimisation While Streaming Amit Chakrabarti Dartmouth College Joint work with S. Kale, A. Wirth DIMACS Workshop on Big Data Through the Lens of Sublinear Algorithms, Aug 2015

Combinatorial Optimisation Problems ◮ 1950s, 60s: Operations research ◮ 1970s, 80s: NP-hardness ◮ 1990s, 2000s: Approximation algorithms, hardness of approximation ◮ 2010s: Space-constrained settings, e.g., streaming

Maximum Matching

Maximum Matching The cardinality version

Maximum Matching 2 1 5 2 2 8 6 1 1 2

Maximum Matching 2 1 5 2 2 8 6 1 1 2 The weighted version

Graph Streams: Maximum Matching, Generalisations Maximum cardinality matching (MCM) ◮ Input: stream of edges ( u , v ) ∈ [ n ] × [ n ] ◮ Describes graph G = ( V , E ): n vertices, m edges, undirected, simple ◮ Each edge appears exactly once in stream ◮ Goal • Output a matching M ⊆ E , with | M | maximal

Graph Streams: Maximum Matching, Generalisations Maximum cardinality matching (MCM) ◮ Input: stream of edges ( u , v ) ∈ [ n ] × [ n ] ◮ Describes graph G = ( V , E ): n vertices, m edges, undirected, simple ◮ Each edge appears exactly once in stream ◮ Goal • Output a matching M ⊆ E , with | M | maximal • Use sublinear (in m ) working memory • Ideally O ( n polylog n ) ... “semi-streaming” • Need Ω( n log n ) to store M

Graph Streams: Maximum Matching, Generalisations Maximum cardinality matching (MCM) ◮ Input: stream of edges ( u , v ) ∈ [ n ] × [ n ] ◮ Describes graph G = ( V , E ): n vertices, m edges, undirected, simple ◮ Goal: output a matching M ⊆ E , with | M | maximal Maximum weight matching (MWM) ◮ Input: stream of weighted edges ( u , v , w uv ) ∈ [ n ] × [ n ] × R + ◮ Goal: output matching M ⊆ E , with w ( M ) = � e ∈ M w ( e ) maximal

Graph Streams: Maximum Matching, Generalisations Maximum cardinality matching (MCM) ◮ Input: stream of edges ( u , v ) ∈ [ n ] × [ n ] ◮ Describes graph G = ( V , E ): n vertices, m edges, undirected, simple ◮ Goal: output a matching M ⊆ E , with | M | maximal Maximum weight matching (MWM) ◮ Input: stream of weighted edges ( u , v , w uv ) ∈ [ n ] × [ n ] × R + ◮ Goal: output matching M ⊆ E , with w ( M ) = � e ∈ M w ( e ) maximal Maximum submodular-function matching (MSM) [Chakrabarti-Kale’14] ◮ Input: unweighted edges ( u , v ), plus submodular f : 2 E → R + ◮ Goal: output matching M ⊆ E , with f ( M ) maximal

Set Cover

Set Cover with Sets Streamed ◮ Input: stream of m sets, each ⊆ [ n ] ◮ Goal: cover universe [ n ] using as few sets as possible

Set Cover with Sets Streamed ◮ Input: stream of m sets, each ⊆ [ n ] ◮ Goal: cover universe [ n ] using as few sets as possible • Use sublinear (in m ) space • Ideally O ( n polylog n ) ... “semi-streaming” • Need Ω( n log n ) space to certify : for each item, who covered it? Think m ≥ n

Road Map ◮ Results on Maximum Submodular Matching (MSM) ◮ Generalising MSM: constrained submodular maximisation ◮ Set Cover: upper bounds ◮ Set Cover: lower bounds, with proof outline

Maximum Submodular Matching Input ◮ Stream of edges σ = � e 1 , e 2 , . . . , e m � ◮ Valuation function f : 2 E → R + • Submodular: X ⊆ Y ⊆ E , e ∈ E = ⇒ f ( X + e ) − f ( X ) ≥ f ( Y + e ) − f ( Y ) • Monotone: X ⊆ Y = ⇒ f ( X ) ≤ f ( Y ) • Normalised: f ( ∅ ) = 0 ◮ Oracle access to f : query at X ⊆ E , get f ( X ) • May only query at X ⊆ (stream so far) Goal ◮ Output matching M ⊆ E , with f ( M ) maximal “large” ◮ Store O ( n ) edges and f -values

Some Results on MSM Can’t solve MSM exactly ◮ MCM, approx < e / ( e − 1) = ⇒ space ω ( n polylog n ) [Kapralov’13] ⇒ n ω (1) oracle calls ◮ Offline MSM, approx < e / ( e − 1) = • Via cardinality-constrained submodular max [Nemhauser-Wolsey’78]

Some Results on MSM Can’t solve MSM exactly ◮ MCM, approx < e / ( e − 1) = ⇒ space ω ( n polylog n ) [Kapralov’13] ⇒ n ω (1) oracle calls ◮ Offline MSM, approx < e / ( e − 1) = • Via cardinality-constrained submodular max [Nemhauser-Wolsey’78] Positive results, using O ( n ) storage: Theorem 1 MSM, one pass: 7 . 75-approx Theorem 2 MSM, (3 + ε )-approx in O ( e − 3 ) passes

Some Results on MSM Can’t solve MSM exactly ◮ MCM, approx < e / ( e − 1) = ⇒ space ω ( n polylog n ) [Kapralov’13] ⇒ n ω (1) oracle calls ◮ Offline MSM, approx < e / ( e − 1) = • Via cardinality-constrained submodular max [Nemhauser-Wolsey’78] Positive results, using O ( n ) storage: Theorem 1 MSM, one pass: 7 . 75-approx Theorem 2 MSM, (3 + ε )-approx in O ( e − 3 ) passes More importantly: Meta-Thm 1 Every compliant MWM approx alg → MSM approx alg

Some Results on MSM Can’t solve MSM exactly ◮ MCM, approx < e / ( e − 1) = ⇒ space ω ( n polylog n ) [Kapralov’13] ⇒ n ω (1) oracle calls ◮ Offline MSM, approx < e / ( e − 1) = • Via cardinality-constrained submodular max [Nemhauser-Wolsey’78] Positive results, using O ( n ) storage: Theorem 1 MSM, one pass: 7 . 75-approx Theorem 2 MSM, (3 + ε )-approx in O ( e − 3 ) passes More importantly: Meta-Thm 1 Every compliant MWM approx alg → MSM approx alg Meta-Thm 2 Similarly, max weight independent set (MWIS) → MSIS

Compliant Algorithms for MWM 2 3 2 unpicked edge picked edge 1 2

Compliant Algorithms for MWM 2 3 2 unpicked edge 8 picked edge 1 2

Compliant Algorithms for MWM 2 3 2 unpicked edge 8 picked edge 1 2 Maintain “current solution” M , update if new edge improves it sufficiently

Compliant Algorithms for MWM: Details Update of “current solution” M ◮ Given new edge e , pick “augmenting pair” ( A , J ) • A ← { e } • J ← M ⋓ A ... edges in M that conflict with A • Ensure w ( A ) ≥ (1 + γ ) w ( J ) ◮ Update M ← ( M \ J ) ∪ A Choice of gain parameter ◮ γ = 1, approx factor 6 [Feigenbaum-K-M-S-Z’05] √ ◮ γ = 1 / 2, approx factor 5 . 828 [McGregor’05]

Compliant Algorithms for MWM: Details Update of “current solution” M ◮ Given new edge e , pick “augmenting pair” ( A , J ) • A ← { e } A ← “best” subset of 3-neighbourhood of e • J ← M ⋓ A ... edges in M that conflict with A • Ensure w ( A ) ≥ (1 + γ ) w ( J ) ◮ Update M ← ( M \ J ) ∪ A Choice of gain parameter ◮ γ = 1, approx factor 6 [Feigenbaum-K-M-S-Z’05] √ ◮ γ = 1 / 2, approx factor 5 . 828 [McGregor’05] ◮ γ = 1 . 717, approx factor 5 . 585 [Zelke’08]

Compliant Algorithms for MWM: Details Update of “current solution” M + pool of “shadow edges” S ◮ Given new edge e , pick “augmenting pair” ( A , J ) • A ← { e } A ← “best” subset of 3-neighbourhood of e • J ← M ⋓ A ... edges in M that conflict with A • Ensure w ( A ) ≥ (1 + γ ) w ( J ) ◮ Update M ← ( M \ J ) ∪ A ◮ Update S ← appropriate subset of ( S \ A ) ∪ J Choice of gain parameter ◮ γ = 1, approx factor 6 [Feigenbaum-K-M-S-Z’05] √ ◮ γ = 1 / 2, approx factor 5 . 828 [McGregor’05] ◮ γ = 1 . 717, approx factor 5 . 585 [Zelke’08]

Generic Compliant Algorithm and f -Extension for MSM 1: procedure Process-Edge ( e , M , S , γ ) 2: ( A , J ) ← a well-chosen augmenting pair for M 3: with A ⊆ M ∪ S + e , w ( A ) ≥ (1 + γ ) w ( J ) M ← ( M \ J ) ∪ A 4: S ← a well-chosen subset of ( S \ A ) ∪ J 5: MWM alg A + submodular f → MSM alg A f (the f -extension of A )

Generic Compliant Algorithm and f -Extension for MSM 1: procedure Process-Edge ( e , M , S , γ ) w ( e ) ← f ( M ∪ S + e ) − f ( M ∪ S ) 2: ( A , J ) ← a well-chosen augmenting pair for M 3: with A ⊆ M ∪ S + e , w ( A ) ≥ (1 + γ ) w ( J ) M ← ( M \ J ) ∪ A 4: S ← a well-chosen subset of ( S \ A ) ∪ J 5: MWM alg A + submodular f → MSM alg A f (the f -extension of A )

Generic Compliant Algorithm and f -Extension for MSM 1: procedure Process-Edge ( e , M , S , γ ) w ( e ) ← f ( M ∪ S + e ) − f ( M ∪ S ) 2: ( A , J ) ← a well-chosen augmenting pair for M 3: with A ⊆ M ∪ S + e , w ( A ) ≥ (1 + γ ) w ( J ) M ← ( M \ J ) ∪ A 4: S ← a well-chosen subset of ( S \ A ) ∪ J 5: MWM alg A + submodular f → MSM alg A f (the f -extension of A ) MWIS (arbitrary ground set E , independent sets I ⊆ 2 E ) + f → MSIS

Generalise: Submodular Maximization (MWIS, MSIS) 1: procedure Process-Element ( e , I , S , γ ) 2: w ( e ) ← f ( I ∪ S + e ) − f ( I ∪ S ) 3: ( A , J ) ← a well-chosen augmenting pair for I with A ⊆ I ∪ S + e , w ( A ) ≥ (1 + γ ) w ( J ) I ← ( I \ J ) ∪ A 4: S ← a well-chosen subset of ( S \ A ) ∪ J 5: MWM alg A + submodular f → MSM alg A f (the f -extension of A ) MWIS (arbitrary ground set E , independent sets I ⊆ 2 E ) + f → MSIS

Further Applications: Hypermatchings Stream of hyperedges e 1 , e 2 , . . . , e m ⊆ [ n ], each | e i | ≤ p Hypermatching = subset of pairwise disjoint edges

Optimisation While Streaming Amit Chakrabarti Dartmouth College - PowerPoint PPT Presentation

Optimisation While Streaming Amit Chakrabarti Dartmouth College Joint work with S. Kale, A. Wirth DIMACS Workshop on Big Data Through the Lens of Sublinear Algorithms, Aug 2015 Combinatorial Optimisation Problems 1950s, 60s: Operations

Medicines optimisation The road to excellence Workshop Overview of meds optimisation Your

While Loops Python While Loops Form of the while loop: while condition : Statement Block

Training Presentation Web Streaming Introduction What is Web Streaming? Who is Streaming?

20 STREAMING AGREEMENT 19 16 OCTOBER US$145 million Streaming Agreement US$145 million

2 Workloa d? 3 OLTP 4 OLAP OLTP 4 OLAP OLTP Streaming 4 Scan- OLAP OLTP Streaming

Automated and Accurate Geometry Extraction and Shape Optimisation of 3D Topology Optimisation

Introductory Course on Non-smooth Optimisation Lecture 09 - Non-convex optimisation Jingwei Liang

Introduction to program optimisation Michel Schinz (based on Erik Stenmans slides) Advanced

while Loops Introducing: while Loops General form of a while loop statement: while [boolean

Introduction (1) Packet Loss Recovery for Streaming is growing Commercial streaming

Massive-scale analysis of streaming social networks David A. Bader Exascale Streaming Data

Spark Streaming and GraphX Amir H. Payberah amir@sics.se SICS Swedish ICT Amir H. Payberah

Streaming Systems Instructor: Matei Zaharia cs245.stanford.edu Outline Motivation Streaming

Landell - live streaming for the masses Luciana Fujii Pontello Landell - live streaming for the

Playing Video Content Alan Smith ACTIVE SOLUTION, STOCKHOLM, SWEDEN youtube.com/user/CloudCasts

Graph Distances in the Streaming Model Joan Feigenbaum Sampath Kannan Andrew McGregor Siddharth

1 Number is potentially limitless 1. 1. Human counter erfactual thought What to change? Not

Implementing an LLVM based D ynamic B inary I nstrumentation framework Charles Hubain Cdric

MB : SGX SGX-BO BOMB Locking Down the Processor via the Rowhammer Attack Yeongjin Jang *,

Introduction to Explicit Instruction Presented by: Gina W . Hopper, Director of Special

Paradim Shift in cholesterol behandeling: van LDL-C target naar LDL-C eradicatie Prof. G.Kees

Novel strategies targeting residual risk: the promise of PCSK-9 inhibiting therapies ESC August

From Nancy, France to Pisa, Italia Ontology-guided Data Preparation for Discovering

Learning on Silicon: Overview Gert Cauwenberghs Johns Hopkins University gert@jhu.edu 520.776