Parameterized Streaming Algorithms Graham Cormode Rajesh Chitnis - - PowerPoint PPT Presentation

parameterized streaming
SMART_READER_LITE
LIVE PREVIEW

Parameterized Streaming Algorithms Graham Cormode Rajesh Chitnis - - PowerPoint PPT Presentation

Towards a Theory ry of f Parameterized Streaming Algorithms Graham Cormode Rajesh Chitnis Parameterized Streaming Algorithms We increasingly have to deal with huge graphs Facebook graph Brain graph Google Maps in USA Web Graph 10


slide-1
SLIDE 1

Towards a Theory ry of f

Parameterized Streaming Algorithms

Graham Cormode Rajesh Chitnis

slide-2
SLIDE 2

We increasingly have to deal with huge graphs…

Parameterized Streaming Algorithms

Facebook graph

  • 109 nodes

Brain graph

  • 109 nodes

Web Graph

  • 232 nodes

Google Maps in USA

  • 108 intersection nodes
  • It is inconvenient or impossible to store the whole input for random access
  • “Solved” problems become hard under different models of data access
  • E.g. External memory, MapReduce, Streaming…
slide-3
SLIDE 3
  • The paradigm of streaming algorithms is one attempt to deal with Big Data
  • The streaming model (for graphs) is as follows:
  • The vertex set 𝑊 = {1,2, … , 𝑜} is fixed, and known in advance
  • The edges arrive one-by-one (in arbitrary order)
  • For each edge arrival, we need to make a (fast) decision what information to store
  • Cannot (do not want to) store all the edges

Parameterized Streaming Algorithms

  • We allow unbounded computation at end of the stream
  • Which graph problems can we solve efficiently in this model?
  • Naïve algorithm for any graph problem uses 𝑃 𝑜2 bits by storing whole adjacency matrix

1 5 4 2 3

slide-4
SLIDE 4
  • Recall that the naïve algorithm for any graph problem uses 𝑃 𝑜2 bits
  • Bad News : Many graph problems have a lower bound of Ω(𝑜2) space in streaming model
  • E.g. Does the given graph have any triangle?
  • Typically use communication complexity to show lower bounds for streaming algorithms
  • INDEX problem: Alice has string 𝑌 ∈ 0,1 𝑂, Bob has index 𝑗 ∈ 𝑂 , want to find 𝑗th bit of X
  • Lower bound of Ω(𝑂) if Alice can send only one message to Bob, even with randomization
  • Communication complexity reductions: show that a streaming algorithm would solve INDEX

Parameterized Streaming Algorithms

10010110 One-way communication from Alice to Bob 𝑗 = 5

slide-5
SLIDE 5

Parameterized Streaming Algorithms

  • Sketch of a simple INDEX reduction for triangle detection:
  • Alice adds edges between 𝑍 and 𝑎 according to her string 𝑌
  • Then she sends her data structure to Bob
  • Bob has an index 𝐽 ∈ 𝑂 corresponding to some 𝑘, ℓ ∈ 𝑠 × 𝑠
  • Bob adds a new vertex 𝑡 and the edges (𝑡, 𝑧𝑘) and (𝑡, 𝑨ℓ)

6

𝑧1 𝑧𝑠 𝑧𝑘 𝑨1 𝑨ℓ 𝑨𝑠

Y Z

Let 𝑂 = 𝑠2

𝑡

The resulting graph has a triangle iff the edge (𝑧𝑘, 𝑨ℓ) is present, i.e., 𝐽𝑢ℎ bit of X is 1

slide-6
SLIDE 6
  • Bad News : Many graph problems require Ω(𝑜2) space in streaming model
  • How can we cope with this (space) intractability?

Parameterized Streaming Algorithms

BIG

Time

BIG

Data

  • Feigenbaum et al. [ICALP ‘04]: Finding (size of) a min VC needs Ω(𝑜2) space
  • But how much space does 𝑙-VC need?
  • We design a streaming algorithm in 𝑃(𝑙 ⋅ log 𝑜) bits (with 2𝑙 passes over the input)
  • Essentially, the standard branching FPT algorithm in streaming model…

Fine-grained understanding via parameterized analysis

slide-7
SLIDE 7
  • Streaming algorithm for 𝑙-VC with 𝑃(𝑙 ⋅ log 𝑜) bits and 2𝑙 passes

Parameterized Streaming Algorithms

𝒇 = 𝒚𝟐𝒛𝟐 𝒇 = 𝒚𝟒𝒛𝟒 𝒇 = 𝒚𝟑𝒛𝟑

𝒚𝟒 𝒚𝟐 𝒚𝟑 𝒛𝟐 𝒛𝟑 𝒛𝟒

𝑯

𝑯-𝒛𝟐 𝑯-𝒚𝟐 𝑯-𝒛𝟐-𝒚𝟒 𝑯-𝒚𝟐-𝒛𝟑

  • Consider all 2𝑙 binary strings from 0,1 𝑙, one in each pass
  • The binary search tree has 2𝑙 leaves
  • Each pass corresponds to a root → leaf path in the tree
  • 0 for left branch, and 1 for right branch
  • Algorithm only stores current binary string and corresponding VC
  • Storage is 𝑃(𝑙 ⋅ log 𝑜) bits
  • Optimal if you also want to output a VC!

Streaming implementation of FPT algorithm via iterative compression: (𝑙 ⋅ 2𝑙)-pass streaming algorithm for 𝑙-VC which uses 𝑃(𝑙 ⋅ log 𝑜) bits

𝑯-𝒛𝟐-𝒛𝟒 𝑯-𝒚𝟐-𝒚𝟑

𝑙

Reducing the number of passes: Chitnis et al. [SODA ‘15] designed a 1-pass streaming algorithm for 𝑙-VC using 𝑃(𝑙2 ⋅ log 𝑜) bits

slide-8
SLIDE 8

Towards a general theory of (space) parameterized streaming algorithms…..

Parameterized Streaming Algorithms

FPS: 𝑔 𝑙 ⋅ log 𝑜 LinPS: 𝑔 𝑙 ⋅ 𝑜 ⋅ log 𝑜 SubPS: 𝑔 𝑙 ⋅ 𝑜1−𝜗 ⋅ log 𝑜 BrutePS: 𝑃(𝑜2)

  • FPS: Fixed-Parameter Streaming
  • SubPS: Sublinear dependence on input 𝑜
  • LinPS: Linear dependence on input 𝑜
  • BrutePS: Naïvely storing the whole graph

Goal: Develop algorithms and lower bounds to categorize graph problems in this hierarchy

𝒍-Vertex-Cover K-MaxMatching

𝒍-Path, 𝒍-FVS, 𝒍-Treewidth 𝒍-Girth, 𝒍-Clique, 𝒍-Dominating-Set

1.5-approx. for MaxMatching

  • n trees

We study all problems, not just NP-hard ones!

slide-9
SLIDE 9

Picture is a bit more complicated: Any entry in this landscape is really a 6-tuple

[Problem, Parameter, Approximation Ratio, Type of Stream, Type of Algorithm, # of passes]

Parameterized Streaming Algorithms

FPS: 𝑔 𝑙 ⋅ log 𝑜 LinPS: 𝑔 𝑙 ⋅ 𝑜 ⋅ log 𝑜 SubPS: 𝑔 𝑙 ⋅ 𝑜1−𝜗 ⋅ log 𝑜 BrutePS: 𝑃(𝑜2)

  • FPS: Fixed-Parameter Streaming Algorithms
  • SubPS: Sublinear dependence on input 𝑜
  • LinPS: Linear dependence on input 𝑜
  • BrutePS: Naïvely storing the whole graph

Insertion-only or Insertion-deletion Deterministic or Randomized

Towards a general theory of (space) parameterized streaming algorithms…..

slide-10
SLIDE 10

Tight problems for the class LinPS via simple upper bounds

Parameterized Streaming Algorithms

FPS: 𝑔 𝑙 ⋅ log 𝑜 LinPS: 𝑔 𝑙 ⋅ 𝑜 ⋅ log 𝑜 SubPS: 𝑔 𝑙 ⋅ 𝑜1−𝜗 ⋅ log 𝑜 BrutePS: 𝑃(𝑜2)

Store all edges till we see (𝑙 ⋅ 𝑜) edges Hence this needs 𝑃(𝑙 ⋅ 𝑜 ⋅ log 𝑜) bits

𝒍-Path, 𝒍-FVS, 𝒍-Treewidth

These problems need Ω(𝑜 ⋅ log 𝑜) space (for constant 𝑙) Hence, they are not in SubPS 𝑙-Path: If 𝐹 ≥ 𝑙 ⋅ 𝑜 then there is a 𝑙-path 𝑙-FVS: If there is a fvs of size 𝑙 then 𝐹 ≤ 𝑙 ⋅ 𝑜 𝑙-Treewidth: If treewidth is ≤ 𝑙 then 𝐹 ≤ 𝑙 ⋅ 𝑜 Rules out any algorithm using space 𝑔 𝑙 ⋅ 𝑝(𝑜 ⋅ log 𝑜) for any function 𝑔

slide-11
SLIDE 11
  • Hardness reduction: “Small” space streaming algorithm for 6-Path

⇒ 1- way communication protocol for PERMUTATION of “small” cost

  • PERMUTATION problem:

Alice has a permutation 𝜀: 𝑂 → 𝑂 encoded as a bit-string of length 𝑂 ⋅ log 𝑜 . Bob has an index 𝐽 ∈ 𝑂 ⋅ log 𝑂 and wants to find 𝐽𝑢ℎ bit of 𝜀

  • Sun and Woodruff [APPROX ‘15]: need Ω(𝑂 ⋅ log 𝑂) bits one-way communication

Parameterized Streaming Algorithms

𝛁(𝐨 ⋅ 𝒎𝒑𝒉 𝒐) bit bit lower r bou bound for

  • r 𝒍-Path

th with th 𝒍 = 𝟕

  • Alice adds edges between 𝑍 and 𝑎 according to the permutation 𝜀
  • For each 𝑗 ∈ [𝑂] she adds an edge from 𝑧𝑗 to 𝑨𝜀 𝑗
  • Bob’s index 𝐽 ∈ [𝑂 ⋅ log 𝑂] maps to ℓ𝑢ℎ-bit of 𝜀(𝑘) for some 𝑘, ℓ
  • Bob adds a new vertex 𝑡, and the edge 𝑡 − 𝑧𝑘
  • Let 𝑇ℓ = {𝑨𝜀(𝑠) ∶ ℓ𝑢ℎ-bit of 𝜀(𝑠) is one }
  • Bob adds new vertex 𝑢, and edges from 𝑢 to each vertex of 𝑇ℓ

𝑧1 𝑧𝑂 𝑧𝑘 𝑨𝜀(1) 𝑨𝜀(2) 𝑨𝜀(𝑂)

Y Z

𝑡

The resulting graph has a 6-path iff edge 𝑨𝜀(𝑘) ∈ 𝑇ℓ is present, i.e., 𝐽𝑢ℎ bit of X is 1

𝑧2 𝑨𝜀(𝑘) 𝑢

slide-12
SLIDE 12

Tight problems for the class BrutePS

Parameterized Streaming Algorithms

FPS: 𝑔 𝑙 ⋅ log 𝑜 LinPS: 𝑔 𝑙 ⋅ 𝑜 ⋅ log 𝑜 SubPS: 𝑔 𝑙 ⋅ 𝑜1−𝜗 ⋅ log 𝑜 BrutePS: 𝑃(𝑜2)

How do we show a problem does not belong to the smaller class LinPS?

  • Show Ω(𝑜2) bits lower bound for constant 𝑙
  • Rules out any algorithm using space 𝑔 𝑙 ⋅ 𝑝(𝑜2)
  • Next slide gives proof for 3-Girth…

Note that 𝑙-Girth is polynomial time solvable, but hard in terms of space!

𝒍-Girth, 𝒍-Clique, 𝒍-Dominating-Set

slide-13
SLIDE 13

INDEX problem requires Ω(𝑂) bits of one-way communication from Alice to Bob Alice has a string 𝑌 ∈ 0,1 𝑂. Bob has an index 𝐽 ∈ 𝑂 and wants to find 𝐽𝑢ℎ bit of X

Parameterized Streaming Algorithms

𝛁(𝐨𝟑) bit bits lower bou bound for

  • r ch

checkin ing if f girth rth of

  • f a

a grap aph is s ≤ 𝟒

  • Same set up as previously:
  • Let 𝑂 = 𝑠2 and fix a bijection 𝜚: 𝑂 → 𝑠 × [𝑠]
  • Alice adds edges between 𝑍 and 𝑎 according to string 𝑌
  • Then she sends her data structure to Bob
  • Bob’s index 𝐽 ∈ 𝑂 corresponds to some 𝑘, ℓ ∈ 𝑠 × 𝑠
  • Bob adds a new vertex 𝑡 and the edges (𝑡, 𝑧𝑘) and (𝑡, 𝑨ℓ)
  • Lower bound of Ω(𝑂) translates to Ω(𝑜2) for 3-girth on graphs with 𝑜 vertices

𝑧1 𝑧𝑠 𝑧𝑘 𝑨1 𝑨ℓ 𝑨𝑠

Y Z

𝑡

The resulting graph has a triangle iff the edge (𝑧𝑘, 𝑨ℓ) is present, i.e., 𝐽𝑢ℎ bit of X is 1

slide-14
SLIDE 14

Parameterized Streaming Algorithms

Goal: Develop algorithms and lower bounds to categorize graph problems in this hierarchy

  • The story so far ….
  • Can simulate parameterized techniques (branching, iterative compression,

bidimensionality, etc.) in the streaming model

  • Developed new lower bounds using communication complexity
  • Beyond “standard” graph problems? Game theory, machine learning, etc …..
  • Connections with kernelization?
  • Implement and evaluate these new parameterized streaming algorithms?
  • Code for some of the 𝑙-VC algorithms available at http://projects.csail.mit.edu/dnd/

Streaming (space) algorithms Parameterized (time) algorithms Two-way flow of ideas

Looking forward…

slide-15
SLIDE 15

Parameterized Streaming Algorithms

𝐌𝐩𝐱𝐟𝐬 𝐜𝐩𝐯𝐨𝐞𝐭 𝐣𝐨𝐭𝐪𝐣𝐬𝐟𝐞 𝐜𝐳 𝐋𝐟𝐬𝐨𝐟𝐦 𝐦𝐩𝐱𝐟𝐬 𝐜𝐩𝐯𝐨𝐞𝐭

  • Connections with Kernelization – a different (but related) data-compression model
  • Kernelization versus streaming
  • Polytime computation versus unbounded computation
  • Full access of the input versus limited access to input
  • AND-compression: No poly kernel unless NP⊆ coNP/poly
  • New definition of AND-compatible, inspired by AND-compression

A problem Π is AND-compatible if ∃ constant 𝑙 ∈ℕ such that

  • ∀ 𝑜 ∈ℕ there is a graph 𝐻𝑍𝐹𝑇 on 𝑜 vertices such that Π 𝐻𝑍𝐹𝑇, 𝑙 is YES instance
  • ∀ 𝑜 ∈ℕ there is a graph 𝐻𝑂𝑃 on 𝑜 vertices such that Π 𝐻𝑂𝑃, 𝑙 is YES instance
  • ∀ 𝑢 ∈ℕ we have that Π 𝐻1 ⊎ 𝐻2 ⊎ ⋯ ⊎ 𝐻𝑢, 𝑙 = ⋀ Π(𝐻𝑗, 𝑙) where ⊎ denotes vertex disjoint union
  • Many natural graph problems are AND-compatible: 𝑙-coloring, 𝑙-treewidth, 𝑙-girth
  • Our result: If a problem Π is AND-compatible then it does not admit a streaming

algorithm using space 𝑔 𝑙 ⋅ 𝑝(𝑜), for any function 𝑔.

  • Unconditional, unlike kernel lower bounds
  • Similar definition and result for OR-compatible
slide-16
SLIDE 16

Parameterized Streaming Algorithms

𝐌𝐩𝐱𝐟𝐬 𝐜𝐩𝐯𝐨𝐞𝐭 𝐣𝐨𝐭𝐪𝐣𝐬𝐟𝐞 𝐜𝐳 𝐋𝐟𝐬𝐨𝐟𝐦 𝐦𝐩𝐱𝐟𝐬 𝐜𝐩𝐯𝐨𝐞𝐭

A problem Π is AND-compatible if ∃ constant 𝑙 ∈ℕ such that

  • ∀ 𝑜 ∈ℕ there is a graph 𝐻𝑍𝐹𝑇 on 𝑜 vertices such that Π 𝐻𝑍𝐹𝑇, 𝑙 is YES instance
  • ∀ 𝑜 ∈ℕ there is a graph 𝐻𝑂𝑃 on 𝑜 vertices such that Π 𝐻𝑂𝑃, 𝑙 is YES instance
  • ∀ 𝑢 ∈ℕ we have that Π 𝐻1 ⊎ 𝐻2 ⊎ ⋯ ⊎ 𝐻𝑢, 𝑙 = ⋀ Π(𝐻𝑗, 𝑙) where ⊎ denotes vertex disjoint union
  • Our result: If a problem Π is AND-compatible then it does not admit a streaming

algorithm using space 𝑔 𝑙 ⋅ 𝑝(𝑜), for any function 𝑔.

  • Consider 𝑢 graphs 𝐻1, 𝐻2, … , 𝐻𝑢 each having 𝑜 vertices
  • Let 𝐻 be disjoint union 𝐻1 ⊎ 𝐻2 ⊎ ⋯ ⊎ 𝐻𝑢
  • By pigeonhole principle, any (correct) algorithm for 𝐻 must use ≥ 𝑢 bits
  • Otherwise two subsets 𝐽, 𝐾 of 𝑢 collide. Let 𝑗∗ ∈ 𝐽 ∖ 𝐾
  • Select 𝐻𝑗 = 𝐻𝑍𝐹𝑇 for each 𝑗 ∈ 𝐽 ∪ 𝐾 ∖ 𝑗∗ and 𝐻𝑗∗ = 𝐻𝑂𝑃
  • This violates correctness of the algorithm
  • Hence, we have that 𝑔 𝑙 ⋅ 𝑝 𝑜𝑢 ≥ 𝑢
  • Contradiction since 𝑙, 𝑜 are constants and we can take 𝑢 as large as we want