Optimal Lower Bounds for Distributed and Streaming Spanning Forest - PowerPoint PPT Presentation

Optimal Lower Bounds for Distributed and Streaming Spanning Forest Computation Huacheng Yu Oct 18, 2018 Harvard University Joint work with Jelani Nelson

Warm-up Consider the following dynamic problem: • edges are inserted into an initially empty graph G on n vertices 1

Warm-up Consider the following dynamic problem: • edges are inserted into an initially empty graph G on n vertices • must output a spanning forest when queried 1

Warm-up Consider the following dynamic problem: • edges are inserted into an initially empty graph G on n vertices • must output a spanning forest when queried Goal: minimize space 1

Warm-up Consider the following dynamic problem: • edges are inserted into an initially empty graph G on n vertices • must output a spanning forest when queried Goal: minimize space Space complexity: Θ( n log n ) bits • maintain list of edges in the spanning forest: O ( n log n ) • when the final graph is a tree itself, have to output the whole graph: Ω( n log n ) 1

Warm-up Consider the following dynamic problem: • edges are inserted into an initially empty graph G on n vertices • must output a spanning forest when queried Goal: minimize space Space complexity: Θ( n log n ) bits • maintain list of edges in the spanning forest: O ( n log n ) • when the final graph is a tree itself, have to output the whole graph: Ω( n log n ) what if we allow edge deletions ? 1

Fully dynamic spanning forest Maintain a dynamic graph on n vertices, supporting • edge insertions, • edge deletions, and • spanning forest queries Goal: minimize space Theorem (Ahn, Guha, McGregor’12) ... solvable using O ( n log 3 n ) bits of space with error probability 1 / poly ( n ) . 2

Fully dynamic spanning forest Maintain a dynamic graph on n vertices, supporting • edge insertions, • edge deletions, and • spanning forest queries Goal: minimize space Theorem (Ahn, Guha, McGregor’12) ... solvable using O ( n log 3 n ) bits of space with error probability 1 / poly ( n ) . only two more log factors! 2

Fully dynamic spanning forest Maintain a dynamic graph on n vertices, supporting • edge insertions, • edge deletions, and • spanning forest queries Goal: minimize space Theorem (Ahn, Guha, McGregor’12) ... solvable using O ( n log( n /δ ) log 2 n ) bits of space with error probability δ . only two more log factors! 2

Fully dynamic spanning forest Maintain a dynamic graph on n vertices, supporting • edge insertions, • edge deletions, and • spanning forest queries Goal: minimize space Theorem (Ahn, Guha, McGregor’12) ... solvable using O ( n log( n /δ ) log 2 n ) bits of space with error probability δ . only two more log factors! why two more? 2

Main result I Theorem (This paper) Any data structure for fully dynamic spanning forest with error probability δ must use Ω( n log( n /δ ) log 2 n ) bits of memory, for any 2 − n 0 . 99 < δ < 0 . 99 . 3

Main result I Theorem (This paper) Any data structure for fully dynamic spanning forest with error probability δ must use Ω( n log( n /δ ) log 2 n ) bits of memory, for any 2 − n 0 . 99 < δ < 0 . 99 . ⇒ Ω( n log 3 n ) bits of space: δ is a constant = need exactly two more log factors! 3

Simultaneous communication The [Ahn, Guha, McGregor’12] solution also solves the following n -player communication problem 4

Simultaneous communication The [Ahn, Guha, McGregor’12] solution also solves the following n -player communication problem A (fixed) graph on n vertices is given to n players w. shared randomness: • each player only sees one vertex and its neighborhood 4

Simultaneous communication The [Ahn, Guha, McGregor’12] solution also solves the following n -player communication problem A (fixed) graph on n vertices is given to n players w. shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee 4

Simultaneous communication The [Ahn, Guha, McGregor’12] solution also solves the following n -player communication problem A (fixed) graph on n vertices is given to n players w. shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee • referee outputs a spanning forest w.p. 1 − δ 4

Simultaneous communication The [Ahn, Guha, McGregor’12] solution also solves the following n -player communication problem A (fixed) graph on n vertices is given to n players w. shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee • referee outputs a spanning forest w.p. 1 − δ Goal: minimize communication 4

Simultaneous communication The [Ahn, Guha, McGregor’12] solution also solves the following n -player communication problem A (fixed) graph on n vertices is given to n players w. shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee • referee outputs a spanning forest w.p. 1 − δ Goal: minimize communication (compute a global function given small “sketches” of “local information”) 4

AGM sketch for simultaneous communication A graph on n vertices is given to n players w. shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee • referee outputs a spanning forest w.p. 1 − δ Goal: minimize communication 5

AGM sketch for simultaneous communication A graph on n vertices is given to n players w. shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee • referee outputs a spanning forest w.p. 1 − δ Goal: minimize communication Theorem (AGM’12) ... solvable using (worst-case) O (log( n /δ ) log 2 n ) bits of communication per player with error probability δ . 5

AGM sketch for simultaneous communication A graph on n vertices is given to n players w. shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee • referee outputs a spanning forest w.p. 1 − δ Goal: minimize communication Theorem (AGM’12) ... solvable using (worst-case) O (log( n /δ ) log 2 n ) bits of communication per player with error probability δ . Trivial: Ω(log n ) since the referee has to learn Ω( n log n ) bits 5

Main result II Theorem (This paper) Any simultaneous communication protocol for spanning forest with error probability 0.99 must use Ω(log 3 n ) bits of communication on average. 6

Main result II Theorem (This paper) Any simultaneous communication protocol for spanning forest with error probability 0.99 must use Ω(log 3 n ) bits of communication on average. exactly two more log factors needed than the trivial information theoretical lower bound 6

Main result II Theorem (This paper) Any simultaneous communication protocol for spanning forest with error probability 0.99 must use Ω(log 3 n ) bits of communication on average. exactly two more log factors needed than the trivial information theoretical lower bound Open: higher lower bounds when error probability δ is lower? 6

Graph sketching for spanning forest [AGM’12] designed a (randomized) linear sketch: S : N n 2 → N O ( n log 2 n ) such that 7

Graph sketching for spanning forest [AGM’12] designed a (randomized) linear sketch: S : N n 2 → N O ( n log 2 n ) such that • S is a linear mapping with poly-bounded coefficients 7

Graph sketching for spanning forest [AGM’12] designed a (randomized) linear sketch: S : N n 2 → N O ( n log 2 n ) such that • S is a linear mapping with poly-bounded coefficients • S ( G ) is a concatenation of S 1 ( G ) , S 2 ( G ) , . . . , S n ( G ), each S i ( G ) has O (log 2 n ) dimensions, and it is computed from the neighborhood of vertex i 7

Graph sketching for spanning forest [AGM’12] designed a (randomized) linear sketch: S : N n 2 → N O ( n log 2 n ) such that • S is a linear mapping with poly-bounded coefficients • S ( G ) is a concatenation of S 1 ( G ) , S 2 ( G ) , . . . , S n ( G ), each S i ( G ) has O (log 2 n ) dimensions, and it is computed from the neighborhood of vertex i • S ( G ) determines a spanning forest with probability 1 − 1 / n c 7

Streaming algorithm Store S ( G ) in memory: • update: S ( G ± ( u , v )) = S ( G ) ± S (( u , v )) • at end of stream: S ( G ) determines a spanning forest w.h.p. Use O ( n log 3 n ) bits of space 8

Communication protocol Given graph G : • Player i computes S i ( G ), and sends it to referee • referee concatenates all S i ( G ), obtains S ( G ) • referee outputs a spanning forest w.h.p. Use O (log 3 n ) bits of communication per player 9

Simultaneous communication complexity of spanning forest 10

Recall... An n -vertex graph is given to n players with shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee • referee outputs a spanning forest w.p. 1 − δ Goal: prove an average player must send Ω(log 3 n ) bits for constant δ 11

Recall... An n -vertex graph is given to n players with shared randomness: • each player only sees one vertex and its neighborhood • each player sends a message to a referee • referee outputs a spanning forest w.p. 1 − δ Goal: prove some player must send Ω(log 3 n ) bits for δ = 1 / n c 11

Optimal Lower Bounds for Distributed and Streaming Spanning Forest - PowerPoint PPT Presentation

Optimal Lower Bounds for Distributed and Streaming Spanning Forest Computation Huacheng Yu Oct 18, 2018 Harvard University Joint work with Jelani Nelson Warm-up Consider the following dynamic problem: edges are inserted into an initially

Circuit Lower-bounds Lecture 24 Weak circuits are indeed weak 1 Circuit Lower-bounds 2

Lower Bounds on Matrix Rigidity via a Quantum Argument Ronald de Wolf CWI Amsterdam Lower

Lecture 2. Upper and lower bounds for subgaussian matrices The -net method refined 1 Random

Kernel-Size Lower Bounds: The Evidence from Complexity Theory Andrew Drucker IAS Worker 2013,

Amit Chakrabarti Dartmouth College WAPMDS, IIT Kanpur, Dec 2009 Amit Chakrabarti 1 Multi-Pass

Kernel-Size Lower Bounds: The Evidence from Complexity Theory Andrew Drucker IAS Worker 2013,

Kernel-Size Lower Bounds: The Evidence from Complexity Theory Andrew Drucker IAS Worker 2013,

Monotone Circuit Depth Lower Bounds Prashant Vasudevan April 10, 2012 Prashant Vasudevan

Lower Bounds for Data Streams: A Survey David Woodruff IBM Almaden Outline 1. Streaming model

On lower bounds for C 0 -semigroups Yuri Tomilov IM PAN, Warsaw Chemnitz, August, 2017 Yuri

9. Sorting III Lower bounds for the comparison based sorting, radix- and bucket-sort 248 9.1

Lecture 3: Lower Bounds for Sorting, Linear Time Sorting Algorithms Instructor: Saravanan

A Space Optimal Streaming Algorithm for Sketching Small Moments Daniel M. Kane Jelani Nelson

Training Presentation Web Streaming Introduction What is Web Streaming? Who is Streaming?

20 STREAMING AGREEMENT 19 16 OCTOBER US$145 million Streaming Agreement US$145 million

2 Workloa d? 3 OLTP 4 OLAP OLTP 4 OLAP OLTP Streaming 4 Scan- OLAP OLTP Streaming

Entanglement requirements in non-local games William Slofstra IQC, University of Waterloo August

Ultimate Referee, Ultimate Automizer, and Incremental Verification Matthias Heizmann University

ONE BIG GOAL OF THE SEASON TOGETHER Lets create a positive environment for kids to play

Reviews, Responses, and Panels John A Clark So who decides who gets a grant? Reviewers and

Gminus2 computing funds requests Alberto Lusiani Scuola Normale Superiore and INFN, sezione di

Mini-Series: Own your CV Episode 5 Interests, Additional Information and Referees Additional

Communication and Memory Efficient Testing of Discrete Distributions Themis Gouleakis USC

Story Generation From Knowledge Graphs Patrick Saad Referee: Prof. Dr. Benno Stein Referee: