Graph & Geometry Problems in Data Streams 2009 Barbados Workshop - PowerPoint PPT Presentation

Graph & Geometry Problems in Data Streams 2009 Barbados Workshop on Computational Complexity Andrew McGregor

Introduction Models: ◮ Graph Streams: Stream of edges E = { e 1 , e 2 , . . . , e m } describe a graph G on n nodes. Estimate properties of G . ◮ Geometric Streams: Stream of points X = { p 1 , p 2 , . . . , p m } from some metric space ( X , d ). Estimate properties of X .

Introduction Models: ◮ Graph Streams: Stream of edges E = { e 1 , e 2 , . . . , e m } describe a graph G on n nodes. Estimate properties of G . ◮ Geometric Streams: Stream of points X = { p 1 , p 2 , . . . , p m } from some metric space ( X , d ). Estimate properties of X . Notes: ◮ ˜ O is our friend: we’ll hide dependence on polylog( m , n ) terms. ◮ Assume that p i can be stored in ˜ O (1) space and d ( p i , p j ) can be calculated if both p i and p j are stored in memory. ◮ Theory isn’t as cohesive but we get to cherry-pick results. . .

Counting Triangles Matching Clustering Graph Distances

Outline Counting Triangles Matching Clustering Graph Distances

Triangles Problem Given a stream of edges, estimate the number of triangles T 3 up to a factor (1 + ǫ ) with probability 1 − δ given promise that T 3 > t.

Triangles Problem Given a stream of edges, estimate the number of triangles T 3 up to a factor (1 + ǫ ) with probability 1 − δ given promise that T 3 > t. Warm-Up What’s an algorithm using O ( ǫ − 2 ( n 3 / t ) log δ − 1 ) space?

Triangles Problem Given a stream of edges, estimate the number of triangles T 3 up to a factor (1 + ǫ ) with probability 1 − δ given promise that T 3 > t. Warm-Up What’s an algorithm using O ( ǫ − 2 ( n 3 / t ) log δ − 1 ) space? Theorem Ω( n 2 ) space required to determine if t = 0 (with δ = 1 / 3 ).

Triangles Problem Given a stream of edges, estimate the number of triangles T 3 up to a factor (1 + ǫ ) with probability 1 − δ given promise that T 3 > t. Warm-Up What’s an algorithm using O ( ǫ − 2 ( n 3 / t ) log δ − 1 ) space? Theorem Ω( n 2 ) space required to determine if t = 0 (with δ = 1 / 3 ). Theorem (Sivakumar et al. 2002) O ( ǫ − 2 ( nm / t ) 2 log δ − 1 ) space is sufficient. ˜

Triangles Problem Given a stream of edges, estimate the number of triangles T 3 up to a factor (1 + ǫ ) with probability 1 − δ given promise that T 3 > t. Warm-Up What’s an algorithm using O ( ǫ − 2 ( n 3 / t ) log δ − 1 ) space? Theorem Ω( n 2 ) space required to determine if t = 0 (with δ = 1 / 3 ). Theorem (Sivakumar et al. 2002) O ( ǫ − 2 ( nm / t ) 2 log δ − 1 ) space is sufficient. ˜ Theorem (Buriol et al. 2006) ˜ O ( ǫ − 2 ( nm / t ) log δ − 1 ) space is sufficient.

Lower Bound Theorem Ω( n 2 ) space required to determine if T 3 � = 0 when δ = 1 / 3 .

Lower Bound Theorem Ω( n 2 ) space required to determine if T 3 � = 0 when δ = 1 / 3 . ◮ Reduce from set-disjointness: Alice has n × n binary matrix A , Bob has n × n binary matrix B . Is A ij = B ij = 1 for some ( i , j )? Needs Ω( n 2 ) bits of communication [Razborov 1992].

Lower Bound Theorem Ω( n 2 ) space required to determine if T 3 � = 0 when δ = 1 / 3 . ◮ Reduce from set-disjointness: Alice has n × n binary matrix A , Bob has n × n binary matrix B . Is A ij = B ij = 1 for some ( i , j )? Needs Ω( n 2 ) bits of communication [Razborov 1992]. ◮ Consider graph G = ( V , E ) with V = { v 1 , . . . , v n , u 1 , . . . , u n , w 1 , . . . , w n } and E = { ( v i , u i ) : i ∈ [ n ] }

Lower Bound Theorem Ω( n 2 ) space required to determine if T 3 � = 0 when δ = 1 / 3 . ◮ Reduce from set-disjointness: Alice has n × n binary matrix A , Bob has n × n binary matrix B . Is A ij = B ij = 1 for some ( i , j )? Needs Ω( n 2 ) bits of communication [Razborov 1992]. ◮ Consider graph G = ( V , E ) with V = { v 1 , . . . , v n , u 1 , . . . , u n , w 1 , . . . , w n } and E = { ( v i , u i ) : i ∈ [ n ] } ◮ Alice runs algorithm on G and edges { ( u i , w j ) : A ij = 1 } .

Lower Bound Theorem Ω( n 2 ) space required to determine if T 3 � = 0 when δ = 1 / 3 . ◮ Reduce from set-disjointness: Alice has n × n binary matrix A , Bob has n × n binary matrix B . Is A ij = B ij = 1 for some ( i , j )? Needs Ω( n 2 ) bits of communication [Razborov 1992]. ◮ Consider graph G = ( V , E ) with V = { v 1 , . . . , v n , u 1 , . . . , u n , w 1 , . . . , w n } and E = { ( v i , u i ) : i ∈ [ n ] } ◮ Alice runs algorithm on G and edges { ( u i , w j ) : A ij = 1 } . ◮ Bob continues running algorithm on edges { ( v i , w j ) : B ij = 1 } .

Lower Bound Theorem Ω( n 2 ) space required to determine if T 3 � = 0 when δ = 1 / 3 . ◮ Reduce from set-disjointness: Alice has n × n binary matrix A , Bob has n × n binary matrix B . Is A ij = B ij = 1 for some ( i , j )? Needs Ω( n 2 ) bits of communication [Razborov 1992]. ◮ Consider graph G = ( V , E ) with V = { v 1 , . . . , v n , u 1 , . . . , u n , w 1 , . . . , w n } and E = { ( v i , u i ) : i ∈ [ n ] } ◮ Alice runs algorithm on G and edges { ( u i , w j ) : A ij = 1 } . ◮ Bob continues running algorithm on edges { ( v i , w j ) : B ij = 1 } . ◮ T 3 > 0 iff A ij = B ij = 1 for some ( i , j ).

First Algorithm Theorem (Sivakumar et al. 2002) O ( ǫ − 2 ( nm / T 3 ) 2 log δ − 1 ) space is sufficient. ˜

First Algorithm Theorem (Sivakumar et al. 2002) O ( ǫ − 2 ( nm / T 3 ) 2 log δ − 1 ) space is sufficient. ˜ ◮ Given stream of edges induce stream of node-triples: edge ( u , v ) gives rise to { u , v , w } for w ∈ V \ { u , v }

First Algorithm Theorem (Sivakumar et al. 2002) O ( ǫ − 2 ( nm / T 3 ) 2 log δ − 1 ) space is sufficient. ˜ ◮ Given stream of edges induce stream of node-triples: edge ( u , v ) gives rise to { u , v , w } for w ∈ V \ { u , v } ◮ Consider F k = � (freq. of { u,v,w } ) k and note  F 0   1 1 1   T 1   = F 1 1 2 3 T 2      F 2 1 4 9 T 3 where T i is the set of node-triples having exactly i edges in the induced subgraph.

First Algorithm Theorem (Sivakumar et al. 2002) O ( ǫ − 2 ( nm / T 3 ) 2 log δ − 1 ) space is sufficient. ˜ ◮ Given stream of edges induce stream of node-triples: edge ( u , v ) gives rise to { u , v , w } for w ∈ V \ { u , v } ◮ Consider F k = � (freq. of { u,v,w } ) k and note  F 0   1 1 1   T 1   = F 1 1 2 3 T 2      F 2 1 4 9 T 3 where T i is the set of node-triples having exactly i edges in the induced subgraph. ◮ T 3 = F 0 − 3 F 1 / 2 + F 2 / 2 so good approx. for F 0 , F 1 , F 2 suffice.

Second Algorithm Theorem (Buriol et al. 2006) ˜ O ( ǫ − 2 ( nm / T 3 ) log δ − 1 ) space is sufficient.

Second Algorithm Theorem (Buriol et al. 2006) ˜ O ( ǫ − 2 ( nm / T 3 ) log δ − 1 ) space is sufficient. ◮ Pick an edge e i = ( u , v ) uniformly at random from the stream.

Second Algorithm Theorem (Buriol et al. 2006) ˜ O ( ǫ − 2 ( nm / T 3 ) log δ − 1 ) space is sufficient. ◮ Pick an edge e i = ( u , v ) uniformly at random from the stream. ◮ Pick w uniformly at random from V \ { u , v }

Second Algorithm Theorem (Buriol et al. 2006) ˜ O ( ǫ − 2 ( nm / T 3 ) log δ − 1 ) space is sufficient. ◮ Pick an edge e i = ( u , v ) uniformly at random from the stream. ◮ Pick w uniformly at random from V \ { u , v } ◮ If e j = ( u , w ), e k = ( v , w ) for j , k > i exist return 1; else 0.

Second Algorithm Theorem (Buriol et al. 2006) ˜ O ( ǫ − 2 ( nm / T 3 ) log δ − 1 ) space is sufficient. ◮ Pick an edge e i = ( u , v ) uniformly at random from the stream. ◮ Pick w uniformly at random from V \ { u , v } ◮ If e j = ( u , w ), e k = ( v , w ) for j , k > i exist return 1; else 0. Lemma T 3 Expected outcome of algorithm is 3 m ( n − 2) .

Second Algorithm Theorem (Buriol et al. 2006) ˜ O ( ǫ − 2 ( nm / T 3 ) log δ − 1 ) space is sufficient. ◮ Pick an edge e i = ( u , v ) uniformly at random from the stream. ◮ Pick w uniformly at random from V \ { u , v } ◮ If e j = ( u , w ), e k = ( v , w ) for j , k > i exist return 1; else 0. Lemma T 3 Expected outcome of algorithm is 3 m ( n − 2) . ◮ Repeat O ( ǫ − 2 ( mn / t ) log δ − 1 ) times in parallel and scale average up by 3 m ( n − 2).

Outline Counting Triangles Matching Clustering Graph Distances

Maximum Weight Matching Problem Stream of weighted edges ( e , w e ) : Find M ⊂ E that maximizes � e ∈ M w e such that no two edges in M share an endpoint.

Maximum Weight Matching Problem Stream of weighted edges ( e , w e ) : Find M ⊂ E that maximizes � e ∈ M w e such that no two edges in M share an endpoint. Warm-Up An easy 2 approx. for unweighted case in ˜ O ( n ) space?

Maximum Weight Matching Problem Stream of weighted edges ( e , w e ) : Find M ⊂ E that maximizes � e ∈ M w e such that no two edges in M share an endpoint. Warm-Up An easy 2 approx. for unweighted case in ˜ O ( n ) space? Theorem √ 2 = 5 . 83 . . . approx. in ˜ 3 + 2 O ( n ) space.

Graph & Geometry Problems in Data Streams 2009 Barbados Workshop - PowerPoint PPT Presentation

Graph & Geometry Problems in Data Streams 2009 Barbados Workshop on Computational Complexity Andrew McGregor Introduction Models: Graph Streams: Stream of edges E = { e 1 , e 2 , . . . , e m } describe a graph G on n nodes. Estimate

WITH C++ Prof. Amr Goneid AUC Part 9. Streams & Files Prof. amr Goneid, AUC 1 Streams

Geometry Problems Geometry Problems Examples for Typical ACM Instances Elementary Geometry

48-175 Descriptive Geometry Basic Concepts of Descriptive Geometry Descriptive geometry is

Stream Algorithmics Albert Bifet March 2012 Data Streams Big Data & Real Time Data Streams

Stochastic geometry and random generation 1 Stochastic geometry and random generation

Environmental Health Science Data Streams Data Streams Health Data Health Data Brian S.

Data Streams Many large sources of data are generated as streams of updates: IP Network

Data Streams Many large sources of data are generated as streams of updates: IP Network

Stream Bank Stabilization in Open Space Streams in open space There are approximately 35

CSE 143 Streams as C++ Classes Streams are C++ classes Streams have lots of built-in

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Data Streams & Communication Complexity Lecture 2: Graph Spanners, Sparsifiers, & Sketches

Comparing Data Streams Using Hamming Norms Graham Cormode, Mayur Datar, Piotr Indyk, S.

Sample Graph Problems Path problems. Graph Operations And Connectedness problems.

A P A P A Proposal for Publishing Data A Proposal for Publishing Data l f l f P bli hi P bli

CMPSCI 711: More Advanced Algorithms Section 2-1: Graph Streams Andrew McGregor Last Compiled:

INF4820 Algorithms for AI and NLP Semantic Spaces Murhaf Fares & Stephan Oepen

Genetic features of volcanic glasses of different composition Vladislav Petrov vlad243@mail.ru

Announcement Slides for Worship 8/23/2020 Slide 5 Thanks to everyone who came to the Elevate

Render for CNN: Viewpoint Estimation in Images Using CNNsTrained with Rendered 3D Model Views Hao

Announcements Reminder: A1 due this Friday Texture Tues, Sept 15 Kristen Grauman UT Austin

Shape from Texture Texture Discrimination 1 Texture Texture Synthesis Goal of texture

Putting Kids Health First: Profitable and healthy fundraisers for Montana schools

Sound quality of textural audio: characterization, modeling and visualization ESI Modern Methods

Graph & Geometry Problems in Data Streams 2009 Barbados Workshop - PowerPoint PPT Presentation

Graph & Geometry Problems in Data Streams 2009 Barbados Workshop on Computational Complexity Andrew McGregor Introduction Models: Graph Streams: Stream of edges E = { e 1 , e 2 , . . . , e m } describe a graph G on n nodes. Estimate

WITH C++ Prof. Amr Goneid AUC Part 9. Streams &amp; Files Prof. amr Goneid, AUC 1 Streams

Geometry Problems Geometry Problems Examples for Typical ACM Instances Elementary Geometry

48-175 Descriptive Geometry Basic Concepts of Descriptive Geometry Descriptive geometry is

Stream Algorithmics Albert Bifet March 2012 Data Streams Big Data &amp; Real Time Data Streams

Stochastic geometry and random generation 1 Stochastic geometry and random generation

Environmental Health Science Data Streams Data Streams Health Data Health Data Brian S.

Data Streams Many large sources of data are generated as streams of updates: IP Network

Data Streams Many large sources of data are generated as streams of updates: IP Network

Stream Bank Stabilization in Open Space Streams in open space There are approximately 35

CSE 143 Streams as C++ Classes Streams are C++ classes Streams have lots of built-in

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Data Streams &amp; Communication Complexity Lecture 2: Graph Spanners, Sparsifiers, &amp; Sketches

Comparing Data Streams Using Hamming Norms Graham Cormode, Mayur Datar, Piotr Indyk, S.

Sample Graph Problems Path problems. Graph Operations And Connectedness problems.

A P A P A Proposal for Publishing Data A Proposal for Publishing Data l f l f P bli hi P bli

CMPSCI 711: More Advanced Algorithms Section 2-1: Graph Streams Andrew McGregor Last Compiled:

INF4820 Algorithms for AI and NLP Semantic Spaces Murhaf Fares &amp; Stephan Oepen

Genetic features of volcanic glasses of different composition Vladislav Petrov vlad243@mail.ru

Announcement Slides for Worship 8/23/2020 Slide 5 Thanks to everyone who came to the Elevate

Render for CNN: Viewpoint Estimation in Images Using CNNsTrained with Rendered 3D Model Views Hao

Announcements Reminder: A1 due this Friday Texture Tues, Sept 15 Kristen Grauman UT Austin

Shape from Texture Texture Discrimination 1 Texture Texture Synthesis Goal of texture

Putting Kids Health First: Profitable and healthy fundraisers for Montana schools

Sound quality of textural audio: characterization, modeling and visualization ESI Modern Methods

WITH C++ Prof. Amr Goneid AUC Part 9. Streams & Files Prof. amr Goneid, AUC 1 Streams

Stream Algorithmics Albert Bifet March 2012 Data Streams Big Data & Real Time Data Streams

Data Streams & Communication Complexity Lecture 2: Graph Spanners, Sparsifiers, & Sketches

INF4820 Algorithms for AI and NLP Semantic Spaces Murhaf Fares & Stephan Oepen