An introduction to chaining, and applications to sublinear - PowerPoint PPT Presentation

An introduction to chaining, and applications to sublinear algorithms Jelani Nelson Harvard August 28, 2015

What’s this talk about?

What’s this talk about? Given a collection of random variables X 1 , X 2 , . . . , , we would like to say that max i X i is small with high probability. (Happens all over computer science, e.g. “Chernion” (Chernoff+Union) bound)

What’s this talk about? Given a collection of random variables X 1 , X 2 , . . . , , we would like to say that max i X i is small with high probability. (Happens all over computer science, e.g. “Chernion” (Chernoff+Union) bound) Today’s topic: Beating the Union Bound

What’s this talk about? Given a collection of random variables X 1 , X 2 , . . . , , we would like to say that max i X i is small with high probability. (Happens all over computer science, e.g. “Chernion” (Chernoff+Union) bound) Today’s topic: Beating the Union Bound Disclaimer: This is an educational talk, about ideas which aren’t mine.

A first example • T ⊂ B ℓ n 2

A first example • T ⊂ B ℓ n 2 • Random variables ( Z x ) x ∈ T Z x = � g , x � for a vector g with i.i.d. N (0 , 1) entries

A first example • T ⊂ B ℓ n 2 • Random variables ( Z x ) x ∈ T Z x = � g , x � for a vector g with i.i.d. N (0 , 1) entries • Define gaussian mean width g ( T ) = E g sup x ∈ T Z x

A first example • T ⊂ B ℓ n 2 • Random variables ( Z x ) x ∈ T Z x = � g , x � for a vector g with i.i.d. N (0 , 1) entries • Define gaussian mean width g ( T ) = E g sup x ∈ T Z x • How can we bound g ( T )?

A first example • T ⊂ B ℓ n 2 • Random variables ( Z x ) x ∈ T Z x = � g , x � for a vector g with i.i.d. N (0 , 1) entries • Define gaussian mean width g ( T ) = E g sup x ∈ T Z x • How can we bound g ( T )? • This talk: four progressively tighter ways to bound g ( T ), then applications of techniques to some TCS problems

Gaussian mean width bound 1: union bound • g ( T ) = E sup x ∈ T Z x = E sup x ∈ T � g , x �

Gaussian mean width bound 1: union bound • g ( T ) = E sup x ∈ T Z x = E sup x ∈ T � g , x � • Z x is a gaussian with variance one

Gaussian mean width bound 1: union bound • g ( T ) = E sup x ∈ T Z x = E sup x ∈ T � g , x � • Z x is a gaussian with variance one � ∞ E sup Z x = P (sup Z x > u ) du x ∈ T x ∈ T 0 � ∞ � u ∗ = P (sup Z x > u ) du + P (sup Z x > u ) du 0 x ∈ T x ∈ T u ∗ � �� ≤ 1 ≤| T |· e − u 2 / 2 (union bound) ≤ u ∗ + | T | · e − u 2 ∗ / 2 � � log | T | (set u ∗ = 2 log | T | ) �

Gaussian mean width bound 2: ε -net • g ( T ) = E sup x ∈ T � g , x � • Let S ε be ε -net of ( T , ℓ 2 )

Gaussian mean width bound 2: ε -net • g ( T ) = E sup x ∈ T � g , x � • Let S ε be ε -net of ( T , ℓ 2 ) • � g , x � = � g , x ′ � + � g , x − x ′ � ( x ′ = argmin y ∈ T � x − y � 2 ) � g , x − x ′ � g ( T ) ≤ g ( S ε ) + E g sup x ∈ T � �� ≤ ε ·� g � 2

Gaussian mean width bound 2: ε -net • g ( T ) = E sup x ∈ T � g , x � • Let S ε be ε -net of ( T , ℓ 2 ) • � g , x � = � g , x ′ � + � g , x − x ′ � ( x ′ = argmin y ∈ T � x − y � 2 ) � g , x − x ′ � g ( T ) ≤ g ( S ε ) + E g sup x ∈ T � �� ≤ ε ·� g � 2 � log | S ε | + ε ( E g � g � 2 2 ) 1 / 2 • � + ε √ n • � log 1 / 2 N ( T , ℓ 2 , ε ) � �� smallest ε − net size

Gaussian mean width bound 2: ε -net • g ( T ) = E sup x ∈ T � g , x � • Let S ε be ε -net of ( T , ℓ 2 ) • � g , x � = � g , x ′ � + � g , x − x ′ � ( x ′ = argmin y ∈ T � x − y � 2 ) � g , x − x ′ � g ( T ) ≤ g ( S ε ) + E g sup x ∈ T � �� ≤ ε ·� g � 2 � log | S ε | + ε ( E g � g � 2 2 ) 1 / 2 • � + ε √ n • � log 1 / 2 N ( T , ℓ 2 , ε ) � �� smallest ε − net size • Choose ε to optimize bound; can never be worse than last slide (which amounts to choosing ε = 0)

Gaussian mean width bound 3: ε -net sequence • S k is a ( 1 / 2 k )-net of T , k ≥ 0 π k x is closest point in S k to x ∈ T , ∆ k x = π k x − π k − 1 x

Gaussian mean width bound 3: ε -net sequence • S k is a ( 1 / 2 k )-net of T , k ≥ 0 π k x is closest point in S k to x ∈ T , ∆ k x = π k x − π k − 1 x • wlog | T | < ∞ (else apply this slide to ε -net of T for ε small) • � g , x � = � g , π 0 x � + � ∞ k =1 � g , ∆ k x �

Gaussian mean width bound 3: ε -net sequence • S k is a ( 1 / 2 k )-net of T , k ≥ 0 π k x is closest point in S k to x ∈ T , ∆ k x = π k x − π k − 1 x • wlog | T | < ∞ (else apply this slide to ε -net of T for ε small) • � g , x � = � g , π 0 x � + � ∞ k =1 � g , ∆ k x � + � ∞ • g ( T ) ≤ E g sup � g , π 0 x � k =1 E g sup x ∈ T � g , ∆ k x � x ∈ T � �� 0

Gaussian mean width bound 3: ε -net sequence • S k is a ( 1 / 2 k )-net of T , k ≥ 0 π k x is closest point in S k to x ∈ T , ∆ k x = π k x − π k − 1 x • wlog | T | < ∞ (else apply this slide to ε -net of T for ε small) • � g , x � = � g , π 0 x � + � ∞ k =1 � g , ∆ k x � + � ∞ • g ( T ) ≤ E g sup � g , π 0 x � k =1 E g sup x ∈ T � g , ∆ k x � x ∈ T � �� 0 • |{ ∆ k x : x ∈ T }| ≤ N ( T , ℓ 2 , 1 / 2 k ) · N ( T , ℓ 2 , 1 / 2 k − 1 ) ≤ ( N ( T , ℓ 2 , 1 / 2 k )) 2

Gaussian mean width bound 3: ε -net sequence • S k is a ( 1 / 2 k )-net of T , k ≥ 0 π k x is closest point in S k to x ∈ T , ∆ k x = π k x − π k − 1 x • wlog | T | < ∞ (else apply this slide to ε -net of T for ε small) • � g , x � = � g , π 0 x � + � ∞ k =1 � g , ∆ k x � + � ∞ • g ( T ) ≤ E g sup � g , π 0 x � k =1 E g sup x ∈ T � g , ∆ k x � x ∈ T � �� 0 • |{ ∆ k x : x ∈ T }| ≤ N ( T , ℓ 2 , 1 / 2 k ) · N ( T , ℓ 2 , 1 / 2 k − 1 ) ≤ ( N ( T , ℓ 2 , 1 / 2 k )) 2 k =1 ( 1 / 2 k ) · log 1 / 2 N ( T , ℓ 2 , 1 / 2 k ) • g ( T ) � � ∞ � ∞ 0 log 1 / 2 N ( T , ℓ 2 , u ) du (Dudley’s theorem) �

Gaussian mean width bound 4: generic chaining • Again, wlog | T | < ∞ . Define T 0 ⊆ T 1 ⊆ · · · ⊆ T k ∗ = T | T 0 | = 1 , | T k | ≤ 2 2 k (call such a sequence “admissible”)

Gaussian mean width bound 4: generic chaining • Again, wlog | T | < ∞ . Define T 0 ⊆ T 1 ⊆ · · · ⊆ T k ∗ = T | T 0 | = 1 , | T k | ≤ 2 2 k (call such a sequence “admissible”) • Exercise: show Dudley’s theorem is equivalent to k =1 2 k / 2 · sup x ∈ T d ℓ 2 ( x , T k ) � ∞ g ( T ) � inf { T k } admissible (should pick T k to be the best ε = ε ( k ) net of size 2 2 k )

Gaussian mean width bound 4: generic chaining • Again, wlog | T | < ∞ . Define T 0 ⊆ T 1 ⊆ · · · ⊆ T k ∗ = T | T 0 | = 1 , | T k | ≤ 2 2 k (call such a sequence “admissible”) • Exercise: show Dudley’s theorem is equivalent to k =1 2 k / 2 · sup x ∈ T d ℓ 2 ( x , T k ) � ∞ g ( T ) � inf { T k } admissible (should pick T k to be the best ε = ε ( k ) net of size 2 2 k ) • Fernique’76 ∗ : can pull the sup x outside the sum � ∞ k =1 2 k / 2 · d ℓ 2 ( x , T k ) def • g ( T ) � inf { T k } sup x ∈ T = γ 2 ( T , ℓ 2 )

Gaussian mean width bound 4: generic chaining • Again, wlog | T | < ∞ . Define T 0 ⊆ T 1 ⊆ · · · ⊆ T k ∗ = T | T 0 | = 1 , | T k | ≤ 2 2 k (call such a sequence “admissible”) • Exercise: show Dudley’s theorem is equivalent to k =1 2 k / 2 · sup x ∈ T d ℓ 2 ( x , T k ) � ∞ g ( T ) � inf { T k } admissible (should pick T k to be the best ε = ε ( k ) net of size 2 2 k ) • Fernique’76 ∗ : can pull the sup x outside the sum � ∞ k =1 2 k / 2 · d ℓ 2 ( x , T k ) def • g ( T ) � inf { T k } sup x ∈ T = γ 2 ( T , ℓ 2 ) ∗ equivalent upper bound proven by Fernique (who minimized some integral over all measures over T ), but reformulated in terms of admissible sequences by Talgarand

An introduction to chaining, and applications to sublinear - PowerPoint PPT Presentation

An introduction to chaining, and applications to sublinear algorithms Jelani Nelson Harvard August 28, 2015 Whats this talk about? Whats this talk about? Given a collection of random variables X 1 , X 2 , . . . , , we would like to say

Chaining Operator in Climb Method Chaining jQuery Method Chaining Extended Climb Christopher

Sublinear Algorithms for Big Data Qin Zhang 1-1 Part 3: Sublinear in Time 2-1 Sublinear in

Sublinear Algorithms for Big Data Qin Zhang 1-1 Part 2: Sublinear in Communication 2-1

Random Local Exploration Techniques for Sublinear-Time Algorithms Krzysztof Onak IBM Research

Using first order logic (Ch. 9) Backward chaining Backward chaining is almost the opposite of

B669 Sublinear Algorithms for Big Data Qin Zhang 1-1 Part 1: Sublinear in Space 2-1 The model

B669 Sublinear Algorithms for Big Data Qin Zhang 1-1 Part 1: Sublinear in Space 2-1 The model

L ECTURE 2 Last time Introduction Basic models for sublinear-time computation Simple

Sublinear Algorithms for ( + 1) Vertex Coloring Sepehr Assadi University of Pennsylvania

Sublinear Geometric Algorithms Sublinear Geometric Algorithms B. Chazelle, D. Liu, A. Magen B.

Priority queues Hash tables chaining Priority queue ADT Binary heap March 13, 2020 Cinda

Sublinear Algorithms Lectures 1 and 2 Sofya Raskhodnikova Penn State University 1 Tentative

Sublinear Algorithms Lecture 5 Sofya Raskhodnikova Penn State University Thanks to Madhav Jha

Sublinear bounds for a quantitative Doignon-Bell-Scarf Theorem Stephen R. Chestnut Robert

L ECTURE 6 Last time Limitations of sublinear-time algorithms Yaos Minimax Principle

A Sublinear Bipartiteness Tester for Bounded Degree Graphs Oded Goldreich Dana Ron

Attractor neural networks Vi Tij, Tji Vj X U i = T ij V j Dynamics: j V i = sign( U i )

Judith Providence Computer Architecture CS 654 Outline Background/Motivation

Mobile Agents Rendezvous in Mesh-Networks in spite of a Malicious Agent Shantanu Das 1 , Flaminia

Leader Election in a Synchronous Ring Paulo S ergio Almeida Distributed Systems Group

Process Layout and Function Calls CS 161 Spring 2016 January 25, 2016 1 / 7 Process Layout

Review addressing modes Op Src Dst Comments movl $0, %rax Register movl $0, 0x605428

Deep Compressed Sensing Yan Wu, Mihaela Rosca, Tim Lillicrap Compressed Sensing A Brief Review

Bounds on Sparse Recovery with Additional Structures Abbas Kazemipour University of Maryland.