On Local Distributed Sampling and Counting Yitong Yin Nanjing - PowerPoint PPT Presentation

On Local Distributed Sampling and Counting Yitong Yin Nanjing University Joint work with W eiming Feng ( Nanjing University )

Counting and Sampling [Jerrum-Valiant-Vazirani ’86]: (For self-reducible problems) approx. counting (approx., exact) sampling is tractable is tractable

Computational Phase Transition Sampling almost-uniform independent set in graphs with maximum degree ∆ : • [Weitz, STOC’06] : If ∆≤ 5 , poly-time. • [Sly, best paper in FOCS’10] : If ∆≥ 6 , no poly-time algorithm unless NP = RP . A phase transition occurs when ∆ : 5 → 6 . Local Computation?

Local Computation “ What can be computed locally? ” [Naor, Stockmeyer ’93] the LOCAL model [Linial ’87] : • Communications are synchronized. • In each round, each node can: exchange unbounded messages with all neighbors perform unbounded local computation read/write to unbounded local memory. • In t rounds: each node can collect information up to distance t .

Example : Sample Independent Set µ : uniform distribution of independent sets in G . Y ∈ {0,1} V indicates an independent set • Each v ∈ V returns a Y v ∈ {0,1} , such that Y = ( Y v ) v ∈ V ∼ µ • Or: d TV ( Y , µ ) < 1/poly( n ) network G ( V , E )

Inference (Local Counting) µ : uniform distribution of independent sets in G . : marginal distribution at v conditioning on σ ∈ {0,1} S . µ σ v ∀ y ∈ { 0 , 1 } : v ( y ) = Pr Y ∼ µ [ Y v = y | Y S = σ ] µ σ 0 • Each v ∈ S receives σ v as input. • Each v ∈ V returns a marginal 0 distribution such that: µ σ ˆ v 1 1 1 d TV (ˆ v ) ≤ µ σ v , µ σ poly( n ) n 1 Y Z = µ ( ∅ ) = Y ∼ µ [ Y v i = 0 | ∀ j < i : Y v j = 0] Pr network G ( V , E ) i =1 Z : # of independent sets

Gibbs Distribution (with pairwise interactions) • Each vertex corresponds to a network G ( V , E ): variable with finite domain [ q ] . • Each edge e =( u , v ) ∈ E has a matrix (binary constraint): A e v u b v A e : [ q ] × [ q ] → [0,1] • Each vertex v ∈ V has a vector (unary constraint): b v : [ q ] → [0,1] • Gibbs distribution µ : ∀ σ ∈ [ q ] V Y Y µ ( σ ) ∝ A e ( σ u , σ v ) b v ( σ v ) e =( u,v ) ∈ E v ∈ V

Gibbs Distribution (with pairwise interactions) • Gibbs distribution µ : ∀ σ ∈ [ q ] V network G ( V , E ): Y Y µ ( σ ) ∝ A e ( σ u , σ v ) b v ( σ v ) e =( u,v ) ∈ E v ∈ V • independent set: A e v u b v  1 �  1 � 1 A e = b v = 1 0 1 • local conflict colorings : [Fraigniaud, Heinrich, Kosowski, FOCS’16] A e : [ q ] × [ q ] → {0,1} A e : [ q ] × [ q ] → [0,1] b v : [ q ] → {0,1} b v : [ q ] → [0,1]

Gibbs Distribution • Gibbs distribution µ : ∀ σ ∈ [ q ] V network G ( V , E ): Y µ ( σ ) ∝ f ( σ S ) ( f,S ) ∈ F each ( f, S ) ∈ F S is a local constraints (factors): f : [ q ] S → R ≥ 0 S ⊆ V with diam G ( S ) = O (1)

A Motivation: Distributed Machine Learning • Data are stored in a distributed system. • Distributed algorithms for: • sampling from a joint distribution (specified by a probabilistic graphical model ); • inferring according to a probabilistic graphical model.

Computational Phase Transition Sampling almost-uniform independent set in graphs with maximum degree ∆ : • [Weitz, STOC’06] : If ∆≤ 5 , poly-time. • [Sly, FOCS’10] : If ∆≥ 6 , no poly-time algorithm unless NP = RP . A phase transition occurs when ∆ : 5 → 6 .

Decay of Correlation : marginal distribution at v conditioning on σ ∈ {0,1} S . µ σ v strong spatial mixing (SSM): ∀ boundary condition B ∈ {0,1} r -sphere( v ) : v , µ σ ,B d TV ( µ σ ) ≤ poly( n ) · exp( − Ω ( r )) v SSM (iff ∆≤ 5 when µ is uniform G distribution of ind. sets) r approx. inference is solvable v B in O(log n ) rounds σ in the LOCAL model

Locality of Counting & Sampling For Gibbs distributions (defined by local factors): Inference: Sampling: Correlation Decay: local approx. local approx. SSM inference sampling easy with additive error O(log 2 n ) factor local approx. local exact inference sampling with multiplicative error

Locality of Sampling Inference: Sampling: Correlation Decay: local approx. local approx. SSM inference sampling return a random Y = ( Y v ) v ∈ V each v can compute a µ σ ˆ v within O(log n ) -ball whose distribution ˆ µ ≈ µ 1 s.t. 1 d TV (ˆ µ, µ ) ≤ d TV (ˆ v ) ≤ µ σ v , µ σ poly( n ) poly( n ) sequential O(log n ) -local procedure: • scan vertices in V in an arbitrary order v 1 , v 2 , …, v n • for i =1,2, …, n : sample according to Y v 1 ,...,Y vi − 1 Y v i ˆ µ v i

Network Decomposition ( C , D ) -network-decomposition of G : • classifies vertices into clusters; • assign each cluster a color in [ C ] ; • each cluster has diameter ≤ D ; • clusters are properly colored. ( C , D ) r -ND: ( C , D ) -ND of G r Given a ( C , D ) r - ND: sequential r -local procedure: r = O(log n ) r = O(log n ) • scan vertices in V in an arbitrary order v 1 , v 2 , …, v n • for i =1,2, …, n : sample according to Y v 1 ,...,Y vi − 1 Y v i ˆ µ v i can be simulated in O( CDr ) rounds in LOCAL model

Network Decomposition ( C , D ) -network-decomposition of G : • classifies vertices into clusters; • assign each cluster a color in [ C ] ; • each cluster has diameter ≤ D ; • clusters are properly colored. ( C , D ) r -ND: ( C , D ) -ND of G r ( O(log n ), O(log n )) r -ND can be constructed in O( r log 2 n ) rounds w.h.p. [Linial, Saks, 1993] — [Ghaffari, Kuhn, Maus, 2017]: r -local SLOCAL algorithm: O( r log 2 n ) -round LOCAL alg.: ND ∀ ordering π =( v 1 , v 2 , …, v n ) , returns w.h.p. the Y ( π ) for some ordering π returns random vector Y ( π )

Locality of Sampling Inference: Sampling: Correlation Decay: O(log n )- round O(log 3 n )- round local approx. local approx. SSM inference sampling with additive error local approx. local exact inference sampling with multiplicative error

Local Exact Sampler In LOCAL model: • Each v ∈ V returns within fixed t ( n ) rounds: • local output Y v ∈ {0,1}; • local failure F v ∈ {0,1} . • Succeeds w.h.p.: ∑ v ∈ V E [ F v ] = O(1/ n ). • Correctness: conditioning on success, Y ~ µ.

Jerrum-Valiant-Vazirani Sampler [ J errum- V aliant- V azirani ’86] ∃ an efficient algorithm that samples from ˆ µ µ ( σ ) given any σ ∈ { 0 , 1 } V and evaluates ˆ e − 1 /n 2 ≤ ˆ µ ( σ ) multiplicative error: ∀ σ ∈ { 0 , 1 } V : µ ( σ ) ≤ e 1 /n 2 Self-reduction: n n Z ( σ 1 , . . . , σ i ) Y µ σ 1 ,..., σ i − 1 Y µ ( σ ) = ( σ i ) = v i Z ( σ 1 , . . . , σ i − 1 ) i =1 i =1 ˆ Z ( σ 1 , . . . , σ i ) let Z ( σ 1 , . . . , σ i − 1 ) ≈ e ± 1 /n 3 · µ σ 1 ,..., σ i − 1 µ σ 1 ,..., σ i − 1 ˆ ( σ i ) = ( σ i ) v i v i ˆ e − 1 / 2 n 3 ≤ ˆ where by approx. counting Z ( ··· ) Z ( ··· ) ≤ e 1 / 2 n 3

Jerrum-Valiant-Vazirani Sampler [ J errum- V aliant- V azirani ’86] ∃ an efficient algorithm that samples from ˆ µ µ ( σ ) given any σ ∈ { 0 , 1 } V and evaluates ˆ e − 1 /n 2 ≤ ˆ µ ( σ ) multiplicative error: ∀ σ ∈ { 0 , 1 } V : µ ( σ ) ≤ e 1 /n 2 Sample a random ; Y ∼ ˆ µ pick Y 0 = ∅ ; q = ˆ µ ( Y 0 ) h i e − 5 /n 2 , 1 accept Y with prob.: µ ( Y ) · e − 3 n 2 ∈ ˆ fail if otherwise; ∀ σ ∈ { 0 , 1 } V : ( 1 σ is ind. set µ ( ∅ ) µ ( σ ) · ˆ µ ( σ ) · e − 3 Pr[ Y = σ ∧ accept] = ˆ ∝ n 2 ˆ 0 otherwise

Boosting Local Inference additive error: ( local approx. 1 d TV (ˆ v ) ≤ µ σ v , µ σ SSM poly( n ) inference multiplicative error: each v computes a µ σ ˆ ˆ v (0) v (0) , ˆ v (1) v µ σ µ σ h e − 1 / poly( n ) , e 1 / poly( n ) i v (1) ∈ within r -ball µ σ µ σ local self-reduction SSM both are achievable with r = O(log n ) boosted sequential r -local sampler: r = O(log n ) • scan vertices in V in an arbitrary order v 1 , v 2 , …, v n • for i =1,2, …, n : sample according to Y v 1 ,...,Y vi − 1 Y v i ˆ µ v i e − 1 /n 2 ≤ ˆ µ ( σ ) multiplicative error: ∀ σ ∈ { 0 , 1 } V : µ ( σ ) ≤ e 1 /n 2

SLOCAL JVV Scan vertices in V in an arbitrary order v 1 , v 2 , …, v n : pass 1 : sample Y ∈ {0,1} V by boosted sequential r -local sampler ; ˆ µ ∀ σ ∈ [ q ] V : e − 1 /n 2 ≤ ˆ µ ( σ ) r = O(log n ) µ ( σ ) ≤ e 1 /n 2 pass 1’ : construct a sequence of ind. sets ∅ = Y 0 , Y 1 , …, Y n = Y ; s.t. ∀ 0 ≤ i ≤ n : • Y i agrees with Y over v 1 , …, v i • Y i and Y i- 1 differ only at v i v i samples independently with F v i ∈ { 0 , 1 } Pr[ F v i = 0] = q v i q v i = ˆ µ ( Y i − 1 ) · e − 3 /n 2 ∈ [e − 5 /n 2 , 1] where µ ( Y i ) ˆ Each v ∈ V returns: O(log n ) -local • Y v ∈ {0,1} to indicate the ind. set; to compute • F v ∈ {0,1} indicate failure at v .

On Local Distributed Sampling and Counting Yitong Yin Nanjing - PowerPoint PPT Presentation

On Local Distributed Sampling and Counting Yitong Yin Nanjing University Joint work with W eiming Feng ( Nanjing University ) Counting and Sampling [Jerrum-Valiant-Vazirani 86]: (For self-reducible problems) approx. counting (approx.,

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

Local Distributed Sampling ! om Locally - Defined Distribution Yitong Yin Nanjing University

Sampling Sediment and Sampling Sediment and Sampling Sediment and Porewater Sampling Sediment

Sampling Overview R toy sampling Non-probability sampling Probability Methods (AKA random)

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

Approximate Counting By Sampling CompSci 590.02 Instructor: AshwinMachanavajjhala Lecture 3 :

44 Days And Counting 44 Days And Counting 2010 World Equestrian Games Overview September 25

Counting and Probability Whats to come? Counting and Probability Whats to come?

Counting CS1200, CSE IIT Madras Meghana Nasre April 2, 2020 CS1200, CSE IIT Madras Meghana

Counting CS1200, CSE IIT Madras Meghana Nasre March 26, 2020 CS1200, CSE IIT Madras Meghana

Counting is Hard: Probabilistically Counting Views at Reddit Krishnan Chandra, Data Engineer

Counting Basic 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 of 1 10/02/2003 04:00 PM 1

Newfound Water Quality Sampling: In Lake Sampling 8 Historic Sampling locations

Error Correction and Erasure Codes in Wireless Communication Networks Pouya Ostovari and Jie Wu

CSC 411 Lectures 1617: Expectation-Maximization Roger Grosse, Amir-massoud Farahmand, and Juan

Tutorial on Estimation and Multivariate Gaussians STAT 27725/CMSC 25400: Machine Learning

AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Example Handwritten Digit Recognition Polynomial

Constructive canonicity for lattice-based fixed point logics Zhiguang Zhao Joint work with

Joint stateparameter estimation for nonlinear stochastic energy balance models Fei Lu 1 Nils

Joint State Sensing and Communication: Theory and Applications Mari Kobayashi A joint work with

Counting partitions of a fixed genus G abor Hetyei Joint work with Robert Cori Department of