modified log sobolev inequalities for strongly log
play

Modified log-Sobolev inequalities for strongly log-concave - PowerPoint PPT Presentation

Modified log-Sobolev inequalities for strongly log-concave distributions Heng Guo (University of Edinburgh) Tsinghua University Jun 25th, 2019 Joint with Mary Cryan and Giorgos Mousa (Edinburgh) Strongly log-concave distributions This


  1. Modified log-Sobolev inequalities for strongly log-concave distributions Heng Guo (University of Edinburgh) Tsinghua University Jun 25th, 2019 Joint with Mary Cryan and Giorgos Mousa (Edinburgh)

  2. Strongly log-concave distributions

  3. This distribution satisfies the condition, but it is not even unimodal. Discrete log-concave distribution What is the correct definition of a log-concave distribution? Consider and all other are . What about high dimensions? What about 1 dimension? For π : [ n ] → R ⩾ 0 , π ( i + 1 ) π ( i − 1 ) ⩽ π ( i ) 2 ?

  4. Discrete log-concave distribution What is the correct definition of a log-concave distribution? What about high dimensions? What about 1 dimension? For π : [ n ] → R ⩾ 0 , π ( i + 1 ) π ( i − 1 ) ⩽ π ( i ) 2 ? Consider π ( 1 ) = 1/2, π ( n ) = 1/2 and all other π ( i ) are 0 . This distribution satisfies the condition, but it is not even unimodal.

  5. Strongly log-concave polynomials Log-concave polynomial semi-definite. Strongly log-concave polynomial Originally introduced by Gurvitz (2009), equivalent to: • completely log-concave (Anari, Oveis Gharan, and Vinzant, 2018); • Lorentzian polynomials (Brändén and Huh, 2019+). A polynomial p ∈ R ⩾ 0 [ x 1 , . . . , x n ] is log-concave (at x ) if the Hessian ∇ 2 log p ( x ) is negative ∇ 2 p ( x ) has at most one positive eigenvalue. ⇒ A polynomial p ∈ R ⩾ 0 [ x 1 , . . . , x n ] is strongly log-concave if for any index set I ⊆ [ n ] , ∂ I p is log-concave at 1 .

  6. Strongly log-concave distributions An important example of homogeneous strongly log-concave distributions is the uniform distri- bution over bases of a matroid (Anari, Oveis Gharan, and Vinzant 2018; Brändén and Huh 2019+). A distribution π : 2 [ n ] → R ⩾ 0 is strongly log-concave if so is its generating polynomial ∑ ∏ g π ( x ) = π ( S ) x i . S ⊆ [ n ] i ∈ S

  7. Matroid dent sets) such that: Maximum independent sets are the bases. For any two bases, there is a sequence of exchanges of ground set elements from one to the other. A matroid M = ( E, I ) consists of a finite ground set E and a collection I of subsets of E (indepen- • ∅ ∈ I ; • if S ∈ I , T ⊆ S , then T ∈ I (downward closed); • if S, T ∈ I and | S | > | T | , then there exists an element i ∈ S \ T such that T ∪ { i } ∈ I . Let n = | E | and r be the rank, namely the size of any basis.

  8. Example — graphic matroids Spanning trees for graphs form the bases of graphic matroids. Nelson (2018): Almost all matroids are non-representable!

  9. Real stable polynomials (and strongly Rayleigh distributions) capture only “balanced” matroids, Alternative characterisation for SLC whereas SLC polynomials capture all matroids. Brändén and Huh (2019+): An r -homogeneous multiafgine polynomial p with non-negative coef- ficients is strongly log-concave if and only if: • the support of p is a matroid; • afuer taking r − 2 partial derivatives, the quadratic is real stable or 0 . Real stable: p ( x ) ̸ = 0 if ℑ ( x i ) > 0 for all i .

  10. Bases-exchange walk The implementation of the second step may be non-trivial. The mixing time measures the convergence rate of a Markov chain: The following Markov chain P BX ,π converges to a homogeneous SLC π : 1. remove an element uniformly at random from the current basis (call the resulting set S ); 2. add i ̸∈ S with probability proportional to π ( S ∪ { i } ) . t | ∥ P t ( x 0 , · ) − π ∥ TV ⩽ ε { } t mix ( P, ε ) := min . t

  11. Example — bases-exchange 1. Remove an edge uniformly at random; 2. Add back one of the available choices uniformly at random.

  12. Example — bases-exchange 2. Add back one of the available choices uniformly at random. → 1. Remove an edge uniformly at random;

  13. Example — bases-exchange 2. Add back one of the available choices uniformly at random. → 1. Remove an edge uniformly at random;

  14. Example — bases-exchange 1. Remove an edge uniformly at random; → 2. Add back one of the available choices uniformly at random.

  15. Example — bases-exchange 1. Remove an edge uniformly at random; → 2. Add back one of the available choices uniformly at random.

  16. Example — bases-exchange 2. Add back one of the available choices uniformly at random. → 1. Remove an edge uniformly at random;

  17. Example — bases-exchange 2. Add back one of the available choices uniformly at random. → 1. Remove an edge uniformly at random;

  18. Example — bases-exchange 1. Remove an edge uniformly at random; → 2. Add back one of the available choices uniformly at random.

  19. Example — bases-exchange 1. Remove an edge uniformly at random; → 2. Add back one of the available choices uniformly at random.

  20. Example — bases-exchange 1. Remove an edge uniformly at random; 2. Add back one of the available choices uniformly at random.

  21. Example — bases-exchange 1. Remove an edge uniformly at random; 2. Add back one of the two choices uniformly at random. If we encode the state as a binary string, then this is just the lazy random walk on the Boolean hypercube . (The rank of this matroid is and the ground set has size .) The mixing time is .

  22. Example — bases-exchange 2. Add back one of the two choices uniformly at random. If we encode the state as a binary string, then this is just the lazy random walk on the Boolean hypercube . (The rank of this matroid is and the ground set has size .) The mixing time is . → 1. Remove an edge uniformly at random;

  23. Example — bases-exchange 1. Remove an edge uniformly at random; If we encode the state as a binary string, then this is just the lazy random walk on the Boolean hypercube . (The rank of this matroid is and the ground set has size .) The mixing time is . → 2. Add back one of the two choices uniformly at random.

  24. Example — bases-exchange 2. Add back one of the two choices uniformly at random. If we encode the state as a binary string, then this is just the lazy random walk on the Boolean hypercube . (The rank of this matroid is and the ground set has size .) The mixing time is . → 1. Remove an edge uniformly at random;

  25. Example — bases-exchange 1. Remove an edge uniformly at random; If we encode the state as a binary string, then this is just the lazy random walk on the Boolean hypercube . (The rank of this matroid is and the ground set has size .) The mixing time is . → 2. Add back one of the two choices uniformly at random.

  26. Example — bases-exchange 1. Remove an edge uniformly at random; 2. Add back one of the two choices uniformly at random. If we encode the state as a binary string, then this is just the lazy random walk on the Boolean hypercube { 0, 1 } r . (The rank of this matroid is r and the ground set has size n = 2r .) The mixing time is Θ ( r log r ) .

  27. Main result — mixing time Theorem (mixing time) Previously, Anari, Liu, Oveis Gharan, and Vinzant (2019): The bound is asymptotically optimal, shown by the previous example. For any r -homogeneous strongly log-concave distribution π , ( 1 + log 1 ) t mix ( P BX ,π , ε ) ⩽ r log log , 2ε 2 π min where π min = min x ∈ Ω π ( x ) . ( ) 1 + log 1 t mix ( P BX ,π , ε ) ⩽ r log π min ε E.g. for the uniform distribution over bases of matroids (with n elements and rank r ), our bound is O ( r ( log r + log log n )) , whereas the previous bound is O ( r 2 log n ) .

  28. Main result — concentration Theorem (concentration bounds) and Peres (2014); see also Hermon and Salez (2019+). Let π and P BX ,π be as before, and Ω be the support of π . For any observable function f : Ω → R and a ⩾ 0 , a 2 ( ) x ∼ π ( | f ( x ) − E π f | ⩾ a ) ⩽ 2 exp Pr − , 2rv ( f ) where v ( f ) is the maximum of one-step variances     ∑ P BX ,π ( x, y )( f ( x ) − f ( y )) 2 v ( f ) := max  . x ∈ Ω  y ∈ Ω For c -Lipschitz function f , v ( f ) ⩽ c 2 . Generalises concentration of Lipschitz functions in strongly Rayleigh distributions by Pemantle

  29. Dirichlet form For reversible Markov chains, For a Markov chain P and two functions f and g over the state space Ω , E P ( f, g ) := g T diag ( π ) L f. (the Laplacian L := I − P ) E P ( f, g ) = 1 ∑ π ( x ) P ( x, y )( f ( x ) − f ( y )))( g ( x ) − g ( y )) . 2 x,y ∈ Ω

  30. Modified log-Sobolev inequality Theorem (modified log-Sobolev inequality) Both main results are consequences of this. For any f : Ω → R ⩾ 0 , E P BX ,π ( f, log f ) ⩾ 1 r · Ent π ( f ) , Ent π ( f ) is defined by Ent π ( f ) := E π ( f ◦ log f ) − E π f · log E π f. If we normalise E π f = 1 , then Ent π ( f ) = D ( π ◦ f ∥ π ) , the relative entropy (or Kullback–Leibler divergence) between π ◦ f and π .

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend