a conditional information inequality and its
play

A Conditional Information Inequality and Its Combinatorial - PowerPoint PPT Presentation

A Conditional Information Inequality and Its Combinatorial Applications Nikolay Vereshchagin 1 based on the joint paper with Tarik Kaced and Andrey Romashchenko 1 Moscow State University, NRU Higher School of Ecomomics and Yandex MIPT 2019 1 /


  1. A Conditional Information Inequality and Its Combinatorial Applications Nikolay Vereshchagin 1 based on the joint paper with Tarik Kaced and Andrey Romashchenko 1 Moscow State University, NRU Higher School of Ecomomics and Yandex MIPT 2019 1 / 19

  2. Shannon entropy � H ( A ) = − P[ A = a ] · log 2 P[ A = a ] a � H ( A | B ) = − P[ A = a , B = b ] · log 2 P[ A = a | B = b ] a , b Theorem H ( A ) ≤ log 2 ( the number of outcomes of A ) and H ( A ) = log 2 ( the number of outcomes of A ) iff A has the uniform distribution. 2 / 19

  3. Information inequalities Definition (Basic inequalities) The chain rule: H ( A , B ) = H ( A ) + H ( B | A ) , H ( A , B | C ) = H ( A | C ) + H ( B | A , C ) , Sub-additivity: H ( A , B ) ≤ H ( A ) + H ( B ) , H ( A , B | C ) ≤ H ( A | C ) + H ( B | C ) Definition Linear combinations of basic inequalities are called Shannon type inequalities. Example H ( B | A ) ≤ H ( B ) , H ( B | A , C ) ≤ H ( B | C ) . 3 / 19

  4. Combinatorial applications of information inequalities (an example) Theorem (Shearer’s inequality) 2 · H ( A , B , C ) ≤ H ( A , B ) + H ( A , C ) + H ( B , C ) 4 / 19

  5. Theorem (Shearer’s inequality) 2 · H ( A , B , C ) ≤ H ( A , B ) + H ( A , C ) + H ( B , C ) Proof. Add the following inequalities: H ( A , B , C ) = H ( A , B ) + H ( C | A , B ) H ( A , B , C ) ≤ H ( A ) + H ( B , C ) H ( C | A , B ) ≤ H ( C | A ) H ( A ) + H ( C | A ) = H ( A , C ) 5 / 19

  6. Theorem (Loomis–Whitney inequality) The volume of a 3-dimensional body is at most the square root of the product of its 2-dimensional projections: V 2 ≤ S 1 S 2 S 3 . 6 / 19

  7. Proof of the discrete version of Loomis–Whitney inequality. Let ( A , B , C ) be a random pixel in the body. Then H ( A , B , C ) = log 2 V H ( A , B ) ≤ log 2 S 1 H ( A , C ) ≤ log 2 S 2 H ( B , C ) ≤ log 2 S 3 Plug these values into Shearer’s inequality 2 · H ( A , B , C ) ≤ H ( A , B ) + H ( A , C ) + H ( B , C ) . 7 / 19

  8. Mutual information Definition (mutual information) I ( A : B ) = H ( A ) + H ( B ) − H ( A , B ) I ( A : B | C ) = H ( A | C ) + H ( B | C ) − H ( A , B | C ) . Theorem I ( A : B ) = 0 iff A , B are independent. I ( A : B | C ) = 0 iff A , B are independent conditional to C. 8 / 19

  9. Conditional inequalities (an example) Proposition The inequality I ( A : B | C ) ≤ I ( A : B ) is false for some A , B , C (let C = A ⊕ B ). Proposition However, I ( B : C | A ) = 0 ⇒ I ( A : B | C ) ≤ I ( A : B ) Moreover, I ( A : B | C ) ≤ I ( A : B ) + I ( B : C | A ) . for all A , B , C . 9 / 19

  10. Proof of the inequality I ( A : B | C ) ≤ I ( A : B ) + I ( B : C | A ) Add the inequalities H ( A , B ) = H ( A ) + H ( B | A ) H ( B , C | A ) = H ( C | A ) + H ( B | A , C ) H ( A | C ) + H ( B | A , C ) = H ( A , B | C ) H ( B | C ) ≤ H ( B ) 10 / 19

  11. Another evidence that mutual information is not material Theorem (folklore) H ( C ) ≤ H ( C | X ) + H ( C | Y ) + I ( X : Y ) for all C , X , Y . Theorem (Matuˇ s, Romashchenko) The inequality I ( A : B ) ≤ I ( A : B | X ) + I ( A : B | Y ) + I ( X : Y ) (Ingleton inequality) is false for some A , B , X , Y . 11 / 19

  12. A non Shannon-type conditional inequality Example ( Zhang and Yeung (1997)) I ( X : Y | A ) = I ( X : Y ) = 0 ⇒ I ( A : B ) ≤ I ( A : B | X ) + I ( A : B | Y ) + I ( X : Y ) . Remark This inequality is essentially conditional: the inequality I ( A : B ) ≤ I ( A : B | X ) + I ( A : B | Y ) + I ( X : Y )+ c · ( I ( X : Y ) + I ( X : Y | A )) is wrong in general for any constant c . 12 / 19

  13. A non Shannon-type unconditional inequality Theorem (Makarychev, Makarychev, Romashchenko, V.’ 2002) I ( A : B ) ≤ I ( A : B | X ) + I ( A : B | Y ) + I ( X : Y ) + I ( A : B | C ) + I ( B : C | A ) + I ( C : A | B ) 13 / 19

  14. Another non Shannon-type conditional inequality Example (Kaced and Romashchenko (2013)) I ( X : Y | A ) = H ( A | X , Y ) = 0 ⇒ I ( A : B ) ≤ I ( A : B | X ) + I ( A : B | Y ) + I ( X : Y ) . A reformulation: I ( X : Y | A ) = H ( A | X , Y ) = 0 ⇒ H ( A | X , B ) + H ( A | Y , B ) ≤ H ( A | B ) . This talk: we “demystify” Kaced and Romashchenko’s inequality and present its combinatorial application. 14 / 19

  15. Theorem The inequality H ( A | X , B ) + H ( A | Y , B ) ≤ H ( A | B ) holds true provided the supports of distribution of the pairs ( A , X ) and ( A , Y ) have the following property: P [ A = a , X = x ] > 0 , P [ A = a , Y = y ] > 0 , P [ A = a ′ , X = x ] > 0 , P [ A = a ′ , Y = y ] > 0 ⇒ a = a ′ X A Y x a y a ′ Remark 1. The condition here is weaker than that of Kaced and Romashchenko. 2. The condition here relativizes. 15 / 19

  16. Proof Step 1. The general case reduces to the case of trivial B : H ( A | X ) + H ( A | Y ) ≤ H ( A ) . Step 2. The general case reduces further to the case when X , Y are independent conditional to A . Proof. Define new random variables A ′ , X ′ , Y ′ so that: the marginal distributions of ( A ′ , X ′ ) and ( A ′ , Y ′ ) are the same as the marginal distributions of ( A , X ) and ( A , Y ), but X ′ , Y ′ are independent conditional to A ′ . Step 3. We prove the Shannon type inequality H ( A | X ) + H ( A | Y ) ≤ H ( A ) + H ( A | X , Y ) + I ( X : Y | A ) 16 / 19

  17. A combinatorial application of the inequality H ( A | X ) + H ( A | Y ) ≤ H ( A ) Theorem Assume that a finite family F of pair-wise disjoint squares is given, each square being a subset of [0 , 1] × [0 , 1] . Assume that each vertical line inside [0 , 1] × [0 , 1] intersects at least L squares in F and similarly each horizontal line intersects at least R squares in F. Then | F | ≥ LR. 17 / 19

  18. The proof of a discrete version of the theorem (each square consists of pixels) Let A = S · T be a randomly chosen square from F , where the probability of each square is proportional to its length | S | = | T | (not area!). Let ( X , Y ) be a random pair from A (chosen with the uniform distribution). The conditions of the inequality are fulfilled hence H ( A | X ) + H ( A | Y ) ≤ H ( A ) One can show that A | X and A | Y have uniform distributions, hence H ( A | X ) ≥ log R and H ( A | Y ) ≥ log L . It follows that log L + log R ≤ H ( A ) . As H ( A ) ≤ log | F | , the theorem follows. 18 / 19

  19. Thank you for attention! 19 / 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend