A Conditional Information Inequality and Its Combinatorial - PowerPoint PPT Presentation

A Conditional Information Inequality and Its Combinatorial Applications Nikolay Vereshchagin 1 based on the joint paper with Tarik Kaced and Andrey Romashchenko 1 Moscow State University, NRU Higher School of Ecomomics and Yandex MIPT 2019 1 / 19

Shannon entropy � H ( A ) = − P[ A = a ] · log 2 P[ A = a ] a � H ( A | B ) = − P[ A = a , B = b ] · log 2 P[ A = a | B = b ] a , b Theorem H ( A ) ≤ log 2 ( the number of outcomes of A ) and H ( A ) = log 2 ( the number of outcomes of A ) iff A has the uniform distribution. 2 / 19

Information inequalities Definition (Basic inequalities) The chain rule: H ( A , B ) = H ( A ) + H ( B | A ) , H ( A , B | C ) = H ( A | C ) + H ( B | A , C ) , Sub-additivity: H ( A , B ) ≤ H ( A ) + H ( B ) , H ( A , B | C ) ≤ H ( A | C ) + H ( B | C ) Definition Linear combinations of basic inequalities are called Shannon type inequalities. Example H ( B | A ) ≤ H ( B ) , H ( B | A , C ) ≤ H ( B | C ) . 3 / 19

Combinatorial applications of information inequalities (an example) Theorem (Shearer’s inequality) 2 · H ( A , B , C ) ≤ H ( A , B ) + H ( A , C ) + H ( B , C ) 4 / 19

Theorem (Shearer’s inequality) 2 · H ( A , B , C ) ≤ H ( A , B ) + H ( A , C ) + H ( B , C ) Proof. Add the following inequalities: H ( A , B , C ) = H ( A , B ) + H ( C | A , B ) H ( A , B , C ) ≤ H ( A ) + H ( B , C ) H ( C | A , B ) ≤ H ( C | A ) H ( A ) + H ( C | A ) = H ( A , C ) 5 / 19

Theorem (Loomis–Whitney inequality) The volume of a 3-dimensional body is at most the square root of the product of its 2-dimensional projections: V 2 ≤ S 1 S 2 S 3 . 6 / 19

Proof of the discrete version of Loomis–Whitney inequality. Let ( A , B , C ) be a random pixel in the body. Then H ( A , B , C ) = log 2 V H ( A , B ) ≤ log 2 S 1 H ( A , C ) ≤ log 2 S 2 H ( B , C ) ≤ log 2 S 3 Plug these values into Shearer’s inequality 2 · H ( A , B , C ) ≤ H ( A , B ) + H ( A , C ) + H ( B , C ) . 7 / 19

Mutual information Definition (mutual information) I ( A : B ) = H ( A ) + H ( B ) − H ( A , B ) I ( A : B | C ) = H ( A | C ) + H ( B | C ) − H ( A , B | C ) . Theorem I ( A : B ) = 0 iff A , B are independent. I ( A : B | C ) = 0 iff A , B are independent conditional to C. 8 / 19

Conditional inequalities (an example) Proposition The inequality I ( A : B | C ) ≤ I ( A : B ) is false for some A , B , C (let C = A ⊕ B ). Proposition However, I ( B : C | A ) = 0 ⇒ I ( A : B | C ) ≤ I ( A : B ) Moreover, I ( A : B | C ) ≤ I ( A : B ) + I ( B : C | A ) . for all A , B , C . 9 / 19

Proof of the inequality I ( A : B | C ) ≤ I ( A : B ) + I ( B : C | A ) Add the inequalities H ( A , B ) = H ( A ) + H ( B | A ) H ( B , C | A ) = H ( C | A ) + H ( B | A , C ) H ( A | C ) + H ( B | A , C ) = H ( A , B | C ) H ( B | C ) ≤ H ( B ) 10 / 19

Another evidence that mutual information is not material Theorem (folklore) H ( C ) ≤ H ( C | X ) + H ( C | Y ) + I ( X : Y ) for all C , X , Y . Theorem (Matuˇ s, Romashchenko) The inequality I ( A : B ) ≤ I ( A : B | X ) + I ( A : B | Y ) + I ( X : Y ) (Ingleton inequality) is false for some A , B , X , Y . 11 / 19

A non Shannon-type conditional inequality Example ( Zhang and Yeung (1997)) I ( X : Y | A ) = I ( X : Y ) = 0 ⇒ I ( A : B ) ≤ I ( A : B | X ) + I ( A : B | Y ) + I ( X : Y ) . Remark This inequality is essentially conditional: the inequality I ( A : B ) ≤ I ( A : B | X ) + I ( A : B | Y ) + I ( X : Y )+ c · ( I ( X : Y ) + I ( X : Y | A )) is wrong in general for any constant c . 12 / 19

Theorem The inequality H ( A | X , B ) + H ( A | Y , B ) ≤ H ( A | B ) holds true provided the supports of distribution of the pairs ( A , X ) and ( A , Y ) have the following property: P [ A = a , X = x ] > 0 , P [ A = a , Y = y ] > 0 , P [ A = a ′ , X = x ] > 0 , P [ A = a ′ , Y = y ] > 0 ⇒ a = a ′ X A Y x a y a ′ Remark 1. The condition here is weaker than that of Kaced and Romashchenko. 2. The condition here relativizes. 15 / 19

Proof Step 1. The general case reduces to the case of trivial B : H ( A | X ) + H ( A | Y ) ≤ H ( A ) . Step 2. The general case reduces further to the case when X , Y are independent conditional to A . Proof. Define new random variables A ′ , X ′ , Y ′ so that: the marginal distributions of ( A ′ , X ′ ) and ( A ′ , Y ′ ) are the same as the marginal distributions of ( A , X ) and ( A , Y ), but X ′ , Y ′ are independent conditional to A ′ . Step 3. We prove the Shannon type inequality H ( A | X ) + H ( A | Y ) ≤ H ( A ) + H ( A | X , Y ) + I ( X : Y | A ) 16 / 19

A combinatorial application of the inequality H ( A | X ) + H ( A | Y ) ≤ H ( A ) Theorem Assume that a finite family F of pair-wise disjoint squares is given, each square being a subset of [0 , 1] × [0 , 1] . Assume that each vertical line inside [0 , 1] × [0 , 1] intersects at least L squares in F and similarly each horizontal line intersects at least R squares in F. Then | F | ≥ LR. 17 / 19

The proof of a discrete version of the theorem (each square consists of pixels) Let A = S · T be a randomly chosen square from F , where the probability of each square is proportional to its length | S | = | T | (not area!). Let ( X , Y ) be a random pair from A (chosen with the uniform distribution). The conditions of the inequality are fulfilled hence H ( A | X ) + H ( A | Y ) ≤ H ( A ) One can show that A | X and A | Y have uniform distributions, hence H ( A | X ) ≥ log R and H ( A | Y ) ≥ log L . It follows that log L + log R ≤ H ( A ) . As H ( A ) ≤ log | F | , the theorem follows. 18 / 19

Thank you for attention! 19 / 19

A Conditional Information Inequality and Its Combinatorial - PowerPoint PPT Presentation

A Conditional Information Inequality and Its Combinatorial Applications Nikolay Vereshchagin 1 based on the joint paper with Tarik Kaced and Andrey Romashchenko 1 Moscow State University, NRU Higher School of Ecomomics and Yandex MIPT 2019 1 /

of Inequality Income Inequality and Belief in Meritocracy go Hand in Hand Introduction Across

Trade Policy Inequality Trade Policy Inequality Trade Policy, Inequality Trade Policy,

Inequality in the United States ECON 499: Economics of Inequality Winter 2018 What is

SOCI 210: Sociological Perspectives Oct. 13 1. Inequality & mobility 2. Social divisions and

11/15/16 Conditional distributions Let X and Y be discrete r.v.s. Conditional probability mass

Review: Conditional Probability Conditional Probability The conditional probability of event

Lectures on Economic Inequality Warwick, Summer 2018, Slides 2 Debraj Ray Inequality and

Lectures on Economic Inequality Warwick, Summer 2018, Slides 1 Debraj Ray Inequality and

Inequality and Development January 2011 () Inequality January 2011 1 / 23 Inequality Data

Violent Crime Inequality in the United States 1 download slides at: www.inequality.com/slides

Mortality ECON 499: Economics of Inequality Winter 2018 What are the consequences of inequality?

15. The Conditional 15.1 The conditional: Formation and uses 15.2 Mise en pratique 15.1 The

Formal Modeling in Cognitive Science Independence Lecture 23: Conditional Probability; Bayes

Markov random fields 2. conditional specifications 3. conditional auto-regression Rasmus

Conditional Statements Python Conditional Statements Sometimes a statement (or a block of

Conditional Sentences as Conditional Speech Acts Workshop Questioning Speech Acts Universitt

. ( key key - total j 'D keys closer to root float wt Ivo , . . .vn . , ) weight - and

P o l a r C o d e s o v e r q - a r y A l p h a b e t s a n d P o

The Shannon Total Variation Rmy Abergel, joint work with Lionel Moisan. CNRS, MAP5 Laboratory,

Notation Given a cryptosystem, denote Shannons Theory of Secrecy Systems M a message

How the Concept of Shannons Derivation: . . . Shannons Derivation . . . Case of a

CSE 417 c 12% Compression Example d 16% e 9% Algorithms f 5% 100k file, 6 letter

CSE 421 Algorithms Summer 2007 Huffman Codes: An Optimal Data Compression Method 1 a 45% b

A Brief Introduction to Graphical Models and How to Learn Them from Data Christian Borgelt Dept.

A Conditional Information Inequality and Its Combinatorial - PowerPoint PPT Presentation

A Conditional Information Inequality and Its Combinatorial Applications Nikolay Vereshchagin 1 based on the joint paper with Tarik Kaced and Andrey Romashchenko 1 Moscow State University, NRU Higher School of Ecomomics and Yandex MIPT 2019 1 /

of Inequality Income Inequality and Belief in Meritocracy go Hand in Hand Introduction Across

Trade Policy Inequality Trade Policy Inequality Trade Policy, Inequality Trade Policy,

Inequality in the United States ECON 499: Economics of Inequality Winter 2018 What is

SOCI 210: Sociological Perspectives Oct. 13 1. Inequality &amp; mobility 2. Social divisions and

11/15/16 Conditional distributions Let X and Y be discrete r.v.s. Conditional probability mass

Review: Conditional Probability Conditional Probability The conditional probability of event

Lectures on Economic Inequality Warwick, Summer 2018, Slides 2 Debraj Ray Inequality and

Lectures on Economic Inequality Warwick, Summer 2018, Slides 1 Debraj Ray Inequality and

Inequality and Development January 2011 () Inequality January 2011 1 / 23 Inequality Data

Violent Crime Inequality in the United States 1 download slides at: www.inequality.com/slides

Mortality ECON 499: Economics of Inequality Winter 2018 What are the consequences of inequality?

15. The Conditional 15.1 The conditional: Formation and uses 15.2 Mise en pratique 15.1 The

Formal Modeling in Cognitive Science Independence Lecture 23: Conditional Probability; Bayes

Markov random fields 2. conditional specifications 3. conditional auto-regression Rasmus

Conditional Statements Python Conditional Statements Sometimes a statement (or a block of

Conditional Sentences as Conditional Speech Acts Workshop Questioning Speech Acts Universitt

. ( key key - total j 'D keys closer to root float wt Ivo , . . .vn . , ) weight - and

P o l a r C o d e s o v e r q - a r y A l p h a b e t s a n d P o

The Shannon Total Variation Rmy Abergel, joint work with Lionel Moisan. CNRS, MAP5 Laboratory,

Notation Given a cryptosystem, denote Shannons Theory of Secrecy Systems M a message

How the Concept of Shannons Derivation: . . . Shannons Derivation . . . Case of a

CSE 417 c 12% Compression Example d 16% e 9% Algorithms f 5% 100k file, 6 letter

CSE 421 Algorithms Summer 2007 Huffman Codes: An Optimal Data Compression Method 1 a 45% b

A Brief Introduction to Graphical Models and How to Learn Them from Data Christian Borgelt Dept.

SOCI 210: Sociological Perspectives Oct. 13 1. Inequality & mobility 2. Social divisions and