6 975 graduate seminar in area i a relationship between
play

6.975 Graduate Seminar in Area I A Relationship Between Information - PowerPoint PPT Presentation

6.975 Graduate Seminar in Area I A Relationship Between Information Inequalities and Group Theory Desmond Lun 23 October 2002 1 The result Let N = { 1 , 2 , . . . , n } and be the set of all non-empty subsets of N . Let { b }


  1. 6.975 Graduate Seminar in Area I A Relationship Between Information Inequalities and Group Theory Desmond Lun 23 October 2002 1

  2. The result Let N = { 1 , 2 , . . . , n } and Ω be the set of all non-empty subsets of N . Let { b α } α ∈ Ω be a set of real numbers. If the information inequality � b α H (( X i ) i ∈ α ) ≥ 0 α ∈ Ω holds for all discrete random variables X 1 , X 2 , . . . , X n , then the group inequality | G | � b α log | ∩ i ∈ α G i | ≥ 0 α ∈ Ω holds for all finite groups G and subgroups G i of G and vice versa. 2

  3. An example We know that, for all discrete random variables X 1 and X 2 , H ( X 1 ) + H ( X 2 ) − H ( X 1 , X 2 ) ≥ 0 (since I ( X 1 ; X 2 ) ≥ 0). So for all finite groups G and subgroups G 1 and G 2 of G , log | G | | G 1 | + log | G | | G | | G 2 | ≥ log | G 1 ∩ G 2 | . 3

  4. An example (cont.) We can confirm that log | G | | G 1 | + log | G | | G | | G 2 | ≥ log | G 1 ∩ G 2 | for all G 1 and G 2 subgroups of a finite G . Let ” ∗ ” be the operation on the group. Consider the subset of G G 1 ∗ G 2 = { a ∗ b | a ∈ G 1 and b ∈ G 2 } . Let us calculate | G 1 ∗ G 2 | . Fix ( a 1 , a 2 ) ∈ G 1 × G 2 . We wish to know how many pairs ( b 1 , b 2 ) ∈ G 1 × G 2 there are that satisfy b 1 ∗ b 2 = a 1 ∗ a 2 . 4

  5. An example (cont.) Since b 1 ∗ b 2 = a 1 ∗ a 2 , b 2 ∗ a − 1 = b − 1 ∗ a 1 . 2 1 Let k = b 2 ∗ a − 1 = b − 1 ∗ a 1 , so k ∈ G 1 ∩ G 2 . Each k gives rise to a 2 1 unique pair ( b 1 , b 2 ) such that b 1 ∗ b 2 = a 1 ∗ a 2 . Therefore, | G 1 ∗ G 2 | = | G 1 || G 2 | | G 1 ∩ G 2 | . Since G 1 ∗ G 2 ⊂ G , | G | ≥ | G 1 || G 2 | | G 1 ∩ G 2 | . 5

  6. An example (cont.) Upon rearrangement of | G | ≥ | G 1 || G 2 | | G 1 ∩ G 2 | , we obtain log | G | | G 1 | + log | G | | G | | G 2 | ≥ log | G 1 ∩ G 2 | . 6

  7. Entropy functions Suppose we have discrete random variables X 1 , X 2 , . . . X n with sample spaces X 1 , X 2 , . . . , X n . Denote by X α , α ∈ Ω the joint random variable ( X i ) i ∈ α with sample space X α = � i ∈ α X i . Definition Let g be a vector in R | Ω | with components g α indexed by α ∈ Ω. Then g is an entropy function if there exists a set of random variables X 1 , X 2 , . . . , X n such that g α = H ( X α ) for all α . Let Γ ∗ n be the set of all entropy functions associated with n random variables; that is n = { g ∈ R | Ω | | g is an entropy function } . Γ ∗ 7

  8. Entropy functions: An example Suppose X 1 , X 2 have the PMF given in the following table. p X 1 ,X 2 ( x 1 , x 2 ) x 1 = 1 x 1 = 2 x 1 = 3 x 2 = 1 1/6 1/6 0 x 2 = 2 0 1/6 1/6 x 2 = 3 1/6 0 1/6 Therefore H ( X 1 ) = log 3, H ( X 2 ) = log 3, and H ( X 1 , X 2 ) = log 6. So (log 3 , log 3 , log 6) ∈ Γ ∗ n . 8

  9. Group-characterizable functions Let G be a finite group and G 1 , G 2 , . . . , G n be subgroups of G . We use the notation G α = ∩ i ∈ α G i , where α ∈ Ω. Definition Let h be a vector in R | Ω | with components h α indexed by α ∈ Ω. Then h is called group-characterizable if there exist subgroups G 1 , G 2 , . . . , G n of a group G such that h α = log( | G | / | G α | ) for all α . Let Υ n be the set of all group-characterizable functions associated with n groups; that is Υ n = { h ∈ R | Ω | | h is group-characterizable } . 9

  10. Group-characterizable functions: An example Let A be the 2 × 3 matrix    a a b A =  c d d where a, b, c, d are distinct elements of a field K . Let G be the group of permutations of 3 elements, so | G | = 6. For i = 1 , 2, let G i be the subgroup of G that keeps the i th row of A unchanged. So | G 1 | = 2, | G 2 | = 2, and | G 1 ∩ G 2 | = 1. So (log 3 , log 3 , log 6) ∈ Υ n . 10

  11. Information inequalities From the definition of Γ ∗ n , an information inequality � b α H ( X α ) ≥ 0 α ∈ Ω is valid if and only if n ⊂ { h ∈ R | Ω | | b ⊤ h ≥ 0 } , Γ ∗ where b is a column vector with components b α . 11

  12. Group inequalities Likewise, a group inequality b α log | G | � | G α | ≥ 0 α ∈ Ω is valid if and only if Υ n ⊂ { h ∈ R | Ω | | b ⊤ h ≥ 0 } . We are interested in relating Γ ∗ n with Υ n . Specifically, we want n ⊂ { h ∈ R | Ω | | b ⊤ h ≥ 0 } ⇔ Υ n ⊂ { h ∈ R | Ω | | b ⊤ h ≥ 0 } . Γ ∗ Note : Υ n is much sparser than Γ ∗ n because it has at most countably many points. 12

  13. Main theorem ∗ Theorem conv(Υ n ) = Γ n for all natural numbers n . Outline of proof: • Show that Υ n ⊂ Γ ∗ n . ∗ • Show that Γ n is a convex cone. ∗ ∗ • So conv(Υ n ) ⊂ conv(Γ ∗ n ) ⊂ conv(Γ n ) = Γ n . • Show that Γ ∗ n ⊂ conv(Υ n ). ∗ • So Γ n ⊂ conv(Υ n ). 13

  14. Showing Υ n ⊂ Γ ∗ n Lemma If h is group-characterizable, then it is an entropy function, i.e. h ∈ Γ ∗ n . Proof. • Want : Find random variables X 1 , X 2 , . . . , X n such that H (( X i ) i ∈ α ) = log( | G | / | G α | ). • Let Λ be uniformly distributed over the sample space G . • For i ∈ N , X i = aG i if Λ = a ( aG i is a left coset of G i ). • For α ∈ Ω, P (( X i = a i G i ) i ∈ α ) = P (Λ ∈ ∩ i ∈ α a i G i ) = | ∩ i ∈ α a i G i | . | G | 14

  15. Showing Υ n ⊂ Γ ∗ n (cont.) • Consider ∩ i ∈ α a i G i . If non-empty, then let b ∈ ∩ i ∈ α a i G i , so ∩ i ∈ α a i G i = ∩ i ∈ α bG i = b ∩ i ∈ α G i = bG α . • The set ∩ i ∈ α a i G i is either empty or of size | G α | . • Hence  | G α | if ∩ i ∈ α a i G i � = ∅ ,  | G | P (( X i = a i G i ) i ∈ α ) = 0 otherwise .  • So H (( X i ) i ∈ α ) = log( | G | / | G α | ), as desired. � 15

  16. ∗ Showing Γ n is a convex cone ∗ Lemma Γ n is a convex cone. Proof. ∗ • First show that Γ n is convex. ∗ • Want : For any 0 < b < 1, u , v ∈ Γ ∗ n , b u + (1 − b ) v ∈ Γ n . (It is then straightforward to extend to points in the closure of Γ ∗ n .) • u entropy function for Y 1 , Y 2 , . . . , Y n . • v entropy function for Z 1 , Z 2 , . . . , Z n . 16

  17. ∗ Showing Γ n is a convex cone (cont.) • Let ( Y ( i ) 1 , Y ( i ) 2 , . . . , Y ( i ) n ) for 1 ≤ i ≤ k be k independent vectors each distributed identically to ( Y 1 , Y 2 , . . . , Y n ). • Let ( Z ( i ) 1 , Z ( i ) 2 , . . . , Z ( i ) n ) for 1 ≤ i ≤ k be k independent vectors each distributed identically to ( Z 1 , Z 2 , . . . , Z n ). • Let U be a random variable having the distribution  1 − δ − µ if u = 0 ,    p U ( u ) = δ if u = 1 ,   µ if u = 2 .  So H ( U ) → 0 as δ, µ → 0. 17

  18. ∗ Showing Γ n is a convex cone (cont.) • Now construct X 1 , X 2 , . . . , X n by  0 if U = 0 ,    ( Y (1) , Y (2) , . . . , Y ( k ) X i = ) if U = 1 , i i i  ( Z (1) , Z (2) , . . . , Z ( k )  ) if U = 2 .  i i i • So for any α ∈ Ω, H ( X α | U ) = δkH ( Y α ) + µkH ( X α ). 18

  19. ∗ Showing Γ n is a convex cone (cont.) • We have 0 ≤ H ( X α ) − H ( X α | U ) ≤ H ( U ) , ⇒ 0 ≤ H ( X α ) − ( δkH ( Y α ) + µkH ( X α )) ≤ H ( U ) . • Take δ = b/k , µ = (1 − b ) /k , so 0 ≤ H ( X α ) − ( bH ( Y α ) + (1 − b ) H ( X α )) ≤ H ( U ) . • Taking k = 1 , 2 , . . . gives us a sequence of points in Γ ∗ n whose ∗ limit point is b u + (1 − b ) v . So b u + (1 − b ) v ∈ Γ n . ∗ • Γ n is convex. 19

  20. ∗ Showing Γ n is a convex cone (cont.) ∗ • It remains only to show that Γ n is a cone. • If v ∈ Γ ∗ n , consider k independent copies of its associated random variables, and we see that k v ∈ Γ ∗ n . ∗ • Straightforwardly extend to closure: if v ∈ Γ n , then for any ∗ positive integer k , k v ∈ Γ n . • By letting X 1 , X 2 , . . . , X n take constant values with probability 1, we see that 0 ∈ Γ ∗ n . 20

  21. ∗ Showing Γ n is a convex cone (cont.) ∗ • Consider non-negative combination � i α i v i , α i ≥ 0, v i ∈ Γ n . • Let α = � i α i , then by convexity � � α i 1 − α α i ∗ � � n ∋ Γ ⌈ α ⌉ v i + 0 = ⌈ α ⌉ v i . ⌈ α ⌉ i i ∗ α i • So Γ n ∋ ⌈ α ⌉ � ⌈ α ⌉ v i = � i α i v i . i � 21

  22. n ⊂ conv(Υ n ) Showing Γ ∗ Lemma For any h ∈ Γ ∗ n , there exists a sequence { f ( r ) } in Υ n such that lim r →∞ ( f ( r ) /r ) = h . Proof. • First consider special case where |X i | < ∞ for all i ∈ N and joint distribution X 1 , X 2 , . . . , X n is rational. • For any α ∈ Ω, let Q α be the marginal distribution of X α . • Assume w.l.o.g. that for any α ∈ Ω and x ∈ X α , Q α ( x ) is rational with denominator q . • Want : Construct a sequence { f ( r ) } in Υ n such that lim r →∞ ( f ( r ) /r ) = h ( h α = H ( X α )). 22

  23. n ⊂ conv(Υ n ) (cont.) Showing Γ ∗ • For r = q, 2 q, 3 q, . . . , fix an n × r matrix A ,   · · · a 1 , 1 a 1 ,r . .  ...  . . A =   . .     a n, 1 · · · a n,r such that for all x ∈ X N , the number of columns in A equal to x is rQ N ( x ). • Denote by A α the submatrix of A obtained by extracting the rows of A indexed by α . • For all x ∈ X α , the number of columns of A α equal to x is rQ α ( x ) ( Q α is the marginal distribution of X i , i ∈ α ). 23

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend