kolmogorov complexity and other entropy measures
play

Kolmogorov Complexity and Other Entropy Measures Iftach Haitner - PowerPoint PPT Presentation

Application of Information Theory, Lecture 8 Kolmogorov Complexity and Other Entropy Measures Iftach Haitner Tel Aviv University. December 16, 2014 Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 1 / 24


  1. Kolmogorov complexity ◮ For s string x ∈ { 0 , 1 } ∗ , let K ( x ) be the length of the shortest C ++ program (written in binary) that outputs x (on empty input) ◮ Now the term “described” is well defined. ◮ Why C ++ ? ◮ All (complete) programming language/computational model are essentially equivalent. ◮ Let K ′ ( x ) be the description length of x in another complete language, then | K ( x ) − k ′ ( x ) | ≤ const . ◮ What is K ( x ) for x = 0101010101 . . . 01 � �� � n pairs ◮ “For i = 1 : i ++ : n ; print 01” ◮ K ( x ) ≤ log n + const ◮ This is considered to be small complexity. We typically ignore log n factors. ◮ What is K ( x ) for x being the first n digits of π ? Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 4 / 24

  2. Kolmogorov complexity ◮ For s string x ∈ { 0 , 1 } ∗ , let K ( x ) be the length of the shortest C ++ program (written in binary) that outputs x (on empty input) ◮ Now the term “described” is well defined. ◮ Why C ++ ? ◮ All (complete) programming language/computational model are essentially equivalent. ◮ Let K ′ ( x ) be the description length of x in another complete language, then | K ( x ) − k ′ ( x ) | ≤ const . ◮ What is K ( x ) for x = 0101010101 . . . 01 � �� � n pairs ◮ “For i = 1 : i ++ : n ; print 01” ◮ K ( x ) ≤ log n + const ◮ This is considered to be small complexity. We typically ignore log n factors. ◮ What is K ( x ) for x being the first n digits of π ? ◮ K ( x ) = log n + const Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 4 / 24

  3. More examples Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 5 / 24

  4. More examples ◮ What is K ( x ) for x ∈ { 0 , 1 } n with k ones? Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 5 / 24

  5. More examples ◮ What is K ( x ) for x ∈ { 0 , 1 } n with k ones? � n � ◮ Recall that ≤ 2 nh ( k / n ) k Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 5 / 24

  6. More examples ◮ What is K ( x ) for x ∈ { 0 , 1 } n with k ones? � n � ◮ Recall that ≤ 2 nh ( k / n ) k ◮ Hence K ( x ) ≤ log n + nh ( k / n ) Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 5 / 24

  7. Bounds Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 6 / 24

  8. Bounds ◮ K ( x ) ≤ | x | + const Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 6 / 24

  9. Bounds ◮ K ( x ) ≤ | x | + const ◮ Proof : “output x ” Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 6 / 24

  10. Bounds ◮ K ( x ) ≤ | x | + const ◮ Proof : “output x ” ◮ Most sequences have high Kolmogorov complexity: Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 6 / 24

  11. Bounds ◮ K ( x ) ≤ | x | + const ◮ Proof : “output x ” ◮ Most sequences have high Kolmogorov complexity: ◮ At most 2 n − 1 ( C ++ ) programs of length ≤ n − 2 Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 6 / 24

  12. Bounds ◮ K ( x ) ≤ | x | + const ◮ Proof : “output x ” ◮ Most sequences have high Kolmogorov complexity: ◮ At most 2 n − 1 ( C ++ ) programs of length ≤ n − 2 ◮ 2 n strings of length n Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 6 / 24

  13. Bounds ◮ K ( x ) ≤ | x | + const ◮ Proof : “output x ” ◮ Most sequences have high Kolmogorov complexity: ◮ At most 2 n − 1 ( C ++ ) programs of length ≤ n − 2 ◮ 2 n strings of length n ◮ Hence, at least 1 2 of n -bit strings have Kolmogorov complexity at least n − 1 Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 6 / 24

  14. Bounds ◮ K ( x ) ≤ | x | + const ◮ Proof : “output x ” ◮ Most sequences have high Kolmogorov complexity: ◮ At most 2 n − 1 ( C ++ ) programs of length ≤ n − 2 ◮ 2 n strings of length n ◮ Hence, at least 1 2 of n -bit strings have Kolmogorov complexity at least n − 1 ◮ In particular, a random sequence has Kolmogorov complexity ≈ n Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 6 / 24

  15. Conditional Kolmogorov complexity Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 7 / 24

  16. Conditional Kolmogorov complexity ◮ K ( x | y ) — Kolmogorov complexity of x given y . The length of the shortest partogram that outputd x on input y Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 7 / 24

  17. Conditional Kolmogorov complexity ◮ K ( x | y ) — Kolmogorov complexity of x given y . The length of the shortest partogram that outputd x on input y ◮ Chain rule K ( x , y ) ≈ k ( y ) + k ( x | y ) Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 7 / 24

  18. H vs. K Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24

  19. H vs. K H ( X ) speaks about a random variable X and K ( x ) of a string x , but Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24

  20. H vs. K H ( X ) speaks about a random variable X and K ( x ) of a string x , but ◮ Both quantities measure the amount of uncertainty or randomness in an object Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24

  21. H vs. K H ( X ) speaks about a random variable X and K ( x ) of a string x , but ◮ Both quantities measure the amount of uncertainty or randomness in an object ◮ Both measure the number of bits it takes to describe an object Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24

  22. H vs. K H ( X ) speaks about a random variable X and K ( x ) of a string x , but ◮ Both quantities measure the amount of uncertainty or randomness in an object ◮ Both measure the number of bits it takes to describe an object ◮ Another property: Let X 1 , . . . , X n be iid, then whp K ( X 1 , . . . , X n ) ≈ H ( X 1 , . . . , X n ) = nH ( X 1 ) Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24

  23. H vs. K H ( X ) speaks about a random variable X and K ( x ) of a string x , but ◮ Both quantities measure the amount of uncertainty or randomness in an object ◮ Both measure the number of bits it takes to describe an object ◮ Another property: Let X 1 , . . . , X n be iid, then whp K ( X 1 , . . . , X n ) ≈ H ( X 1 , . . . , X n ) = nH ( X 1 ) ◮ Proof : ? Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24

  24. H vs. K H ( X ) speaks about a random variable X and K ( x ) of a string x , but ◮ Both quantities measure the amount of uncertainty or randomness in an object ◮ Both measure the number of bits it takes to describe an object ◮ Another property: Let X 1 , . . . , X n be iid, then whp K ( X 1 , . . . , X n ) ≈ H ( X 1 , . . . , X n ) = nH ( X 1 ) ◮ Proof : ? Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24

  25. H vs. K H ( X ) speaks about a random variable X and K ( x ) of a string x , but ◮ Both quantities measure the amount of uncertainty or randomness in an object ◮ Both measure the number of bits it takes to describe an object ◮ Another property: Let X 1 , . . . , X n be iid, then whp K ( X 1 , . . . , X n ) ≈ H ( X 1 , . . . , X n ) = nH ( X 1 ) ◮ Proof : ? AEP Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24

  26. H vs. K H ( X ) speaks about a random variable X and K ( x ) of a string x , but ◮ Both quantities measure the amount of uncertainty or randomness in an object ◮ Both measure the number of bits it takes to describe an object ◮ Another property: Let X 1 , . . . , X n be iid, then whp K ( X 1 , . . . , X n ) ≈ H ( X 1 , . . . , X n ) = nH ( X 1 ) ◮ Proof : ? AEP ◮ Example: coin flip ( 0 . 7 , 0 . 3 ) then whp we get a string with K ( x ) ≈ n · h ( 0 . 3 ) Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 8 / 24

  27. Universal compression Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 9 / 24

  28. Universal compression ◮ A program of length K ( x ) that outputs x , compresses x into k ( x ) bit of information. Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 9 / 24

  29. Universal compression ◮ A program of length K ( x ) that outputs x , compresses x into k ( x ) bit of information. ◮ Example: length of the human genome: 6 · 10 9 bits Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 9 / 24

  30. Universal compression ◮ A program of length K ( x ) that outputs x , compresses x into k ( x ) bit of information. ◮ Example: length of the human genome: 6 · 10 9 bits ◮ But the code is redundant Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 9 / 24

  31. Universal compression ◮ A program of length K ( x ) that outputs x , compresses x into k ( x ) bit of information. ◮ Example: length of the human genome: 6 · 10 9 bits ◮ But the code is redundant ◮ The relevant number to measure the number of possible values is the Kolmogorov complexity of the code. Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 9 / 24

  32. Universal compression ◮ A program of length K ( x ) that outputs x , compresses x into k ( x ) bit of information. ◮ Example: length of the human genome: 6 · 10 9 bits ◮ But the code is redundant ◮ The relevant number to measure the number of possible values is the Kolmogorov complexity of the code. ◮ No-one knows its value... Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 9 / 24

  33. Universal probability Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24

  34. Universal probability K ( x ) = min p : p ()= x | p | , where p () is the output of C ++ program defined by p . Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24

  35. Universal probability K ( x ) = min p : p ()= x | p | , where p () is the output of C ++ program defined by p . Definition 1 The universal probability of a string x is P U ( x ) = � p : p ()= x 2 −| p | = Pr p ←{ 0 , 1 } ∞ [ p () = x ] Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24

  36. Universal probability K ( x ) = min p : p ()= x | p | , where p () is the output of C ++ program defined by p . Definition 1 The universal probability of a string x is P U ( x ) = � p : p ()= x 2 −| p | = Pr p ←{ 0 , 1 } ∞ [ p () = x ] ◮ Namely, the probability that if one picks a program at random, it prints x . Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24

  37. Universal probability K ( x ) = min p : p ()= x | p | , where p () is the output of C ++ program defined by p . Definition 1 The universal probability of a string x is P U ( x ) = � p : p ()= x 2 −| p | = Pr p ←{ 0 , 1 } ∞ [ p () = x ] ◮ Namely, the probability that if one picks a program at random, it prints x . ◮ Insensitive (up o constant factor) to the computation model. Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24

  38. Universal probability K ( x ) = min p : p ()= x | p | , where p () is the output of C ++ program defined by p . Definition 1 The universal probability of a string x is P U ( x ) = � p : p ()= x 2 −| p | = Pr p ←{ 0 , 1 } ∞ [ p () = x ] ◮ Namely, the probability that if one picks a program at random, it prints x . ◮ Insensitive (up o constant factor) to the computation model. ◮ Interpretation: P U ( x ) is the the probability that you observe x in nature. Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24

  39. Universal probability K ( x ) = min p : p ()= x | p | , where p () is the output of C ++ program defined by p . Definition 1 The universal probability of a string x is P U ( x ) = � p : p ()= x 2 −| p | = Pr p ←{ 0 , 1 } ∞ [ p () = x ] ◮ Namely, the probability that if one picks a program at random, it prints x . ◮ Insensitive (up o constant factor) to the computation model. ◮ Interpretation: P U ( x ) is the the probability that you observe x in nature. Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24

  40. Universal probability K ( x ) = min p : p ()= x | p | , where p () is the output of C ++ program defined by p . Definition 1 The universal probability of a string x is P U ( x ) = � p : p ()= x 2 −| p | = Pr p ←{ 0 , 1 } ∞ [ p () = x ] ◮ Namely, the probability that if one picks a program at random, it prints x . ◮ Insensitive (up o constant factor) to the computation model. ◮ Interpretation: P U ( x ) is the the probability that you observe x in nature. ◮ Computer as an intelligent amplifier Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24

  41. Universal probability K ( x ) = min p : p ()= x | p | , where p () is the output of C ++ program defined by p . Definition 1 The universal probability of a string x is P U ( x ) = � p : p ()= x 2 −| p | = Pr p ←{ 0 , 1 } ∞ [ p () = x ] ◮ Namely, the probability that if one picks a program at random, it prints x . ◮ Insensitive (up o constant factor) to the computation model. ◮ Interpretation: P U ( x ) is the the probability that you observe x in nature. ◮ Computer as an intelligent amplifier Theorem 2 ∃ c > 0 such that 2 − K ( x ) ≤ P U ( x ) ≤ c · 2 − K ( x ) for every x ∈ { 0 , 1 } ∗ . Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24

  42. Universal probability K ( x ) = min p : p ()= x | p | , where p () is the output of C ++ program defined by p . Definition 1 The universal probability of a string x is P U ( x ) = � p : p ()= x 2 −| p | = Pr p ←{ 0 , 1 } ∞ [ p () = x ] ◮ Namely, the probability that if one picks a program at random, it prints x . ◮ Insensitive (up o constant factor) to the computation model. ◮ Interpretation: P U ( x ) is the the probability that you observe x in nature. ◮ Computer as an intelligent amplifier Theorem 2 ∃ c > 0 such that 2 − K ( x ) ≤ P U ( x ) ≤ c · 2 − K ( x ) for every x ∈ { 0 , 1 } ∗ . Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24

  43. Universal probability K ( x ) = min p : p ()= x | p | , where p () is the output of C ++ program defined by p . Definition 1 The universal probability of a string x is P U ( x ) = � p : p ()= x 2 −| p | = Pr p ←{ 0 , 1 } ∞ [ p () = x ] ◮ Namely, the probability that if one picks a program at random, it prints x . ◮ Insensitive (up o constant factor) to the computation model. ◮ Interpretation: P U ( x ) is the the probability that you observe x in nature. ◮ Computer as an intelligent amplifier Theorem 2 ∃ c > 0 such that 2 − K ( x ) ≤ P U ( x ) ≤ c · 2 − K ( x ) for every x ∈ { 0 , 1 } ∗ . ◮ The interesting part is P U ( x ) ≤ c · 2 − K ( x ) Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24

  44. Universal probability K ( x ) = min p : p ()= x | p | , where p () is the output of C ++ program defined by p . Definition 1 The universal probability of a string x is P U ( x ) = � p : p ()= x 2 −| p | = Pr p ←{ 0 , 1 } ∞ [ p () = x ] ◮ Namely, the probability that if one picks a program at random, it prints x . ◮ Insensitive (up o constant factor) to the computation model. ◮ Interpretation: P U ( x ) is the the probability that you observe x in nature. ◮ Computer as an intelligent amplifier Theorem 2 ∃ c > 0 such that 2 − K ( x ) ≤ P U ( x ) ≤ c · 2 − K ( x ) for every x ∈ { 0 , 1 } ∗ . ◮ The interesting part is P U ( x ) ≤ c · 2 − K ( x ) � � � ≤ c ◮ Hence, for X ∼ P U , it holds that � E K ( X ) [ − ] H ( X ) Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 10 / 24

  45. Proving Theorem 2 Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 11 / 24

  46. Proving Theorem 2 ◮ We need to find c > 0 such that k ( x ) ≤ log P U ( x ) + c for every x ∈ { 0 , 1 } ∗ 1 Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 11 / 24

  47. Proving Theorem 2 ◮ We need to find c > 0 such that k ( x ) ≤ log P U ( x ) + c for every x ∈ { 0 , 1 } ∗ 1 1 ◮ In other words, find a program to output x whose length is log P U ( x ) + c Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 11 / 24

  48. Proving Theorem 2 ◮ We need to find c > 0 such that k ( x ) ≤ log P U ( x ) + c for every x ∈ { 0 , 1 } ∗ 1 1 ◮ In other words, find a program to output x whose length is log P U ( x ) + c ◮ Idea, program chooses a leaf on the Shannon code for P U (in which x is � � 1 of depth log ) P U ( x ) Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 11 / 24

  49. Proving Theorem 2 ◮ We need to find c > 0 such that k ( x ) ≤ log P U ( x ) + c for every x ∈ { 0 , 1 } ∗ 1 1 ◮ In other words, find a program to output x whose length is log P U ( x ) + c ◮ Idea, program chooses a leaf on the Shannon code for P U (in which x is � � 1 of depth log ) P U ( x ) ◮ Problem: P U is not computable Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 11 / 24

  50. Proving Theorem 2 ◮ We need to find c > 0 such that k ( x ) ≤ log P U ( x ) + c for every x ∈ { 0 , 1 } ∗ 1 1 ◮ In other words, find a program to output x whose length is log P U ( x ) + c ◮ Idea, program chooses a leaf on the Shannon code for P U (in which x is � � 1 of depth log ) P U ( x ) ◮ Problem: P U is not computable ◮ Solution: compute a better and better estimate for the tree of P U along with the “mapping" from the tree nodes back to codewords. Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 11 / 24

  51. Proving Theorem 2 Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24

  52. Proving Theorem 2 ◮ Initial T to be the infinite Binary tree. Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24

  53. Proving Theorem 2 ◮ Initial T to be the infinite Binary tree. Program 3 ( M ) Enumerate over all programs in { 0 , 1 } ∗ : at round i emulate the first i programs (one after the other), for i steps, and do: If program p outputs a string x and ( ∗ , x , n ( x )) / ∈ T , place ( p , x , n ( x )) at unused n ( x ) -depth node of � � P U ( x ) = � p ′ : emulated p ′ has output x 2 − | p ′ | + 1 and ˆ 1 T , for n ( x ) = log ˆ P U ( x ) Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24

  54. Proving Theorem 2 ◮ Initial T to be the infinite Binary tree. Program 3 ( M ) Enumerate over all programs in { 0 , 1 } ∗ : at round i emulate the first i programs (one after the other), for i steps, and do: If program p outputs a string x and ( ∗ , x , n ( x )) / ∈ T , place ( p , x , n ( x )) at unused n ( x ) -depth node of � � P U ( x ) = � p ′ : emulated p ′ has output x 2 − | p ′ | + 1 and ˆ 1 T , for n ( x ) = log ˆ P U ( x ) ◮ The program never gets stack (can always add the node). Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24

  55. Proving Theorem 2 ◮ Initial T to be the infinite Binary tree. Program 3 ( M ) Enumerate over all programs in { 0 , 1 } ∗ : at round i emulate the first i programs (one after the other), for i steps, and do: If program p outputs a string x and ( ∗ , x , n ( x )) / ∈ T , place ( p , x , n ( x )) at unused n ( x ) -depth node of � � P U ( x ) = � p ′ : emulated p ′ has output x 2 − | p ′ | + 1 and ˆ 1 T , for n ( x ) = log ˆ P U ( x ) ◮ The program never gets stack (can always add the node). Proof : Let x ∈ { 0 , 1 } ∗ . At each point through the execution of M, � ( p , x , · ) ∈ T 2 −| p | ≤ 2 − K ( x ) Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24

  56. Proving Theorem 2 ◮ Initial T to be the infinite Binary tree. Program 3 ( M ) Enumerate over all programs in { 0 , 1 } ∗ : at round i emulate the first i programs (one after the other), for i steps, and do: If program p outputs a string x and ( ∗ , x , n ( x )) / ∈ T , place ( p , x , n ( x )) at unused n ( x ) -depth node of � � P U ( x ) = � p ′ : emulated p ′ has output x 2 − | p ′ | + 1 and ˆ 1 T , for n ( x ) = log ˆ P U ( x ) ◮ The program never gets stack (can always add the node). Proof : Let x ∈ { 0 , 1 } ∗ . At each point through the execution of M, � ( p , x , · ) ∈ T 2 −| p | ≤ 2 − K ( x ) Since � x 2 − K ( x ) ≤ 1, the proof follows by Kraft inequality. Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24

  57. Proving Theorem 2 ◮ Initial T to be the infinite Binary tree. Program 3 ( M ) Enumerate over all programs in { 0 , 1 } ∗ : at round i emulate the first i programs (one after the other), for i steps, and do: If program p outputs a string x and ( ∗ , x , n ( x )) / ∈ T , place ( p , x , n ( x )) at unused n ( x ) -depth node of � � P U ( x ) = � p ′ : emulated p ′ has output x 2 − | p ′ | + 1 and ˆ 1 T , for n ( x ) = log ˆ P U ( x ) ◮ The program never gets stack (can always add the node). Proof : Let x ∈ { 0 , 1 } ∗ . At each point through the execution of M, � ( p , x , · ) ∈ T 2 −| p | ≤ 2 − K ( x ) Since � x 2 − K ( x ) ≤ 1, the proof follows by Kraft inequality. � � ◮ ∀ x ∈ { 0 , 1 } ∗ : M adds a node ( · , x , · ) to T at depth 2 + 1 log P U ( x ) Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24

  58. Proving Theorem 2 ◮ Initial T to be the infinite Binary tree. Program 3 ( M ) Enumerate over all programs in { 0 , 1 } ∗ : at round i emulate the first i programs (one after the other), for i steps, and do: If program p outputs a string x and ( ∗ , x , n ( x )) / ∈ T , place ( p , x , n ( x )) at unused n ( x ) -depth node of � � P U ( x ) = � p ′ : emulated p ′ has output x 2 − | p ′ | + 1 and ˆ 1 T , for n ( x ) = log ˆ P U ( x ) ◮ The program never gets stack (can always add the node). Proof : Let x ∈ { 0 , 1 } ∗ . At each point through the execution of M, � ( p , x , · ) ∈ T 2 −| p | ≤ 2 − K ( x ) Since � x 2 − K ( x ) ≤ 1, the proof follows by Kraft inequality. � � ◮ ∀ x ∈ { 0 , 1 } ∗ : M adds a node ( · , x , · ) to T at depth 2 + 1 log P U ( x ) Proof : ˆ P U ( x ) converges to P U ( x ) Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24

  59. Proving Theorem 2 ◮ Initial T to be the infinite Binary tree. Program 3 ( M ) Enumerate over all programs in { 0 , 1 } ∗ : at round i emulate the first i programs (one after the other), for i steps, and do: If program p outputs a string x and ( ∗ , x , n ( x )) / ∈ T , place ( p , x , n ( x )) at unused n ( x ) -depth node of � � P U ( x ) = � p ′ : emulated p ′ has output x 2 − | p ′ | + 1 and ˆ 1 T , for n ( x ) = log ˆ P U ( x ) ◮ The program never gets stack (can always add the node). Proof : Let x ∈ { 0 , 1 } ∗ . At each point through the execution of M, � ( p , x , · ) ∈ T 2 −| p | ≤ 2 − K ( x ) Since � x 2 − K ( x ) ≤ 1, the proof follows by Kraft inequality. � � ◮ ∀ x ∈ { 0 , 1 } ∗ : M adds a node ( · , x , · ) to T at depth 2 + 1 log P U ( x ) Proof : ˆ P U ( x ) converges to P U ( x ) � � 1 ◮ For x ∈ { 0 , 1 } ∗ , let ℓ ( x ) be the location its ( 2 + log ) -depth node P U ( x ) Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24

  60. Proving Theorem 2 ◮ Initial T to be the infinite Binary tree. Program 3 ( M ) Enumerate over all programs in { 0 , 1 } ∗ : at round i emulate the first i programs (one after the other), for i steps, and do: If program p outputs a string x and ( ∗ , x , n ( x )) / ∈ T , place ( p , x , n ( x )) at unused n ( x ) -depth node of � � P U ( x ) = � p ′ : emulated p ′ has output x 2 − | p ′ | + 1 and ˆ 1 T , for n ( x ) = log ˆ P U ( x ) ◮ The program never gets stack (can always add the node). Proof : Let x ∈ { 0 , 1 } ∗ . At each point through the execution of M, � ( p , x , · ) ∈ T 2 −| p | ≤ 2 − K ( x ) Since � x 2 − K ( x ) ≤ 1, the proof follows by Kraft inequality. � � ◮ ∀ x ∈ { 0 , 1 } ∗ : M adds a node ( · , x , · ) to T at depth 2 + 1 log P U ( x ) Proof : ˆ P U ( x ) converges to P U ( x ) � � 1 ◮ For x ∈ { 0 , 1 } ∗ , let ℓ ( x ) be the location its ( 2 + log ) -depth node P U ( x ) ◮ Program for printing x . Run M till it assigns the node at the location of ℓ ( x ) Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 12 / 24

  61. Applications Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 13 / 24

  62. Applications ◮ (another) Proof that there are infinity many primes. Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 13 / 24

  63. Applications ◮ (another) Proof that there are infinity many primes. ◮ Assume there are finitely many primes p 1 , . . . , p m Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 13 / 24

  64. Applications ◮ (another) Proof that there are infinity many primes. ◮ Assume there are finitely many primes p 1 , . . . , p m ◮ Any length n integer x can be written as x = � m i = 1 p d i i Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 13 / 24

  65. Applications ◮ (another) Proof that there are infinity many primes. ◮ Assume there are finitely many primes p 1 , . . . , p m ◮ Any length n integer x can be written as x = � m i = 1 p d i i ◮ d i ≤ n , hence length d i ≤ log n Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 13 / 24

  66. Applications ◮ (another) Proof that there are infinity many primes. ◮ Assume there are finitely many primes p 1 , . . . , p m ◮ Any length n integer x can be written as x = � m i = 1 p d i i ◮ d i ≤ n , hence length d i ≤ log n ◮ Hence, K ( x ) ≤ m · log n + const Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 13 / 24

  67. Applications ◮ (another) Proof that there are infinity many primes. ◮ Assume there are finitely many primes p 1 , . . . , p m ◮ Any length n integer x can be written as x = � m i = 1 p d i i ◮ d i ≤ n , hence length d i ≤ log n ◮ Hence, K ( x ) ≤ m · log n + const ◮ But for most numbers k ( x ) ≥ n − 1 Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 13 / 24

  68. Computability of K Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24

  69. Computability of K ◮ Can we compute K ( x ) ? Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24

  70. Computability of K ◮ Can we compute K ( x ) ? ◮ Answer, No. Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24

  71. Computability of K ◮ Can we compute K ( x ) ? ◮ Answer, No. ◮ Proof : Assume K is computable by a program of length C Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24

  72. Computability of K ◮ Can we compute K ( x ) ? ◮ Answer, No. ◮ Proof : Assume K is computable by a program of length C ◮ Let s be the smallest positive integer s.t. K ( s ) > 2 C + 10 , 000 Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24

  73. Computability of K ◮ Can we compute K ( x ) ? ◮ Answer, No. ◮ Proof : Assume K is computable by a program of length C ◮ Let s be the smallest positive integer s.t. K ( s ) > 2 C + 10 , 000 ◮ s can be computed by the following program: Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24

  74. Computability of K ◮ Can we compute K ( x ) ? ◮ Answer, No. ◮ Proof : Assume K is computable by a program of length C ◮ Let s be the smallest positive integer s.t. K ( s ) > 2 C + 10 , 000 ◮ s can be computed by the following program: 1. x = 0 Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24

  75. Computability of K ◮ Can we compute K ( x ) ? ◮ Answer, No. ◮ Proof : Assume K is computable by a program of length C ◮ Let s be the smallest positive integer s.t. K ( s ) > 2 C + 10 , 000 ◮ s can be computed by the following program: 1. x = 0 2. While ( K ( x ) < 2 C + 10 , 000 ) : x ++ Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24

  76. Computability of K ◮ Can we compute K ( x ) ? ◮ Answer, No. ◮ Proof : Assume K is computable by a program of length C ◮ Let s be the smallest positive integer s.t. K ( s ) > 2 C + 10 , 000 ◮ s can be computed by the following program: 1. x = 0 2. While ( K ( x ) < 2 C + 10 , 000 ) : x ++ 3. Output x Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24

  77. Computability of K ◮ Can we compute K ( x ) ? ◮ Answer, No. ◮ Proof : Assume K is computable by a program of length C ◮ Let s be the smallest positive integer s.t. K ( s ) > 2 C + 10 , 000 ◮ s can be computed by the following program: 1. x = 0 2. While ( K ( x ) < 2 C + 10 , 000 ) : x ++ 3. Output x ◮ Thus K ( s ) < C + log C + log 10 , 000 + const < 2 C + 10 , 000 Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24

  78. Computability of K ◮ Can we compute K ( x ) ? ◮ Answer, No. ◮ Proof : Assume K is computable by a program of length C ◮ Let s be the smallest positive integer s.t. K ( s ) > 2 C + 10 , 000 ◮ s can be computed by the following program: 1. x = 0 2. While ( K ( x ) < 2 C + 10 , 000 ) : x ++ 3. Output x ◮ Thus K ( s ) < C + log C + log 10 , 000 + const < 2 C + 10 , 000 ◮ Bergg’s Paradox, revisited: Iftach Haitner (TAU) Application of Information Theory, Lecture 8 December 16, 2014 14 / 24

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend