Information Theory Don Fallis Information in the Wild Intentional - - PowerPoint PPT Presentation

information theory
SMART_READER_LITE
LIVE PREVIEW

Information Theory Don Fallis Information in the Wild Intentional - - PowerPoint PPT Presentation

Information Theory Don Fallis Information in the Wild Intentional Information Transfer Data Storage Measuring Information Surprise! Inversely Related to Probability The lower the probability of event A, the more information you get by


slide-1
SLIDE 1

Information Theory

Don Fallis

slide-2
SLIDE 2

Information in the Wild

slide-3
SLIDE 3

Intentional Information Transfer

slide-4
SLIDE 4

Data Storage

slide-5
SLIDE 5

Measuring Information

slide-6
SLIDE 6

Surprise!

slide-7
SLIDE 7

Inversely Related to Probability

  • The lower the probability of event A,

the more information you get by learning A.

  • The higher the probability of event A,

the less information you get by learning A.

  • So, 1/p(A) is a plausible measure of

the information you get by learning A.

slide-8
SLIDE 8

Measuring Information

1 2 1 2 3 4 1 2 3 4 5 6 7 8

  • S(HEADS) = 1/p(HEADS) = 1/0.5 = 2
  • S(‘1’) = 1/p(‘1’) = 1/0.25 = 4
  • S(‘2’) = 1/p(‘2’) = 1/0.125 = 8
slide-9
SLIDE 9

Measuring Information

1 2 1 1 5 2 2 6 3 3 7 4 4 8

  • 2 + 4 ≠ 8
  • Log2(2) + log2(4) = 1 + 2 = 3 = log2(8)
slide-10
SLIDE 10

Binary Search

slide-11
SLIDE 11

Surprise

  • Surprise of a Fair Coin coming up Heads
  • S(FC = HEADS) = log2( 1/(1/2) ) = log2(2) = 1 bit
  • Surprise of LLR being at the Left shrub at first time step
  • S(X1 = LEFT) = log2( 1/(1/3) ) = log2(3) = 1.58 bits
  • Surprise of a Fire Alarm going off
  • S(FA = ALARM) = log2( 1/(1/100) ) = log2(100) = 6.644 bits
slide-12
SLIDE 12

Bits versus Binary Digits

slide-13
SLIDE 13

Entropy

  • Entropy is Average Surprise
  • Note that this another example of expected value.
  • Entropy of a Fair Coin
  • H(FC) = 1/2*log2(2) + 1/2*log2(2)
  • H(FC) = 1/2*1 + 1/2*1 = 1
  • Entropy of Robot Location at first time step
  • H(X1) = 1/3*log2(3) + 1/3*log2(3) + 1/3*log2(3)
  • H(X1) = 1/3*1.58 + 1/3*1.58 + 1/3*1.58 = 1.58
  • Entropy of a Fire Alarm
  • H(FA) = 0.01*log2(100) + 0.99*log2(1.01)
  • H(FA) = 0.01*6.644 + 0.99*0.014 = 0.081
slide-14
SLIDE 14

Uniform Maximizes Entropy

slide-15
SLIDE 15

Amount of Information Transmitted

slide-16
SLIDE 16

Noise

slide-17
SLIDE 17

Information Channel

slide-18
SLIDE 18

Binary Symmetric Channel

slide-19
SLIDE 19

Probabilistic Graphical Model

𝚾S S0 q S1 1-q 𝛀SR R0 R1 S0 1-p p S1 p 1-p

  • 𝑞 𝑡
  • 𝑞 𝑠 | 𝑡
slide-20
SLIDE 20

Mutual Information

slide-21
SLIDE 21

Worst-Case Scenario (Independent)

slide-22
SLIDE 22

Best-Case Scenario (Perfectly Correlated)

  • 𝐼 𝑌 = 𝐼 𝑍 = 𝑁𝐽 𝑌 & 𝑍
slide-23
SLIDE 23
  • 𝑁𝐽 𝑌 & 𝑍 = 𝐼 𝑌 + 𝐼 𝑍 − 𝐼(𝑌 & 𝑍)

Everything In Between

slide-24
SLIDE 24

Measuring Mutual Information

  • Mutual Information is Expected Reduction in Uncertainty
  • Note that this another example of expected value.
  • Suppose that you see a Yellow flash …
  • Your credences shift from (1/3, 1/3, 1/3) to (1/2, 1/2, 0)
  • The entropy of your credences shifts from 1.58 to 1
  • So, there is a reduction in entropy of 0.58
  • Suppose that you see a White flash …
  • Your credences shift from (1/3, 1/3, 1/3) to (0, 0, 1)
  • The entropy of your credences shifts from 1.58 to 0
  • So, there is a reduction in entropy of 1.58
  • Take a Weighted Average …
  • The probability of a Yellow flash is 2/3
  • The probability of a White flash is 1/3
  • So, the expected reduction in entropy is 2/3*0.58 + 1/3*1.58 = 0.92
slide-25
SLIDE 25

H↓ E→ YELLOW WHITE total H GOOD 1/3 1/3 BAD 1/3 1/3 UGLY 1/3 1/3 total E 2/3 1/3

Firefly Entropy

  • H(H) = 1/3*log2(3) + 1/3*log2(3) + 1/3*log2(3)
  • H(H) = 1/3*1.58 + 1/3*1.58 + 1/3*1.58 = 1.58
  • H(E) = 2/3*log2(1.5) + 1/3*log2(3)
  • H(E) = 2/3*0.58 + 1/3*1.58 = 0.92
  • 𝑞 ℎ & 𝑓 , 𝑞 ℎ , and 𝑞(𝑓)
slide-26
SLIDE 26

H↓ E→ YELLOW WHITE total H GOOD 1/3 1/3 BAD 1/3 1/3 UGLY 1/3 1/3 total E 2/3 1/3

More Firefly Entropy

  • H(H&E) = 1/3*log2(3) + 0*log2(0) + 1/3*log2(3)

+ 0*log2(0) + 0*log2(0) + 1/3*log2(3)

  • H(H&E) = 1/3*1.58 + 0*(-∞) + 1/3*1.58

+ 0*(-∞) + 0*(-∞) + 1/3*1.58 = 1.58

  • 𝑞 ℎ & 𝑓 , 𝑞 ℎ , and 𝑞(𝑓)
slide-27
SLIDE 27

H↓ E→ YELLOW WHITE total H GOOD 1/3 1/3 BAD 1/3 1/3 UGLY 1/3 1/3 total E 2/3 1/3

Firefly Mutual Information

  • MI(H&E) = H(H) + H(E) – H(H&E)
  • MI(H&E) = 1.58 + 0.92 – 1.58 = 0.92
  • 𝑞 ℎ & 𝑓 , 𝑞 ℎ , and 𝑞(𝑓)
slide-28
SLIDE 28

X1↓ X2→ left middle right total X1 left 1/12 1/4 1/3 middle 1/12 1/4 1/3 right 1/3 1/3 total X2 1/12 1/3 7/12

  • 𝑞 𝑦1 & 𝑦2 , 𝑞 𝑦1 , and 𝑞 𝑦2

Robot Localization #1

  • H(X1) = 1/3*log2(3) + 1/3*log2(3) + 1/3*log2(3)
  • H(X1) = 1/3*1.58 + 1/3*1.58 + 1/3*1.58 = 1.58
  • H(X2) = 1/12*log2(12) + 1/3*log2(3) + 7/12*log2(1.71)
  • H(X2) = 1/12*3.58 + 1/3*1.58 + 7/12*0.78 = 1.28
slide-29
SLIDE 29

X1↓ X2→ left middle right total X1 left 1/12 1/4 1/3 middle 1/12 1/4 1/3 right 1/3 1/3 total X2 1/12 1/3 7/12

  • 𝑞 𝑦1 & 𝑦2 , 𝑞 𝑦1 , and 𝑞 𝑦2

Robot Localization #1

  • H(X1&X2) = 1/12*log2(12) + 1/4*log2(4) +

1/12*log2(12) + 1/4*log2(4) + 1/3*log2(3)

  • H(X1&X2) = 1/12*3.58 + 1/4*2 + 1/12*3.58

+ 1/4*2 + 1/3*1.58 = 2.13

slide-30
SLIDE 30

X1↓ X2→ left middle right total X1 left 1/12 1/4 1/3 middle 1/12 1/4 1/3 right 1/3 1/3 total X2 1/12 1/3 7/12

  • 𝑞 𝑦1 & 𝑦2 , 𝑞 𝑦1 , and 𝑞 𝑦2

Robot Localization #1

  • MI(X1&X2) = H(X1) + H(X2) – H(X1&X2)
  • MI(X1&X2) = 1.58 + 1.28 – 2.13 = 0.74
slide-31
SLIDE 31
  • 𝑞 𝑦1 & 𝑝1 , 𝑞 𝑦1 , and 𝑞 𝑝1

Robot Localization #1

  • H(X1) = 1/3*log2(3) + 1/3*log2(3) + 1/3*log2(3)
  • H(X1) = 1/3*1.58 + 1/3*1.58 + 1/3*1.58 = 1.58
  • H(O1) = 2/3*log2(1.5) + 1/3*log2(3)
  • H(O1) = 2/3*0.58 + 1/3*1.58 = 0.92

X1↓ O1→ hot cold total X1 left 1/3 1/3 middle 1/3 1/3 right 1/3 1/3 total O1 2/3 1/3

slide-32
SLIDE 32

Robot Localization #1

  • H(X1&O1) = 1/3*log2(3) + 0*log2(0) + 0*log2(0) +

1/3*log2(3) + 1/3*log2(3) + 0*log2(0)

  • H(X1&O1) = 1/3*1.58 + 0*(-∞) + 0*(-∞) +

1/3*1.58 + 1/3*1.58 + 0*(-∞) = 1.58

  • 𝑞 𝑦1 & 𝑝1 , 𝑞 𝑦1 , and 𝑞 𝑝1

X1↓ O1→ hot cold total X1 left 1/3 1/3 middle 1/3 1/3 right 1/3 1/3 total O1 2/3 1/3

slide-33
SLIDE 33

X1↓ O1→ hot cold total X1 left 1/3 1/3 middle 1/3 1/3 right 1/3 1/3 total O1 2/3 1/3

Robot Localization #1

  • MI(X1&O1) = H(X1) + H(O1) – H(X1&O1)
  • MI(X1&O1) = 1.58 + 0.92 – 1.58 = 0.92
  • 𝑞 𝑦1 & 𝑝1 , 𝑞 𝑦1 , and 𝑞 𝑝1