introduction to information
play

Introduction to Information by Erol Seke For the course - PowerPoint PPT Presentation

Introduction to Information by Erol Seke For the course Communications OSMANGAZI UNIVERSITY The Goal Transfer information from source point to one or more destinations correctly (using least amount of resources, in most cases)


  1. Introduction to Information by Erol Seke For the course “ Communications ” OSMANGAZI UNIVERSITY

  2. The Goal Transfer information from source point to one or more destinations correctly (using least amount of resources, in most cases) Information Channel Information Information Generator User Destination point Source point

  3. Information, Data and Signal info data Information Data Signal signal to channel Generator Representation Representation Examples idea words speech/voice states bits electrical signals voice electrical signals electrical signals several representation changes may occur before the channel-signal output to channel idea words speech/voice electrical signals bits electrical signals we are interested in signals-to-signals and states-to-signals paths in this course

  4. Simple Example V 0 ( t ) States A t day represented by 0 represented by represented by 1 represented by V 1 ( t ) night t B Fact 1 : if it is always ‘night’, then nobody needs to share this information, that is there is no information to share Fact 2 : Information user must know what the signals mean (speak same language/symbols/signals etc)

  5. Simple Example A sentence : "The sun will rise tomorrow" meaning : The star that the earth rounds around will continue to exist and earth will continue to spin and no catasrophic event will occur to prevent that. (probability=1) The opposite of the above event has the probability of 0. It turns out that there is no point of sharing this sentence as it does not contain any information unless the sentence has some epic meaning. For other meanings, of course, both sides must speak the same language . So, what is information ?

  6. Information, Data and Signal Fact : In order for an event to be counted as information, its probability must be between (0,1) excluding both ends So: To have a probability within (0,1) a complementing probability (opposite of the event) must exist * So that the occurring event might change in the future * So that the representative data might change in the future * So that the representative signal might change in the future * So that we cannot use constant/periodic signals. Something in the signal must change in time V 0 ( t ) V 0 ( t ) A A This is the most precious thing in the universe. t t T If there is no time of event, there is no <put anything here>.

  7. Information P ( E ) I ( E ) "Stocks will drop 0.5% tomorrow" low information (happens everyday) "Stocks will drop 25% tomorrow" high information (rarely happens) I ( E ) self information I ( E ) = -log( P ( E )) P ( E ) 0 1 information is [unit]less quantity. But in order to compare quantities we use the base of the logarithm as if it is a unit I ( E ) = -log 2 ( P ( E )) [bits] (information value in bits)

  8. Example Expected grades in Communications Course (approximately) AA : 5% I AA = -log 2 ( P AA ) = -log 2 ( 0.05 ) ~ 4.32 bits BA : 10% BB : 15% CB : 20% I CC = -log 2 ( P CC ) = -log 2 ( 0.2 ) ~ 2.32 bits CC : 20% DC : 15% .. so on DD : 5% FF : 10% Meaning : When someone said "I got an AA", he/she actually transferred 4.32 bits worth of information to us. Question: How much information does he/she transfer by telling all the grades? Answer: SumOf(All_Info) = Info_Student1 + Info_Student2 + ... = Number_of_students X Average_Info_Per_Grade? Question: What is the Average_Info_Per_Grade?

  9. Average Information Per Source Output Information info in symbols (like AA,BA, etc.) Generator Since we know the probabilities, we can calculate N   sym I p I (weighted average) avg n n  n 1 N : the number of possible grades (8 in our example) sym N sym    log ( ) I p p avg n 2 n  n 1 we give it a special name : entropy of the source which depends only on the symbol probabilities   z { p n , 1,..., N } H z ( ) and denote it as where n sym ( in our example z ={ 0.05, 0.1, 0.15, 0.2, 0.2, 0.15, 0.05, 0.1 } )

  10. Examples We have 2 possible events : H, T with equal probabilities (like a coin drop) I       log (0.5) 1 I log (0.5) 1 bit and T 2 H 2 2         H z ( ) I p I 0.5 1 0.5 1 1 bit per symbol avg n n  n 1 T can be represented by binary 1 H can be represented by binary 0 0 : Heads Coin Drop tell the truth in binary symbols 1 : Tails Question: What if the coin is not a fair one (probs are not equal)? example : z = { 0.25, 0.75 }    I    I log (0.25) 2 log (0.75) 0.415 bits H 2 T 2 oops, how do we use 0.415 bits ?

  11. Examples We have 8 possible symbols with equal probabilities of 0.125 each I    log (0.125) 3 bits for each symbol (logical) s 2 These can well be { 000, 001, 010, 011, 100, 101, 110, 111 } or { 0, 1, 2, 3, 4, 5, 6, 7 } or { a, b, c, d, e, f, g, h } or ... The point is : the symbols do not need to be represented in binary (although their info can be measured in bits) However : we prefer binary since we use it all the time (in all digital systems). But that does not prevent us to create symbols like "01011" which might conveniently be represented by 01011 bit sequence. Question: What if the symbol information values are not integers? Answer: No problem. That all depends on what we want to do with them or how we represent them.

  12. Extensions Extensions are constructed by putting symbols in a set side by side A  {0,1 } example B  {000,001,010,011,100,101,110,111 } then (fixed length) is a 3 rd extension of binary alphabet A Why ? : To have more symbols to have more efficient representations Probabilities of newly created symbols are  u { p , p , p , p , p , p , p , p } 000 001 010 011 100 101 110 111  p p p p abc a b c z  {0.25,0.75} example      p p p p 0.25 0.75 0.75 0.14 011 1 1 o

  13. Extensions Neither extensions nor original alphabet needs to have fixed length codes example alphabet constructed of fixed length extensions of binary alphabet symbols B  {000,001,010,011,100,101,110,111 } example alphabets constructed of variable length extensions of binary alphabet symbols C  {00,01,011,1011,101,11001,110,111 } D  {0,1,10,11,100,101,110,111 } We can have infinite number of alphabets representing the same source symbol-set Question : So, What are their differences, advantages, disadvantages etc?

  14. Coding : Representations with Other Symbol Sets Symbol Code-1 Code-2 Code-3 Code-4 Code -5 Code -6 s 000 1 0 00 0 1 1 s 001 10 01 01 1 01 2 … s 010 100 011 10 10 001 3 s 011 1000 0111 110 11 0001 4 s 100 10000 01111 111 100 00001 5 fixed length variable length codes Coding Representing symbols (or a sequence of symbols) from a symbol set with symbols (or a sequence of symbols) from another set example it is also good to have abc... 123... 123... abc... Answer: For efficient representation Question: Why are we doing it?

  15. Average Code Length p Symbol Code-1 Code-2 Code-3 Code-4 Code -5 Code -6 i s 0.36 0 000 0 1 1 00 1 s 0.18 01 001 1 01 10 01 2 … s 0.17 011 010 10 001 100 10 3 s 0.16 0111 011 11 0001 1000 110 4 s 0.13 01111 100 100 00001 10000 111 5 N  sym   L p l 3 bits for Code-1 avg n n  n 1 so, using Code-6 is better N  sym   L p l 2.29 bits for Code-6 avg n n  n 1 Why not use Code-2 then? It looks like it will result a shorter average code length Because Code-2 is not uniquely decodable 123... abc... when transferred consecutively

  16. Unique Decodability Let us have an information source generating symbols from the alphabet  A { , s s s s s , , , } 1 2 3 4 5 with the probabilities of u = { 0.36, 0.16, 0.17, 0.16. 0.13 } s s s s s s s Assume that the source has generated the sequence of 1 2 3 1 1 5 4 Coding the symbols with Code-2, we would have : 0, 1, 10, 0, 0, 100, 11 or a binary sequence of : 01100010011 s s s s s s s We would like to decode the sequence 01100010011 back to 1 2 3 1 1 5 4 remembering that we do not have symbol separators, we see that it is impossible to decode it back to original So, the Code-2 is not uniquely decodable (that means it is nearly useless)

  17. Unique Decodability How about using Code-6 on the same source s s s s s s s Sequence is 1 2 3 1 1 5 4 Code-6 coder output : 00, 01, 10, 00, 00, 111, 110 binary sequence without separators: 0001100000111110 On the receiver side we would like to decode the sequence 0001100000111110 back 0001100000111110 0: not in table, take another bit from the stream. Remaining : 01100000111110 00: in table, so output S 1 0: not in table, take another bit from the stream. Remaining : 100000111110 01: in table, so output S 2 1: not in table, take another bit from the stream. Remaining : 0000111110 10: in table, so output S 3 ...so on and so forth, up to the end of the stream Therefore, Code-6 is uniquely decodable although the symbols are variable length

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend