Applied Information Theory Daniel Bosk Department of Information - PowerPoint PPT Presentation

Introduction Shannon entropy Applications References Applied Information Theory Daniel Bosk Department of Information and Communication Systems, Mid Sweden University, Sundsvall. 14th March 2019 1

Introduction Shannon entropy Applications References 1 Introduction History 2 Shannon entropy Definition of Shannon Entropy Properties for Shannon entropy Conditional entropy Information density and redundancy Information gain 3 Application in security Passwords Research about human chosen passwords Identifying information 2

Introduction Shannon entropy Applications References History Created 1948 by Shannon’s paper ‘A Mathematical Theory of Communication’ [Sha48]. He starts using the term ‘entropy’ as a measure for information. In physics entropy measures the disorder of molecules. Shannon’s entropy measures disorder of information. He used this theory to analyse communication. What are the theoretical limits for different channels? How much redundancy is needed for certain noise? 4

Introduction Shannon entropy Applications References History This theory is interesting on the physical layer of networking. It’s also interesting for security. Field of Information Theoretic Security ‘Efficiency’ of passwords Measure identifiability . . . 5

Introduction Shannon entropy Applications References Definition of Shannon Entropy Definition (Shannon entropy) Stochastic variable X assumes values from X . Shannon entropy H ( X ) defined as � H ( X ) = − K Pr( X = x ) log Pr( X = x ) , x ∈ X 1 Usually K = log 2 to give entropy in unit bits (bit). 7

Introduction Shannon entropy Applications References Definition of Shannon Entropy Shannon entropy can be seen as . . . . . . how much choice in each event. . . . the uncertainty of each event. . . . how many bits to store each event. . . . how much information it produces. 8

Introduction Shannon entropy Applications References Definition of Shannon Entropy Example (Toss a coin) Stochastic variable S takes values from S = { h , t } . We have Pr( S = h ) = Pr( S = t ) = 1 2 . This gives H ( S ) as follows: H ( S ) = − (Pr( S = h ) log Pr( S = h ) + Pr( S = t ) log Pr( S = t )) = − 2 × 1 2 log 1 2 = log 2 = 1 . 9

Introduction Shannon entropy Applications References Definition of Shannon Entropy Example (Roll a die) Stochastic variable D takes values from D = { q , q q , q q q , q q q q , q q q , q q q q } . q q q q We have Pr( D = d ) = 1 6 for all d ∈ D . The entropy H ( D ) is as follows: � H ( D ) = − Pr( D = d ) log Pr( D = d ) d ∈ D = − 6 × 1 6 log 1 6 = log 6 ≈ 2 . 585 . 10

Introduction Shannon entropy Applications References Definition of Shannon Entropy Remark If we didn’t know already, we now know that a roll of a die . . . contains more ‘choice’ than a coin toss. is more uncertain to predict than a coin toss. requires more bits to store than a coin toss. produces more information than a coin toss. What if we modify the die a bit? 11

Introduction Shannon entropy Applications References Definition of Shannon Entropy Example (Roll of a modified die) Stochastic variable D ′ taking values from D . We now have Pr( D ′ = q q 10 and Pr( D ′ = d ) = 1 q q ) = 9 10 × 1 5 for q q d � = q q q q . q q This yields    9 10 log 9 50 log 1 1 H ( D ′ ) = − � 10 +  50 d � = 6 = − 9 10 log 9 10 − 5 × 1 50 log 1 50 = − 9 10 log 9 10 − 1 10 log 1 50 ≈ 0 . 701 . Note that the log function is the logarithm in base 2 (i.e. log 2 ). 12

Introduction Shannon entropy Applications References Definition of Shannon Entropy Remark This die is much easier to predict. It produces much less information — less than a coin toss! Requires less data for storage etc. 13

Introduction Shannon entropy Applications References Properties for Shannon entropy Definition Function f : R → R such that tf ( x ) + ( 1 − t ) f ( y ) ≤ f ( tx + ( 1 − t ) y ) , Then f is concave . With strict inequality for x � = y we say that f is strictly concave . Example log: R → R is strictly concave. 14

Introduction Shannon entropy Applications References Properties for Shannon entropy 1 . 5 1 log x 0 . 5 0 1 2 3 4 5 x 15

Introduction Shannon entropy Applications References Properties for Shannon entropy Theorem (Jensen’s inequality) Strictly concave function f : R → R . Real numbers a 1 , a 2 , . . . , a n > 0 such that � n i = 1 a i = 1 . Then we have � n � n � � a i f ( x i ) ≤ f a i x i . i = 1 i = 1 We have equality iff x 1 = x 2 = · · · = x n . 16

Introduction Shannon entropy Applications References Properties for Shannon entropy Theorem Stochastic variable X with probability distribution p 1 , p 2 , . . . , p n , where p i > 0 for 1 ≤ i ≤ n . Then H ( X ) ≤ log n . Equality iff p 1 = p 2 = · · · = p n = 1 / n . 17

Introduction Shannon entropy Applications References Properties for Shannon entropy Proof. The theorem follows directly from Jensen’s inequality: n n p i log 1 � � H ( X ) = − p i log p i = p i i = 1 i = 1 n 1 � ≤ log p i = log n . p i i = 1 With equality iff p 1 = p 2 = · · · = p n . Q.E.D. 18

Introduction Shannon entropy Applications References Properties for Shannon entropy Corollary H ( X ) = 0 iff Pr( X = x ) = 1 for some x ∈ X and Pr( X = x ′ ) = 0 for all x � = x ′ ∈ X . Proof. If Pr( X = x ) = 1, then n = 1 and thus H ( X ) = log n = 0. If H ( X ) = 0, then H ( X ) ≤ log n = 0. Thus n = 1. Q.E.D. 19

Introduction Shannon entropy Applications References Properties for Shannon entropy Lemma Stochastic variables X and Y . Then we have H ( X , Y ) ≤ H ( X ) + H ( Y ) . Equality iff X and Y are independent. 20

Introduction Shannon entropy Applications References Conditional entropy Definition (Conditional entropy) Define conditional entropy H ( Y | X ) as � � H ( Y | X ) = − Pr( Y = y ) Pr( X = x | y ) log Pr( X = x | y ) . y x Remark This is the uncertainty in Y which is not revealed by X . 21

Introduction Shannon entropy Applications References Conditional entropy Theorem H ( X , Y ) = H ( X ) + H ( Y | X ) H ( X ) H ( Y | X ) 22

Introduction Shannon entropy Applications References Conditional entropy Corollary H ( X | Y ) ≤ H ( X ) . Corollary H ( X | Y ) = H ( X ) iff X and Y independent. 23

Introduction Shannon entropy Applications References Information density and redundancy Definition Natural language L . Stochastic variable P n L of strings of length n . (Alphabet P L .) Entropy of L defined as H ( P n L ) H L = lim . n n →∞ Redundancy in L is H L R L = 1 − log | P L | . 24

Introduction Shannon entropy Applications References Information density and redundancy Remark Meaning we have H L bits per character in L . Example ([Sha48]) Entropy of 1–1.5 bits per character in English. Redundancy of approximately 1 − 1 . 25 log 26 ≈ 0 . 73. 25

Applied Information Theory Daniel Bosk Department of Information - PowerPoint PPT Presentation

Introduction Shannon entropy Applications References Applied Information Theory Daniel Bosk Department of Information and Communication Systems, Mid Sweden University, Sundsvall. 14th March 2019 1 Introduction Shannon entropy

Applied Hodge Theory: Social Choice, Crowdsourced Ranking, and Game Theory Yuan Yao HKUST

Chapter 2- -3 3 Chapter 2 Definition of Theory: A theory is a systematic Definition of

Applied Machine Learning Introduction 1 APPLIED MACHINE LEARNING Practicalities Contact

Applied Automata Theory Roland Meyer TU Kaiserslautern Roland Meyer (TU KL) Applied Automata

Applied Automata Theory Roland Meyer TU Kaiserslautern Roland Meyer (TU KL) Applied Automata

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

Applied category theory @KenScambler The emerging science of compositionality Category

Game Theory and Nuclear Weapons Game Theory and Nuclear Weapons Game Theory and Nuclear Warfare

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

SOCIOLOGICAL THEORY: A SCIENTIFIC APPROACH What is a theory? ! What does a theory consist of?

SOCIOLOGICAL THEORY: A SCIENTIFIC APPROACH What is a theory? What does a theory consist of?

General motivations Model theory Recursion theory Lambda calculus Set theory

Climate Information Climate Information Applied in China Applied in China Yuping Yan Yan

Information Theory project Lo Bordy 29 mai 2017 Lo Bordy Information Theory project Global

Overview Coding and Information Theory What is information theory? Entropy Coding Chris

Tor and (un)provable privacy Roger Dingledine The Tor Project https://torproject.org/ 1

Freenet Project: Leap over Censorship The technical part of the solution for freedom of the press

Gregory W. Wornell Dept. Electrical Engineering and Computer Science Massachusetts Institute of

ISE331: Snowden Attack 1 Ou Outline of f Topics Covered Snowdens background and how he

Credential Access with Hashcat Dawid Czagan SECURITY INSTRUCTOR @dawidczagan Creator: Jens

Tor and circumvention: Lessons learned Roger Dingledine The Tor Project https://torproject.org/

Solar Finance Breakfast Briefing Steven M. Kaplan Anjali Garg Nadav Klugman Partner Associate

ValiCert Case Case Description and Questions Exercise - 1 W hy every job doesn't translate w ell