The Method of Types and Its Application to Information Hiding - PowerPoint PPT Presentation

✬ ✩ The Method of Types and Its Application to Information Hiding Pierre Moulin University of Illinois at Urbana-Champaign www.ifp.uiuc.edu/ ˜ moulin/talks/eusipco05-slides.pdf EUSIPCO Antalya, September 7, 2005 ✫ ✪ 1

✬ ✩ Outline • Part I: General Concepts – Introduction – Definitions – What is it useful for? • Part II: Application to Information Hiding – Performance guarantees against omnipotent attacker? – Steganography, Watermarking, Fingerprinting ✫ ✪ 2

✬ ✩ Part I: General Concepts ✫ ✪ 3

✬ ✩ Reference Materials • I. Csiszar, “The Method of Types”, IEEE Trans. Information Theory , Oct. 1998 (commemorative Shannon issue) • A. Lapidoth and P. Narayan, “Reliable Communication under Channel Uncertainty”, same issue. • Application areas: – capacity analyses – computation of error probabilities (exponential behavior) – universal coding/decoding – hypothesis testing ✫ ✪ 4

✬ ✩ Basic Notation • Discrete alphabets X and Y • Random variables X, Y with joint pmf p ( x, y ) • The entropy of X is H ( X ) = − � x ∈X p ( x ) log p ( x ) (will sometimes be denoted by H ( p X )) • Joint entropy H ( X, Y ) = − � � y ∈Y p ( x, y ) log p ( x, y ) x ∈X • The conditional entropy of Y given X is � � H ( Y | X ) = − p ( x, y ) log p ( y | x ) x ∈X y ∈Y = H ( X, Y ) − H ( X ) ✫ ✪ 5

✬ ✩ • The mutual information between X and Y is p ( x, y ) log p ( x, y ) � � I ( X ; Y ) = p ( x ) p ( y ) x ∈X y ∈Y = H ( Y ) − H ( Y | X ) • The Kullback-Leibler divergence between pmf’s p and q is p ( x ) log p ( x ) � D ( p || q ) = q ( x ) x ∈X ✫ ✪ 6

✬ ✩ Types • Deterministic notion • Given a length- n sequence x ∈ X n , count the frequency of occurrence of each letter of the alphabet X • Example: X = { 0 , 1 } , n = 12, x = 110100101110 contains 5 zeroes and 7 ones p x = ( 5 12 , 7 ⇒ the sequence x has type ˆ 12 ) • ˆ p x is also called empirical pmf. It may be viewed as a pmf over X p x ( x ) is a multiple of 1 • Each ˆ n . ✫ ✪ 7

✬ ✩ Joint Types • Given two length- n sequences x ∈ X n and y ∈ Y n , count the frequency of occurrence of each pair ( x, y ) ∈ X × Y • Example: x = 110100101110 y = 111100101110    4 / 12 1 / 12 • ( x , y ) have joint type ˆ p xy =  0 7 / 12 • Empirical pmf over X × Y ✫ ✪ 8

✬ ✩ Conditional Types • By analogy with Bayes rule, define the conditional type of y given x as p y | x ( y | x ) = ˆ p xy ( x, y ) ˆ p x ( x ) ˆ which is an empirical conditional pmf • Example: x = 110100101110 y = 111100101110    4 / 5 1 / 5 ⇒ ˆ p y | x =  0 1 ✫ ✪ 9

✬ ✩ Type Classes • The type class T x is the set of all sequences that have the same type as x . Example: all sequences with 5 zeroes and 7 ones • The joint type class T xy is the set of all sequences that have the same joint type as ( x , y ) • The conditional type class T y | x is the set of all sequences y ′ that have the same type as y , conditioned on x ✫ ✪ 10

✬ ✩ Information Measures • Any type may be represented by a dummy sequence • Can define empirical information measures: � H ( x ) H (ˆ p x ) � H ( y | x ) H (ˆ p y | x ) � I ( x ; y ) I ( X ; Y ) for ( X, Y ) ∼ ˆ p xy • Will be useful to design universal decoders ✫ ✪ 11

✬ ✩ Typicality • Consider pmf p over X • Length- n sequence x ∼ i.i.d. p . Notation: x ∼ p n • Example: X = { 0 , 1 } , n = 12, x = 110100101110 • For large n , all typical sequences have approximately composition p • This can be measured in various ways: – Entropy ǫ -typicality: | 1 n log p n ( x ) − H ( X ) | < ǫ – Strong ǫ -typicality: max x ∈X | ˆ p x ( x ) − p ( x ) | < ǫ both define sets of typical sequences ✫ ✪ 12

✬ ✩ Application to Channel Coding • Channel input x = ( x 1 , · · · , x n ) ∈ X n , output y = ( y 1 , · · · , y n ) ∈ Y n • Discrete Memoryless Channel (DMC): p n ( y | x ) = � n i =1 p ( y i | x i ) • Many fundamental coding theorems can be proven using the concept of entropy typicality. Examples: – Shannon’s coding theorem (capacity of DMC) – Rate-distortion bound for memoryless sources ✫ ✪ 13

✬ ✩ • Many fundamental coding theorems cannot be proved using the concept of entropy typicality. Examples: – precise calculations of error log-probability – various kinds of unknown channels • So let’s derive some useful facts about types • Number of types ≤ ( n + 1) |X| (polynomial in n ) • Size of type class T x : p x ) ≤ | T x | ≤ e nH (ˆ ( n + 1) −|X| e nH (ˆ p x ) Ignoring polynomial terms, we write | T x | . = e nH (ˆ p x ) ✫ ✪ 14

✬ ✩ • Probability of x under distribution p n : � p ( x ) n ˆ p x ( x ) p n ( x ) = x ∈X e − n � x ∈X ˆ p x ( x ) log p ( x ) = e − n [ H (ˆ p x )+ D (ˆ p x || p )] = same for all x in the same type class • Probability of type class T x under distribution p n : P n ( T x ) = | T x | p n ( x ) . = e − nD (ˆ p x || p ) • Similarly: | T y | x | . = e nH (ˆ p y | x ) Y | X ( T y | x | x ) . = e − nD (ˆ p xy || p Y | X ˆ p x ) P n ✫ ✪ 15

✬ ✩ Constant-Composition Codes • All codewords have the same type ˆ p x • Random coding : generate codewords x m , m ∈ M randomly and independently from uniform pmf on type class T x • Note that channel outputs have different types in general ✫ ✪ 16

✬ ✩ Unknown DMC’s – Universal Codes • Channel p Y | X is revealed neither to encoder nor to decoder ⇒ neither encoding rule nor decoding rule may depend on p Y | X C = max p X min p Y | X I ( X ; Y ) • Universal codes: same error exponent as in known- p Y | X case (existence?) • Encoder : select T x , use constant-composition codes • Decoder : uses Maximum Mutual Information rule ˆ = argmax m ∈M I ( x m ; y ) m = argmin m ∈M H ( y | x m ) • Note: the GLRT decoder is in general not universal (GLRT: first estimate p Y | X , then plug in ML decoding rule) ✫ ✪ 17

✬ ✩ Key idea in proof • Denote by D m ⊂ Y n the decoding region for message m • Polynomial number of type classes, forming a partition of Y n • Given that m was transmitted, partition error event y ∈ Y n \ D m into a union over type classes: � y ∈ T y | x m \ D m T y | x m ✫ ✪ 18

✬ ✩ • The probability of the error event is therefore given by    � Pr [error | m ] = Pr T y | x m \ D m  T y | x m � � � ≤ Pr T y | x m \ D m T y | x m . � � = max Pr T y | x m \ D m T y | x m Pr [ T y | x m ] | T y | x m \ D m | = max | T y | x m | T y | x m p x m ) | T y | x m \ D m | . e − nD (ˆ p x m y || p Y | X ˆ = max | T y | x m | T y | x m ⇒ the worst conditional type class dominates error probability • Calculation mostly involves combinatorics: finding out | T y | x m \ D m | ✫ ✪ 19

✬ ✩ Extensions • Channels with memory • “Arbitrary Varying” Channels ⇒ randomized codes • Continuous alphabets (difficult!) ✫ ✪ 20

✬ ✩ Part II: Applications to WM ✫ ✪ 21

✬ ✩ Reference Materials [SM’03 ] A. Somekh-Baruch and N. Merhav, “On the Error Exponent and Capacity Games of Private Watermarking Systems,” IEEE Trans. Information Theory , March 2003 [SM’04 ] A. Somekh-Baruch and N. Merhav, “On the Capacity Game of Public Watermarking Systems,” IEEE Trans. Information Theory , March 2004 [MO’03 ] P. Moulin and J. O’Sullivan, “Information-Theoretic Analysis of Information Hiding,” IEEE Trans. Information Theory , March 2003 [MW’04 ] P. Moulin and Y. Wang, “Error Exponents for Channel Coding with Side Information,” preprint , Sep. 2004 ✫ ✪ 22

✬ ✩ Communication Model for Data Hiding Decoder Attack Encoder ^ x y M M y x g( , ) y k Message f( ,m, ) s k p( | ) s Host k Key • Memoryless host sequence s • Message M uniformly distributed over { 1 , 2 , · · · , 2 nR } • Unknown attack channel p ( y | x ) • Randomization via secret key sequence k , arbitrary alphabet K ✫ ✪ 23

✬ ✩ Attack Channel Model • First IT formulations of this problem assumed a fixed attack channel (e.g., AWGN) or a family of memoryless channels (1998-1999) • Memoryless assumption was later relaxed (2001) • We’ll just require the following distortion constraint: n � d n ( x , y ) � d ( x i , y i ) ≤ D 2 ∀ x , y (wp1) i =1 ⇒ unknown channel with arbitrary memory • Similarly the following embedding constraint will be assumed: d n ( s , x ) ≤ D 1 ∀ s , k , m, x (wp1) ✫ ✪ 24

✬ ✩ Data-Hiding Capacity [SM’04] • Single-letter formula: C ( D 1 , D 2 ) = sup p ( y | x ) ∈A ( D 2 ) [ I ( U ; Y ) − I ( U ; S )] min p ( x,u | s ) ∈Q ( D 1 ) where U is an auxiliary random variable Q ( D 1 ) = { p XU | S : � x,u,s p ( x, u | s ) p ( s ) d ( s, x ) ≤ D 1 } A ( D 2 ) = { p Y | X : � x,y p ( y | x ) p ( x ) d ( x, y ) ≤ D 2 } • Same capacity formula as in [MO’03], where p ( y | x ) was constrained to belong to the family A n ( D 2 ) of memoryless channels • Why? ✫ ✪ 25

The Method of Types and Its Application to Information Hiding - PowerPoint PPT Presentation

The Method of Types and Its Application to Information Hiding Pierre Moulin University of Illinois at Urbana-Champaign www.ifp.uiuc.edu/ moulin/talks/eusipco05-slides.pdf EUSIPCO Antalya, September 7, 2005 1

Types Dynamic types Types are broken down into many categories Static types Duck typing

Types Classification of Values cs3723 1 Values and Types Basic types: types of atomic

! TYPES & STATIC ANALYSIS TYPES ARE GOOD, I PROMISE. SAM GREENWOOD @SAMTGREENWOOD

The Scientific Method The Scientific Method The Scientific Method involves 6 steps: Problem

Algebraic Data Types Christine Rizkallah CSE, UNSW Term 3 2020 1 Composite Data Types as

Method Handles Everywhere! Charles Oliver Nutter @headius Method Handles What are method

B Method Proof assistants May 16, 2017 Lucas Franceschino What is B method? B-method goal

Newtons method Newtons method 1 / 8 Newtons method Objective: solving a non-linear

Scenegraphs and Engines Scenegraphs and Engines Scenegraphs Application Application

Algebraic Data Types Christine Rizkallah CSE, UNSW (and data61) Term 3 2019 1 Composite Data

Algebraic Data Types Christine Rizkallah CSE, UNSW (and data61) Term 3 2019 1 Composite Data

Basic Types C s basic (built-in) types: o Integer types, including long integers, short

Java Type System and Object Model Horstmann ch.7.1-7.3.1, 7.7 Types Non-primitive types

OSPF Router Types OSPF Router Types There are four types of OSPF routers. Router types are

Data Types Gabriele Keller Ron Vanderfeesten Compound types What are types? So far, we

Inductive Types for Free Representing Nested Inductive Types using W-types Michael Abbott (U.

Formal Modeling in Cognitive Science 1 Noisy Channel Model Channel Capacity Lecture 29: Noisy

NDN-RTC and Experimental Library Func:onality Peter Gusev NDNComm, March 2017 Flume Slack

Resistive Memories Marwen Zorgui, Mohammed E. Fouda, Zhiying Wang, Ahmed Eltawil, and Fadi Kurdahi

some channel models Input X P(y|x) output Y transition probabilities memoryless: - output at

Processing Architecture with Memory Channel Network Mohammad Alian 1 , Seung Won Min 1 , Hadi

Jingwen Bai ECE, Rice University Joint work with Chenxi Liu* and Ashutosh Sabharwal Rice

Improving Spectrum Efficiency with ACKs Jiansong Zhang # , Haichen Shen , Kun Tan ,

References 2 Material Related to LTE comes from 3GPP LTE: System Overview, Product

The Method of Types and Its Application to Information Hiding - PowerPoint PPT Presentation

The Method of Types and Its Application to Information Hiding Pierre Moulin University of Illinois at Urbana-Champaign www.ifp.uiuc.edu/ moulin/talks/eusipco05-slides.pdf EUSIPCO Antalya, September 7, 2005 1

Types Dynamic types Types are broken down into many categories Static types Duck typing

Types Classification of Values cs3723 1 Values and Types Basic types: types of atomic

! TYPES &amp; STATIC ANALYSIS TYPES ARE GOOD, I PROMISE. SAM GREENWOOD @SAMTGREENWOOD

The Scientific Method The Scientific Method The Scientific Method involves 6 steps: Problem

Algebraic Data Types Christine Rizkallah CSE, UNSW Term 3 2020 1 Composite Data Types as

Method Handles Everywhere! Charles Oliver Nutter @headius Method Handles What are method

B Method Proof assistants May 16, 2017 Lucas Franceschino What is B method? B-method goal

Newtons method Newtons method 1 / 8 Newtons method Objective: solving a non-linear

Scenegraphs and Engines Scenegraphs and Engines Scenegraphs Application Application

Algebraic Data Types Christine Rizkallah CSE, UNSW (and data61) Term 3 2019 1 Composite Data

Algebraic Data Types Christine Rizkallah CSE, UNSW (and data61) Term 3 2019 1 Composite Data

Basic Types C s basic (built-in) types: o Integer types, including long integers, short

Java Type System and Object Model Horstmann ch.7.1-7.3.1, 7.7 Types Non-primitive types

OSPF Router Types OSPF Router Types There are four types of OSPF routers. Router types are

Data Types Gabriele Keller Ron Vanderfeesten Compound types What are types? So far, we

Inductive Types for Free Representing Nested Inductive Types using W-types Michael Abbott (U.

Formal Modeling in Cognitive Science 1 Noisy Channel Model Channel Capacity Lecture 29: Noisy

NDN-RTC and Experimental Library Func:onality Peter Gusev NDNComm, March 2017 Flume Slack

Resistive Memories Marwen Zorgui, Mohammed E. Fouda, Zhiying Wang, Ahmed Eltawil, and Fadi Kurdahi

some channel models Input X P(y|x) output Y transition probabilities memoryless: - output at

Processing Architecture with Memory Channel Network Mohammad Alian 1 , Seung Won Min 1 , Hadi

Jingwen Bai ECE, Rice University Joint work with Chenxi Liu* and Ashutosh Sabharwal Rice

Improving Spectrum Efficiency with ACKs Jiansong Zhang # , Haichen Shen , Kun Tan ,

References 2 Material Related to LTE comes from 3GPP LTE: System Overview, Product

! TYPES & STATIC ANALYSIS TYPES ARE GOOD, I PROMISE. SAM GREENWOOD @SAMTGREENWOOD