The Information Bottleneck Method Naftali Tishby, Fernando C. Pereira, William Bialek Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
What is information bottleneck? It is a technique for finding the best tradeoff between accuracy and complexity. Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Example Speech compression: A transcript of spoken words has low entropy = ⇒ It can be compressed without loosing the information about the words. Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Problem Definition Input signals x ∈ X , and y ∈ Y mapping function f: X → Y P ( X = x ) , P ( Y = y , X = x ) Output X → ˜ X ˜ X → Y Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Example 1 X = Speech signal Y = Transcription signal 2 X = Speech signal Y = Speakers identity Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Relevant quantization Mapping X → ˜ X � Soft Partitioning P (˜ x | x ) ← − Hard Partitioning P (˜ x ) = � x p ( x ) p (˜ x | x ) Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
What is a good quantization? The first factor is the rate, or the average number of bits per message needed to specify an element in the codebook without confusion. This number per element of X is bounded from below by the mutual information Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
H ( X ) , I ( X , ˜ X ) , H ( X | ˜ X ) Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
The average volume of the elements of X that are mapped to the same codeword is 2 H ( X | ˜ X ) Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Information rate alone is not enough to characterize good quantization since the rate can always be reduced by throwing away details of the original signal x. We need therefore some additional constraints. Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
The information bottleneck Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
The optimal assignment, that minimizes previous equation, satisfies the equation p ( y | ˜ x ) can be computed by Bayes’ rule and Markov chain condition ˜ X ← X ← Y Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
The information bottleneck iterative algorithm Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
The structure of the solutions The formal solution of the self consistent equations, described above, still requires a specification of the structure and cardinality of ˜ X , as in rate distortion theory. Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
a novel implementation of the information bottleneck method for unsupervised document clustering. Input: X = Documents, Y = Words P ( X ) and P ( X , Y ) Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Hard Clustering β − → ∞ Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method
Recommend
More recommend