The Information Bottleneck Method Naftali Tishby, Fernando C. - - PowerPoint PPT Presentation

the information bottleneck method
SMART_READER_LITE
LIVE PREVIEW

The Information Bottleneck Method Naftali Tishby, Fernando C. - - PowerPoint PPT Presentation

The Information Bottleneck Method Naftali Tishby, Fernando C. Pereira, William Bialek Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method Naftali Tishby, Fernando C. Pereira, William Bialek The Information


slide-1
SLIDE 1

The Information Bottleneck Method

Naftali Tishby, Fernando C. Pereira, William Bialek

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-2
SLIDE 2

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-3
SLIDE 3

What is information bottleneck?

It is a technique for finding the best tradeoff between accuracy and complexity.

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-4
SLIDE 4

Example

Speech compression: A transcript of spoken words has low entropy = ⇒ It can be compressed without loosing the information about the words.

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-5
SLIDE 5

Problem Definition

Input signals x ∈ X, and y ∈ Y mapping function f: X → Y P(X = x), P(Y = y, X = x) Output X → ˜ X ˜ X → Y

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-6
SLIDE 6

Example

1 X = Speech signal

Y = Transcription signal

2 X = Speech signal

Y = Speakers identity

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-7
SLIDE 7

Relevant quantization

Mapping X → ˜ X P(˜ x|x) ← − Soft Partitioning Hard Partitioning P(˜ x) =

x p(x)p(˜

x|x)

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-8
SLIDE 8

What is a good quantization?

The first factor is the rate, or the average number of bits per message needed to specify an element in the codebook without

  • confusion. This number per element of X is bounded from below

by the mutual information

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-9
SLIDE 9

H(X), I(X, ˜ X), H(X|˜ X)

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-10
SLIDE 10

The average volume of the elements of X that are mapped to the same codeword is 2H(X| ˜

X)

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-11
SLIDE 11

Information rate alone is not enough to characterize good quantization since the rate can always be reduced by throwing away details of the original signal x. We need therefore some additional constraints.

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-12
SLIDE 12

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-13
SLIDE 13

The information bottleneck

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-14
SLIDE 14

The optimal assignment, that minimizes previous equation, satisfies the equation p(y|˜ x) can be computed by Bayes’ rule and Markov chain condition ˜ X ← X ← Y

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-15
SLIDE 15

The information bottleneck iterative algorithm

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-16
SLIDE 16

The structure of the solutions

The formal solution of the self consistent equations, described above, still requires a specification of the structure and cardinality

  • f ˜

X , as in rate distortion theory.

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-17
SLIDE 17

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-18
SLIDE 18

a novel implementation of the information bottleneck method for unsupervised document clustering. Input: X = Documents, Y = Words P(X) and P(X, Y )

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-19
SLIDE 19

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-20
SLIDE 20

Hard Clustering

β − → ∞

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-21
SLIDE 21

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-22
SLIDE 22

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-23
SLIDE 23

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-24
SLIDE 24

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-25
SLIDE 25

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-26
SLIDE 26

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-27
SLIDE 27

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method

slide-28
SLIDE 28

Naftali Tishby, Fernando C. Pereira, William Bialek The Information Bottleneck Method