Clusters and features from combinatorial stochastic processes - - PowerPoint PPT Presentation

clusters and features from combinatorial stochastic
SMART_READER_LITE
LIVE PREVIEW

Clusters and features from combinatorial stochastic processes - - PowerPoint PPT Presentation

Clusters and features from combinatorial stochastic processes Tamara Broderick, Michael I. Jordan, Jim Pitman UC Berkeley Clustering/Partition 1 Clustering/Partition clusters, classes, blocks (of a partition) 1


slide-1
SLIDE 1

Clusters and features from combinatorial stochastic processes

Tamara Broderick, Michael I. Jordan, Jim Pitman UC Berkeley

slide-2
SLIDE 2

1

Clustering/Partition

slide-3
SLIDE 3

“clusters”, “classes”, “blocks (of a partition)”

1

Clustering/Partition

slide-4
SLIDE 4

C a t D

  • g

M

  • u

s e

1

“clusters”, “classes”, “blocks (of a partition)” Clustering/Partition

slide-5
SLIDE 5

Clustering/Partition Picture 1 Picture 2 Picture 3 Picture 4 Picture 5 Picture 6 Picture 7 C a t D

  • g

M

  • u

s e L i z a r d S h e e p

2

slide-6
SLIDE 6

Latent feature allocation

“features”, “topics”

Picture 1 Picture 2 Picture 3 Picture 4 Picture 5 Picture 6 Picture 7 C a t D

  • g

M

  • u

s e L i z a r d S h e e p

3

  • Exchangeable
  • Finite # of features

per data point

slide-7
SLIDE 7

Characterizations

4

  • Exchangeable cluster distributions are characterized
  • What about exchangeable feature distributions?
slide-8
SLIDE 8

Exchangeable probability functions

5

P( ) = p (NN,1, . . . , NN,K)

1 2 N 2 K ... ... 1

slide-9
SLIDE 9

Exchangeable probability functions

5

P(

1 2 N 2 K ... ... 1

) = p (SN,1, . . . , SN,K)

slide-10
SLIDE 10

Exchangeable probability functions

5

P(

1 2 N 2 K ... ... 1

Size of Kth cluster ) = p (SN,1, . . . , SN,K)

slide-11
SLIDE 11

Exchangeable probability functions

5

P( Exchangeable partition probability function (EPPF)

1 2 N 1 2 K ... ...

[Pitman 1995]

) = p (SN,1, . . . , SN,K)

slide-12
SLIDE 12

Exchangeable probability functions

6

“Exchangeable feature probability function” (EFPF)?

slide-13
SLIDE 13

Example: Indian buffet process

7

For n = 1, 2, ...

  • 1. Data point n has an existing feature

k that has already occurred times with probability

  • 2. Number of new features for data

point n: K+

n = Poisson

  • γ

θ θ + n − 1

  • Mn−1,k

Mn−1,k θ + n − 1

n = 1 2 N ... k = 1 2 K ...

[Griffiths, Ghahramani 2005]

slide-14
SLIDE 14

7

For n = 1, 2, ...

  • 1. Data point n has an existing feature

k that has already occurred times with probability

  • 2. Number of new features for data

point n: K+

n = Poisson

  • γ

θ θ + n − 1

  • Mn−1,k

Mn−1,k θ + n − 1

[Griffiths, Ghahramani 2005]

Example: Indian buffet process

n = 1 2 N ... k = 1 2 K ...

slide-15
SLIDE 15

7

For n = 1, 2, ..., N

  • 1. Data point n has an existing feature

k that has already occurred times with probability

  • 2. Number of new features for data

point n: K+

n = Poisson

  • γ

θ θ + n − 1

  • Mn−1,k

Mn−1,k θ + n − 1

[Griffiths, Ghahramani 2005]

Example: Indian buffet process

n = 1 2 N ... k = 1 2 K ...

slide-16
SLIDE 16

For n = 1, 2, ..., N

  • 1. Data point n has an existing feature

k that has already occurred times with probability

  • 2. Number of new features for data

point n: Sn−1,k θ + n − 1

7

K+

n = Poisson

  • γ

θ θ + n − 1

  • [Griffiths, Ghahramani 2005]

Example: Indian buffet process

n = 1 2 N ... k = 1 2 K ...

Sn−1,k

slide-17
SLIDE 17

7

For n = 1, 2, ..., N

  • 1. Data point n has an existing feature

k that has already occurred times with probability

  • 2. Number of new features for data

point n: K+

n = Poisson

  • γ

θ θ + n − 1

  • [Griffiths, Ghahramani 2005]

Example: Indian buffet process

n = 1 2 N ... k = 1 2 K ...

Sn−1,k θ + n − 1 Sn−1,k

slide-18
SLIDE 18

7 [Griffiths, Ghahramani 2005]

Example: Indian buffet process

n = 1 2 N ... k = 1 2 K ...

For n = 1, 2, ..., N

  • 1. Data point n has an existing feature

k that has already occurred times with probability

  • 2. Number of new features for data

point n: K+

n = Poisson

  • γ

θ θ + n − 1

  • Sn−1,k

θ + n − 1 Sn−1,k

slide-19
SLIDE 19

7 [Griffiths, Ghahramani 2005]

Example: Indian buffet process

n = 1 2 N ... k = 1 2 K ...

For n = 1, 2, ..., N

  • 1. Data point n has an existing feature

k that has already occurred times with probability

  • 2. Number of new features for data

point n: K+

n = Poisson

  • γ

θ θ + n − 1

  • Sn−1,k

θ + n − 1 Sn−1,k

slide-20
SLIDE 20

7 [Griffiths, Ghahramani 2005]

Example: Indian buffet process

n = 1 2 N ... k = 1 2 K ...

For n = 1, 2, ..., N

  • 1. Data point n has an existing feature

k that has already occurred times with probability

  • 2. Number of new features for data

point n: K+

n = Poisson

  • γ

θ θ + n − 1

  • Sn−1,k

θ + n − 1 Sn−1,k

slide-21
SLIDE 21

7 [Griffiths, Ghahramani 2005]

Example: Indian buffet process

n = 1 2 N ... k = 1 2 K ...

For n = 1, 2, ..., N

  • 1. Data point n has an existing feature

k that has already occurred times with probability

  • 2. Number of new features for data

point n: K+

n = Poisson

  • γ

θ θ + n − 1

  • Sn−1,k

θ + n − 1 Sn−1,k

slide-22
SLIDE 22

7 [Griffiths, Ghahramani 2005]

Example: Indian buffet process

n = 1 2 N ... k = 1 2 K ...

For n = 1, 2, ..., N

  • 1. Data point n has an existing feature

k that has already occurred times with probability

  • 2. Number of new features for data

point n: K+

n = Poisson

  • γ

θ θ + n − 1

  • Sn−1,k

θ + n − 1 Sn−1,k

slide-23
SLIDE 23

7 [Griffiths, Ghahramani 2005]

Example: Indian buffet process

n = 1 2 N ... k = 1 2 K ...

For n = 1, 2, ..., N

  • 1. Data point n has an existing feature

k that has already occurred times with probability

  • 2. Number of new features for data

point n: K+

n = Poisson

  • γ

θ θ + n − 1

  • Sn−1,k

θ + n − 1 Sn−1,k

slide-24
SLIDE 24

7 [Griffiths, Ghahramani 2005]

Example: Indian buffet process

n = 1 2 N ... k = 1 2 K ...

For n = 1, 2, ..., N

  • 1. Data point n has an existing feature

k that has already occurred times with probability

  • 2. Number of new features for data

point n: K+

n = Poisson

  • γ

θ θ + n − 1

  • Sn−1,k

θ + n − 1 Sn−1,k

slide-25
SLIDE 25

7

n = 1 2 N ... k = 1 2 K ...

[Griffiths, Ghahramani 2005]

Example: Indian buffet process

For n = 1, 2, ..., N

  • 1. Data point n has an existing feature

k that has already occurred times with probability

  • 2. Number of new features for data

point n: K+

n = Poisson

  • γ

θ θ + n − 1

  • Sn−1,k

θ + n − 1 Sn−1,k

slide-26
SLIDE 26

Exchangeable probability functions

8

“Exchangeable feature probability function” (EFPF)?

slide-27
SLIDE 27

Exchangeable probability functions

8

k = 1 2 K ...

P( ) “Exchangeable feature probability function” (EFPF)? Example: Indian buffet process (IBP)

n = 1 2 N ...

slide-28
SLIDE 28

= 1 KN!(θγ)KN exp

  • −θγ

N

  • n=1

(θ + n − 1)−1 KN

  • k=1

Γ(SN,k)Γ(N − SN,k + θ) Γ(N + θ)

Exchangeable probability functions

8

P( ) “Exchangeable feature probability function” (EFPF)? Example: Indian buffet process (IBP)

k = 1 2 K ... n = 1 2 N ...

slide-29
SLIDE 29

Exchangeable probability functions

8

P( ) “Exchangeable feature probability function” (EFPF)? Example: Indian buffet process (IBP) Size of kth feature

k = 1 2 K ... n = 1 2 N ...

= 1 KN!(θγ)KN exp

  • −θγ

N

  • n=1

(θ + n − 1)−1 KN

  • k=1

Γ(SN,k)Γ(N − SN,k + θ) Γ(N + θ)

slide-30
SLIDE 30

Exchangeable probability functions

8

P( ) “Exchangeable feature probability function” (EFPF)? Example: Indian buffet process (IBP) Number of features Size of kth feature

k = 1 2 K ... n = 1 2 N ...

= 1 KN!(θγ)KN exp

  • −θγ

N

  • n=1

(θ + n − 1)−1 KN

  • k=1

Γ(SN,k)Γ(N − SN,k + θ) Γ(N + θ)

slide-31
SLIDE 31

Exchangeable probability functions

8

P( ) “Exchangeable feature probability function” (EFPF)? Example: Indian buffet process (IBP) Number of features Size of kth feature Number of data points

k = 1 2 K ... n = 1 2 N ...

= 1 KN!(θγ)KN exp

  • −θγ

N

  • n=1

(θ + n − 1)−1 KN

  • k=1

Γ(SN,k)Γ(N − SN,k + θ) Γ(N + θ)

slide-32
SLIDE 32

= p(N; SN,1, SN,2, . . . , SN,K)

Exchangeable probability functions

8

P( ) “Exchangeable feature probability function” (EFPF)? Example: Indian buffet process (IBP) Number of features Size of kth feature Number of data points

[Broderick, Jordan, Pitman 2012]

k = 1 2 K ... n = 1 2 N ...

= 1 KN!(θγ)KN exp

  • −θγ

N

  • n=1

(θ + n − 1)−1 KN

  • k=1

Γ(SN,k)Γ(N − SN,k + θ) Γ(N + θ)

slide-33
SLIDE 33

Exchangeable probability functions

8

P( ) “Exchangeable feature probability function” (EFPF)? Example: Indian buffet process (IBP) Number of features Size of kth feature Number of data points

[Broderick, Jordan, Pitman 2012]

“EFPF”

k = 1 2 K ... n = 1 2 N ...

= 1 KN!(θγ)KN exp

  • −θγ

N

  • n=1

(θ + n − 1)−1 KN

  • k=1

Γ(SN,k)Γ(N − SN,k + θ) Γ(N + θ)

= p(N; SN,1, SN,2, . . . , SN,K)

slide-34
SLIDE 34

Exchangeable probability functions

9

Counterexample “Exchangeable feature probability function” (EFPF)?

[Broderick, Jordan, Pitman 2012]

n = 1 2 N ...

slide-35
SLIDE 35

Exchangeable probability functions

9

Counterexample P(row = ) = p1 ) = p2 ) = p3 ) = p4 P(row = P(row = P(row = “Exchangeable feature probability function” (EFPF)?

[Broderick, Jordan, Pitman 2012]

n = 1 2 N ...

slide-36
SLIDE 36

Exchangeable probability functions

9

Counterexample P(row = ) = p1 ) = p2 ) = p3 ) = p4 P(row = P(row = P(row = “Exchangeable feature probability function” (EFPF)?

[Broderick, Jordan, Pitman 2012]

n = 1 2 N ...

slide-37
SLIDE 37

Exchangeable probability functions

9

Counterexample P(row = ) = p1 ) = p2 ) = p3 ) = p4 P(row = P(row = P(row = P( “Exchangeable feature probability function” (EFPF)?

[Broderick, Jordan, Pitman 2012]

) ) P(

n = 1 2 N ...

slide-38
SLIDE 38

Exchangeable probability functions

9

Counterexample P(row = ) = p1 ) = p2 ) = p3 ) = p4 P(row = P(row = P(row = P( ) = P( ) p1p2 = p3p4 “Exchangeable feature probability function” (EFPF)?

[Broderick, Jordan, Pitman 2012]

n = 1 2 N ...

slide-39
SLIDE 39

Exchangeable probability functions

9

Counterexample P(row = ) = p1 ) = p2 ) = p3 ) = p4 P(row = P(row = P(row = P( ) = P( ) p1p2 = p3p4 “Exchangeable feature probability function” (EFPF)?

[Broderick, Jordan, Pitman 2012]

n = 1 2 N ...

slide-40
SLIDE 40

Exchangeable probability functions

Exchangeable cluster distributions = Cluster distributions with EPPFs Exchangeable feature distributions Two-feature example IBP Feature distributions with EFPFs

10 [Broderick, Jordan, Pitman 2012]

slide-41
SLIDE 41

Paintboxes

Exchangeable partition: Kingman paintbox

[Kingman 1978] 11

slide-42
SLIDE 42

Paintboxes

[Kingman 1978] 11

Exchangeable partition: Kingman paintbox

slide-43
SLIDE 43

Paintboxes

[Kingman 1978] 11

Exchangeable partition: Kingman paintbox

slide-44
SLIDE 44

Paintboxes

1 1

[Kingman 1978]

Exchangeable partition: Kingman paintbox

11

slide-45
SLIDE 45

Paintboxes

1 2 1 2

[Kingman 1978]

Exchangeable partition: Kingman paintbox

11

slide-46
SLIDE 46

Paintboxes

1 2 3 1 2 3

[Kingman 1978]

Exchangeable partition: Kingman paintbox

11

slide-47
SLIDE 47

Paintboxes

1 2 3 4 1 2 3 4

[Kingman 1978]

Exchangeable partition: Kingman paintbox

11

slide-48
SLIDE 48

Paintboxes

1 2 3 4 5 1 2 3 4 5

[Kingman 1978]

Exchangeable partition: Kingman paintbox

11

slide-49
SLIDE 49

Paintboxes

1 2 3 4 6 5 1 2 3 4 5 6

[Kingman 1978]

Exchangeable partition: Kingman paintbox

11

slide-50
SLIDE 50

Paintboxes

1 2 3 4 6 5 7 1 2 3 4 5 6 7

[Kingman 1978]

Exchangeable partition: Kingman paintbox

11

slide-51
SLIDE 51

Paintboxes

1 2 3 4 6 5 7 1 2 3 4 5 6 7 Cat cluster Dog cluster ...

[Kingman 1978]

Exchangeable partition: Kingman paintbox

11

slide-52
SLIDE 52

Paintboxes

1 2 3 4 6 5 7 1 2 3 4 5 6 7 Cat cluster Dog cluster Mouse cluster Lizard cluster Sheep cluster Horse cluster

[Kingman 1978]

Exchangeable partition: Kingman paintbox

12

slide-53
SLIDE 53

Paintboxes

[Broderick, Pitman, Jordan (submitted)]

Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature

13

slide-54
SLIDE 54

Paintboxes

Exchangeable feature allocation: feature paintbox

[Broderick, Pitman, Jordan (submitted)]

Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature

13

slide-55
SLIDE 55

Paintboxes

1 1 Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature

13 [Broderick, Pitman, Jordan (submitted)]

Exchangeable feature allocation: feature paintbox

slide-56
SLIDE 56

Paintboxes

1 2 1 2 Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature

13 [Broderick, Pitman, Jordan (submitted)]

Exchangeable feature allocation: feature paintbox

slide-57
SLIDE 57

Paintboxes

1 2 3 1 2 3 Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature

13 [Broderick, Pitman, Jordan (submitted)]

Exchangeable feature allocation: feature paintbox

slide-58
SLIDE 58

Paintboxes

1 2 3 4 1 2 3 4 Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature

13 [Broderick, Pitman, Jordan (submitted)]

Exchangeable feature allocation: feature paintbox

slide-59
SLIDE 59

Paintboxes

1 2 3 4 5 1 2 3 4 5 Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature

13 [Broderick, Pitman, Jordan (submitted)]

Exchangeable feature allocation: feature paintbox

slide-60
SLIDE 60

Paintboxes

1 2 3 4 6 5 1 2 3 4 5 6 Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature

13 [Broderick, Pitman, Jordan (submitted)]

Exchangeable feature allocation: feature paintbox

slide-61
SLIDE 61

Paintboxes

1 2 3 4 6 5 7 1 2 3 4 5 6 7 Cat feature Dog feature Mouse feature Lizard feature Sheep feature Horse feature

13 [Broderick, Pitman, Jordan (submitted)]

Exchangeable feature allocation: feature paintbox

slide-62
SLIDE 62

Paintboxes

Exchangeable cluster distributions = Cluster distributions with EPPFs Exchangeable feature distributions Two-feature example IBP Feature distributions with EFPFs

14 [Broderick, Jordan, Pitman 2012]

slide-63
SLIDE 63

Exchangeable cluster distributions = Cluster distributions with EPPFs = Kingman paintbox partitions Exchangeable feature distributions = Feature paintbox allocations Feature distributions with EFPFs Two-feature example IBP

Paintboxes

14 [Broderick, Pitman, Jordan (submitted)]

slide-64
SLIDE 64

Paintboxes

Two feature example

Feature 1 Feature 2

p1 P(row = ) = p1 ) = p2 ) = p3 ) = p4 P(row = P(row = P(row = p2 p3 p4

15

slide-65
SLIDE 65

Paintboxes

Indian buffet process: beta feature frequencies

16 [Thibaux, Jordan 2007]

slide-66
SLIDE 66

For m = 1, 2, ...

  • 1. Draw

Set

  • 2. For k =

Draw an atom mass of size K+

m = Poisson

  • γ

θ θ + m − 1

  • Paintboxes

Indian buffet process: beta feature frequencies

16 [Thibaux, Jordan 2007]

Km =

m

  • j=1

K+

m

Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)

slide-67
SLIDE 67

For m = 1, 2, ...

  • 1. Draw

Set

  • 2. For k =

Draw a frequency of size K+

m = Poisson

  • γ

θ θ + m − 1

  • Paintboxes

Indian buffet process: beta feature frequencies

[Thibaux, Jordan 2007]

Km =

m

  • j=1

K+

m

Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)

16

slide-68
SLIDE 68

Paintboxes

Indian buffet process: beta feature frequencies

1

[Thibaux, Jordan 2007] 16

For m = 1, 2, ...

  • 1. Draw

Set

  • 2. For k =

Draw a frequency of size K+

m = Poisson

  • γ

θ θ + m − 1

  • Km =

m

  • j=1

K+

m

Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)

slide-69
SLIDE 69

Paintboxes

Indian buffet process: beta feature frequencies

1 q1 q2

[Thibaux, Jordan 2007] 16

For m = 1, 2, ...

  • 1. Draw

Set

  • 2. For k =

Draw a frequency of size K+

m = Poisson

  • γ

θ θ + m − 1

  • Km =

m

  • j=1

K+

m

Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)

slide-70
SLIDE 70

Paintboxes

Indian buffet process: beta feature frequencies

1 q1 q2 q3

[Thibaux, Jordan 2007] 16

For m = 1, 2, ...

  • 1. Draw

Set

  • 2. For k =

Draw a frequency of size K+

m = Poisson

  • γ

θ θ + m − 1

  • Km =

m

  • j=1

K+

m

Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)

slide-71
SLIDE 71

Paintboxes

Indian buffet process: beta feature frequencies

1 q1 q2 q3 q4 q5 q6

[Thibaux, Jordan 2007] 16

For m = 1, 2, ...

  • 1. Draw

Set

  • 2. For k =

Draw a frequency of size K+

m = Poisson

  • γ

θ θ + m − 1

  • Km =

m

  • j=1

K+

m

Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)

slide-72
SLIDE 72

Paintboxes

Indian buffet process: beta feature frequencies

1 q1 q2 q3 q4 q5

...

q6

[Thibaux, Jordan 2007] 16

For m = 1, 2, ...

  • 1. Draw

Set

  • 2. For k =

Draw a frequency of size K+

m = Poisson

  • γ

θ θ + m − 1

  • Km =

m

  • j=1

K+

m

Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)

slide-73
SLIDE 73

Paintboxes

Indian buffet process: beta feature frequencies

1 q1 q2 q3 q4 q5

...

q6

[Thibaux, Jordan 2007] 16

For m = 1, 2, ...

  • 1. Draw

Set

  • 2. For k =

Draw a frequency of size K+

m = Poisson

  • γ

θ θ + m − 1

  • Km =

m

  • j=1

K+

m

Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)

slide-74
SLIDE 74

Paintboxes

Indian buffet process: beta feature frequencies

1 q1 q2 q3 q4 q5

...

q6

[Thibaux, Jordan 2007] 16

For m = 1, 2, ...

  • 1. Draw

Set

  • 2. For k =

Draw a frequency of size K+

m = Poisson

  • γ

θ θ + m − 1

  • Km =

m

  • j=1

K+

m

Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)

slide-75
SLIDE 75

Paintboxes

Indian buffet process: beta feature frequencies

1 q1 q2 q3 q4 q5

...

q6

[Thibaux, Jordan 2007] 16

For m = 1, 2, ...

  • 1. Draw

Set

  • 2. For k =

Draw a frequency of size K+

m = Poisson

  • γ

θ θ + m − 1

  • Km =

m

  • j=1

K+

m

Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)

slide-76
SLIDE 76

Paintboxes

Indian buffet process: beta feature frequencies

1 q1 q2 q3 q4 q5

...

q6

[Thibaux, Jordan 2007] 16

For m = 1, 2, ...

  • 1. Draw

Set

  • 2. For k =

Draw a frequency of size K+

m = Poisson

  • γ

θ θ + m − 1

  • Km =

m

  • j=1

K+

m

Km−1, . . . , Km qk ∼ Beta(1, θ + m − 1)

slide-77
SLIDE 77

Paintboxes

Indian buffet process: beta feature frequencies

1 q1 q2 q3 q4 q5 q6

...

17

slide-78
SLIDE 78

Paintboxes

Indian buffet process: beta feature frequencies

18

slide-79
SLIDE 79

Paintboxes

Indian buffet process: beta feature frequencies

18

slide-80
SLIDE 80

Paintboxes

Indian buffet process: beta feature frequencies

18

slide-81
SLIDE 81

Paintboxes

Indian buffet process: beta feature frequencies

18

slide-82
SLIDE 82

Paintboxes

Indian buffet process: beta feature frequencies

...

18

slide-83
SLIDE 83

1 q1 q2 q3 q4 q5

...

q6

Paintboxes

19

slide-84
SLIDE 84

“Frequency models”

1 q1 q2 q3 q4 q5

...

q6

Paintboxes

19 [Broderick, Pitman, Jordan (submitted)]

slide-85
SLIDE 85

Paintboxes

Two feature example

Feature 1 Feature 2

p1 P(row = ) = p1 ) = p2 ) = p3 ) = p4 P(row = P(row = P(row = p2 p3 p4

20

slide-86
SLIDE 86

Paintboxes

Two feature example

Feature 1 Feature 2

p1 P(row = ) = p1 ) = p2 ) = p3 ) = p4 P(row = P(row = P(row = p2 p3 p4

20

Not a frequency model

slide-87
SLIDE 87

Exchangeable cluster distributions = Cluster distributions with EPPFs = Kingman paintbox partitions Exchangeable feature distributions = Feature paintbox allocations Feature distributions with EFPFs Two-feature example IBP

Paintboxes

21 [Broderick, Pitman, Jordan (submitted)]

slide-88
SLIDE 88

Exchangeable cluster distributions = Cluster distributions with EPPFs = Kingman paintbox partitions Exchangeable feature distributions = Feature paintbox allocations Frequency models Two-feature example IBP

Paintboxes

21 [Broderick, Pitman, Jordan (submitted)]

slide-89
SLIDE 89

1 q1 q2 q3 q4 q5

...

q6

Frequency models: EFPFs?

22

slide-90
SLIDE 90

1 q1 q2 q3 q4 q5

...

q6

Frequency models: EFPFs?

22

slide-91
SLIDE 91

1 q1 q2 q3 q4 q5

...

q6

Frequency models: EFPFs?

22

slide-92
SLIDE 92

1 q1 q2 q3 q4 q5

...

q6 n = 1 2 N ... k = 1 2 K ...

Frequency models: EFPFs?

22

slide-93
SLIDE 93

1 q1 q2 q3 q4 q5

...

q6 n = 1 2 N ... k = 1 2 K ...

P( )

Frequency models: EFPFs?

22

slide-94
SLIDE 94

1 q1 q2 q3 q4 q5

...

q6 n = 1 2 N ... k = 1 2 K ...

P( )

Frequency models: EFPFs?

22

slide-95
SLIDE 95

qSN,1

1

(1 − q1)N−SN,1

1 q1 q2 q3 q4 q5

...

q6 n = 1 2 N ... k = 1 2 K ...

P( )

Frequency models: EFPFs?

22

slide-96
SLIDE 96

qSN,k

k

(1 − qk)N−SN,k

1 q1 q2 q3 q4 q5

...

q6 n = 1 2 N ... k = 1 2 K ...

P( )

Frequency models: EFPFs?

22

slide-97
SLIDE 97

qSN,k

ik

(1 − qik)N−SN,k

1 q1 q2 q3 q4 q5

...

q6

23

n = 1 2 N ... k = 1 2 K ...

P( )

Frequency models: EFPFs?

slide-98
SLIDE 98

K

  • k=1

qSN,k

ik

(1 − qik)N−SN,k

1 q1 q2 q3 q4 q5

...

q6 n = 1 2 N ... k = 1 2 K ...

P( )

Frequency models: EFPFs?

23

slide-99
SLIDE 99

K

  • k=1

qSN,k

ik

(1 − qik)N−SN,k ·

  • j /

∈{ik}K

k=1

(1 − qj)N

1 q1 q2 q3 q4 q5

...

q6 n = 1 2 N ... k = 1 2 K ...

P( )

Frequency models: EFPFs?

23

slide-100
SLIDE 100

= E[

  • distinct ik

1 K!

K

  • k=1

qSN,k

ik

(1 − qik)N−SN,k ·

  • j /

∈{ik}K

k=1

(1 − qj)N]

1 q1 q2 q3 q4 q5

...

q6 n = 1 2 N ... k = 1 2 K ...

P( )

Frequency models: EFPFs?

23

slide-101
SLIDE 101

n = 1 2 N ... k = 1 2 K ...

P( )

[Broderick, Pitman, Jordan (submitted)]

Frequency models: EFPFs?

23

= E[

  • distinct ik

1 K!

K

  • k=1

qSN,k

ik

(1 − qik)N−SN,k ·

  • j /

∈{ik}K

k=1

(1 − qj)N]

slide-102
SLIDE 102

n = 1 2 N ... k = 1 2 K ...

P( ) Size of kth feature

[Broderick, Pitman, Jordan (submitted)]

Frequency models: EFPFs?

23

= E[

  • distinct ik

1 K!

K

  • k=1

qSN,k

ik

(1 − qik)N−SN,k ·

  • j /

∈{ik}K

k=1

(1 − qj)N]

slide-103
SLIDE 103

n = 1 2 N ... k = 1 2 K ...

P( ) Number of features Size of kth feature

[Broderick, Pitman, Jordan (submitted)]

Frequency models: EFPFs?

23

= E[

  • distinct ik

1 K!

K

  • k=1

qSN,k

ik

(1 − qik)N−SN,k ·

  • j /

∈{ik}K

k=1

(1 − qj)N]

slide-104
SLIDE 104

n = 1 2 N ... k = 1 2 K ...

P( ) Number of features Size of kth feature Number of data points

[Broderick, Pitman, Jordan (submitted)]

Frequency models: EFPFs?

23

= E[

  • distinct ik

1 K!

K

  • k=1

qSN,k

ik

(1 − qik)N−SN,k ·

  • j /

∈{ik}K

k=1

(1 − qj)N]

slide-105
SLIDE 105

= p(N; SN,1, SN,2, . . . , SN,K)

n = 1 2 N ... k = 1 2 K ...

P( ) Number of features Size of kth feature Number of data points

[Broderick, Pitman, Jordan (submitted)]

Frequency models: EFPFs?

EFPF

23

= E[

  • distinct ik

1 K!

K

  • k=1

qSN,k

ik

(1 − qik)N−SN,k ·

  • j /

∈{ik}K

k=1

(1 − qj)N]

slide-106
SLIDE 106

Exchangeable cluster distributions = Cluster distributions with EPPFs = Kingman paintbox partitions Exchangeable feature distributions = Feature paintbox allocations Frequency models Two-feature example IBP

24 [Broderick, Pitman, Jordan (submitted)]

Frequency models: EFPFs?

slide-107
SLIDE 107

Exchangeable cluster distributions = Cluster distributions with EPPFs = Kingman paintbox partitions Exchangeable feature distributions = Feature paintbox allocations Frequency models Two-feature example IBP

24

Feature distributions with EFPFs

Frequency models: EFPFs?

[Broderick, Pitman, Jordan (submitted)]

slide-108
SLIDE 108

Distributions with EFPFs: frequencies?

25

slide-109
SLIDE 109

Distributions with EFPFs: frequencies?

25

Feature allocation

n = 1 2 N ...

slide-110
SLIDE 110

Distributions with EFPFs: frequencies?

Feature allocation K=2 for all N

25

n = 1 2 N ...

slide-111
SLIDE 111

p(N; SN,1, SN,2)

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation K=2 for all N

25

n = 1 2 N ...

slide-112
SLIDE 112

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation Want to show: K=2 for all N ∃q1 ∃q2

25

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-113
SLIDE 113

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation Want to show: K=2 for all N ∃q1 ∃q2

25

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-114
SLIDE 114

|q1:2) = q1(1 − q2)

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation P(row = e.g. K=2 for all N ∃q1 ∃q2 Want to show:

25

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-115
SLIDE 115

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation P(row = e.g. K=2 for all N ∃q1 ∃q2 |q1:2) = q1(1 − q2) Want to show:

25

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-116
SLIDE 116

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation P(row = e.g. K=2 for all N ∃q1 ∃q2 |q1:2) = q1(1 − q2) Want to show:

25

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-117
SLIDE 117

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation

26

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-118
SLIDE 118

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation Feature paintbox |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =

26

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-119
SLIDE 119

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row = p1 p2 p3 p4

26

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-120
SLIDE 120

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row = p1 p2 p3 p4 q1 = p1 + p3 q2 = p2 + p3

26

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-121
SLIDE 121

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation P( ) = P( ) = P( ) |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =

27

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-122
SLIDE 122

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation P( ) = P( P(4; 2, 2) = ) = P( ) |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =

27

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-123
SLIDE 123

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation P( ) = P( P(4; 2, 2) = ) = P( ) |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =

27

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-124
SLIDE 124

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation P( ) = P( P(4; 2, 2) = ) = P( ) E[p2

1p2 2] = E[p2 3p2 4] = E[p1p2p3p4]

|p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =

27

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-125
SLIDE 125

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation P( ) = P( P(4; 2, 2) = ) = P( ) E[p2

1p2 2] = E[p2 3p2 4] = E[p1p2p3p4]

E[(p1p2 − p3p4)2] = 0 |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =

27

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-126
SLIDE 126

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation P( ) = P( P(4; 2, 2) = ) = P( ) E[p2

1p2 2] = E[p2 3p2 4] = E[p1p2p3p4]

E[(p1p2 − p3p4)2] = 0 p1p2

a.s.

= p3p4 |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =

27

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-127
SLIDE 127

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation p1p2

a.s.

= p3p4 |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =

28

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-128
SLIDE 128

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation algebra p1p2

a.s.

= p3p4 |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =

28

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-129
SLIDE 129

p1

a.s.

= (p1 + p3)(1 − [p2 + p3])

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation algebra p1p2

a.s.

= p3p4 |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =

28

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-130
SLIDE 130

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation algebra p1p2

a.s.

= p3p4 p1

a.s.

= (p1 + p3)(1 − [p2 + p3]) |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =

28

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-131
SLIDE 131

p1

a.s.

= q1(1 − [p2 + p3])

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation algebra p1p2

a.s.

= p3p4 |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =

28

n = 1 2 N ...

p(N; SN,1, SN,2)

slide-132
SLIDE 132

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation algebra p1p2

a.s.

= p3p4 |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =

28

n = 1 2 N ...

p(N; SN,1, SN,2) p1

a.s.

= q1(1 − [p2 + p3])

slide-133
SLIDE 133

Distributions with EFPFs: frequencies?

Assume EFPF Feature allocation algebra p1p2

a.s.

= p3p4 |p1:4) = p4 |p1:4) = p3 P(row = P(row = |p1:4) = p1 |p1:4) = p2 P(row = P(row =

28

n = 1 2 N ...

p(N; SN,1, SN,2) p1

a.s.

= q1(1 − q2)

slide-134
SLIDE 134

Exchangeable cluster distributions = Cluster distributions with EPPFs = Kingman paintbox partitions Exchangeable feature distributions = Feature paintbox allocations Frequency models Two-feature example IBP

29

Feature distributions with EFPFs

Distributions with EFPFs: frequencies?

[Broderick, Pitman, Jordan (submitted)]

slide-135
SLIDE 135

Feature distributions with EFPFs = Frequency models Exchangeable cluster distributions = Cluster distributions with EPPFs = Kingman paintbox partitions Exchangeable feature distributions = Feature paintbox allocations Two-feature example IBP

Distributions with EFPFs: frequencies?

[Broderick, Pitman, Jordan (submitted)] 29

slide-136
SLIDE 136

Conclusions

  • Feature paintbox: characterization of exchangeable

feature models

  • Limits of clustering characterizations in feature case?
  • Remaining connections to fill in
  • Other combinatorial structures

Exchangeable features

30

slide-137
SLIDE 137

Conclusions

  • Feature paintbox: characterization of exchangeable

feature models

  • Limits of clustering characterizations in feature case?
  • Remaining connections to fill in
  • Other combinatorial structures

Exchangeable features

30

slide-138
SLIDE 138

Conclusions

  • Feature paintbox: characterization of exchangeable

feature models

  • Limits of clustering characterizations in feature case?
  • Remaining connections to fill in
  • Other combinatorial structures

Exchangeable features; feature paintbox

30

slide-139
SLIDE 139

Conclusions

  • Feature paintbox: characterization of exchangeable

feature models

  • Limits of clustering characterizations in feature case?
  • Remaining connections to fill in
  • Other combinatorial structures

Exchangeable features; feature paintbox

30

slide-140
SLIDE 140

Conclusions

  • Feature paintbox: characterization of exchangeable

feature models

  • Limits of clustering characterizations in feature case?
  • Remaining connections to fill in
  • Other combinatorial structures

Models with EFPFs Exchangeable features; feature paintbox

30

slide-141
SLIDE 141

Conclusions

  • Feature paintbox: characterization of exchangeable

feature models

  • Limits of clustering characterizations in feature case?
  • Remaining connections to fill in
  • Other combinatorial structures

Models with EFPFs IBP Two-feature example Exchangeable features; feature paintbox

30

slide-142
SLIDE 142

Conclusions

  • Feature paintbox: characterization of exchangeable

feature models

  • Limits of clustering characterizations in feature case?
  • Remaining connections to fill in
  • Other combinatorial structures

Models with EFPFs IBP Two-feature example Exchangeable clusters Exchangeable features; feature paintbox

30

slide-143
SLIDE 143

Conclusions

  • Feature paintbox: characterization of exchangeable

feature models

  • Characterization of alternative correlation structure
  • Remaining connections to fill in
  • Other combinatorial structures

Models with EFPFs IBP Two-feature example Exchangeable clusters Exchangeable features; feature paintbox

30

slide-144
SLIDE 144

Conclusions

  • Feature paintbox: characterization of exchangeable

feature models

  • Characterization of alternative correlation structure
  • Remaining connections to fill in
  • Other combinatorial structures

Models with EFPFs IBP Two-feature example Exchangeable features; feature paintbox Exchangeable clusters; models with EPPFs

30

slide-145
SLIDE 145

Conclusions

  • Feature paintbox: characterization of exchangeable

feature models

  • Characterization of alternative correlation structure
  • Remaining connections to fill in
  • Other combinatorial structures

Models with EFPFs; frequency models IBP Two-feature example Exchangeable features; feature paintbox Exchangeable clusters; models with EPPFs

30

slide-146
SLIDE 146

Conclusions

  • Feature paintbox: characterization of exchangeable

feature models

  • Characterization of alternative correlation structure
  • Remaining connections to fill in
  • Other combinatorial structures

Models with EFPFs; frequency models IBP Two-feature example Exchangeable features; feature paintbox Exchangeable clusters; models with EPPFs; Kingman paintbox

30

slide-147
SLIDE 147

Conclusions

  • Feature paintbox: characterization of exchangeable

feature models

  • Characterization of alternative correlation structure
  • Remaining connections to fill in
  • Other combinatorial structures

Models with EFPFs; frequency models IBP Two-feature example Exchangeable features; feature paintbox Exchangeable clusters; models with EPPFs; Kingman paintbox

30

slide-148
SLIDE 148

Conclusions

  • Feature paintbox: characterization of exchangeable

feature models

  • Characterization of alternative correlation structure
  • Remaining connections to fill in
  • Other combinatorial structures

Models with EFPFs; frequency models Two-feature example Exchangeable features; feature paintbox Exchangeable clusters; models with EPPFs; Kingman paintbox IBP Completely random measures

30

slide-149
SLIDE 149

Conclusions

  • Feature paintbox: characterization of exchangeable

feature models

  • Characterization of alternative correlation structure
  • Remaining connections to fill in
  • Other combinatorial structures

Models with EFPFs; frequency models Two-feature example Exchangeable features; feature paintbox Exchangeable clusters; models with EPPFs; Kingman paintbox IBP Completely random measures Normalized completely random measures

30

slide-150
SLIDE 150

Conclusions

  • Feature paintbox: characterization of exchangeable

feature models

  • Characterization of alternative correlation structure
  • Remaining connections to fill in
  • Other combinatorial structures

Models with EFPFs; frequency models Two-feature example Exchangeable features; feature paintbox Exchangeable clusters; models with EPPFs; Kingman paintbox IBP Completely random measures Normalized completely random measures CRP

30

slide-151
SLIDE 151

Conclusions

Two-feature example Exchangeable features; feature paintbox IBP Completely random measures CRP Normalized completely random measures Models with EFPFs; frequency models

  • Feature paintbox: characterization of exchangeable

feature models

  • Characterization of alternative correlation structure
  • Remaining connections to fill in
  • Other combinatorial structures

Exchangeable clusters; models with EPPFs; Kingman paintbox

30

slide-152
SLIDE 152

Conclusions

Two-feature example Exchangeable features; feature paintbox IBP Completely random measures CRP Normalized completely random measures Models with EFPFs; frequency models

  • Feature paintbox: characterization of exchangeable

feature models

  • Characterization of alternative correlation structure
  • Remaining connections to fill in
  • Other combinatorial structures

Exchangeable clusters; models with EPPFs; Kingman paintbox

30

slide-153
SLIDE 153

References

31

  • T. Broderick, M. I. Jordan, and J. Pitman. Clusters and features from combinatorial stochastic processes. Arxiv

preprint arXiv:1206.5862, 2012.

  • T. Broderick, J. Pitman, and M. I. Jordan. Feature allocations, probability functions, and paintboxes.

Submitted.

  • T. Broderick, L. Mackey, J. Paisley, and M. I. Jordan. Combinatorial clustering and the beta negative binomial
  • process. Arxiv preprint arXiv:1111.1802, 2011.
  • T. Griffiths and Z. Ghahramani. Infinite latent feature models and the Indian buffet process. In Y

. Weiss, B. Scholkopf, and J. Platt, editors, Advances in Neural Information Processing Systems 18, pages 475–482. MIT Press, Cambridge, MA, 2006.

  • N. L. Hjort. Nonparametric bayes estimators based on beta processes in models for life history data. Annals of

Statistics, 18(3):1259–1294, 1990. Y . Kim. Nonparametric Bayesian estimators for counting processes. Annals of Statistics, 27(2):562–588, 1999.

  • J. F. C. Kingman. The representation of partition structures. Journal of the London Mathematical Society, 2(2):374,

1978.

  • J. Pitman. Exchangeable and partially exchangeable random partitions. Probability Theory and Related Fields,

102(2):145–158, 1995.

  • R. Thibaux and M. I. Jordan. Hierarchical beta processes and the Indian buffet process. In Proceedings of the

International Conference on Artificial Intelligence and Statistics, volume 11, 2007.

  • M. Zhou, L. Hannah, D. Dunson, and L. Carin. Beta-negative binomial process and Poisson factor analysis.

In Proceedings of the International Conference on Artificial Intelligence and Statistics, volume 15, 2012.