Clustering Eric Xing 2 1 Object Recognition and Tracking (1.9, - - PDF document

clustering
SMART_READER_LITE
LIVE PREVIEW

Clustering Eric Xing 2 1 Object Recognition and Tracking (1.9, - - PDF document

School of Computer Science Infinite Mixture and Dirichlet Process Probabilistic Graphical Models (10- Probabilistic Graphical Models (10 -708) 708) Lecture 20, Nov 28, 2007 Receptor A Receptor A X 1 X 1 X 1 Receptor B Receptor B X 2 X 2


slide-1
SLIDE 1

1

1

School of Computer Science

Infinite Mixture and Dirichlet Process

Probabilistic Graphical Models (10 Probabilistic Graphical Models (10-

  • 708)

708)

Lecture 20, Nov 28, 2007

Eric Xing Eric Xing

Receptor A Kinase C TF F Gene G Gene H Kinase E Kinase D Receptor B X1 X2 X3 X4 X5 X6 X7 X8 Receptor A Kinase C TF F Gene G Gene H Kinase E Kinase D Receptor B X1 X2 X3 X4 X5 X6 X7 X8 X1 X2 X3 X4 X5 X6 X7 X8

Reading:

Eric Xing 2

Clustering

slide-2
SLIDE 2

2

Eric Xing 3

Object Recognition and Tracking

t=1 t=2 t=3

(1.8, 7.4, 2.3) (1.9, 9.0, 2.1) (1.9, 6.1, 2.2) (0.9, 5.8, 3.1) (0.7, 5.1, 3.2) (0.6, 5.9, 3.2)

Eric Xing 4

Modeling The Mind …

… …

t=1 t=T

Read sentence Read sentence View picture View picture

Decide whether consistent Decide whether consistent

Latent Latent brain processes: brain processes: fMRI fMRI scan: scan:

∑ ∑

slide-3
SLIDE 3

3

Eric Xing 5

PNAS PNAS papers papers Research Research topics topics 1900 2000 ? Research Research circles circles

The Evolution of Science

CS Bio Phy Phy

Eric Xing 6

Partially Observed, Open and Evolving Possible Worlds

  • Unbounded # of objects/trajectories
  • Changing attributes
  • Birth/death, merge/split
  • Relational ambiguity
  • The parametric paradigm:
  • Finite
  • Structurally

unambiguous

* |t t

Ξ

* | 1 1 + +

Ξ

t t

{ } ( )

k

p φ | x

Sensor model Sensor model

  • bservation space
  • bservation space

Entity space Entity space motion model motion model

{ } { }

( )

t k t k

p φ φ

1 +

Event model Event model

{ } ( )

k

p φ

{ } ( )

T k

p

: 1

φ

  • r

How to open it up? How to open it up?

slide-4
SLIDE 4

4

Eric Xing 7

A Classical Approach

Clustering as Mixture Modeling Then "model selection"

Eric Xing 8

Model Selection vs. Posterior Inference

Model selection

  • "intelligent" guess: ???
  • cross validation: data-hungry
  • information theoretic:
  • AIC
  • TIC
  • MDL :
  • Bayes factor:

need to compute data likelihood

Posterior inference:

we want to handle uncertainty of model complexity explicitly

  • we favor a distribution that does not constrain M in a "closed" space!

( )

) , ˆ | ( | ) ( min arg K KL

ML

g f θ ⋅ ⋅ ) ( ) | ( ) | ( M p M D p D M p ∝

{ }

K , θ ≡ M

Parsimony, Parsimony, Ockam's Ockam's Razor Razor

slide-5
SLIDE 5

5

Eric Xing 9

Two "Recent" Developments

First order probabilistic languages (FOPLs)

  • Examples: PRM, BLOG …
  • Lift graphical models to "open" world (#rv, relation, index, lifespan …)
  • Focus on complete, consistent, and operating rules to instantiate possible worlds,

and formal language of expressing such rules

  • Operational way of defining distributions over possible worlds, via sampling

methods Bayesian Nonparametrics

  • Examples: Dirichlet processes, stick-breaking processes …
  • From finite, to infinite mixture, to more complex constructions (hierarchies,

spatial/temporal sequences, …)

  • Focus on the laws and behaviors of both the generative formalisms and resulting

distributions

  • Often offer explicit expression of distributions, and expose the structure of the

distributions --- motivate various approximate schemes

Eric Xing 10

Clustering

  • How to label them ?
  • How many clusters ???
slide-6
SLIDE 6

6

Eric Xing 11

Genetic Demography

  • Are there genetic prototypes among them ?
  • What are they ?
  • How many ? (how many ancestors do we have ?)

Eric Xing 12

Genetic Polymorphisms

slide-7
SLIDE 7

7

Eric Xing 13

Biological Terms

– Each variant is called an “allele” – Almost always bi-allelic – Account for most of the genetic diversity among different (normal) individuals, e.g. drug response, disease susceptibility

Genetic polymorphism: a difference in DNA sequence among

individuals, groups, or populations

Single Nucleotide Polymorphism (SNP): DNA sequence

variation occurring when a single nucleotide - A, T, C, or G - differs between members of the species

Eric Xing 14

From SNPs to Haplotypes

Alleles of adjacent SNPs on a chromosome form haplotypes

  • Powerful in the study of disease association or genetic evolution
slide-8
SLIDE 8

8

Eric Xing 15

2 13 6 1 9 15 17 4 1 9 6 2 9 17 2 12 12 7 14 6 7 1 18 18 1 4 10 10

Genotypes Haplotypes

13 1 15 4 9 2 17 12 7 6 1 18 4 10 2 6 9 17 1 6 9 2 12 14 7 18 1 10

Haplotype Re-construction

Chromosome phase is known Chromosome phase is unknown

Haplotype and Genotype

A collection of alleles derived from the same chromosome

Eric Xing 16

Ancestral Inference

  • Better recovery of the ancestors leads to better haplotyping results

(because of more accurate grouping of common haplotypes)

  • True haplotypes are obtainable with high cost, but they can validate model

more subjectively (as opposed to examining saliency of clustering)

  • Many other biological/scientific utilities

Gn Hn1 Hn2 Ak θk

?

N N

Essentially a clustering problem, but Essentially a clustering problem, but … …

slide-9
SLIDE 9

9

Eric Xing 17

The probability of a genotype g: Standard settings:

  • H| = K << 2J

fixed-sized population haplotype pool

  • p(h1,h2)= p(h1)p(h2)=f1f2

Hardy-Weinberg equilibrium

Problem:

K ? H ?

=

, 2 1 2 1

2 1

) , | ( ) , ( ) (

H h h

h h g p h h p g p

Genotyping model Haplotype model Population haplotype pool

A Finite (Mixture of ) Allele Model

Gn Hn1 Hn2

Eric Xing 18

A Infinite (Mixture of ) Allele Model

Gn Hn1 Hn2 Ak θk

N N How?

  • Via a nonparametric hierarchical Bayesian formalism !
slide-10
SLIDE 10

10

Eric Xing 19

Stick-breaking Process

G0 0.4 0.4 0.6 0.5 0.3 0.3 0.8 0.24 ) , Beta( ~ )

  • (

~ ) (

∏ ∑ ∑

α β β β π π θ θ δ π 1 1 1

1 1 1 1 k k j k k k k k k k k k

G G

= = =

= = =

Location Mass

Eric Xing 20

Graphical Model

Gn Hn1 Hn2 Ak θk

N N

slide-11
SLIDE 11

11

Eric Xing 21

Chinese Restaurant Process

CRP defines an exchangeable distribution on partitions over an (infinite) sequence

  • f samples, such a distribution is formally known as the Dirichlet Process (DP)

= ) | = (

  • i

i

k c P c 1

α + 1 1 α α + 1 α + 2 1 α + 2 1 α α + 2 α + 3 1 α + 3 2 α α + 3 1

  • +

1

α i m 1

  • +

2

α i m 1

α i

....

1

θ

2

θ

Eric Xing 22

{A,θ} {A,θ} {A,θ} {A,θ} {A,θ} {A,θ}

… …

3 1 2 4 5 6 7 8 9

The DP Mixture of Ancestral Haplotypes

The customers around a table form a cluster

  • associate a mixture component (i.e., a population haplotype) with a table
  • sample {a, θ} at each table from a base measure G0 to obtain the

population haplotype and nucleotide substitution frequency for that component

  • With p(h|{Α, θ}) and p(g|h1,h2), the CRP yields a posterior distribution on

the number of population haplotypes (and on the haplotype configurations and the nucleotide substitution frequencies)

slide-12
SLIDE 12

12

Eric Xing 23

DP-haplotyper

Inference:

Markov Chain Monte Carlo (MCMC)

  • Gibbs sampling
  • Metropolis Hasting

Gn Hn1 Hn2 A θ

N K

G α G0

DP infinite mixture components (for population haplotypes) Likelihood model (for individual haplotypes and genotypes)

Eric Xing 24

Model components

Choice of base measure: Nucleotide-substitution model: Noisy genotyping model:

j j

a G ) Beta( ) Unif( ~ θ

⎩ ⎨ ⎧ = − = = =∏

j k j i j k j k j i j k j k j k j i j j k j k j i k i

a h a h a h p a h p a h p

, , , , , , , , , , , ,

if if ) , | ( where ) , | ( ) } , { | ( θ θ θ θ θ 1 ⎪ ⎩ ⎪ ⎨ ⎧ ≠ ⊕ − = ⊕ = =∏

j i j i j i j i j i j i j i j i j i j j i j i j i i i i

g h h g h h h h g p h h g p h h g p

, , , , , , , , , , , ,

if 2 if ) , | ( where ) , | ( ) , | (

2 1 2 1 2 1 2 1 2 1

1 γ γ

slide-13
SLIDE 13

13

Eric Xing 25

Gibbs sampling

Starting from some initial haplotype reconstruction H(0) , pick a first table with an arbitrary a1

(0) , and form initial population-hap pool A(0) ={a1 (0) }:

i) Choose an individual i and one of his/her two haplytopes t, uniformly and at random, from all ambiguous individuals; ii) Sample from , update ; iii) Sample , where , from ; update A(t+1) ; iii) Sample from , update H(t+1).

) , , | (

) ( ) ( ) ( ) 1 ( t t t i t i

H c c p

t t

A

− + ) 1 ( + t it

c

) 1 ( + t k

a

) 1 ( +

=

t it

c k ) s.t. | (

) 1 ( ' ) ( ' ) 1 (

' '

k c h a p

t i t i t k

t t

= ∀

+ − + ) 1 ( + t it

h ) , , | (

) 1 ( ) ( ) 1 ( ) 1 ( + − + + t t i t i t i

t t t

H c h p A

) 1 ( + t

c

Eric Xing 26

Convergence of Ancestral Inference

slide-14
SLIDE 14

14

Eric Xing 27

Haplotyping Error

The Gabriel data