CSci 8980: Advanced Topics in Graphical Models Application: Gene - - PowerPoint PPT Presentation

csci 8980 advanced topics in graphical models application
SMART_READER_LITE
LIVE PREVIEW

CSci 8980: Advanced Topics in Graphical Models Application: Gene - - PowerPoint PPT Presentation

Background Gene Expression Analysis Replicated Microarray Analysis CSci 8980: Advanced Topics in Graphical Models Application: Gene Expression Analysis Instructor: Arindam Banerjee November 20, 2007 Background Gene Expression Analysis


slide-1
SLIDE 1

Background Gene Expression Analysis Replicated Microarray Analysis

CSci 8980: Advanced Topics in Graphical Models Application: Gene Expression Analysis

Instructor: Arindam Banerjee November 20, 2007

slide-2
SLIDE 2

Background Gene Expression Analysis Replicated Microarray Analysis

Microarray Technology

slide-3
SLIDE 3

Background Gene Expression Analysis Replicated Microarray Analysis

Microarray Data

slide-4
SLIDE 4

Background Gene Expression Analysis Replicated Microarray Analysis

Microarray Data

T gene expression profiles under M conditions

slide-5
SLIDE 5

Background Gene Expression Analysis Replicated Microarray Analysis

Microarray Data

T gene expression profiles under M conditions xi is the profile for the ith gene

slide-6
SLIDE 6

Background Gene Expression Analysis Replicated Microarray Analysis

Microarray Data

T gene expression profiles under M conditions xi is the profile for the ith gene xim is the value for gene i, condition m

slide-7
SLIDE 7

Background Gene Expression Analysis Replicated Microarray Analysis

Microarray Data

T gene expression profiles under M conditions xi is the profile for the ith gene xim is the value for gene i, condition m Each profile is assumed to be generated by one cluster

slide-8
SLIDE 8

Background Gene Expression Analysis Replicated Microarray Analysis

Microarray Data

T gene expression profiles under M conditions xi is the profile for the ith gene xim is the value for gene i, condition m Each profile is assumed to be generated by one cluster Total Q clusters, later Q → ∞

slide-9
SLIDE 9

Background Gene Expression Analysis Replicated Microarray Analysis

Microarray Data

T gene expression profiles under M conditions xi is the profile for the ith gene xim is the value for gene i, condition m Each profile is assumed to be generated by one cluster Total Q clusters, later Q → ∞ Clustering is an assignment (c1, . . . , cT)

slide-10
SLIDE 10

Background Gene Expression Analysis Replicated Microarray Analysis

Microarray Data

T gene expression profiles under M conditions xi is the profile for the ith gene xim is the value for gene i, condition m Each profile is assumed to be generated by one cluster Total Q clusters, later Q → ∞ Clustering is an assignment (c1, . . . , cT)

Each ci ∈ {1, . . . , Q}

slide-11
SLIDE 11

Background Gene Expression Analysis Replicated Microarray Analysis

Microarray Data

T gene expression profiles under M conditions xi is the profile for the ith gene xim is the value for gene i, condition m Each profile is assumed to be generated by one cluster Total Q clusters, later Q → ∞ Clustering is an assignment (c1, . . . , cT)

Each ci ∈ {1, . . . , Q} Want posterior probability over assignments

slide-12
SLIDE 12

Background Gene Expression Analysis Replicated Microarray Analysis

Microarray Data

T gene expression profiles under M conditions xi is the profile for the ith gene xim is the value for gene i, condition m Each profile is assumed to be generated by one cluster Total Q clusters, later Q → ∞ Clustering is an assignment (c1, . . . , cT)

Each ci ∈ {1, . . . , Q} Want posterior probability over assignments

Idea: Use a Bayesian infinite mixture model

slide-13
SLIDE 13

Background Gene Expression Analysis Replicated Microarray Analysis

Infinite Mixture Model for Gene Expression

Level 1: Data generation p(xi|ci = j, µh, σ2

h, [h]Q 1 ) = N(x; µj, σ2 j I)

slide-14
SLIDE 14

Background Gene Expression Analysis Replicated Microarray Analysis

Infinite Mixture Model for Gene Expression

Level 1: Data generation p(xi|ci = j, µh, σ2

h, [h]Q 1 ) = N(x; µj, σ2 j I)

Level 2: Priors for parameters p(µj|λ, r) = N(µ; λ, 1/rI) p(σ−2

j

|β, w) = fG(σ−2; β/2, βw/2)

slide-15
SLIDE 15

Background Gene Expression Analysis Replicated Microarray Analysis

Infinite Mixture Model for Gene Expression

Level 1: Data generation p(xi|ci = j, µh, σ2

h, [h]Q 1 ) = N(x; µj, σ2 j I)

Level 2: Priors for parameters p(µj|λ, r) = N(µ; λ, 1/rI) p(σ−2

j

|β, w) = fG(σ−2; β/2, βw/2) Level 2: Prior for clustering ci ∼ Discrete(π)

slide-16
SLIDE 16

Background Gene Expression Analysis Replicated Microarray Analysis

Infinite Mixture Model for Gene Expression (Contd.)

Prior for hyper-parameters p(w|σ2

x)

= fG(w; 1/2, 1/(2σ2

x))

p(β) = fG(β; 1/2, 1/2) p(r|σ2

x)

= fG(r; 1/2, 1/(2σ2

x))

p(λ|µx, σ2

x)

= fN(λ|µx, σ2

xI)

slide-17
SLIDE 17

Background Gene Expression Analysis Replicated Microarray Analysis

Infinite Mixture Model for Gene Expression (Contd.)

Prior for hyper-parameters p(w|σ2

x)

= fG(w; 1/2, 1/(2σ2

x))

p(β) = fG(β; 1/2, 1/2) p(r|σ2

x)

= fG(r; 1/2, 1/(2σ2

x))

p(λ|µx, σ2

x)

= fN(λ|µx, σ2

xI)

(µx, σ2

x) are empirical mean, variance

slide-18
SLIDE 18

Background Gene Expression Analysis Replicated Microarray Analysis

Infinite Mixture Model for Gene Expression (Contd.)

Prior for hyper-parameters p(w|σ2

x)

= fG(w; 1/2, 1/(2σ2

x))

p(β) = fG(β; 1/2, 1/2) p(r|σ2

x)

= fG(r; 1/2, 1/(2σ2

x))

p(λ|µx, σ2

x)

= fN(λ|µx, σ2

xI)

(µx, σ2

x) are empirical mean, variance

Priors on cluster-prior π π ∼ Dirichlet(α/Q)

slide-19
SLIDE 19

Background Gene Expression Analysis Replicated Microarray Analysis

Gibbs Sampler

As Q → ∞, we get a DPM

slide-20
SLIDE 20

Background Gene Expression Analysis Replicated Microarray Analysis

Gibbs Sampler

As Q → ∞, we get a DPM Gibbs sampler for the model p(ci = j|c−i, xi, µj, σ2

j )

∝ n−i,j T − 1 + αN(xi; µj, σ2

j I)

p(ci = cj, j = i|c−i, xi, µx, σ2

x)

∝ α T − 1 + α

  • N(xi|µ, σ2I)p(µ, σ
slide-21
SLIDE 21

Background Gene Expression Analysis Replicated Microarray Analysis

Gibbs Sampler

As Q → ∞, we get a DPM Gibbs sampler for the model p(ci = j|c−i, xi, µj, σ2

j )

∝ n−i,j T − 1 + αN(xi; µj, σ2

j I)

p(ci = cj, j = i|c−i, xi, µx, σ2

x)

∝ α T − 1 + α

  • N(xi|µ, σ2I)p(µ, σ

Pairwise probability of being generated by the same pattern Pij = # samples after ‘burn-in’ with ci = cj SB

slide-22
SLIDE 22

Background Gene Expression Analysis Replicated Microarray Analysis

Gibbs Sampler

As Q → ∞, we get a DPM Gibbs sampler for the model p(ci = j|c−i, xi, µj, σ2

j )

∝ n−i,j T − 1 + αN(xi; µj, σ2

j I)

p(ci = cj, j = i|c−i, xi, µx, σ2

x)

∝ α T − 1 + α

  • N(xi|µ, σ2I)p(µ, σ

Pairwise probability of being generated by the same pattern Pij = # samples after ‘burn-in’ with ci = cj SB Distance Dij = 1 − Pij

slide-23
SLIDE 23

Background Gene Expression Analysis Replicated Microarray Analysis

Model for Replicated Data

Generate replicates of each expression

slide-24
SLIDE 24

Background Gene Expression Analysis Replicated Microarray Analysis

Model for Replicated Data

Generate replicates of each expression Account for variability in gene expression data

slide-25
SLIDE 25

Background Gene Expression Analysis Replicated Microarray Analysis

Model for Replicated Data

Generate replicates of each expression Account for variability in gene expression data For G replicates p(xik|yi, ψi) = N(xik; yi, ψ2

i )

p(yi|ci = j, µj, σj) = N(yi|µj, σ2

j I)

slide-26
SLIDE 26

Background Gene Expression Analysis Replicated Microarray Analysis

Model for Replicated Data

Generate replicates of each expression Account for variability in gene expression data For G replicates p(xik|yi, ψi) = N(xik; yi, ψ2

i )

p(yi|ci = j, µj, σj) = N(yi|µj, σ2

j I)

Integrating out the mean expression profile p(xi1, . . . , xik|ci = j, µj, σ2

j , ψi) =

  • k

N(¯ xi; µj, (σ2

j + ψ2 i

G )I)

slide-27
SLIDE 27

Background Gene Expression Analysis Replicated Microarray Analysis

Model for Replicated Data

Generate replicates of each expression Account for variability in gene expression data For G replicates p(xik|yi, ψi) = N(xik; yi, ψ2

i )

p(yi|ci = j, µj, σj) = N(yi|µj, σ2

j I)

Integrating out the mean expression profile p(xi1, . . . , xik|ci = j, µj, σ2

j , ψi) =

  • k

N(¯ xi; µj, (σ2

j + ψ2 i

G )I) Gibbs sampler used for inference

slide-28
SLIDE 28

Background Gene Expression Analysis Replicated Microarray Analysis

Experimental Results

Move to paper for results

slide-29
SLIDE 29

Background Gene Expression Analysis Replicated Microarray Analysis

Results

slide-30
SLIDE 30

Background Gene Expression Analysis Replicated Microarray Analysis

Gibbs Sampler

For the replicated model p(ci = q|C−i, xi1, . . . , xik, µq, σ2

q, ψi, α)

∝ n−i,q T − 1 + αN(¯ xi; µq, (σ2

q + ψ2 i /G)I)

p(ci = cj, j = i|C−i, xik, ψi, α) ∝ α T − 1 + α

  • N(¯

xi; µ, (σ2 + ψ2

i /G)I)p(µ, σ2|λ, τ,

slide-31
SLIDE 31

Background Gene Expression Analysis Replicated Microarray Analysis

Gibbs Sampler

For the replicated model p(ci = q|C−i, xi1, . . . , xik, µq, σ2

q, ψi, α)

∝ n−i,q T − 1 + αN(¯ xi; µq, (σ2

q + ψ2 i /G)I)

p(ci = cj, j = i|C−i, xik, ψi, α) ∝ α T − 1 + α

  • N(¯

xi; µ, (σ2 + ψ2

i /G)I)p(µ, σ2|λ, τ,

The integral is implemented as follows

slide-32
SLIDE 32

Background Gene Expression Analysis Replicated Microarray Analysis

Gibbs Sampler

For the replicated model p(ci = q|C−i, xi1, . . . , xik, µq, σ2

q, ψi, α)

∝ n−i,q T − 1 + αN(¯ xi; µq, (σ2

q + ψ2 i /G)I)

p(ci = cj, j = i|C−i, xik, ψi, α) ∝ α T − 1 + α

  • N(¯

xi; µ, (σ2 + ψ2

i /G)I)p(µ, σ2|λ, τ,

The integral is implemented as follows

Sample µp, σp from the prior

slide-33
SLIDE 33

Background Gene Expression Analysis Replicated Microarray Analysis

Gibbs Sampler

For the replicated model p(ci = q|C−i, xi1, . . . , xik, µq, σ2

q, ψi, α)

∝ n−i,q T − 1 + αN(¯ xi; µq, (σ2

q + ψ2 i /G)I)

p(ci = cj, j = i|C−i, xik, ψi, α) ∝ α T − 1 + α

  • N(¯

xi; µ, (σ2 + ψ2

i /G)I)p(µ, σ2|λ, τ,

The integral is implemented as follows

Sample µp, σp from the prior Approximate integral with N(¯ xi; µp, (σp + ψ2

i /G)I)

slide-34
SLIDE 34

Background Gene Expression Analysis Replicated Microarray Analysis

Gibbs Sample with ‘Reverse Annealing’

Heuristic to handle mixing problems of Gibbs sampler

slide-35
SLIDE 35

Background Gene Expression Analysis Replicated Microarray Analysis

Gibbs Sample with ‘Reverse Annealing’

Heuristic to handle mixing problems of Gibbs sampler If π is the target posterior, use flattened target π(ξ)(x) = πξ(x) K(ξ) , ξ < 1

slide-36
SLIDE 36

Background Gene Expression Analysis Replicated Microarray Analysis

Gibbs Sample with ‘Reverse Annealing’

Heuristic to handle mixing problems of Gibbs sampler If π is the target posterior, use flattened target π(ξ)(x) = πξ(x) K(ξ) , ξ < 1 The conditional probability can be flattened p(ci = j|C−i, Θ) = p(ci = j|C−i, Θ)ξ K(ξ) , ξ < 1

slide-37
SLIDE 37

Background Gene Expression Analysis Replicated Microarray Analysis

Gibbs Sample with ‘Reverse Annealing’

Heuristic to handle mixing problems of Gibbs sampler If π is the target posterior, use flattened target π(ξ)(x) = πξ(x) K(ξ) , ξ < 1 The conditional probability can be flattened p(ci = j|C−i, Θ) = p(ci = j|C−i, Θ)ξ K(ξ) , ξ < 1 Let ξ → 1 as iterations n → ∞

slide-38
SLIDE 38

Background Gene Expression Analysis Replicated Microarray Analysis

Results

slide-39
SLIDE 39

Background Gene Expression Analysis Replicated Microarray Analysis

Results

slide-40
SLIDE 40

Background Gene Expression Analysis Replicated Microarray Analysis

Results

slide-41
SLIDE 41

Background Gene Expression Analysis Replicated Microarray Analysis

Results

slide-42
SLIDE 42

Background Gene Expression Analysis Replicated Microarray Analysis

Results

slide-43
SLIDE 43

Background Gene Expression Analysis Replicated Microarray Analysis

Results