Nonparametric Deconvolution Models Allison J.B. Chaney Princeton - - PowerPoint PPT Presentation

nonparametric deconvolution models
SMART_READER_LITE
LIVE PREVIEW

Nonparametric Deconvolution Models Allison J.B. Chaney Princeton - - PowerPoint PPT Presentation

Nonparametric Deconvolution Models Allison J.B. Chaney Princeton University In collaboration with Barbara Engelhardt, Archit Verma and Young-Suk Lee Objective Model collections of convolved data points Objective Model collections of


slide-1
SLIDE 1

Nonparametric Deconvolution Models

Allison J.B. Chaney Princeton University 


In collaboration with Barbara Engelhardt, Archit Verma and Young-Suk Lee

slide-2
SLIDE 2

Objective

Model collections of convolved data points

slide-3
SLIDE 3

Objective

Model collections of convolved data points

slide-4
SLIDE 4

Objective

Model collections of convolved data points

slide-5
SLIDE 5

Objective

Model collections of convolved data points

1 2 2 2 3 3 1 1 3 2 1 2 2 3 2

slide-6
SLIDE 6

Objective

Model collections of convolved data points

General Voting Bulk RNA-seq Images

  • bservation

district vote tally sample image feature issue or candidate gene expression level pixel particle individual voter

  • ne cell

light particle factor voting cohort cell type visual pattern

slide-7
SLIDE 7

“convolution”

individual particles

  • bserved together

1 2 2 3 2

signal progressing

convolutional neural nets

slide-8
SLIDE 8

“convolution”

individual particles

  • bserved together

1 2 2 3 2

signal progressing

convolutional neural nets

slide-9
SLIDE 9

Related Models

  • Mixture models assign each
  • bservation to one of K clusters, or

factors.

slide-10
SLIDE 10

Related Models

  • Admixture models represent

groups of observations, each with its

  • wn mixture of K shared factors.
slide-11
SLIDE 11

Related Models

  • Decomposition models

decompose observations into constituent parts by representing

  • bservations as a product between

group representations and factor features.

slide-12
SLIDE 12

Our Model

  • Deconvolution models (this

work) similarly decompose, or deconvolve, observations into constituent parts, but also capture group-specific (or local) fluctuations in factor features.

slide-13
SLIDE 13

How do usually vote?

1 2 2 2 3 3 1 1 3 2 1 2 2 3 2

slide-14
SLIDE 14

1 2 2 2 3 3 1 1 3 2 1 2 2 3 2

  • How do vote in district A?

A

A B C …

B C D E

slide-15
SLIDE 15

Our Model

slide-16
SLIDE 16

Our Model

  • global factors
slide-17
SLIDE 17

Our Model

  • local factors

(observation-specific)

slide-18
SLIDE 18

Our Model

slide-19
SLIDE 19

Our Model

HDP

slide-20
SLIDE 20

Our Model

HDP

slide-21
SLIDE 21

Our Model

HDP

Paisley, 2012

β′

k | α0 ∼ Beta(1,α0)

βk = β′

k k−1

ℓ=1

(1 − β′

ℓ)

slide-22
SLIDE 22

Our Model

HDP

Paisley, 2012

β′

k | α0 ∼ Beta(1,α0)

βk = β′

k k−1

ℓ=1

(1 − β′

ℓ)

π′

n,k | α, βk ∼ Gamma(αβk,1)

πn,k = π′

n,k

∑∞

ℓ=1 π′ n,ℓ

slide-23
SLIDE 23

Our Model

μk,m | μ0, σ0 ∼ 𝒪(μ0, σ0)

slide-24
SLIDE 24

Our Model

μk,m | μ0, σ0 ∼ 𝒪(μ0, σ0) Σk | ν, Ψ ∼ 𝒳−1(Ψ, ν)

slide-25
SLIDE 25

Our Model

μk,m | μ0, σ0 ∼ 𝒪(μ0, σ0) Σk | ν, Ψ ∼ 𝒳−1(Ψ, ν) ¯ xn,k | πn,k, μk, Σk ∼ 𝒪M (μk, Σk Pnπn,k)

slide-26
SLIDE 26

Our Model

μk,m | μ0, σ0 ∼ 𝒪(μ0, σ0) Σk | ν, Ψ ∼ 𝒳−1(Ψ, ν) ¯ xn,k | πn,k, μk, Σk ∼ 𝒪M (μk, Σk Pnπn,k) Pn | ρ ∼ Poisson(ρ)

slide-27
SLIDE 27

Our Model

yn,m | ¯ xn, πn ∼ f g (

k=1

πn,k ¯ xn,k,m)

slide-28
SLIDE 28

Inference

slide-29
SLIDE 29

Inference

? ? ? ? ? ?

slide-30
SLIDE 30

Variational Inference

intractable posterior p

slide-31
SLIDE 31

easy to compute approximation q

Variational Inference

intractable posterior p

slide-32
SLIDE 32

easy to compute approximation q

Variational Inference

intractable posterior p black box variational inference (Ranganath, 2014)
 split-merge procedure (Bryant, 2012) to learn K

slide-33
SLIDE 33

BBVI overview

Ranganath et al., 2014

∇λ[z]ℒ = 𝔽q [∇λ[z]log q(z | λ[z])(log pz(y, z, …) − log q(z | λ[z]))]

We want to estimate f

z

slide-34
SLIDE 34

BBVI overview

Ranganath et al., 2014

∇λ[z]ℒ = 𝔽q [∇λ[z]log q(z | λ[z])(log pz(y, z, …) − log q(z | λ[z]))]

We want to estimate f

z

Which has corresponding variational parameter f λ[z]

slide-35
SLIDE 35

BBVI overview

Ranganath et al., 2014

∇λ[z]ℒ = 𝔽q [∇λ[z]log q(z | λ[z])(log pz(y, z, …) − log q(z | λ[z]))]

We want to estimate f

z

Which has corresponding variational parameter f λ[z]

is the set of all variational parameters

λ

slide-36
SLIDE 36

BBVI overview

Ranganath et al., 2014

∇λ[z]ℒ = 𝔽q [∇λ[z]log q(z | λ[z])(log pz(y, z, …) − log q(z | λ[z]))]

We want to estimate f

z

Which has corresponding variational parameter f λ[z]

is the set of all variational parameters

λ

The gradient of the ELBO

slide-37
SLIDE 37

BBVI overview

Ranganath et al., 2014

∇λ[z]ℒ = 𝔽q [∇λ[z]log q(z | λ[z])(log pz(y, z, …) − log q(z | λ[z]))]

We want to estimate f

z

Which has corresponding variational parameter f λ[z]

is the set of all variational parameters

λ

The gradient of the ELBO

≈ ˜ ∇λ[z]ℒ

If we can approximate this gradient, we can use standard stochastic gradient ascent to update . λ[z]

slide-38
SLIDE 38

BBVI overview

Ranganath et al., 2014

∇λ[z]ℒ = 𝔽q [∇λ[z]log q(z | λ[z])(log pz(y, z, …) − log q(z | λ[z]))]

We want to estimate f

z

Which has corresponding variational parameter f λ[z]

is the set of all variational parameters

λ

The gradient of the ELBO

= 1 S

S

s=1

[∇λ[z]log q(z[s] | λ[z])(log pz(y, z[s], …) − log q(z[s] | λ[z]))] ≈ ˜ ∇λ[z]ℒ

If we can approximate this gradient, we can use standard stochastic gradient ascent to update . λ[z] Average over S samples From the variational distribution

z[s] ∼ q(z | λ[z])

slide-39
SLIDE 39

split/merge overview

Bryant and Sudderth, 2012 initialize with fixed K iterate until batch convergence consider splitting each factor consider merging some factors full convergence

slide-40
SLIDE 40

split/merge overview

Bryant and Sudderth, 2012 initialize with fixed K ergence consider splitting each factor consider merging some factors

slide-41
SLIDE 41

split/merge overview

Bryant and Sudderth, 2012 initialize with fixed K ergence consider splitting each factor consider merging some factors

slide-42
SLIDE 42

split/merge overview

Bryant and Sudderth, 2012 initialize with fixed K ergence consider splitting each factor consider merging some factors

initialize variational parameters update variational parameters (one iteration) accept / reject based on ELBO

slide-43
SLIDE 43

split/merge overview

Bryant and Sudderth, 2012 initialize with fixed K ergence consider splitting each factor consider merging some factors

λS[βk′] = ρtλ[βk] λS[βk′′] = (1 − ρt)λ[βk] λ[βk]

slide-44
SLIDE 44

split/merge overview

Bryant and Sudderth, 2012 initialize with fixed K ergence consider splitting each factor consider merging some factors

λS[πn,k′] = ρtλ[πn,k] λS[πn,k′′] = (1 − ρt)λ[πn,k] λ[πn,k]

slide-45
SLIDE 45

split/merge overview

Bryant and Sudderth, 2012 initialize with fixed K ergence consider splitting each factor consider merging some factors

λS[μk′] = λ[μk] λS[μk′′] = λ[μk] + ε λ[μk]

slide-46
SLIDE 46

split/merge overview

Bryant and Sudderth, 2012 initialize with fixed K ergence consider splitting each factor consider merging some factors

λS[Σk′] = λ[Σk] λS[Σk′′] = λ[Σk] λ[Σk]

slide-47
SLIDE 47

split/merge overview

Bryant and Sudderth, 2012 initialize with fixed K ergence consider splitting each factor consider merging some factors

λS[xn,k′] = λ[xn,k] λS[xn,k′′] = λ[xn,k] λ[xn,k]

slide-48
SLIDE 48

split/merge overview

Bryant and Sudderth, 2012 initialize with fixed K ergence consider splitting each factor consider merging some factors

λM[βk] = λ[βk′] + λ[βk′′] λM[πn,k] = λ[πn,k′] + λ[πn,k′′] λM[μk] = λ[βk′]λ[μk′] + λ[βk′′]λ[μk′′] λ[βk′] + λ[βk′′]

slide-49
SLIDE 49

set K to an initial value initialize variational parameters repeat until convergence: repeat until batch convergence: update variational parameters for using BBVI update variational parameters for _using analytic updates split/merge latent factors, defining new K and updating variational parameters accordingly

¯ x, π, P, β

μ, Σ

Algorithm Pseudocode

slide-50
SLIDE 50
slide-51
SLIDE 51
slide-52
SLIDE 52
slide-53
SLIDE 53
slide-54
SLIDE 54
slide-55
SLIDE 55
slide-56
SLIDE 56
slide-57
SLIDE 57

Results on Simulated Data

slide-58
SLIDE 58

Results on Simulated Data

slide-59
SLIDE 59

2016 Election in California

https://github.com/datadesk/california-2016-election-precinct-maps 43.9% registered Democrats 28.9% registered Republicans 27.2% other parties / unregistered caveat: these are very preliminary results

slide-60
SLIDE 60
slide-61
SLIDE 61

Prop 63: Background Checks for Ammunition Purchases and Large- Capacity Ammunition Magazine Ban

slide-62
SLIDE 62

Prop 58: Non-English Languages Allowed in Public Education

slide-63
SLIDE 63
slide-64
SLIDE 64

(gun control) (non-english in schools)

slide-65
SLIDE 65

(gun control) (non-english in schools)

slide-66
SLIDE 66

Thank you!

This research was supported by an appointment to the Intelligence Community Postdoctoral Research Fellowship Program at Princeton University, administered by Oak Ridge Institute for Science and Education through an interagency agreement between the U.S. Department of Energy and the Office of the Director of National Intelligence.