Outline Motivation and challenge Dirichlet Process and Infinite - PDF document

Nonparametric Bayesian M Nonparametric Bayesian Models odels --Learning and Reasoning in Open Possible Worlds -- Learning and Reasoning in Open Possible Worlds Eric Xing epxing@cs.cmu.edu Machine Learning Dept./Language Technology Inst./Computer Science Dept. Carnegie Mellon University 1 VLPR09 @ Beijing, China 8/6/2009 Outline � Motivation and challenge � Dirichlet Process and Infinite Mixture � Formulation � Approximate Inference algorithm Example: population clustering � � Hierarchical Dirichlet Process and Multi-Task Clustering � Formulation � Transformed DP and HDP � Kernel stick-breaking process Application: joint image segmentation � � Dynamic Dirichlet Process � Hidden Markov DP � Temporal DPM Application: evolutionary clustering of documents � � Summary 2 VLPR09 @ Beijing, China 8/6/2009

Clustering 3 VLPR09 @ Beijing, China 8/6/2009 Image Segmentation � How to segment images? � Manual segmentation (very expensive) � Algorithm segmentation � K-means � Statistical mixture models � Spectral clustering � Problems with most existing algorithms � Ignore the spatial information � Perform the segmentation one image at a time � Need to specify the number of segments a priori 4 VLPR09 @ Beijing, China 8/6/2009

Discover Object Categories � Discover what objects are present in a collection of images in an unsupervised way � Find those same objects in novel images � Determine what local image features correspond to what objects; segmenting the image 5 VLPR09 @ Beijing, China 8/6/2009 Learn and Recognize Natural Scene Categories 6 VLPR09 @ Beijing, China 8/6/2009

Object Recognition and Tracking (1.9, 9.0, 2.1) (1.8, 7.4, 2.3) (1.9, 6.1, 2.2) (0.7, 5.1, 3.2) (0.6, 5.9, 3.2) (0.9, 5.8, 3.1) t=1 t=2 t=3 7 VLPR09 @ Beijing, China 8/6/2009 The Evolution of Science Research Research circles circles Phy Phy Bio Research Research topics topics CS PNAS papers papers PNAS 2000 ? 1900 8 VLPR09 @ Beijing, China 8/6/2009

A Classical Approach � Clustering as Mixture Modeling � Then "model selection" 9 VLPR09 @ Beijing, China 8/6/2009 Partially Observed, Open and Evolving Possible Worlds � Unbounded # of objects/trajectories � Changing attributes � Birth/death, merge/split � Relational ambiguity � The parametric paradigm: � Finite ( ) Event model motion model ( { } Event model ) ( { } ) motion model { } { } 0 1 + 1 p φ φ T φ t φ t p : p � Structurally k or k k k Entity space Entity space unambiguous Ξ Ξ * * 1 + | 1 + t | t t t observation space observation space Sensor model Sensor model ( { } ) φ p | x k How to open it up? How to open it up? 10 10 VLPR09 @ Beijing, China 8/6/2009

Model Selection vs. Posterior Inference � Model selection � "intelligent" guess: ??? � cross validation: data-hungry � � information theoretic: ( ) � AIC f g ˆ ⋅ ⋅ θ KL K arg min ( ) | ( | , ) ML � TIC Parsimony, Ockam's Parsimony, Ockam's Razor Razor � MDL : � Bayes factor: need to compute data likelihood � Posterior inference: we want to handle uncertainty of model complexity explicitly p M D p D M p M ∝ ( | ) ( | ) ( ) { } M ≡ θ K , � we favor a distribution that does not constrain M in a "closed" space! 11 11 VLPR09 @ Beijing, China 8/6/2009 Two "Recent" Developments � First order probabilistic languages (FOPLs) � Examples: PRM, BLOG … � Lift graphical models to "open" world (#rv, relation, index, lifespan …) � Focus on complete, consistent, and operating rules to instantiate possible worlds, and formal language of expressing such rules � Operational way of defining distributions over possible worlds, via sampling methods � Bayesian Nonparametrics � Examples: Dirichlet processes, stick-breaking processes … � From finite, to infinite mixture, to more complex constructions (hierarchies, spatial/temporal sequences, …) � Focus on the laws and behaviors of both the generative formalisms and resulting distributions � Often offer explicit expression of distributions, and expose the structure of the distributions --- motivate various approximate schemes 12 12 VLPR09 @ Beijing, China 8/6/2009

Outline � Motivation and challenge � Dirichlet Process and Infinite Mixture � Formulation � Approximate Inference algorithm Example: population clustering � � Hierarchical Dirichlet Process and Multi-Task Clustering � Formulation � Transformed DP and HDP � Kernel stick-breaking process � Application: joint image segmentation � Dynamic Dirichlet Process � Hidden Markov DP � Temporal DPM Application: evolutionary clustering of documents � � Summary 13 13 VLPR09 @ Beijing, China 8/6/2009 Clustering � How to label them ? � How many clusters ??? 14 14 VLPR09 @ Beijing, China 8/6/2009

Random Partition of Probability Space { } φ , 6 π 6 { } φ , 4 π 4 { } . ( event, p event ) φ , 5 π 5 { } φ , 3 π 3 { } centroid := φ φ , 2 π 2 { } Image ele. :=( x, θ ) … φ , 1 π 1 15 15 VLPR09 @ Beijing, China 8/6/2009 Stick-breaking Process 0 0.4 0.4 ∞ ∑ G = π δ θ ( ) k k k 1 = 0.6 0.5 0.3 G θ ~ k 0 ∞ ∑ Location 1 0.3 0.8 0.24 π = k k 1 = k 1 - ∏ 1 π = β β ( - ) k k k j 1 = Mass G 0 1 β α ~ Beta( , ) k 16 16 VLPR09 @ Beijing, China 8/6/2009

DP – a P ó lya urn Process 2 = 5 p + α 3 = 5 p + α α = 5 p + α = G p K : ( ) 0 ( ) α G DP( G ) ~ Joint: Joint: 0 � Self-reinforcing property α K n � exchangeable partition ∑ φ φ α δ + G k G | , , ~ . Marginal: Marginal: 0 0 i − i 1 φ 1 − + α − + α i i of samples k = 1 k 17 17 VLPR09 @ Beijing, China 8/6/2009 Clustering and DP Mixture 2 = 5 p + α 3 = 5 p + α α = 5 p + α = G p K : ( ) 0 1 3 2 4 5 6 � We can associate mixture components with colors in the Pólya urn model and thereby define a clustering of the data 18 18 VLPR09 @ Beijing, China 8/6/2009

Chinese Restaurant Process θ θ 1 2 P c k 1 0 0 ( = | ) = c i - i α 1 0 α α 1 + 1 + α 1 1 α α α 2 + 2 + 2 + α 1 2 α α α 3 + 3 + 3 + α m m .... 1 2 α α + α i i i + - 1 + - 1 - 1 19 19 VLPR09 @ Beijing, China 8/6/2009 Dirichlet Process � A CDF , G , on possible worlds φ φ 6 of random partitions follows a 4 φ φ 5 3 φ Dirichlet Process if for any 2 φ 1 measurable finite partition ( φ 1 , φ 2 , .., φ m ): a distribution ( G ( φ 1 ), G( φ 2 ), …, G ( φ m ) ) ~ Dirichlet( α G 0 ( φ 1 ), …., α G 0( φ m ) ) another distribution where G 0 is the base measure and α is the scale parameter Process G G defines a distribution of distribution Thus a Thus a Dirichlet Dirichlet Process defines a distribution of distribution 20 20 VLPR09 @ Beijing, China 8/6/2009

Graphical Model Representations of DP G 0 G G 0 G 0 0 α α α α G θ π ∞ θ i y i x i x i N N The Pólya urn construction The Stick-breaking construction 21 21 VLPR09 @ Beijing, China 8/6/2009 Example: DP-haplotyper [Xing et al, 2004] � Clustering human populations α G 0 DP G K infinite mixture components A θ (for population haplotypes haplotypes ) H n 1 H n 2 Likelihood model (for individual haplotypes and genotypes genotypes ) G n N � Inference: Markov Chain Monte Carlo (MCMC) � Gibbs sampling � Metropolis Hasting 22 22 VLPR09 @ Beijing, China 8/6/2009

Inheritance and Observation Models � Single-locus mutation model A 1 → A A H C i 2 Ancestral i e C A e i 3 pool 1 … ⎧ θ = h a for t t ⎪ C θ = 1 − θ P h a ⎨ ( | , ) i H t t ≠ h a for 2 ⎪ − 1 t t B ⎩ | | → = θ h a with prob . H t t i 1 Haplotypes � Noisy observation model H i 2 → H H G 1 , i i i 2 P g h h ( | , ) : 1 2 G G Genotype = ⊕ λ g h h with prob i . 1 2 t , t , t 23 23 VLPR09 @ Beijing, China 8/6/2009 MCMC for Haplotype Inference � Gibbs sampling for exploring the posterior distribution under the proposed model θ λ c a e , � Integrate out the parameters such as or , and sample i k h and i e = ∝ = p c k p c k p h a ( | , , ) ( | ) ( | , ) c h a c h c − − − i i i i i k i [ ] [ ] , [ ] e e e e e e Posterior Prior x Likelihood Pólya urn M � Gibbs sampling algorithm: draw samples of each random variable to be sampled given values of all the remaining variables 24 24 VLPR09 @ Beijing, China 8/6/2009

MCMC for Haplotype Inference Sample c ie(j) , from 1. Sample a k from 2. Sample h ie(j) from 3. For DP scale parameter α : a vague inverse Gamma prior � 25 25 VLPR09 @ Beijing, China 8/6/2009 Convergence of Ancestral Inference 26 26 VLPR09 @ Beijing, China 8/6/2009

Outline Motivation and challenge Dirichlet Process and Infinite - PDF document

Nonparametric Bayesian M Nonparametric Bayesian Models odels --Learning and Reasoning in Open Possible Worlds -- Learning and Reasoning in Open Possible Worlds Eric Xing epxing@cs.cmu.edu Machine Learning Dept./Language Technology

Ins Domingues Breast Cancer Workshop April 7th 2015 Outline Outline Outline Outline

Presentation Preparation Outline Speech Outline Template ***Use this outline to guide you in

Outline for St Outline for St Outline for

Beob Kyun Kim, S oonwook Hwang {kyun, hwang}@ kisti.re.kr KIS TI, Korea Outline Outline

Catherine Revels, World Bank November 2009 Presentation outline Presentation outline

Battlestar Galactica Battlestar Galactica Galactica Battlestar Outline Outline Outline

Outline 2 Outline 2 ZSim core simulation techniques Outline 2 ZSim core simulation

Appendix J: Capstone Presentation Outline Revised Spring 2016 CAPSTONE PRESENTATION OUTLINE This

PT1 TMP Presentation Outline 1 Group Members: ___________________________________ Use this outline

Broverview Outline 2 Outline Philosophy and Architecture A framework for network traffic

Xingqian Peng, Huaqiao University, China Presented by Zhen Wu Presented by Zhen Wu October 30,2011

1 Web Application Development 2 3 Web Application Development CSS Outline An outline is a

Lecture Outline Strengthening Induction Hypothesis. Lecture Outline Strengthening Induction

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

High Dimensional Approximation - Outline Background and Sources Wolfgang Dahmen Seminar: USC,

Outline Outline Deaf and Hearing Impaired Deaf and Hearing Impaired Physical Structures of

Recent Results on Radia iativ ive and Ele lectroweak Penguin in decays at t Belle lle

CS 473: Algorithms Ruta Mehta University of Illinois, Urbana-Champaign Spring 2018 Ruta (UIUC)

COMPANY UPDATE PT Wijaya Karya (Persero) Tbk. Spirit of Innovation DISCLAIMER This presentation

Factorizable completely positive maps and the Connes embedding problem Joint work with Uffe

Split The Check App MAKI HIROSE The Mission Improve the experience of splitting a check at a

19 Advanced Topics 1: MT System Combination In the chapters up to this point, we have covered

LC-PCN The Load Control PCN solution draft-westberg-pcn-load-control-00.txt Lars Westberg,

Childhood Obesity: Anesthetic Implications The Changing Practice of Marla Ferschl, MD