Prior-Driven Cluster Allocation in Bayesian Mixture Models Sally - PowerPoint PPT Presentation

Prior-Driven Cluster Allocation in Bayesian Mixture Models Sally Paganin sally.paganin@berkeley.edu JSM 2020 August 03, 2020

Amy Herring David Dunson Andrew Olshan Duke University Duke University UNC at Chapel Hill

Introduction Clustering is one of the canonical data analysis goal in statistics • Distance based methods : distance metric between data points • Model-based clustering : rely on discrete mixture models Bayesian perspective : allow to incorporate prior information

Introduction Clustering is one of the canonical data analysis goal in statistics • Distance based methods : distance metric between data points • Model-based clustering : rely on discrete mixture models Bayesian perspective : allow to incorporate prior information What if, we have prior information on the clustering itself?

Introduction Clustering is one of the canonical data analysis goal in statistics • Distance based methods : distance metric between data points • Model-based clustering : rely on discrete mixture models Bayesian perspective : allow to incorporate prior information What if, we have prior information on the clustering itself? Motivating application - Birth defects data • Relate exposure factors to the development risk of a defect • Prior information available (biology/expert’s judgments) � We aim to provide methods to facilitate data-adaptive clustering, both using information in the data and external knowledge .

National Birth Defect Prevention Study • Population-based case-control study � 300 controls/ 100 cases per year since 1997 � monthly n. of controls ∝ n. of births previous year • Cases ( 37 major birth defect) � Birth defects surveillance system + clinical genetist review � Cases with known etiology were excluded • Controls ❤tt♣✿✴✴✇✇✇✳♥❜❞♣s✳♦r❣✴ � Non-malformed live birth � Birth certificates or hospital delivery records • Data collection � CATI (English/Spanish) within 24 months

National Birth Defect Prevention Study • Population-based case-control study � 300 controls/ 100 cases per year since 1997 � monthly n. of controls ∝ n. of births previous year • Cases ( 37 major birth defect) � Birth defects surveillance system + clinical genetist review � Cases with known etiology were excluded • Controls ❤tt♣✿✴✴✇✇✇✳♥❜❞♣s✳♦r❣✴ � Non-malformed live birth � Birth certificates or hospital delivery records • Data collection � CATI (English/Spanish) within 24 months We focus on the Congenital Heart Defects ( CDH ) which are problems in the structure of the heart that are present at birth.

Congenital Heart Defects Clinical importance priority in public health � most frequent class of defects � high impact on pediatric mortality Statistical relevance : challenge in birth defects modeling � Most defects are too rare for individual study � Difficult to determine how best to group birth defects

Congenital Heart Defects Clinical importance priority in public health � most frequent class of defects � high impact on pediatric mortality Statistical relevance : challenge in birth defects modeling � Most defects are too rare for individual study � Difficult to determine how best to group birth defects Experts have provided a mechanistic classification of the defects � relies on biological knowledge and embryologic development � translates in a prior guess c 0 for the clustering

Set partitions A set partition c of an integer [ n ] is a collection of non-empty disjoint subsets { B 1 , B 2 , . . . , B K } such that ∪ K i B i = [ n ] • Number of partitions of [ n ] into k blocks � Stirling numbers S ( n, k ) • Total number of set partitions � Bell number B n = � n k =1 S ( n, k )

Set partitions A set partition c of an integer [ n ] is a collection of 11111 non-empty disjoint subsets { B 1 , B 2 , . . . , B K } such that ∪ K 2111 i B i = [ n ] • Number of partitions of [ n ] into k blocks � Stirling numbers S ( n, k ) 311 • Total number of set partitions � Bell number B n = � n k =1 S ( n, k ) 221 • Configuration λ = {| B 1 | , . . . , | B K |} � sequence of block cardinalities � individuate an integer partition , a set of 41 positive integers { λ 1 , . . . , λ K } such that � K i =1 λ i = n 32 5

Modeling birth defects • i = 1 , . . . , N heart defects, j = 1 , . . . , n i observations • y ij = 1 if observation j has the b.d. i while y ij = 0 is a control • x T ij = ( x ij 1 , . . . , x ijp ) observed values for p dichotomous variables Grouped logistic regression logit ( π ij ) = α i + x T y ij ∼ Ber ( π ij ) ij β c i , j = 1 , . . . , n i , α i ∼ N ( a 0 , τ − 1 0 ) β c i | c ∼ N p ( b , Q ) i = 1 , . . . , N, Bayesian framework : assign a prior probability p ( c ) � Exchangeable Partition Probability Function (EPPF)

Dirichlet Process: p ( c ) ∝ � K i =1 ( | B i | − 1)! Uniform distribution p ( c ) ∝ 1 / B N Pitman-Yor Process: p ( c ) ∝ � K i =1 (1 − σ ) | B i |

How to account for c 0 ? Base idea : penalize a baseline EPPF in order to center the prior distribution on the given partition c 0 p ( c | c 0 , ψ ) ∝ p 0 ( c ) exp {− ψd ( c , c 0 ) } (1) • p 0 ( c ) indicates a baseline distribution (EPPF) on Π N • d ( c , c 0 ) a suitable distance between partitions � ideally a metric on the set partitions lattice • ψ penalization parameter controlling for the centering p ( c | c 0 , ψ ) → p 0 ( c ) � ψ = 0 � ψ → ∞ p ( c | c 0 , ψ ) = δ c 0

How to account for c 0 ? Base idea : penalize a baseline EPPF in order to center the prior distribution on the given partition c 0 p ( c | c 0 , ψ ) ∝ p 0 ( c ) exp {− ψd ( c , c 0 ) } (1) • p 0 ( c ) indicates a baseline distribution (EPPF) on Π N • d ( c , c 0 ) a suitable distance between partitions � ideally a metric on the set partitions lattice • ψ penalization parameter controlling for the centering p ( c | c 0 , ψ ) → p 0 ( c ) � ψ = 0 � ψ → ∞ p ( c | c 0 , ψ ) = δ c 0 Choice of the distance � Variation of information [Meila (2007)] • VI ( c , c ′ ) = − H ( c ) − H ( c ′ ) + 2 H ( c ∧ c ′ ) • H ( · ) information entropy • metric on set partition lattice

Centered Partition Processes Define sets of partitions with distance δ l from c 0 and configuration λ m s lm ( c 0 ) = { c ∈ Π N : d ( c , c 0 ) = δ l , Λ ( c ) = λ m } for l = 0 , . . . , L and m = 1 , . . . , M . Centered Partition Processes - analytic form g ( λ m ) e − ψδ l p ( c | c 0 , ψ ) = for c ∈ s lm ( c 0 ) v =1 | s uv ( c 0 ) | g ( λ v ) e − ψδ u , � L � M u =0 • g ( · ) function of the configuration Λ ( c ) � e.g. Uniform g ( Λ ( c )) = 1 , DP g ( Λ ( c )) = α K � K j =1 Γ( λ j ) • | · | cardinality of the set s lm ( c 0 ) , not analytically tractable � but can nonetheless be used in Bayesian models relying on Monte Carlo methods

CP Process - Uniform EPPF c 0 = { 1 , 2 , 3 , 4 , 5 } c 0 = { 1 , 2 }{ 3 , 4 }{ 5 }

CP Process - DP EPPF ( α = 1 ) c 0 = { 1 , 2 , 3 , 4 , 5 } c 0 = { 1 , 2 }{ 3 , 4 }{ 5 }

Prior-Driven Cluster Allocation in Bayesian Mixture Models Sally - PowerPoint PPT Presentation

Prior-Driven Cluster Allocation in Bayesian Mixture Models Sally Paganin sally.paganin@berkeley.edu JSM 2020 August 03, 2020 Amy Herring David Dunson Andrew Olshan Duke University Duke University UNC at Chapel Hill Introduction

Bernoulli Mixture Models Victor Medina Researcher at SBIF DataCamp Mixture Models in R The

Structure of mixture models Victor Medina Researcher at SBIF DataCamp Mixture Models in R

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

A Firefox cluster driven by JavaScript, Perl, and PL/PgSQL A Firefox cluster driven by JavaScript

Cluster Architectures Overview Cluster Computing The Problem The Solution The Anatomy

AND MACHINE LEARNING CHAPTER 10: MIXTURE MODELS AND EM Mixture Models - Define a joint

Gaussian Mixture Models & EM CE-717: Machine Learning Sharif University of Technology M.

Luigi Spezia Biomathematics & Statistics Scotland Aberdeen BAYESIAN VARIABLE SELECTION

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Bayesian hierarchical models Bruno Nicenboim / Shravan Vasishth 2020-03-14 1 Bayesian

Deep Gaussian Mixture Models Cinzia Viroli (University of Bologna, Italy) joint with Geoff

The prior model Alicia Johnson Associate Professor, Macalester College DataCamp Bayesian

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

Priority-Driven Scheduling of Periodic Tasks Priority-driven vs. clock-driven scheduling:

More Register Allocation Last time Register allocation Global allocation via graph

Therapeutic Strategies for Elderly Patients with DLBCL Michael Pfreundschuh German High-Grade

Fiocruz intramural INOVA program to accelerate science and technology for health. Scientific and

SPRINT: a Simple Parallel R INTerface to High Performance Computing (HPC) and a Parallel R

Department of Internal Medicine Coordinating H2020 grants Immunotherapy in infectious disease -

New Frontiers in Infectious & Autoimmune Encephalitis Michael Wilson, MD, MAS Assistant

Early Intervention in Psychosis Network 7 th July 2016 Stephen McGowan, EIP Clinical Lead for

Why Antibacterial Minor Groove Binders Are a Good Thing Colin J. Suckling 1, *, Abedawn Khalaf 1 ,

Antibacterial activity of zinc(II) and copper(II) terpyridine complexes Tanja Soldatovi 1, *,

Prior-Driven Cluster Allocation in Bayesian Mixture Models Sally - PowerPoint PPT Presentation

Prior-Driven Cluster Allocation in Bayesian Mixture Models Sally Paganin sally.paganin@berkeley.edu JSM 2020 August 03, 2020 Amy Herring David Dunson Andrew Olshan Duke University Duke University UNC at Chapel Hill Introduction

Bernoulli Mixture Models Victor Medina Researcher at SBIF DataCamp Mixture Models in R The

Structure of mixture models Victor Medina Researcher at SBIF DataCamp Mixture Models in R

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

A Firefox cluster driven by JavaScript, Perl, and PL/PgSQL A Firefox cluster driven by JavaScript

Cluster Architectures Overview Cluster Computing The Problem The Solution The Anatomy

AND MACHINE LEARNING CHAPTER 10: MIXTURE MODELS AND EM Mixture Models - Define a joint

Gaussian Mixture Models &amp; EM CE-717: Machine Learning Sharif University of Technology M.

Luigi Spezia Biomathematics &amp; Statistics Scotland Aberdeen BAYESIAN VARIABLE SELECTION

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Bayesian hierarchical models Bruno Nicenboim / Shravan Vasishth 2020-03-14 1 Bayesian

Deep Gaussian Mixture Models Cinzia Viroli (University of Bologna, Italy) joint with Geoff

The prior model Alicia Johnson Associate Professor, Macalester College DataCamp Bayesian

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

Priority-Driven Scheduling of Periodic Tasks Priority-driven vs. clock-driven scheduling:

More Register Allocation Last time Register allocation Global allocation via graph

Therapeutic Strategies for Elderly Patients with DLBCL Michael Pfreundschuh German High-Grade

Fiocruz intramural INOVA program to accelerate science and technology for health. Scientific and

SPRINT: a Simple Parallel R INTerface to High Performance Computing (HPC) and a Parallel R

Department of Internal Medicine Coordinating H2020 grants Immunotherapy in infectious disease -

New Frontiers in Infectious &amp; Autoimmune Encephalitis Michael Wilson, MD, MAS Assistant

Early Intervention in Psychosis Network 7 th July 2016 Stephen McGowan, EIP Clinical Lead for

Why Antibacterial Minor Groove Binders Are a Good Thing Colin J. Suckling 1, *, Abedawn Khalaf 1 ,

Antibacterial activity of zinc(II) and copper(II) terpyridine complexes Tanja Soldatovi 1, *,

Gaussian Mixture Models & EM CE-717: Machine Learning Sharif University of Technology M.

Luigi Spezia Biomathematics & Statistics Scotland Aberdeen BAYESIAN VARIABLE SELECTION

New Frontiers in Infectious & Autoimmune Encephalitis Michael Wilson, MD, MAS Assistant