 
              The continuous categorical: a novel simplex-valued exponential family Elliott Gordon-Rodr´ ıguez , Gabriel Loaiza-Ganem, John P. Cunningham https://arxiv.org/abs/2002.08563 ICML 2020
Motivation: compositional data Definition (simplex): S K := { x ∈ R K + : � K i =1 x i = 1 } Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 2 / 1 The continuous categorical: a novel simplex-valued exponential family.
Motivation: compositional data Examples: ◮ Geology ◮ Chemistry ◮ Microbiology ◮ Genetics ◮ Economics ◮ Politics ◮ Machine learning Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 3 / 1 The continuous categorical: a novel simplex-valued exponential family.
Shortcomings of the Dirichlet Definition: x ∼ Dirichlet ( α ) if x ∈ S K with density: K 1 � x α i − 1 p ( x ; α ) = . (1) i B ( α ) i =1 Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 4 / 1 The continuous categorical: a novel simplex-valued exponential family.
Shortcomings of the Dirichlet Definition: x ∼ Dirichlet ( α ) if x ∈ S K with density: K 1 � x α i − 1 p ( x ; α ) = . (1) i B ( α ) i =1 ◮ Extrema. log p ( x ; α ) → ±∞ as x j → 0. ∴ log-likelihood is undefined in the presence of zeros. Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 4 / 1 The continuous categorical: a novel simplex-valued exponential family.
Shortcomings of the Dirichlet Definition: x ∼ Dirichlet ( α ) if x ∈ S K with density: K 1 � x α i − 1 p ( x ; α ) = . (1) i B ( α ) i =1 ◮ Extrema. log p ( x ; α ) → ±∞ as x j → 0. ∴ log-likelihood is undefined in the presence of zeros. ◮ Bias. Re-write the density in canonical form �� K � p ( x ; α ) = h ( x ) exp i =1 α i log x i − A ( α ) . By theory of exponential families, MLE is unbiased for E log x j . ∴ MLE is biased for the mean µ j = E x j . Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 4 / 1 The continuous categorical: a novel simplex-valued exponential family.
Shortcomings of the Dirichlet Definition: x ∼ Dirichlet ( α ) if x ∈ S K with density: K 1 � x α i − 1 p ( x ; α ) = . (1) i B ( α ) i =1 ◮ Extrema. log p ( x ; α ) → ±∞ as x j → 0. ∴ log-likelihood is undefined in the presence of zeros. ◮ Bias. Re-write the density in canonical form �� K � p ( x ; α ) = h ( x ) exp i =1 α i log x i − A ( α ) . By theory of exponential families, MLE is unbiased for E log x j . ∴ MLE is biased for the mean µ j = E x j . ◮ Flexibility. If x 0 ∈ S K is a single datapoint, then log p ( x 0 ; α ) → ∞ as α → ∞ along α = k x 0 . ∴ the Dirichlet log-likelihood is ill-behaved under flexible predictive models (e.g. GLMs, neural networks). Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 4 / 1 The continuous categorical: a novel simplex-valued exponential family.
Solution: a new exponential family Definition: x ∈ S K follows a continuous categorical ( CC ) distribution with parameter λ ∈ S K if: K � λ x i x ∼ CC ( λ ) ⇐ ⇒ p ( x ; λ ) ∝ i i =1 Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 5 / 1 The continuous categorical: a novel simplex-valued exponential family.
Solution: a new exponential family Definition: x ∈ S K follows a continuous categorical ( CC ) distribution with parameter λ ∈ S K if: K � λ x i x ∼ CC ( λ ) ⇐ ⇒ p ( x ; λ ) ∝ i i =1 ◮ Extrema. log p ( x ; λ ) is finite at the extrema of the simplex. ∴ log-likelihood is well-defined in the presence of zeros. Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 5 / 1 The continuous categorical: a novel simplex-valued exponential family.
Solution: a new exponential family Definition: x ∈ S K follows a continuous categorical ( CC ) distribution with parameter λ ∈ S K if: K � λ x i x ∼ CC ( λ ) ⇐ ⇒ p ( x ; λ ) ∝ i i =1 ◮ Extrema. log p ( x ; λ ) is finite at the extrema of the simplex. ∴ log-likelihood is well-defined in the presence of zeros. ◮ Bias. Re-write the CC density in canonical form �� K � p ( x ; λ ) ∝ exp i =1 log( λ i ) · x i . ∴ by theory of exponential families, MLE is unbiased for the mean µ j = E x j . Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 5 / 1 The continuous categorical: a novel simplex-valued exponential family.
Solution: a new exponential family Definition: x ∈ S K follows a continuous categorical ( CC ) distribution with parameter λ ∈ S K if: K � λ x i x ∼ CC ( λ ) ⇐ ⇒ p ( x ; λ ) ∝ i i =1 ◮ Extrema. log p ( x ; λ ) is finite at the extrema of the simplex. ∴ log-likelihood is well-defined in the presence of zeros. ◮ Bias. Re-write the CC density in canonical form �� K � p ( x ; λ ) ∝ exp i =1 log( λ i ) · x i . ∴ by theory of exponential families, MLE is unbiased for the mean µ j = E x j . ◮ Flexibility. The CC density is convex in x . ∴ cannot represent interior modes, cannot concentrate mass on interior points and log-likelihood does not diverge. Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 5 / 1 The continuous categorical: a novel simplex-valued exponential family.
Solution: a new exponential family Definition: x ∈ S K follows a continuous categorical ( CC ) distribution with parameter λ ∈ S K if: K � λ x i x ∼ CC ( λ ) ⇐ ⇒ p ( x ; λ ) ∝ i i =1 Where did this come from? ◮ A probabilistic cross-entropy loss for compositional data. ◮ Multivariate generalization of the continuous Bernoulli distribution (Loaiza-Ganem & Cunningham, NeurIPS 2019): ⇒ p ( x | λ ) ∝ λ x (1 − λ ) 1 − x , for x ∈ [0 , 1] = S 1 . x ∼ CB ( λ ) ⇐ ◮ A continuous relaxation of the categorical distribution. ◮ Switching the role of the parameter and the argument in the Dirichlet density. ◮ Restricting independent exponential RVs to the simplex. Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 5 / 1 The continuous categorical: a novel simplex-valued exponential family.
Normalizing constant Theorem: Write C ( λ ) for the normalizing constant of the CC ( λ ) distribution, i.e. K � � λ x i S K C ( λ ) i d µ ( x ) = 1 . (2) i =1 Then � − 1 K � λ k ( − 1) K +1 � C ( λ ) = , i � = k log λ i � k =1 λ k Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 6 / 1 The continuous categorical: a novel simplex-valued exponential family.
Normalizing constant Theorem: Write C ( λ ) for the normalizing constant of the CC ( λ ) distribution, i.e. K � � λ x i S K C ( λ ) i d µ ( x ) = 1 . (2) i =1 Then � − 1 K � λ k ( − 1) K +1 � C ( λ ) = , i � = k log λ i � k =1 λ k Remark: ◮ Closed-form in terms of elementary functions only. ◮ Can compute moments, MGF, and more, directly from C ( · ). Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 6 / 1 The continuous categorical: a novel simplex-valued exponential family.
Related distributions Continuous Beta Bernoulli Continuous Dirichlet Categorical Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 7 / 1 The continuous categorical: a novel simplex-valued exponential family.
Related distributions x α − 1 (1 − x ) β − 1 λ x (1 − λ ) 1 − x � K i =1 x α i − 1 � K i =1 λ x i i i Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 8 / 1 The continuous categorical: a novel simplex-valued exponential family.
Related distributions x α − 1 (1 − x ) β − 1 λ x (1 − λ ) 1 − x Generalize to simplex Generalize to simplex � K i =1 x α i − 1 � K i =1 λ x i i i Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 9 / 1 The continuous categorical: a novel simplex-valued exponential family.
Related distributions Switch parameter and argument x α − 1 (1 − x ) β − 1 λ x (1 − λ ) 1 − x Generalize to simplex Generalize to simplex Switch parameter and argument � K i =1 x α i − 1 � K i =1 λ x i i i Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 10 / 1 The continuous categorical: a novel simplex-valued exponential family.
Related distributions Switch parameter and argument Beta CB Generalize to simplex Generalize to simplex Switch parameter and argument Dirichlet CC Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 11 / 1 The continuous categorical: a novel simplex-valued exponential family.
Related distributions [0,1]-valued, Beta CB Image data Unstable Stable Biased Unbiased Flexible Inflexible Simplex-valued, Dirichlet CC Compositional data Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 12 / 1 The continuous categorical: a novel simplex-valued exponential family.
Application: UK 2019 general election Constituency-level regression function (linear or MLP) Voting predictors outcomes Gordon-Rodriguez, E., Loaiza-Ganem, G., & Cunningham, J. P. (2020). 13 / 1 The continuous categorical: a novel simplex-valued exponential family.
Recommend
More recommend