The maximum likelihood degree of rank 2 matrices via Euler - PowerPoint PPT Presentation

The maximum likelihood degree of rank 2 matrices via Euler characteristics Jose Israel Rodriguez University of Chicago Joint work with Botong Wang AMS University of Loyola October 3, 2015

A mixture of independence models 1 Consider a pair of four sided dice: one red die and one blue die R 1 , B 1 . 2 Consider a second pair of four sided dice: one red die and one blue die R 2 , B 2 . 3 Consider a biased coin C = [ c 1 , c 2 ] The following map induces a set of probability distributions denoted M 44 ⊂ △ 15 ⊂ R 16 and is called the model. ∆ 1 × (∆ 3 × ∆ 3 ) × (∆ 3 × ∆ 3 ) → M 44 ⊂ ∆ 15 ⊂ R 16 c 1 R 1 B T 1 + c 2 R 2 B T 2 = [ p ij ] M 44 is the set of 4 × 4 nonnegative rank at most 2 matrices. M 44 is a mixture of two independence models.

Collecting data and the likelihood function Roll the dice Rolling the dice we may observe the following data:  160 8 16 24  32 200 16 8   u = [ u ij ] =   8 24 176 32   16 40 8 232 To each p in the set of probability distributions M 44 we assign the likelihood of p with respect to u by the likelihood function: � − 1 � ∑ u ij p u ij ∏ ℓ u ( p ) = ij . u 11 ,..., u 44 ij The probability distribution maximizing ℓ u ( p ) on the set of distributions M 44 is called the maximum likelihood estimate (mle). The mle is the best point of M 44 to describe the observed data. The statistics problem is to determine mle’s.

Applied Algebraic Geometry The mle can be determined by solving the likelihood equations. Instead of M 44 , we consider its Zariski closure X 44 . The Zariski closure is described by zero sets of homogeneous polynomials. The defining polynomials of X 44 are the 3 × 3 minors of  p 11 p 12 p 13 p 14  p 21 p 22 p 23 p 24     p 31 p 32 p 33 p 34   p 41 p 42 p 43 p 44 and the linear constraint p 11 + p 12 + ··· + p 44 − p s = 0. The equations define a projective variety of P 16 : rank at ≤ 2 matrices We consider the homogenized likelihood function ℓ u ( p ) = ∏ ij ( p ij / p s ) u ij on X 44 .

Geometric definition of critical points Critical points can be determined by solving a system of polynomial equations. For the models in this talk, the mle is a critical point of the homogenized likelihood function. The solutions to the likelihood equations are critical points. One way to formulate the likelihood equations is to use Lagrange multipliers. ◮ We omit a formal description of the likelihood equations, but instead give a geometric description of critical points.

Geometric definition of critical points (cont.) Critical points can be determined by solving a system of polynomial equations. Let X o denote the open variety X \{ coordinate hyperplanes } . ◮ X o is the set of points in X which have nonzero coordinates. The gradient of the likelihood function up to scaling equals � � u 11 u 12 u 44 u s u s := − ∑ ∇ ℓ u ( p ) = ... , u ij . p 11 p 12 p 44 p s ij ◮ The gradient is defined on X o . We say p ∈ X o is a complex critical point, whenever ∇ ℓ u ( p ) is orthogonal to the tangent space of X at p and p ∈ X o reg . The mle is a critical point (in the cases we consider).

Two experiments and ML degree Two experiments Consider vectorized datasets u for likelihood function ℓ u ( p ) on X 44 . ◮ u = { 160 , 8 , 16 , 24 , 32 , 200 , 16 , 8 , 8 , 24 , 176 , 32 , 16 , 40 , 8 , 232 } ⋆ 191 complex: 25 real and 166 nonreal ◮ u = { 292 , 45 , 62 , 41 , 142 , 51 , 44 , 42 , 213 , 75 , 67 , 63 , 119 , 85 , 58 , 70 } ⋆ 191 complex : 3 real and 188 nonreal The # of complex solutions was always 191 (this is the ML degree). For general choices of u we get the same number of complex critical points. ◮ This number is called the ML degree of a variety.

Previous Computational Results Consider the mixture model M mn for m -sided red dice and n -sided blue dice. Denote its Zariski closure by X mn . Theorem The ML-degrees of X mn include the following: ( m , n ) 3 4 5 6 7 8 9 10 11 12 3 10 26 58 122 250 506 1018 2042 4090 8186 4 26 191 843 3119 6776 ? ? ? ? ? Reference: “Maximum likelihood for matrices with rank constraints” ◮ J. Hauenstein, [], and B. Sturmfels using Bertini. Any conjectures for the first row? (Hint add 6.) “Maximum likelihood geometry in the presence of sampling and model zeros” gave supporting evidence for up to n = 15. ◮ E. Gross and [] using Macaulay2.

Euler characteristics and ML degrees Huh proves that the ML degrees are an Euler characteristic in the smooth case. Let X be a smooth variety of P n + 1 defined by homogeneous polynomials and the linear constraint p 0 + p 1 + ··· + p n − p s = 0 . Let X o denote the open variety X \{ coordinate hyperplanes } . Theorem [Huh] The ML degree of the smooth variety X equals the signed Euler characteristic of X o , i.e. χ ( X o ) = ( − 1 ) dim X MLdegree ( X ) . The independence model (one sided coin) is smooth, but the mixture model is not.

The maximum likelihood degree of rank 2 matrices via Euler - PowerPoint PPT Presentation

The maximum likelihood degree of rank 2 matrices via Euler characteristics Jose Israel Rodriguez University of Chicago Joint work with Botong Wang AMS University of Loyola October 3, 2015 A mixture of independence models 1 Consider a pair of

2 3 4 5 8 9 MINNEAPOLIS MILWAUKEE MSA RANK #16 MSA RANK #39 CHICAGO MSA RANK #3

Results for different matrices and comparisons Dense Matrices Rectangular Matrices

Maximum Likelihood properties Maximum parsimony Maximum likelihood Experimental design

A new family of maximum rank distance codes or: Maximum rank distance codes and finite semifields

MATHEMATICS 1 CONTENTS Matrices Special matrices Operations with matrices Matrix

Phylogenetic trees IV Maximum Likelihood Gerhard Jger ESSLLI 2016 Gerhard Jger Maximum

Maximum likelihood models Tues. Feb. 27, 2018 1 Overview of today Informal notion of

Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood

Maximum Likelihood Estimation CS 446 Maximum likelihood: abstract formulation Weve had one

Binary choice 3.3 Maximum likelihood estimation Michel Bierlaire Maximum likelihood

Maximum Likelihood Estimation CS 446 Maximum likelihood: abstract formulation Weve had one

Binary choice 3.3 Maximum likelihood estimation Michel Bierlaire Output of the estimation

Phylogenetic trees IV Maximum Likelihood Gerhard Jger Words, Bones, Genes, Tools February 28,

15-388/688 - Practical Data Science: Maximum likelihood estimation, nave Bayes J. Zico Kolter

Maximum likelihood parameter estimation Maximum likelihood parameter estimation For an HMM

MAXIMUM CARDS MAXIMUM CARDS What is a Maximum Card ? The Maximum Card is the one which contains a

Probabilistic Graphical Models Probabilistic Graphical Models Parameter learning in Bayesian

STAT 339 A Generative Linear Model and Max Likelihood Estimation 20-22 February 2017 Colin

Non-Gaussian likelihoods for Gaussian Processes Alan Saul Outline Motivation Non-Gaussian

Introduction to General and Generalized Linear Models The Likelihood Principle - part I Henrik

Probabilistic Graphical Models Lecture 11 CRFs, Exponential Family CS/CNS/EE 155 Andreas

Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation Selim Aksoy

Inferential Statistics Concepts IN TR OD U C TION TO L IN E AR MOD E L IN G IN P YTH ON Jason

E9 205 Machine Learning for Signal Processing ML, MAP, MMSE and Gaussian 28-08-2019 Modeling

Sambuz

Useful Links

Newsletter

Mail Us