Parameter Estimation in Mixtures of Truncated Exponentials Helge - PowerPoint PPT Presentation

Parameter Estimation in Mixtures of Truncated Exponentials Helge Langseth 1 Thomas D. Nielsen 2 Rafael Rumí 3 Antonio Salmerón 3 1 Dept. of Computer and Information Science, The Norwegian University of Science and Technology, Norway 2 Dept. of Computer Science, Aalborg University, Denmark 3 Dept. of Statistics and Applied Mathematics, University of Almería, Spain PGM, September 2008 1 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Outline Background 1 Motivation Mixtures of Truncated Exponentials Learning MTEs from data 2 Background Maximum likelihood estimation in MTEs Constrained optimisation and Lagrange multipliers The Newton-Raphson method The initialisation procedure Model selection 3 Locating splitpoints Determining model complexity Conclusions 4 2 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Background Mixtures of Truncated Exponentials Mixtures of Truncated Exponentials 0.4 0.35 0.3 1 − 1 2 z 2 � 0.25 � f ( z ) = 2 π exp √ 0.2 0.15 0.1 0.05 0 −3 −2 −1 0 1 2 3 Z Calculate P ( Y = 1) in Hugin: “ Illegal link ” Y 1 0.9 0.8 0.7 0.6 0.5 1 P ( Y = 1 | z ) = 0.4 1+exp( − z ) 0.3 0.2 0.1 0 −6 −4 −2 0 2 4 6 3 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Background Mixtures of Truncated Exponentials Mixtures of Truncated Exponentials 0.4  − 0 . 0172 + 0 . 931 e 1 . 27 z if − 3 ≤ z < − 1 0.35   0.3  0 . 442 − 0 . 0385 e − 1 . 64 z  if − 1 ≤ z < 0  0.25 f ( z ) = 0.2 0 . 442 − 0 . 0385 e 1 . 64 z if 0 ≤ z < 1  0.15   − 0 . 0172 + 0 . 9314 e − 1 . 27 z 0.1  if 1 ≤ z < 3  0.05 0 −3 −2 −1 0 1 2 3 Z Calculate P ( Y = 1) with MTEs: P ( Y = 1) ≈ 0 . 4996851 Y 1 0.9  0.8 0 if z < − 5  0.7   − 0 . 0217 + 0 . 522 e 0 . 635 z 0.6  if − 5 ≤ z < 0  0.5 P ( Y = 1 | z ) = 1 . 0217 − 0 . 522 e − 0 . 635 z 0.4 if 0 ≤ z ≤ 5  0.3   0.2  1 if z > 5  0.1 0 −6 −4 −2 0 2 4 6 3 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Background Mixtures of Truncated Exponentials The MTE model Definition (Univariate MTE potential over a continuous variable) Let Z be a continuous variable. A function f : Ω Z �→ R + 0 is an MTE potential over Z If 1 m � f ( z ) = a 0 + a i exp ( b i · z ) i =1 for all z ∈ Ω Z , where a i , b i are real numbers . . . or there is a partition of Ω Z into intervals I 1 , . . . , I k s.t. 2 f is defined as above on each I j . Generalization to arbitrary hybrid domains (Moral et al. 2001) The definition transfers to multivariate domains containing both continuous and discrete variables. 4 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Learning MTEs from data Outline Background 1 Motivation Mixtures of Truncated Exponentials Learning MTEs from data 2 Background Maximum likelihood estimation in MTEs Constrained optimisation and Lagrange multipliers The Newton-Raphson method The initialisation procedure Model selection 3 Locating splitpoints Determining model complexity Conclusions 4 5 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Learning MTEs from data Background Learning MTEs from data 15 10 5 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 The MTE learning problem How to find the MTE-distribution that generated this data? 6 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Learning MTEs from data Background Learning MTEs from data The learning task involves three basic steps: Determine the intervals into which Ω Z will be partitioned. 1 Determine the number of exponential terms in the 2 mixture for each interval. Estimate the parameters . 3 Simplifying assumptions In this work we are concerned with the univariate case. For simplicity we will initially assume that: The intervals into which Ω Z will be partitioned is known; The number of exponential terms in the mixture for each interval is fixed to 2 , giving target density f ( z ) = k + a · exp( b · z ) + c · exp( d · z ) . 7 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Learning MTEs from data Maximum likelihood estimation in MTEs Learning MTEs from data by Maximum Likelihood Why learn MTEs using Maximum Likelihood? Well developed core theory, incl. good asymptotic properties under regularity conditions. ML parameters give access to a variety of model estimation procedures: LRT or BIC for selecting no. exponential terms; Likelihood maximisation to locate split-points. Problems The likelihood equations cannot be solved analytically. Identifiability or parameters. 8 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Learning MTEs from data Maximum likelihood estimation in MTEs Initial observations We will assume target density f ( z | θ j ) = k j + a j · exp( b j · z ) + c j · exp( d j · z ) , z ∈ I j for interval I j ; θ j = { k j , a j , b j , c j , d j } . Denote by n j the no. observations from interval I j and let j n j . Then the ML solution ˆ N = � θ j must satisfy � f ( z | ˆ θ j ) dz = n j /N. (1) z ∈ I j Parameter independence ˆ θ k can be found independently of ˆ θ l as long as Equation (1) is satisfied for all ˆ θ j . We will therefore look at a single interval I from now on (and drop the index j when appropriate). 9 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Learning MTEs from data Constrained optimisation and Lagrange multipliers Constrained optimisation � � Maximize log L ( θ | z ) = log L ( θ | z i ) = log f ( z i | θ ) i : z i ∈ I i : z i ∈ I � Subject to f ( z | θ ) dz − n/N = 0 , z ∈ I f ( e 1 | θ ) ≥ 0 , f ( e 2 | θ ) ≥ 0 . 10 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Learning MTEs from data Constrained optimisation and Lagrange multipliers Constrained optimisation � � Maximize log L ( θ | z ) = log L ( θ | z i ) = log f ( z i | θ ) i : z i ∈ I i : z i ∈ I � Subject to f ( z | θ ) dz − n/N = 0 , z ∈ I f ( e 1 | θ ) − s 2 1 = 0 , f ( e 2 | θ ) − s 2 2 = 0 . 10 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Learning MTEs from data Constrained optimisation and Lagrange multipliers Constrained optimisation � � Maximize log L ( θ | z ) = log L ( θ | z i ) = log f ( z i | θ ) i : z i ∈ I i : z i ∈ I � Subject to f ( z | θ ) dz − n/N = 0 , z ∈ I f ( e 1 | θ ) − s 2 1 = 0 , f ( e 2 | θ ) − s 2 2 = 0 . Notation: φ = [ θ T s T ] T , ψ = [ θ T s T λ T ] T = [ φ T λ T ] T , � g 0 ( φ ) = z ∈ I f ( z | θ ) dz − n/N , g 1 ( φ ) = f ( e 1 | θ ) − s 2 1 ; g 2 ( φ ) = f ( e 2 | θ ) − s 2 2 . Lagrange multipliers ψ (log L ( θ | z ) + λ T g ( φ )) to solve the Find the root of ∇ constrained optimisation problem. 10 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Learning MTEs from data The Newton-Raphson method The Newton-Raphson method Example: Find x s.t. h ( x ) = 0 . 11 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Learning MTEs from data The Newton-Raphson method The Newton-Raphson method Example: Find x s.t. h ( x ) = 0 . Initial “guess”: x = x 0 ; approximate h ( x ) by its tangent in x 0 . 11 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Learning MTEs from data The Newton-Raphson method The Newton-Raphson method Example: Find x s.t. h ( x ) = 0 . New “guess” x 1 : The point where tangent crosses abscissa. 11 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Learning MTEs from data The Newton-Raphson method The Newton-Raphson method Example: Find x s.t. h ( x ) = 0 . Iterate using general formula x t +1 ← x t − { h ′ ( x t ) } − 1 · h ( x t ) . 11 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Learning MTEs from data The Newton-Raphson method The Lagrange Multipliers method Maximise likelihood given constraints Use the multivariate Newton-Raphson method to solve A ( ψ | z ) ≡ ∇ ψ (log L ( θ | z ) + λ T g ( φ )) = 0 : ψ t +1 ← ψ t − J ( A ( ψ t | z )) − 1 · A ( ψ t | z ) . Initialisation of Newton-Raphson: � T �� Choose θ 0 “randomly” giving s 0 = f ( e 1 | θ 0 ) f ( e 2 | θ 0 ) λ 0 = [1 1] T (chosen rather arbitrarily). 12 Langseth, Nielsen, Rumí and Salmerón Parameter estimation in MTEs

Parameter Estimation in Mixtures of Truncated Exponentials Helge - PowerPoint PPT Presentation

Parameter Estimation in Mixtures of Truncated Exponentials Helge Langseth 1 Thomas D. Nielsen 2 Rafael Rum 3 Antonio Salmern 3 1 Dept. of Computer and Information Science, The Norwegian University of Science and Technology, Norway 2 Dept. of

Exponentials of derivations in prime Gradings characteristic Artin-Hasse exponentials Laguerre

Truncated Differentials Lars R. Knudsen June 2014 Lars R. Knudsen Truncated Differentials

Maximum Likelihood vs. Least Squares for Estimating Mixtures of Truncated Exponentials Helge

I 4 - Bayesian parameter estimation in a normal model STAT 587 (Engineering) Iowa State

Analysis of a model of elastic plastic mixtures (Prandtl-Reuss-mixtures) Project of Josef

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Unsupervised Maximum Likelihood

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Supervised Maximum Likelihood

On truncated discrete moment problems Tobias Kuna University of Reading, UK (Joint work with

Maximum-likelihood and Bayesian parameter estimation Andrea Passerini passerini@disi.unitn.it

Maximum likelihood parameter estimation Maximum likelihood parameter estimation For an HMM

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Martin Emms September 20, 2019 4CSLL5

Learning Conditional Distributions using Mixtures of Truncated Basis Functions Inmaculada

Learning Mixtures of Truncated Basis Functions from Data Helge Langseth, Thomas D. Nielsen, and

6. Parameter Passing Parameter Passing CS 381 Spring 2016 Example (Formal) Parameter void

10/16/19 Parameter Control Genetic Algorithms Motivation Parameter setting Tuning

Conditional Moment Relaxations and Sums-of-AM/GM-Exponentials Riley Murray California Institute

Towards Breaking the Exponential Barrier for General Secret Sharing Tianren Liu Vinod

Overview of DT Fourier Series Topics Orthogonality of DT exponential harmonics DT Fourier

Logarithms 2-1 Definition of Logarithms

Wolfes Combinatorial Method is Exponential Jamie Haddock STOC June 26, 2018 UC Davis/UCLA

Approximate Inference Henrik I. Christensen Robotics & Intelligent Machines @ GT Georgia

Graphical Models Graphical Models Exponential family & Variational Inference I Siamak

CSci 8980: Advanced Topics in Graphical Models Mixture Models, EM, Exponential Families

Generalized Linear Models (GLIMs) Probabilistic Graphical Models Sharif University of Technology

Parameter Estimation in Mixtures of Truncated Exponentials Helge - PowerPoint PPT Presentation

Parameter Estimation in Mixtures of Truncated Exponentials Helge Langseth 1 Thomas D. Nielsen 2 Rafael Rum 3 Antonio Salmern 3 1 Dept. of Computer and Information Science, The Norwegian University of Science and Technology, Norway 2 Dept. of

Exponentials of derivations in prime Gradings characteristic Artin-Hasse exponentials Laguerre

Truncated Differentials Lars R. Knudsen June 2014 Lars R. Knudsen Truncated Differentials

Maximum Likelihood vs. Least Squares for Estimating Mixtures of Truncated Exponentials Helge

I 4 - Bayesian parameter estimation in a normal model STAT 587 (Engineering) Iowa State

Analysis of a model of elastic plastic mixtures (Prandtl-Reuss-mixtures) Project of Josef

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Unsupervised Maximum Likelihood

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Supervised Maximum Likelihood

On truncated discrete moment problems Tobias Kuna University of Reading, UK (Joint work with

Maximum-likelihood and Bayesian parameter estimation Andrea Passerini passerini@disi.unitn.it

Maximum likelihood parameter estimation Maximum likelihood parameter estimation For an HMM

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Martin Emms September 20, 2019 4CSLL5

Learning Conditional Distributions using Mixtures of Truncated Basis Functions Inmaculada

Learning Mixtures of Truncated Basis Functions from Data Helge Langseth, Thomas D. Nielsen, and

6. Parameter Passing Parameter Passing CS 381 Spring 2016 Example (Formal) Parameter void

10/16/19 Parameter Control Genetic Algorithms Motivation Parameter setting Tuning

Conditional Moment Relaxations and Sums-of-AM/GM-Exponentials Riley Murray California Institute

Towards Breaking the Exponential Barrier for General Secret Sharing Tianren Liu Vinod

Overview of DT Fourier Series Topics Orthogonality of DT exponential harmonics DT Fourier

Logarithms 2-1 Definition of Logarithms

Wolfes Combinatorial Method is Exponential Jamie Haddock STOC June 26, 2018 UC Davis/UCLA

Approximate Inference Henrik I. Christensen Robotics &amp; Intelligent Machines @ GT Georgia

Graphical Models Graphical Models Exponential family &amp; Variational Inference I Siamak

CSci 8980: Advanced Topics in Graphical Models Mixture Models, EM, Exponential Families

Generalized Linear Models (GLIMs) Probabilistic Graphical Models Sharif University of Technology

Approximate Inference Henrik I. Christensen Robotics & Intelligent Machines @ GT Georgia

Graphical Models Graphical Models Exponential family & Variational Inference I Siamak