Maximum Likelihood vs. Least Squares for Estimating Mixtures of - - PowerPoint PPT Presentation

maximum likelihood vs least squares for estimating
SMART_READER_LITE
LIVE PREVIEW

Maximum Likelihood vs. Least Squares for Estimating Mixtures of - - PowerPoint PPT Presentation

Maximum Likelihood vs. Least Squares for Estimating Mixtures of Truncated Exponentials Helge Langseth 1 Thomas D. Nielsen 2 Rafael Rum 3 Antonio Salmern 3 1 Department of Computer and Information Science The Norwegian University of Science and


slide-1
SLIDE 1

Maximum Likelihood vs. Least Squares for Estimating Mixtures of Truncated Exponentials

Helge Langseth1 Thomas D. Nielsen2 Rafael Rumí3 Antonio Salmerón3

1Department of Computer and Information Science

The Norwegian University of Science and Technology, Trondheim (Norway)

2Department of Computer Science

Aalborg University, Aalborg (Denmark)

3Department of Statistics and Applied Mathematics

University of Almería, Almería (Spain)

INFORMS, Seattle, November 2007

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-2
SLIDE 2

Outline

1

Motivation.

2

The MTE (Mixture of Truncated Exponentials) model.

3

Maximum Likelihood (ML) estimation for MTEs.

4

Least Squares (LS) estimation of MTEs.

5

Experimental analysis.

6

Conclusions.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-3
SLIDE 3

Outline

1

Motivation.

2

The MTE (Mixture of Truncated Exponentials) model.

3

Maximum Likelihood (ML) estimation for MTEs.

4

Least Squares (LS) estimation of MTEs.

5

Experimental analysis.

6

Conclusions.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-4
SLIDE 4

Outline

1

Motivation.

2

The MTE (Mixture of Truncated Exponentials) model.

3

Maximum Likelihood (ML) estimation for MTEs.

4

Least Squares (LS) estimation of MTEs.

5

Experimental analysis.

6

Conclusions.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-5
SLIDE 5

Outline

1

Motivation.

2

The MTE (Mixture of Truncated Exponentials) model.

3

Maximum Likelihood (ML) estimation for MTEs.

4

Least Squares (LS) estimation of MTEs.

5

Experimental analysis.

6

Conclusions.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-6
SLIDE 6

Outline

1

Motivation.

2

The MTE (Mixture of Truncated Exponentials) model.

3

Maximum Likelihood (ML) estimation for MTEs.

4

Least Squares (LS) estimation of MTEs.

5

Experimental analysis.

6

Conclusions.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-7
SLIDE 7

Outline

1

Motivation.

2

The MTE (Mixture of Truncated Exponentials) model.

3

Maximum Likelihood (ML) estimation for MTEs.

4

Least Squares (LS) estimation of MTEs.

5

Experimental analysis.

6

Conclusions.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-8
SLIDE 8

Motivation

Graphical models are a common tool for decision analysis. Problems in which continuous and discrete variables interact are frequent. A very general solution is the use of MTE models. When learning from data, the only existing method in the literature is based on least squares. The feasibility of ML estimation seems to be worth studying.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-9
SLIDE 9

Motivation

Graphical models are a common tool for decision analysis. Problems in which continuous and discrete variables interact are frequent. A very general solution is the use of MTE models. When learning from data, the only existing method in the literature is based on least squares. The feasibility of ML estimation seems to be worth studying.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-10
SLIDE 10

Motivation

Graphical models are a common tool for decision analysis. Problems in which continuous and discrete variables interact are frequent. A very general solution is the use of MTE models. When learning from data, the only existing method in the literature is based on least squares. The feasibility of ML estimation seems to be worth studying.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-11
SLIDE 11

Motivation

Graphical models are a common tool for decision analysis. Problems in which continuous and discrete variables interact are frequent. A very general solution is the use of MTE models. When learning from data, the only existing method in the literature is based on least squares. The feasibility of ML estimation seems to be worth studying.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-12
SLIDE 12

Motivation

Graphical models are a common tool for decision analysis. Problems in which continuous and discrete variables interact are frequent. A very general solution is the use of MTE models. When learning from data, the only existing method in the literature is based on least squares. The feasibility of ML estimation seems to be worth studying.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-13
SLIDE 13

Bayesian networks

X1 X2 X3 X4 X5

D.A.G. The nodes represent random variables. Arc ⇒ dependence. p(x) =

n

  • i=1

p(xi|πi) x ∈ ΩX

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-14
SLIDE 14

Bayesian networks

X1 X2 X3 X4 X5

D.A.G. The nodes represent random variables. Arc ⇒ dependence. p(x1, x2, x3, x4, x5) = p(x1)p(x2|x1)p(x3|x1)p(x5|x3)p(x4|x2, x3) .

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-15
SLIDE 15

The MTE model (Moral et al. 2001)

Definition (MTE potential) X: mixed n-dimensional random vector. Y = (Y1, . . . , Yd), Z = (Z1, . . . , Zc) its discrete and continuous parts. A function f : ΩX → R+

0 is a Mixture of Truncated

Exponentials potential (MTE potential) if for each fixed value y ∈ ΩY of the discrete variables Y, the potential over the continuous variables Z is defined as: f(z) = a0 +

m

  • i=1

ai exp   

c

  • j=1

b(j)

i zj

   for all z ∈ ΩZ, where ai, b(j)

i

are real numbers. Also, f is an MTE potential if there is a partition D1, . . . , Dk

  • f ΩZ into hypercubes and in each Di, f is defined as

above.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-16
SLIDE 16

The MTE model (Moral et al. 2001)

Definition (MTE potential) X: mixed n-dimensional random vector. Y = (Y1, . . . , Yd), Z = (Z1, . . . , Zc) its discrete and continuous parts. A function f : ΩX → R+

0 is a Mixture of Truncated

Exponentials potential (MTE potential) if for each fixed value y ∈ ΩY of the discrete variables Y, the potential over the continuous variables Z is defined as: f(z) = a0 +

m

  • i=1

ai exp   

c

  • j=1

b(j)

i zj

   for all z ∈ ΩZ, where ai, b(j)

i

are real numbers. Also, f is an MTE potential if there is a partition D1, . . . , Dk

  • f ΩZ into hypercubes and in each Di, f is defined as

above.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-17
SLIDE 17

The MTE model (Moral et al. 2001)

Example Consider a model with continuous variables X and Y, and discrete variable Z.

X Y Z

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-18
SLIDE 18

The MTE model (Moral et al. 2001)

Example One example of conditional densities for this model is given by the following expressions:

f(x) =

  • 1.16 − 1.12e−0.02x

if 0.4 ≤ x < 4 , 0.9e−0.35x if 4 ≤ x < 19 . f(y|x) =          1.26 − 1.15e0.006y if 0.4≤x<5, 0≤y<13 , 1.18 − 1.16e0.0002y if 0.4≤x<5, 13≤y<43 , 0.07 − 0.03e−0.4y + 0.0001e0.0004y if 5≤x<19, 0≤y<5 , −0.99 + 1.03e0.001y if 5≤x<19, 5≤y<43 . f(z|x) =          0.3 if z = 0, 0.4 ≤ x < 5 , 0.7 if z = 1, 0.4 ≤ x < 5 , 0.6 if z = 0, 5 ≤ x < 19 , 0.4 if z = 1, 5 ≤ x < 19 .

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-19
SLIDE 19

The MTE model (Moral et al. 2001)

Example One example of conditional densities for this model is given by the following expressions:

f(x) =

  • 1.16 − 1.12e−0.02x

if 0.4 ≤ x < 4 , 0.9e−0.35x if 4 ≤ x < 19 . f(y|x) =          1.26 − 1.15e0.006y if 0.4≤x<5, 0≤y<13 , 1.18 − 1.16e0.0002y if 0.4≤x<5, 13≤y<43 , 0.07 − 0.03e−0.4y + 0.0001e0.0004y if 5≤x<19, 0≤y<5 , −0.99 + 1.03e0.001y if 5≤x<19, 5≤y<43 . f(z|x) =          0.3 if z = 0, 0.4 ≤ x < 5 , 0.7 if z = 1, 0.4 ≤ x < 5 , 0.6 if z = 0, 5 ≤ x < 19 , 0.4 if z = 1, 5 ≤ x < 19 .

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-20
SLIDE 20

The MTE model (Moral et al. 2001)

Example One example of conditional densities for this model is given by the following expressions:

f(x) =

  • 1.16 − 1.12e−0.02x

if 0.4 ≤ x < 4 , 0.9e−0.35x if 4 ≤ x < 19 . f(y|x) =          1.26 − 1.15e0.006y if 0.4≤x<5, 0≤y<13 , 1.18 − 1.16e0.0002y if 0.4≤x<5, 13≤y<43 , 0.07 − 0.03e−0.4y + 0.0001e0.0004y if 5≤x<19, 0≤y<5 , −0.99 + 1.03e0.001y if 5≤x<19, 5≤y<43 . f(z|x) =          0.3 if z = 0, 0.4 ≤ x < 5 , 0.7 if z = 1, 0.4 ≤ x < 5 , 0.6 if z = 0, 5 ≤ x < 19 , 0.4 if z = 1, 5 ≤ x < 19 .

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-21
SLIDE 21

Learning MTEs from data

In this work we are concerned with the univariate case. The learning task involves three basic steps: Determination of the splits into which ΩX will be partitioned. Determination of the number of exponential terms in the mixture for each split. Estimation of the parameters.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-22
SLIDE 22

Learning MTEs from data

In this work we are concerned with the univariate case. The learning task involves three basic steps: Determination of the splits into which ΩX will be partitioned. Determination of the number of exponential terms in the mixture for each split. Estimation of the parameters.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-23
SLIDE 23

Learning MTEs from data

In this work we are concerned with the univariate case. The learning task involves three basic steps: Determination of the splits into which ΩX will be partitioned. Determination of the number of exponential terms in the mixture for each split. Estimation of the parameters.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-24
SLIDE 24

Learning MTEs from data

In this work we are concerned with the univariate case. The learning task involves three basic steps: Determination of the splits into which ΩX will be partitioned. Determination of the number of exponential terms in the mixture for each split. Estimation of the parameters.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-25
SLIDE 25

Learning MTEs from data by ML

Why ML? Well developed core theory. Good asymptotic properties under regularity conditions. Several procedures connected to Bayesian networks rely

  • n ML estimations.

Problems for applying ML to MTEs The likelihood equations cannot be solved for the MTE model. Numerical methods are slow and potentially unstable. Question Can the LS estimates be used as approximations to the ML

  • nes?
  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-26
SLIDE 26

Learning MTEs from data by ML

Why ML? Well developed core theory. Good asymptotic properties under regularity conditions. Several procedures connected to Bayesian networks rely

  • n ML estimations.

Problems for applying ML to MTEs The likelihood equations cannot be solved for the MTE model. Numerical methods are slow and potentially unstable. Question Can the LS estimates be used as approximations to the ML

  • nes?
  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-27
SLIDE 27

Learning MTEs from data by ML

Why ML? Well developed core theory. Good asymptotic properties under regularity conditions. Several procedures connected to Bayesian networks rely

  • n ML estimations.

Problems for applying ML to MTEs The likelihood equations cannot be solved for the MTE model. Numerical methods are slow and potentially unstable. Question Can the LS estimates be used as approximations to the ML

  • nes?
  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-28
SLIDE 28

Learning MTEs from data by ML

Why ML? Well developed core theory. Good asymptotic properties under regularity conditions. Several procedures connected to Bayesian networks rely

  • n ML estimations.

Problems for applying ML to MTEs The likelihood equations cannot be solved for the MTE model. Numerical methods are slow and potentially unstable. Question Can the LS estimates be used as approximations to the ML

  • nes?
  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-29
SLIDE 29

Learning MTEs from data by ML

Why ML? Well developed core theory. Good asymptotic properties under regularity conditions. Several procedures connected to Bayesian networks rely

  • n ML estimations.

Problems for applying ML to MTEs The likelihood equations cannot be solved for the MTE model. Numerical methods are slow and potentially unstable. Question Can the LS estimates be used as approximations to the ML

  • nes?
  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-30
SLIDE 30

Learning MTEs from data by ML

Why ML? Well developed core theory. Good asymptotic properties under regularity conditions. Several procedures connected to Bayesian networks rely

  • n ML estimations.

Problems for applying ML to MTEs The likelihood equations cannot be solved for the MTE model. Numerical methods are slow and potentially unstable. Question Can the LS estimates be used as approximations to the ML

  • nes?
  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-31
SLIDE 31

Learning MTEs from data

Starting point

  • R. Rumí, A. Salmerón, S. Moral (2006) Estimating mixtures of

truncated exponentials in hybrid Bayesian networks. Test 15:397–421. Estimation based on least squares (LS). Empiric density approximated by a histogram. Improved version

  • V. Romero, R. Rumí, A. Salmerón (2006) Learning hybrid

Bayesian networks using mixtures of truncated exponentials. International Journal of Approximate Reasoning 42:54-68. Empiric density approximated by a kernel.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-32
SLIDE 32

Learning MTEs from data

Starting point

  • R. Rumí, A. Salmerón, S. Moral (2006) Estimating mixtures of

truncated exponentials in hybrid Bayesian networks. Test 15:397–421. Estimation based on least squares (LS). Empiric density approximated by a histogram. Improved version

  • V. Romero, R. Rumí, A. Salmerón (2006) Learning hybrid

Bayesian networks using mixtures of truncated exponentials. International Journal of Approximate Reasoning 42:54-68. Empiric density approximated by a kernel.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-33
SLIDE 33

Learning MTEs from data

Starting point

  • R. Rumí, A. Salmerón, S. Moral (2006) Estimating mixtures of

truncated exponentials in hybrid Bayesian networks. Test 15:397–421. Estimation based on least squares (LS). Empiric density approximated by a histogram. Improved version

  • V. Romero, R. Rumí, A. Salmerón (2006) Learning hybrid

Bayesian networks using mixtures of truncated exponentials. International Journal of Approximate Reasoning 42:54-68. Empiric density approximated by a kernel.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-34
SLIDE 34

Learning MTEs from data

Starting point

  • R. Rumí, A. Salmerón, S. Moral (2006) Estimating mixtures of

truncated exponentials in hybrid Bayesian networks. Test 15:397–421. Estimation based on least squares (LS). Empiric density approximated by a histogram. Improved version

  • V. Romero, R. Rumí, A. Salmerón (2006) Learning hybrid

Bayesian networks using mixtures of truncated exponentials. International Journal of Approximate Reasoning 42:54-68. Empiric density approximated by a kernel.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-35
SLIDE 35

Learning MTEs from data

The split points are determined observing the extreme and inflexion points.

The number of points (N) to locate is established beforehand. The N higher changes from concavity/convexity or increase/decrease are selected.

The number of exponential terms can be determined beforehand or decided during the parameter estimation procedure.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-36
SLIDE 36

Learning MTEs from data

The split points are determined observing the extreme and inflexion points.

The number of points (N) to locate is established beforehand. The N higher changes from concavity/convexity or increase/decrease are selected.

The number of exponential terms can be determined beforehand or decided during the parameter estimation procedure.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-37
SLIDE 37

Learning MTEs from data

The split points are determined observing the extreme and inflexion points.

The number of points (N) to locate is established beforehand. The N higher changes from concavity/convexity or increase/decrease are selected.

The number of exponential terms can be determined beforehand or decided during the parameter estimation procedure.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-38
SLIDE 38

Learning MTEs: estimating the parameters by LS

Target density f(x) = k + aebx + cedx Assume we have initial estimates a0, b0 and k0. c and d are estimated by fitting to points (x, w), where w = y − a0 exp {b0x} − k0 , a function w = c exp {dx} , minimising the weighted mean squared error.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-39
SLIDE 39

Learning MTEs: estimating the parameters by LS

Target density f(x) = k + aebx + cedx Assume we have initial estimates a0, b0 and k0. c and d are estimated by fitting to points (x, w), where w = y − a0 exp {b0x} − k0 , a function w = c exp {dx} , minimising the weighted mean squared error.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-40
SLIDE 40

Learning MTEs: estimating the parameters by LS

Target density f(x) = k + aebx + cedx Assume we have initial estimates a0, b0 and k0. c and d are estimated by fitting to points (x, w), where w = y − a0 exp {b0x} − k0 , a function w = c exp {dx} , minimising the weighted mean squared error.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-41
SLIDE 41

Learning MTEs: estimating the parameters by LS

Taking logarithms, the problem reduces to linear regression: ln {w} = ln {c} exp {dx} = ln {c} + dx , that can be written as w∗ = c∗ + dx , where c∗ = ln {c} and w∗ = ln {w}. The solution is (c∗, d) = arg min

c∗,d n

  • i=1

(w∗

i − c∗ − dxi)2f(xi) .

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-42
SLIDE 42

Learning MTEs: estimating the parameters by LS

Taking logarithms, the problem reduces to linear regression: ln {w} = ln {c} exp {dx} = ln {c} + dx , that can be written as w∗ = c∗ + dx , where c∗ = ln {c} and w∗ = ln {w}. The solution is (c∗, d) = arg min

c∗,d n

  • i=1

(w∗

i − c∗ − dxi)2f(xi) .

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-43
SLIDE 43

Learning MTEs: estimating the parameters by LS

The solution can be obtained by analytical means: c∗ = n

i=1 wixif(xi)

  • − d

n

i=1 xif(xi)

2 n

i=1 xif(xi)

  • d

= n

i=1 wif(xi)

n

i=1 xif(xi)

n

i=1 f(xi)

n

i=1 wixif(xi)

  • n

i=1 xif(xi)

2 − n

i=1 f(xi)

n

i=1 x2 i f(xi)

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-44
SLIDE 44

Learning MTEs: estimating the parameters by LS

The solution can be obtained by analytical means: c∗ = n

i=1 wixif(xi)

  • − d

n

i=1 xif(xi)

2 n

i=1 xif(xi)

  • d

= n

i=1 wif(xi)

n

i=1 xif(xi)

n

i=1 f(xi)

n

i=1 wixif(xi)

  • n

i=1 xif(xi)

2 − n

i=1 f(xi)

n

i=1 x2 i f(xi)

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-45
SLIDE 45

Learning MTEs: estimating the parameters by LS

Once a, b, c and d are known, we go for k: f ∗(x) = k + ae{bx} + ce{dx} , where k ∈ R should be such that minimises the error E(k) =

n

  • i=1

(f(xi) − aebxi − cedxi − k)2f(xi) n , This is optimised for ˆ k = n

i=1(f(xi) − aebxi − cedxi)f(xi)

n

i=1 f(xi)

.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-46
SLIDE 46

Learning MTEs: estimating the parameters by LS

Once a, b, c and d are known, we go for k: f ∗(x) = k + ae{bx} + ce{dx} , where k ∈ R should be such that minimises the error E(k) =

n

  • i=1

(f(xi) − aebxi − cedxi − k)2f(xi) n , This is optimised for ˆ k = n

i=1(f(xi) − aebxi − cedxi)f(xi)

n

i=1 f(xi)

.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-47
SLIDE 47

Learning MTEs: estimating the parameters by LS

The contribution of each exponential term can be refined. Assume that we have an estimated model ˆ f(x) = ˆ a0e

ˆ b0x + ˆ

c0e

ˆ d0x + ˆ

k0 . The impact of the second exponential term can be determined by introducing a factor h in the regression equation, for a sample (x, y), given by y = ˆ a0e

ˆ b0x + hˆ

c0e

ˆ d0x + ˆ

k0 , and the value of h is computed by least squares, obtaining h = n

i=1(yi − ˆ

a0eˆ

b0xi − ˆ

k0)(eˆ

d0xi)f(xi)

n

i=1 ˆ

c0e2ˆ

d0xif(xi)

.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-48
SLIDE 48

Learning MTEs: estimating the parameters by LS

The contribution of each exponential term can be refined. Assume that we have an estimated model ˆ f(x) = ˆ a0e

ˆ b0x + ˆ

c0e

ˆ d0x + ˆ

k0 . The impact of the second exponential term can be determined by introducing a factor h in the regression equation, for a sample (x, y), given by y = ˆ a0e

ˆ b0x + hˆ

c0e

ˆ d0x + ˆ

k0 , and the value of h is computed by least squares, obtaining h = n

i=1(yi − ˆ

a0eˆ

b0xi − ˆ

k0)(eˆ

d0xi)f(xi)

n

i=1 ˆ

c0e2ˆ

d0xif(xi)

.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-49
SLIDE 49

Initialising a, b and k

The initial values of a, b and k can be arbitrary, but a good selection of them can speed up the convergence of the method. These values can be initialised fitting a curve y = a exp {bx} to the modified sample by exponential regression, and computing k as before. Another alternative is to force the empiric density and the initial model to have the same derivative.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-50
SLIDE 50

Initialising a, b and k

The initial values of a, b and k can be arbitrary, but a good selection of them can speed up the convergence of the method. These values can be initialised fitting a curve y = a exp {bx} to the modified sample by exponential regression, and computing k as before. Another alternative is to force the empiric density and the initial model to have the same derivative.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-51
SLIDE 51

Initialising a, b and k

The initial values of a, b and k can be arbitrary, but a good selection of them can speed up the convergence of the method. These values can be initialised fitting a curve y = a exp {bx} to the modified sample by exponential regression, and computing k as before. Another alternative is to force the empiric density and the initial model to have the same derivative.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-52
SLIDE 52

Experimental setting

Tested distributions:

Normal. Log-normal. χ2. Beta. MTE.

Sample size: 1000. Split points detection:

ML: Manually determined. LS: Automatic detection.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-53
SLIDE 53

Experimental setting

Tested distributions:

Normal. Log-normal. χ2. Beta. MTE.

Sample size: 1000. Split points detection:

ML: Manually determined. LS: Automatic detection.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-54
SLIDE 54

Experimental setting

Tested distributions:

Normal. Log-normal. χ2. Beta. MTE.

Sample size: 1000. Split points detection:

ML: Manually determined. LS: Automatic detection.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-55
SLIDE 55

Graphical comparison: MTE

5 10 15 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35

Artificial 1000 Sample Size

Black=Original, Red=LSE, Blue=ML Density

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-56
SLIDE 56

Graphical comparison: Log-normal

5 10 15 20 0.0 0.2 0.4 0.6 0.8

LogNormal 1000 Sample Size

Black=Original, Red=LSE, Blue=ML Density

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-57
SLIDE 57

Graphical comparison: χ2

5 10 15 20 25 0.00 0.05 0.10 0.15

Chi 1000 Sample Size

Black=Original, Red=LSE, Blue=ML Density

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-58
SLIDE 58

Graphical comparison: Normal (2 splits)

−3 −2 −1 1 2 3 0.0 0.2 0.4 0.6

Normal 1000 Sample Size

Black=Original, Red=LSE, Blue=ML Density

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-59
SLIDE 59

Graphical comparison: Normal (4 splits)

−3 −2 −1 1 2 3 0.0 0.2 0.4 0.6

Normal 1000 Sample Size

Black=Original, Red=LSE, Blue=ML Density

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-60
SLIDE 60

Graphical comparison: Beta

0.0 0.2 0.4 0.6 0.8 1.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

Beta 1000 Sample Size

Black=Original, Red=LSE, Blue=ML Density

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-61
SLIDE 61

Graphical comparison: Beta, kernel fitting

0.0 0.2 0.4 0.6 0.8 1.0 0.5 1.0 1.5 2.0 2.5

Beta 1000 Sample & Kernel values

Comparison Kernel values vs LSE (red) Density

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-62
SLIDE 62

Comparison in terms of likelihood

Artificial χ2 Beta Normal-2 Normal-4 Log-normal ML −2263.132 160.687 −2685.765 −1373.499 −1364.643 −1398.300 LS −2293.473 90.234 −2726.950 −1533.889 −1404.747 −1451.568

Comparison through a two-sided paired t-test Including all the data: p = 0.02039. Excluding the Beta: p = 0.05418.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-63
SLIDE 63

Comparison in terms of likelihood

Artificial χ2 Beta Normal-2 Normal-4 Log-normal ML −2263.132 160.687 −2685.765 −1373.499 −1364.643 −1398.300 LS −2293.473 90.234 −2726.950 −1533.889 −1404.747 −1451.568

Comparison through a two-sided paired t-test Including all the data: p = 0.02039. Excluding the Beta: p = 0.05418.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-64
SLIDE 64

Conclusions

LS highly dependent on the empirical density (kernel or histogram). LS and ML very close except for the Beta case. ML reaches higher likelihood values. LS efficient from a computational point of view.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-65
SLIDE 65

Conclusions

LS highly dependent on the empirical density (kernel or histogram). LS and ML very close except for the Beta case. ML reaches higher likelihood values. LS efficient from a computational point of view.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-66
SLIDE 66

Conclusions

LS highly dependent on the empirical density (kernel or histogram). LS and ML very close except for the Beta case. ML reaches higher likelihood values. LS efficient from a computational point of view.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-67
SLIDE 67

Conclusions

LS highly dependent on the empirical density (kernel or histogram). LS and ML very close except for the Beta case. ML reaches higher likelihood values. LS efficient from a computational point of view.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-68
SLIDE 68

Ongoing work

Refining LS estimation. More exhaustive comparison with ML. Extension to the conditional case. Possible mixed approach ML and LS.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-69
SLIDE 69

Ongoing work

Refining LS estimation. More exhaustive comparison with ML. Extension to the conditional case. Possible mixed approach ML and LS.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-70
SLIDE 70

Ongoing work

Refining LS estimation. More exhaustive comparison with ML. Extension to the conditional case. Possible mixed approach ML and LS.

  • H. Langseth et al.

ML vs. LS for estimating MTEs

slide-71
SLIDE 71

Ongoing work

Refining LS estimation. More exhaustive comparison with ML. Extension to the conditional case. Possible mixed approach ML and LS.

  • H. Langseth et al.

ML vs. LS for estimating MTEs