Likelihood-Based Statistical Decisions Marco Cattaneo Seminar for - - PowerPoint PPT Presentation

likelihood based statistical decisions
SMART_READER_LITE
LIVE PREVIEW

Likelihood-Based Statistical Decisions Marco Cattaneo Seminar for - - PowerPoint PPT Presentation

Likelihood-Based Statistical Decisions Marco Cattaneo Seminar for Statistics ETH Z urich, Switzerland July 23, 2005 Likelihood Function set of statistical models { P : } observation A likelihood function lik : P


slide-1
SLIDE 1

Likelihood-Based Statistical Decisions

Marco Cattaneo Seminar for Statistics ETH Z¨ urich, Switzerland July 23, 2005

slide-2
SLIDE 2

Likelihood Function

set of statistical models {Pθ : θ ∈ Θ}

  • bservation A

❀ likelihood function lik : θ → Pθ(A)

slide-3
SLIDE 3

Likelihood Function

set of statistical models {Pθ : θ ∈ Θ}

  • bservation A

❀ likelihood function lik : θ → Pθ(A) The likelihood function lik measures the relative plausibility of the models Pθ, on the basis of the observation A alone. The likelihood function lik is not calibrated: only ratios lik(θ1)/lik(θ2) are well determined.

slide-4
SLIDE 4

Likelihood Function

set of statistical models {Pθ : θ ∈ Θ}

  • bservation A

❀ likelihood function lik : θ → Pθ(A) The likelihood function lik measures the relative plausibility of the models Pθ, on the basis of the observation A alone. The likelihood function lik is not calibrated: only ratios lik(θ1)/lik(θ2) are well determined. Example. X ∼ Binomial (n, θ) n = 5, θ ∈ Θ = [0, 1] x = 3 ⇒ lik(θ) ∝ θ3 (1 − θ)2

0.6 0.4 0.8 0.2 1

θ

slide-5
SLIDE 5

Statistical Decision Problem

set of statistical models {Pθ : θ ∈ Θ} set of possible decisions D loss function L : Θ × D → [0, ∞) L(θ, d) is the loss we would incur, according to the model Pθ, by making the decision d.

slide-6
SLIDE 6

Statistical Decision Problem

set of statistical models {Pθ : θ ∈ Θ} set of possible decisions D loss function L : Θ × D → [0, ∞) L(θ, d) is the loss we would incur, according to the model Pθ, by making the decision d.

  • bservation A ❀ likelihood function lik on Θ

MPL criterion: minimize supθ lik(θ) L(θ, d)

slide-7
SLIDE 7

Statistical Decision Problem

set of statistical models {Pθ : θ ∈ Θ} set of possible decisions D loss function L : Θ × D → [0, ∞) L(θ, d) is the loss we would incur, according to the model Pθ, by making the decision d.

  • bservation A ❀ likelihood function lik on Θ

MPL criterion: minimize supθ lik(θ) L(θ, d) minimax criterion: minimize supθ L(θ, d) MPL = minimax if lik is constant (i.e., complete ignorance about Θ) MPL: Minimax Plausibility-weighted Loss

slide-8
SLIDE 8

Example

lik(θ) ∝ θ3 (1 − θ)2 L(θ, d) = |d − θ2| dML = 0.36, dMP L ≈ 0.385, dBU ≈ 0.335

0.6 0.4 0.8 0.2 1

θ

slide-9
SLIDE 9

Example

lik(θ) ∝ θ3 (1 − θ)2 L(θ, d) = |d − θ2| dML = 0.36, dMP L ≈ 0.385, dBU ≈ 0.335

0.6 0.4 0.8 0.2 1

θ

τ = θ2, lik(τ) ∝ τ

3 2 (1 − √τ)2

L(τ, d) = |d − τ| dML = 0.36, dMP L ≈ 0.385, dBU ≈ 0.404

0.6 0.4 0.2 1 0.8

τ

slide-10
SLIDE 10

Example

lik(θ) ∝ θ3 (1 − θ)2 L(θ, d) = |d − θ2| dML = 0.36, dMP L ≈ 0.385, dBU ≈ 0.335

0.6 0.4 0.8 0.2 1

θ

τ = θ2, lik(τ) ∝ τ

3 2 (1 − √τ)2

L(τ, d) = |d − τ| dML = 0.36, dMP L ≈ 0.385, dBU ≈ 0.404

0.6 0.4 0.2 1 0.8

τ

L(τ, d) =

  • 2 |d − τ|

if d ≤ τ |d − τ| if d ≥ τ dML = 0.36, dMP L ≈ 0.468, dBU ≈ 0.502 (dBU ≈ 0.435 using θ)

slide-11
SLIDE 11

Relative Plausibility

The likelihood function can be easily updated by multiplying it with the (conditional) likelihood functions based on the new observations. Prior information can be encoded in a “prior likelihood function” assumed to be based on past (independent) observations.

slide-12
SLIDE 12

Relative Plausibility

The likelihood function can be easily updated by multiplying it with the (conditional) likelihood functions based on the new observations. Prior information can be encoded in a “prior likelihood function” assumed to be based on past (independent) observations. The relative plausibility is the extension of the likelihood function to the subsets H of Θ by means of the supremum: rp(H) ∝ supθ∈H lik(θ).

slide-13
SLIDE 13

Relative Plausibility

The likelihood function can be easily updated by multiplying it with the (conditional) likelihood functions based on the new observations. Prior information can be encoded in a “prior likelihood function” assumed to be based on past (independent) observations. The relative plausibility is the extension of the likelihood function to the subsets H of Θ by means of the supremum: rp(H) ∝ supθ∈H lik(θ). The relative plausibility is thus a quantitative description of the uncertain knowledge about the models Pθ, that can start with complete ignorance

  • r with prior information, that can be easily updated when new data are
  • bserved, and that can be used for inference and decision making.
slide-14
SLIDE 14

Imprecise Probabilities

The relative plausibility is a non-calibrated possibility measure on Θ.

slide-15
SLIDE 15

Imprecise Probabilities

The relative plausibility is a non-calibrated possibility measure on Θ. MPL criterion: minimize supθ rp{θ} L(θ, d)

  • Shilkret integral of L(·, d) with respect to rp
slide-16
SLIDE 16

Imprecise Probabilities

The relative plausibility is a non-calibrated possibility measure on Θ. MPL criterion: minimize supθ rp{θ} L(θ, d)

  • Shilkret integral of L(·, d) with respect to rp

If Γ is a set of probability measures on Θ, the consideration of the (second-

  • rder) relative plausibility on Γ leads to a non-calibrated possibilistic

hierarchical model, which allows non-vacuous conclusions even if Γ is the set of all probability measures on Θ.

slide-17
SLIDE 17

Properties

The relative plausibility and the MPL criterion:

slide-18
SLIDE 18

Properties

The relative plausibility and the MPL criterion:

  • are simple and intuitive.
slide-19
SLIDE 19

Properties

The relative plausibility and the MPL criterion:

  • are simple and intuitive.
  • are parametrization invariant.
slide-20
SLIDE 20

Properties

The relative plausibility and the MPL criterion:

  • are simple and intuitive.
  • are parametrization invariant.
  • lead to decision functions that are equivariant (if the problem is invariant)

and asymptotic optimal (if some regularity conditions are satisfied).

slide-21
SLIDE 21

Properties

The relative plausibility and the MPL criterion:

  • are simple and intuitive.
  • are parametrization invariant.
  • lead to decision functions that are equivariant (if the problem is invariant)

and asymptotic optimal (if some regularity conditions are satisfied).

  • satisfy the strong likelihood principle.
slide-22
SLIDE 22

Properties

The relative plausibility and the MPL criterion:

  • are simple and intuitive.
  • are parametrization invariant.
  • lead to decision functions that are equivariant (if the problem is invariant)

and asymptotic optimal (if some regularity conditions are satisfied).

  • satisfy the strong likelihood principle.
  • can use pseudo likelihood functions.
slide-23
SLIDE 23

Properties

The relative plausibility and the MPL criterion:

  • are simple and intuitive.
  • are parametrization invariant.
  • lead to decision functions that are equivariant (if the problem is invariant)

and asymptotic optimal (if some regularity conditions are satisfied).

  • satisfy the strong likelihood principle.
  • can use pseudo likelihood functions.
  • can represent complete (or partial) ignorance.
slide-24
SLIDE 24

Properties

The relative plausibility and the MPL criterion:

  • are simple and intuitive.
  • are parametrization invariant.
  • lead to decision functions that are equivariant (if the problem is invariant)

and asymptotic optimal (if some regularity conditions are satisfied).

  • satisfy the strong likelihood principle.
  • can use pseudo likelihood functions.
  • can represent complete (or partial) ignorance.
  • can handle prior information in a natural way.
slide-25
SLIDE 25

Example

Estimation of the variance components in the 3 × 3 random effect one-way layout, under normality assumptions and weighted squared error loss.

  • ve

(SSa+SSe)

  • va

(SSa+SSe)

0.04 0.2 0.4 0.08 SSa/(SSa+SSe) 1 0.8 0.6 0.16 0.12 MPL ANOVA = ANOVA+ = MINQU ML ReML

  • 0.05

0.2 0.1 0.4 SSa/(SSa+SSe) 1 0.8 0.15 0.6 0.05 MPL ANOVA ML ReML = ANOVA+

  • nonneg. MINQ min. bias
slide-26
SLIDE 26

Example

3 E[( ve−ve)2] ve2 E[( va−va)2] (va+1 3 ve)2

0.2 1 0.85 0.8 Va/(Va+Ve) 0.7 0.8 0.4 0.95 0.9 0.75 1 0.6 MPL ANOVA = ANOVA+ = MINQU ML ReML 1.6 0.2 1.2 0.4 0.8 0.4 Va/(Va+Ve) 0.8 0.6 1 MPL ANOVA ML ReML = ANOVA+

  • nonneg. MINQ min. bias