Likelihood-Based Statistical Decisions Marco Cattaneo Seminar for - PowerPoint PPT Presentation

Likelihood-Based Statistical Decisions Marco Cattaneo Seminar for Statistics ETH Z¨ urich, Switzerland July 23, 2005

Likelihood Function set of statistical models { P θ : θ ∈ Θ } observation A ❀ likelihood function lik : θ �→ P θ ( A )

Likelihood Function set of statistical models { P θ : θ ∈ Θ } observation A ❀ likelihood function lik : θ �→ P θ ( A ) The likelihood function lik measures the relative plausibility of the models P θ , on the basis of the observation A alone. The likelihood function lik is not calibrated : only ratios lik ( θ 1 ) /lik ( θ 2 ) are well determined.

Likelihood Function set of statistical models { P θ : θ ∈ Θ } observation A ❀ likelihood function lik : θ �→ P θ ( A ) The likelihood function lik measures the relative plausibility of the models P θ , on the basis of the observation A alone. The likelihood function lik is not calibrated : only ratios lik ( θ 1 ) /lik ( θ 2 ) are well determined. Example. X ∼ Binomial ( n, θ ) n = 5 , θ ∈ Θ = [0 , 1] lik ( θ ) ∝ θ 3 (1 − θ ) 2 x = 3 ⇒ 0 0.2 0.4 0.6 0.8 1 θ

Statistical Decision Problem set of statistical models { P θ : θ ∈ Θ } set of possible decisions D loss function L : Θ × D → [0 , ∞ ) L ( θ, d ) is the loss we would incur, according to the model P θ , by making the decision d .

Statistical Decision Problem set of statistical models { P θ : θ ∈ Θ } set of possible decisions D loss function L : Θ × D → [0 , ∞ ) L ( θ, d ) is the loss we would incur, according to the model P θ , by making the decision d . observation A ❀ likelihood function lik on Θ MPL criterion: minimize sup θ lik ( θ ) L ( θ, d )

Statistical Decision Problem set of statistical models { P θ : θ ∈ Θ } set of possible decisions D loss function L : Θ × D → [0 , ∞ ) L ( θ, d ) is the loss we would incur, according to the model P θ , by making the decision d . observation A ❀ likelihood function lik on Θ MPL criterion: minimize sup θ lik ( θ ) L ( θ, d ) minimax criterion: minimize sup θ L ( θ, d ) MPL = minimax if lik is constant (i.e., complete ignorance about Θ ) MPL: Minimax Plausibility-weighted Loss

Example lik ( θ ) ∝ θ 3 (1 − θ ) 2 L ( θ, d ) = | d − θ 2 | d ML = 0 . 36 , d MP L ≈ 0 . 385 , d BU ≈ 0 . 335 0 0.2 0.4 0.6 0.8 1 θ

Example lik ( θ ) ∝ θ 3 (1 − θ ) 2 L ( θ, d ) = | d − θ 2 | d ML = 0 . 36 , d MP L ≈ 0 . 385 , d BU ≈ 0 . 335 0 0.2 0.4 0.6 0.8 1 θ 2 (1 − √ τ ) 2 3 τ = θ 2 , lik ( τ ) ∝ τ L ( τ, d ) = | d − τ | d ML = 0 . 36 , d MP L ≈ 0 . 385 , d BU ≈ 0 . 404 0 0.2 0.4 0.6 0.8 1 τ

Example lik ( θ ) ∝ θ 3 (1 − θ ) 2 L ( θ, d ) = | d − θ 2 | d ML = 0 . 36 , d MP L ≈ 0 . 385 , d BU ≈ 0 . 335 0 0.2 0.4 0.6 0.8 1 θ 2 (1 − √ τ ) 2 3 τ = θ 2 , lik ( τ ) ∝ τ L ( τ, d ) = | d − τ | d ML = 0 . 36 , d MP L ≈ 0 . 385 , d BU ≈ 0 . 404 0 0.2 0.4 0.6 0.8 1 τ � 2 | d − τ | if d ≤ τ L ( τ, d ) = | d − τ | if d ≥ τ d ML = 0 . 36 , d MP L ≈ 0 . 468 , d BU ≈ 0 . 502 ( d BU ≈ 0 . 435 using θ )

Relative Plausibility The likelihood function can be easily updated by multiplying it with the (conditional) likelihood functions based on the new observations. Prior information can be encoded in a “prior likelihood function” assumed to be based on past (independent) observations.

Relative Plausibility The likelihood function can be easily updated by multiplying it with the (conditional) likelihood functions based on the new observations. Prior information can be encoded in a “prior likelihood function” assumed to be based on past (independent) observations. The relative plausibility is the extension of the likelihood function to the subsets H of Θ by means of the supremum: rp ( H ) ∝ sup θ ∈H lik ( θ ) .

Relative Plausibility The likelihood function can be easily updated by multiplying it with the (conditional) likelihood functions based on the new observations. Prior information can be encoded in a “prior likelihood function” assumed to be based on past (independent) observations. The relative plausibility is the extension of the likelihood function to the subsets H of Θ by means of the supremum: rp ( H ) ∝ sup θ ∈H lik ( θ ) . The relative plausibility is thus a quantitative description of the uncertain knowledge about the models P θ , that can start with complete ignorance or with prior information, that can be easily updated when new data are observed, and that can be used for inference and decision making.

Imprecise Probabilities The relative plausibility is a non-calibrated possibility measure on Θ .

Imprecise Probabilities The relative plausibility is a non-calibrated possibility measure on Θ . MPL criterion: minimize sup θ rp { θ } L ( θ, d ) � �� Shilkret integral of L ( · , d ) with respect to rp

Imprecise Probabilities The relative plausibility is a non-calibrated possibility measure on Θ . MPL criterion: minimize sup θ rp { θ } L ( θ, d ) � �� Shilkret integral of L ( · , d ) with respect to rp If Γ is a set of probability measures on Θ , the consideration of the (second- order) relative plausibility on Γ leads to a non-calibrated possibilistic hierarchical model , which allows non-vacuous conclusions even if Γ is the set of all probability measures on Θ .

Properties The relative plausibility and the MPL criterion:

Properties The relative plausibility and the MPL criterion: • are simple and intuitive.

Properties The relative plausibility and the MPL criterion: • are simple and intuitive. • are parametrization invariant.

Properties The relative plausibility and the MPL criterion: • are simple and intuitive. • are parametrization invariant. • lead to decision functions that are equivariant (if the problem is invariant) and asymptotic optimal (if some regularity conditions are satisfied).

Properties The relative plausibility and the MPL criterion: • are simple and intuitive. • are parametrization invariant. • lead to decision functions that are equivariant (if the problem is invariant) and asymptotic optimal (if some regularity conditions are satisfied). • satisfy the strong likelihood principle.

Properties The relative plausibility and the MPL criterion: • are simple and intuitive. • are parametrization invariant. • lead to decision functions that are equivariant (if the problem is invariant) and asymptotic optimal (if some regularity conditions are satisfied). • satisfy the strong likelihood principle. • can use pseudo likelihood functions.

Properties The relative plausibility and the MPL criterion: • are simple and intuitive. • are parametrization invariant. • lead to decision functions that are equivariant (if the problem is invariant) and asymptotic optimal (if some regularity conditions are satisfied). • satisfy the strong likelihood principle. • can use pseudo likelihood functions. • can represent complete (or partial) ignorance.

Properties The relative plausibility and the MPL criterion: • are simple and intuitive. • are parametrization invariant. • lead to decision functions that are equivariant (if the problem is invariant) and asymptotic optimal (if some regularity conditions are satisfied). • satisfy the strong likelihood principle. • can use pseudo likelihood functions. • can represent complete (or partial) ignorance. • can handle prior information in a natural way.

Example Estimation of the variance components in the 3 × 3 random effect one-way layout, under normality assumptions and weighted squared error loss. ve � va � ( SSa + SSe ) ( SSa + SSe ) 0.16 0.15 0.1 0.12 0.05 0.08 0 0 0.2 0.4 0.6 0.8 1 0.04 SSa/(SSa+SSe) -0.05 0 0 0.2 0.4 0.6 0.8 1 MPL SSa/(SSa+SSe) MPL ANOVA ANOVA = ANOVA+ = MINQU ML ML ReML = ANOVA+ ReML nonneg. MINQ min. bias

Example ve − ve )2] va − va )2] 3 E [( � E [( � ve 2 ( va +1 3 ve )2 1 1.6 0.95 1.2 0.9 0.85 0.8 0.8 0.4 0.75 0.7 0 0.2 0.4 0.6 0.8 1 Va/(Va+Ve) 0 0.2 0.4 0.6 0.8 1 MPL Va/(Va+Ve) ANOVA MPL ANOVA = ANOVA+ = MINQU ML ReML = ANOVA+ ML nonneg. MINQ min. bias ReML

Likelihood-Based Statistical Decisions Marco Cattaneo Seminar for - PowerPoint PPT Presentation

Likelihood-Based Statistical Decisions Marco Cattaneo Seminar for Statistics ETH Z urich, Switzerland July 23, 2005 Likelihood Function set of statistical models { P : } observation A likelihood function lik : P

Max. likelihood & Bayesian techniques are both likelihood-based. Weaknesses of likelihood for

Maximum Likelihood properties Maximum parsimony Maximum likelihood Experimental design

Applied Statistics Lecturer: Serena Arima Likelihood ML estimator Summaries ML properties LR

Lesson 3: Likelihood-based inference for POMP models Aaron A. King, Edward L. Ionides, Kidus

Chapter 8: Estimation In this chapter we will cover: 1. The likelihood and maximum likelihood

Maximum likelihood models Tues. Feb. 27, 2018 1 Overview of today Informal notion of

Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood

Max Likelihood for Log-Linear Models Daphne Koller Log-Likelihood for Markov Nets A B C

Today Making Simple Decisions Making Decisions Making Sequential Decisions Planning

III.4 Statistical Language Models 1. Basics of Statistical Language Models 2. Query-Likelihood

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

STA 214: Probability & Statistical Models STA 214: Analysis of Statistical Models

GCSE or Equivalent Options Decisions! Decisions! Decisions! An important time for our Year 10

Doing Your Taxes Decisions Decisions Decisions How do I get ready? Should I

Dysphagia: decisions, decisions, decisions Sean White Home Enteral Feed Dietitian Sheffield

Likelihood and Point Estimation Lecture 09 Biostatistics 602 - Statistical Inference . . . .

10-701 Probability and MLE (brief) intro to probability Basic notations Random variable -

Statistical inference for incomplete Ins Couso ranking data: A comparison of two Mohsen Ahmadi

Using Single Photons Using Single Photons Using Single Photons Using Single Photons for WIMP

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

(MLE) Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Winter 2012 UCSD Statistical

Tutorial on Probabilistic Programming in Machine Learning Frank Wood Play Along 1. Download

Introduction to (profiled) side-channel analysis Annelie Heuser In this talk back to

Statistical Tests Matthieu de Lapparent matthieu.delapparent@epfl.ch Transport and Mobility

Likelihood-Based Statistical Decisions Marco Cattaneo Seminar for - PowerPoint PPT Presentation

Likelihood-Based Statistical Decisions Marco Cattaneo Seminar for Statistics ETH Z urich, Switzerland July 23, 2005 Likelihood Function set of statistical models { P : } observation A likelihood function lik : P

Max. likelihood &amp; Bayesian techniques are both likelihood-based. Weaknesses of likelihood for

Maximum Likelihood properties Maximum parsimony Maximum likelihood Experimental design

Applied Statistics Lecturer: Serena Arima Likelihood ML estimator Summaries ML properties LR

Lesson 3: Likelihood-based inference for POMP models Aaron A. King, Edward L. Ionides, Kidus

Chapter 8: Estimation In this chapter we will cover: 1. The likelihood and maximum likelihood

Maximum likelihood models Tues. Feb. 27, 2018 1 Overview of today Informal notion of

Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood

Max Likelihood for Log-Linear Models Daphne Koller Log-Likelihood for Markov Nets A B C

Today Making Simple Decisions Making Decisions Making Sequential Decisions Planning

III.4 Statistical Language Models 1. Basics of Statistical Language Models 2. Query-Likelihood

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

STA 214: Probability &amp; Statistical Models STA 214: Analysis of Statistical Models

GCSE or Equivalent Options Decisions! Decisions! Decisions! An important time for our Year 10

Doing Your Taxes Decisions Decisions Decisions How do I get ready? Should I

Dysphagia: decisions, decisions, decisions Sean White Home Enteral Feed Dietitian Sheffield

Likelihood and Point Estimation Lecture 09 Biostatistics 602 - Statistical Inference . . . .

10-701 Probability and MLE (brief) intro to probability Basic notations Random variable -

Statistical inference for incomplete Ins Couso ranking data: A comparison of two Mohsen Ahmadi

Using Single Photons Using Single Photons Using Single Photons Using Single Photons for WIMP

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

(MLE) Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Winter 2012 UCSD Statistical

Tutorial on Probabilistic Programming in Machine Learning Frank Wood Play Along 1. Download

Introduction to (profiled) side-channel analysis Annelie Heuser In this talk back to

Statistical Tests Matthieu de Lapparent matthieu.delapparent@epfl.ch Transport and Mobility

Max. likelihood & Bayesian techniques are both likelihood-based. Weaknesses of likelihood for

STA 214: Probability & Statistical Models STA 214: Analysis of Statistical Models