Frequentist and Bayesian statistics Claus Ekstrm E-mail: - - PDF document

frequentist and bayesian statistics
SMART_READER_LITE
LIVE PREVIEW

Frequentist and Bayesian statistics Claus Ekstrm E-mail: - - PDF document

Faculty of Life Sciences Frequentist and Bayesian statistics Claus Ekstrm E-mail: ekstrom@life.ku.dk Outline 1 Frequentists and Bayesians What is a probability? Interpretation of results / inference 2 Comparisons 3 Markov chain Monte


slide-1
SLIDE 1

Faculty of Life Sciences

Frequentist and Bayesian statistics

Claus Ekstrøm

E-mail: ekstrom@life.ku.dk

Outline

1 Frequentists and Bayesians

  • What is a probability?
  • Interpretation of results / inference

2 Comparisons 3 Markov chain Monte Carlo

Slide 2— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

What is a probability?

Two schools in statistics: frequentists and Bayesians.

Slide 3— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

slide-2
SLIDE 2

Frequentist school

School of Jerzy Neyman, Egon Pearson and Ronald Fischer.

Slide 4— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

Bayesian school

“School” of Thomas Bayes P(H|D) = P(D|H)·P(H)

P(D|H)·P(H)dH

Slide 5— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

Frequentists

Frequentists talk about probabilities in relation to experiments with a random component. Relative frequency of an event, A, is defined as P(A) = number of outcomes consistent with A number of experiments The probability of event A is the limiting relative frequency.

20 40 60 80 100 0.0 0.2 0.4 0.6 0.8 1.0 n Relative frequency Slide 6— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

slide-3
SLIDE 3

Frequentists — 2

The definition restricts the things we can add probabilities to: What is the probability of there being life on Mars 100 billion years ago? We assume that there is an unknown but fixed underlying parameter, θ, for a population (i.e., the mean height on Danish men). Random variation (environmental factors, measurement errors, ...) means that each observation does not result in the true value.

Slide 7— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

The meta-experiment idea

Frequentists think of meta-experiments and consider the current dataset as a single realization from all possible datasets.

Slide 8— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

The meta-experiment idea

Frequentists think of meta-experiments and consider the current dataset as a single realization from all possible datasets. 167.2 cm

Slide 8— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

slide-4
SLIDE 4

The meta-experiment idea

Frequentists think of meta-experiments and consider the current dataset as a single realization from all possible datasets. 167.2 cm 175.5 cm

Slide 8— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

The meta-experiment idea

Frequentists think of meta-experiments and consider the current dataset as a single realization from all possible datasets. 167.2 cm 175.5 cm 187.7 cm

Slide 8— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

The meta-experiment idea

Frequentists think of meta-experiments and consider the current dataset as a single realization from all possible datasets. 167.2 cm 175.5 cm 187.7 cm 182.0 cm

Slide 8— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

slide-5
SLIDE 5

Confidence intervals

Thus a frequentist believes that a population mean is real, but unknown, and unknowable, and can only be estimated from the data. Knowing the distribution for the sample mean, he constructs a confidence interval, centered at the sample mean.

  • Either the true mean is in the interval or it is not. Can’t

say there’s a 95% probability (long-run fraction having this characteristic) that the true mean is in this interval, because it’s either already in, or it’s not.

  • Reason: true mean is fixed value, which doesn’t have a

distribution.

  • The sample mean does have a distribution! Thus must

use statements like “95% of similar intervals would contain the true mean, if each interval were constructed from a different random sample like this one.”

Slide 9— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

Maximum likelihood

How will the frequentist estimate the parameter?

Slide 10— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

Maximum likelihood

How will the frequentist estimate the parameter? Answer: maximum likelihood.

Slide 10— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

slide-6
SLIDE 6

Maximum likelihood

How will the frequentist estimate the parameter? Answer: maximum likelihood.

Basic idea

Our best estimate of the parameter(s) are the one(s) that make our observed data most likely. We know what we have

  • bserved so far (our data). Our best “guess” would therefore

be to select parameters that make our observations most likely. Binomial distribution: P(Y = y) = n y

  • py(1−p)n−y

Slide 10— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

Bayesians

Each investigator is entitled to his/hers personal belief ... the prior information. No fixed values for parameters but a distribution. All distributions are subjective. Yours is as good as mine. Can still talk about the mean — but it is the mean of my distribution. In many cases trying to circumvent by using vague priors. Thumb tack pin pointing down:

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Theta Prior distribution

Slide 11— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

Credibility intervals

Bayesians have an altogether different world-view. They say that only the data are real. The population mean is an abstraction, and as such some values are more believable than others based on the data and their prior beliefs.

Slide 12— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

slide-7
SLIDE 7

Credibility intervals

Bayesians have an altogether different world-view. They say that only the data are real. The population mean is an abstraction, and as such some values are more believable than others based on the data and their prior beliefs. The Bayesian constructs a credibility interval, centered near the sample mean, but tempered by “prior” beliefs concerning the mean. Now the Bayesian can say what the frequentist cannot: “There is a 95% probability (degree of believability) that this interval contains the mean.”

Slide 12— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

Comparison

Advantages Disadvantages Frequentist Objective Confidence intervals (not quite the desi- red) Calculations Bayesian Credibility intervals (usually the desired) Subjective Complex models Calculations

Slide 13— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

In summary

  • A frequentist is a person whose long-run ambition is to

be wrong 5% of the time.

  • A Bayesian is one who, vaguely expecting a horse, and

catching a glimpse of a donkey, strongly believes he has seen a mule.

Slide 14— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

slide-8
SLIDE 8

In summary

  • A frequentist is a person whose long-run ambition is to

be wrong 5% of the time.

  • A Bayesian is one who, vaguely expecting a horse, and

catching a glimpse of a donkey, strongly believes he has seen a mule. A frequentist uses impeccable logic to answer the wrong question, while a Bayesean answers the right question by making assumptions that nobody can fully believe in.

  • P. G. Hamer

Slide 14— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

Jury duty

Slide 15— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics

Example: speed of light

What is the speed of light in vacuum “really”? Results (m/s) 299792459.2 299792460.0 299792456.3 299792458.1 299792459.5

Slide 16— PhD (Aug 23rd 2011) — Frequentist and Bayesian statistics