Identifying Parametric Prior Distributions Stephanie Kovalchik - PowerPoint PPT Presentation

Identifying Parametric Prior Distributions Stephanie Kovalchik UCLA, Department of Biostatistics UseR! 2008 Conference August 14, 2008

In a Bayesian analysis the statistician must specify prior densities for the model parameters.

In a Bayesian analysis the statistician must specify prior densities for the model parameters. If he is bold enough to choose an informative prior for the model parameter θ , then this prior should well-represent beliefs about θ .

Why be bold? If there are experts or historical data that have accurate information about the model parameters, then we should use this in choosing a prior as this will make our posterior inferences more precise.

Why be bold? If there are experts or historical data that have accurate information about the model parameters, then we should use this in choosing a prior as this will make our posterior inferences more precise. This point is well made by Garthwaite, Kadane and O’Hagan: ‘An aim of much statistical research is to wring as much from data as we possibly can, but using expert opinion better (or using it at all) could add more information than slight improvement in efficiency through better techniques of data analysis’ (p. 698, 2005)

Being bold . . . The challenge for the statistician is that beliefs are most commonly expressed as typical values , an average or a set of cumulative probabilities. It is not always clear how to translate these beliefs to a specific density.

We need tools to help in this translation process. Belief → p ( θ ) As a start towards this end, a set of R functions have been written to identify a prior distribution for θ when a continuous parametric density can adequately represent prior beliefs about θ .

Available Densities ◮ Normal ◮ Beta ◮ Gamma ◮ Inverse-Gamma ◮ Student’s T

General Function Syntax For a prior density f with parameters α, β then the function form is f.prior(args) which returns the vector ( α, β ) and a plot if desired.

What are the args? Each function attempts to find a matching prior based on some combination of mean, mode, variance and coverage probabilities. Essentially, the functions are an R-version of the modal interval approach discussed by Garthwaite, Kadane and O’Hagan (2005).

Usage by example Consider the following problem posed by Jim Albert: ‘If p denotes the probability of flipping a head, then your ‘best guess’ at p is .5. Moreover, you believe that it is highly likely that the coin is close to fair, which you quantify by P ( . 44 < p < . 56) = .9’ (p. 55, 2007).

If we want a Beta prior to reflect our beliefs about p, what are the appropriate parameter settings?

If we want a Beta prior to reflect our beliefs about p, what are the appropriate parameter settings? If we take ‘best guess’ to mean the mode, then we could use the following as a first attempt at identifying a suitable Beta.

If we want a Beta prior to reflect our beliefs about p, what are the appropriate parameter settings? If we take ‘best guess’ to mean the mode, then we could use the following as a first attempt at identifying a suitable Beta. beta.prior(mode=.5,p=.05,q=.44)

If we want a Beta prior to reflect our beliefs about p, what are the appropriate parameter settings? If we take ‘best guess’ to mean the mode, then we could use the following as a first attempt at identifying a suitable Beta. beta.prior(mode=.5,p=.05,q=.44) Here q is the specified quantile and p its associated probability.

With this specification, we assume that the density was fairly symmetric. To check that this is reasonable we can use the plot option and mark the 90 % coverage interval.

With this specification, we assume that the density was fairly symmetric. To check that this is reasonable we can use the plot option and mark the 90 % coverage interval. beta.prior(mode=.5,p=.05,q=.44,plot=T)

With this specification, we assume that the density was fairly symmetric. To check that this is reasonable we can use the plot option and mark the 90 % coverage interval. beta.prior(mode=.5,p=.05,q=.44,plot=T) So we find that Beta (93 . 5 , 93 . 5) well represents the prior beliefs about p.

With this specification, we assume that the density was fairly symmetric. To check that this is reasonable we can use the plot option and mark the 90 % coverage interval. beta.prior(mode=.5,p=.05,q=.44,plot=T) So we find that Beta (93 . 5 , 93 . 5) well represents the prior beliefs about p. Note that Albert suggests Beta (100 , 100) which curve(dbeta(x,100,100),add=T,lty=2) indicates places slightly more mass for values between ( . 44 , . 56) then the prior beliefs warrant.

Again, from Albert, suppose our data are Y i | λ ∼ Poisson ( λ ). Consider the following beliefs and choice of density for the rate λ . Density Belief Gamma E [ λ ] = 3 and P ( λ ≤ 2 . 1) = . 25

Again, from Albert, suppose our data are Y i | λ ∼ Poisson ( λ ). Consider the following beliefs and choice of density for the rate λ . Density Belief Gamma E [ λ ] = 3 and P ( λ ≤ 2 . 1) = . 25 log( λ ) ∼ Normal P (1 . 94 ≤ λ ≤ 3 . 81) = . 5

Again, from Albert, suppose our data are Y i | λ ∼ Poisson ( λ ). Consider the following beliefs and choice of density for the rate λ . Density Belief Gamma E [ λ ] = 3 and P ( λ ≤ 2 . 1) = . 25 log( λ ) ∼ Normal P (1 . 94 ≤ λ ≤ 3 . 81) = . 5 log( λ ) ∼ Normal E [ λ ] = 3 and P ( λ ≥ 8) = . 02

Enhancements ◮ Use of effective sample size for Beta distribution where n effective = α + β for θ ∼ Beta ( α, β ). ◮ Transformations for Normal and T distributions so beliefs can be expressed in terms of X though the actual model is f ( X ) ∼ N ( µ, σ 2 ) or ∼ T ( µ, σ 2 = 1 , ν ).

Further Extensions ◮ More densities. ◮ Identifying the prior density that best matches prior conditions rather than requiring an exact match for a minimally sufficient set of arguments. ◮ Mixture priors.

Source Code and Documentation http://skoval.bol.ucla.edu/R.html

References Jim Albert. Bayesian Computation with R . Springer, New York, 2007. Paul H. Garthwaite, Joseph B. Kadane, and Anthony O’Hagan. Statistical methods for eliciting probability distributions. JASA , 100:680–700, 2005.

Identifying Parametric Prior Distributions Stephanie Kovalchik - PowerPoint PPT Presentation

Identifying Parametric Prior Distributions Stephanie Kovalchik UCLA, Department of Biostatistics UseR! 2008 Conference August 14, 2008 In a Bayesian analysis the statistician must specify prior densities for the model parameters. In a

MLSE Channel Estimation MLSE Channel Estimation MLSE Channel Estimation Parametric or Non-

Semi-parametric and response setup non-parametric approaches to Parametric models

Formal Modeling in Cognitive Science 1 Distributions Lecture 20: Joint, Marginal, and Conditional

Parametric Methods Steven J Zeil Old Dominion Univ. Fall 2010 1 Distributions Estimating

? ? ? ? Basic Charts Outline - Distributions & Histograms - Mean, Mode, Average - Chart

AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic

Fitting parametric distributions using R : the fitdistrplus package M. L. Delignette-Muller - CNRS

Stat 5102 Lecture Slides: Deck 1 Empirical Distributions, Exact Sampling Distributions,

Create Distributions Empirically using Excel V0E 10/11/2014 0E 2014 Schield Creating

Input Distributions Reading: Chapter 6 in Law Input Distributions Overview Probability Theory

Lecture 5: Probability Distributions Random Variables Probability Distributions

Outline Power Law Size Distributions Distributions Power Law Size Distributions Overview

Introduction to non-parametric Bayes Introduction to non-parametric Bayes methods 1 Overview

Non-parametric Bayesian Statistics Graham Neubig 2011-12-22 1 Graham Neubig Non-parametric

TCTL model checking lower/upper-bound Introduction parametric timed automata without Parametric

CMSC427 Notes on piecewise parametric curves: Hermite, Catmull-Rom, and Bezier I. Parametric

Learning Objectives At the end of the class you should be able to: derive Bayesian learning from

Overview Bayesian Methods for Parameter Estimation Introduction to Bayesian Statistics: Learning

Introduction to Bayesian Statistics Lecture 9: Hierarchical Models Rung-Ching Tsai Department of

Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 3: Probability Jan-Willem van de

Introduction to Bayesian Inference Frank Wood April 6, 2010 Introduction Overview of Topics

ML, MAP Estimation and Bayesian CE-717: Machine Learning Sharif University of Technology Fall

Probabilistic Graphical Models Lecture 5 Bayesian Learning of Bayesian Networks CS/CNS/EE

Outline Introduction and motivation Gauge-fermion theories Gauge-Yukawa theories Summary and

Sambuz

Useful Links

Newsletter

Mail Us

Identifying Parametric Prior Distributions Stephanie Kovalchik - PowerPoint PPT Presentation

Identifying Parametric Prior Distributions Stephanie Kovalchik UCLA, Department of Biostatistics UseR! 2008 Conference August 14, 2008 In a Bayesian analysis the statistician must specify prior densities for the model parameters. In a

MLSE Channel Estimation MLSE Channel Estimation MLSE Channel Estimation Parametric or Non-

Semi-parametric and response setup non-parametric approaches to Parametric models

Formal Modeling in Cognitive Science 1 Distributions Lecture 20: Joint, Marginal, and Conditional

Parametric Methods Steven J Zeil Old Dominion Univ. Fall 2010 1 Distributions Estimating

? ? ? ? Basic Charts Outline - Distributions &amp; Histograms - Mean, Mode, Average - Chart

AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic

Fitting parametric distributions using R : the fitdistrplus package M. L. Delignette-Muller - CNRS

Stat 5102 Lecture Slides: Deck 1 Empirical Distributions, Exact Sampling Distributions,

Create Distributions Empirically using Excel V0E 10/11/2014 0E 2014 Schield Creating

Input Distributions Reading: Chapter 6 in Law Input Distributions Overview Probability Theory

Lecture 5: Probability Distributions Random Variables Probability Distributions

Outline Power Law Size Distributions Distributions Power Law Size Distributions Overview

Introduction to non-parametric Bayes Introduction to non-parametric Bayes methods 1 Overview

Non-parametric Bayesian Statistics Graham Neubig 2011-12-22 1 Graham Neubig Non-parametric

TCTL model checking lower/upper-bound Introduction parametric timed automata without Parametric

CMSC427 Notes on piecewise parametric curves: Hermite, Catmull-Rom, and Bezier I. Parametric

Learning Objectives At the end of the class you should be able to: derive Bayesian learning from

Overview Bayesian Methods for Parameter Estimation Introduction to Bayesian Statistics: Learning

Introduction to Bayesian Statistics Lecture 9: Hierarchical Models Rung-Ching Tsai Department of

Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 3: Probability Jan-Willem van de

Introduction to Bayesian Inference Frank Wood April 6, 2010 Introduction Overview of Topics

ML, MAP Estimation and Bayesian CE-717: Machine Learning Sharif University of Technology Fall

Probabilistic Graphical Models Lecture 5 Bayesian Learning of Bayesian Networks CS/CNS/EE

Outline Introduction and motivation Gauge-fermion theories Gauge-Yukawa theories Summary and

Sambuz

Useful Links

Newsletter

Mail Us

? ? ? ? Basic Charts Outline - Distributions & Histograms - Mean, Mode, Average - Chart