Selecting priors Applied Bayesian Statistics Dr. Earvin Balderama - PowerPoint PPT Presentation

Selecting priors Applied Bayesian Statistics Dr. Earvin Balderama Department of Mathematics & Statistics Loyola University Chicago September 19, 2017 Selecting priors 1 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

Selecting priors Selecting the prior is one of the most important steps in a Bayesian analysis, but there are many schools of thought on this. The choices often depend on the objective of the study and the nature of the data: Conjugate vs. Non-conjugate 1 Informative vs. Uninformative 2 Subjective vs. Objective 3 Proper vs. Improper 4 Selecting priors 2 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

Conjugate priors A prior is conjugate if the posterior is a member of the same parametric family. The advantage is that the posterior is available in closed form (very convenient for MCMC) There is a long list of conjugate priors: https://en.wikipedia.org/wiki/Conjugate_prior Selecting priors 3 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

Non-conjugate priors For many likelihoods/parameters, there is no known conjugate prior. A silly example Say Y ∼ Poisson ( λ ) and λ ∼ Beta ( a , b ) . Then the posterior is f ( λ | Y ) ∝ e − λ λ Y λ a − 1 ( 1 − λ ) b − 1 This is not a Beta PDF, therefore the prior is not conjugate. (In fact, this doesn’t look like a member of any known family of distributions). Selecting priors 4 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

Informative versus uninformative priors Conjugacy is only about the family of distributions that the prior belongs to. However, we can utilize outside knowledge to make a prior (conjugate or non-conjugate) more informative . Potential sources include literature reviews, pilot studies, and expert opinions. Prior elicitation is the process of converting expert information into a prior distribution. For example, the expert may not know what an inverse gamma pdf is, but you can choose a and b so that the distribution reflects the beliefs of the expert. Any time informative priors are used you should conduct a sensitivity analysis , that is, evaluate how the posterior differs for several priors. Selecting priors 5 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

Informative versus uninformative priors FDA recommendations http://www.fda.gov/RegulatoryInformation/Guidances/ ucm071072.htm “Discussions with FDA regarding study design will include an evaluation of the model to be used to incorporate the prior information into the analysis.” “We recommend you identify as many sources of good prior information as possible. The evaluation of ‘goodness’ of the prior information is subjective. Because your trial will be conducted with the goal of FDA approval of a medical device, you should present and discuss your choice of prior information with FDA reviewers (clinical, engineering, and statistical) before your study begins.” Selecting priors 6 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

Informative versus uninformative priors “Prior distributions based directly on data from other studies are the easiest to evaluate. While we recognize that two studies are never exactly alike, we nonetheless recommend the studies used to construct the prior be similar to the current study...” “Prior distributions based on expert opinion rather than data can be problematic. Approval of a device could be delayed or jeopardized if FDA advisory panel members or other clinical evaluators do not agree with the opinions used to generate the prior.” Selecting priors 7 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

Informative versus uninformative priors Another word for a very informative prior is a strong prior. Very strong priors are typically used only for nuisance or other special parameters. Strong priors for the main parameters of interest can be tough to defend (even with expert opinion involved). Most of the time, uninformative priors are used. Other terms used to describe uninformative priors: vague, weak, flat, diffuse, etc. Question: How vague or weak can we make a prior? Can we always select a prior that is completely uninformative? Selecting priors 8 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

Informative versus uninformative priors Another word for a very informative prior is a strong prior. Very strong priors are typically used only for nuisance or other special parameters. Strong priors for the main parameters of interest can be tough to defend (even with expert opinion involved). Most of the time, uninformative priors are used. Other terms used to describe uninformative priors: vague, weak, flat, diffuse, etc. Question: How vague or weak can we make a prior? Can we always select a prior that is completely uninformative? The idea is to let the likelihood overwhelm the prior. We can select priors with large variance, e.g., µ ∼ Normal ( 0 , 10 6 ) . (Try to plot this and it will look flat). Selecting priors 8 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

Subjective versus objective Subjective decisions include picking the likelihood, treatment of outliers, transformations, ... and prior specification. For example, should we select σ ∼ Uniform ( 0 , 10 ) or σ 2 ∼ Uniform ( 0 , 100 ) ? Choosing a prior subjectively, before collecting data, may be rejected by a reader who does not share the same view, so uninformative priors and a sensitivity analysis are common. Selecting priors 9 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

Subjective versus objective An objective analysis is one that requires no subjective decisions by the analyst. A completely objective analysis may be feasible in tightly controlled experiments, but is impossible in many analyses. An objective Bayesian attempts to replace the subjective choice of prior with an algorithm that determines the prior. Any objective prior must give the same results, e.g„ the same posterior mean for any prior parameterization. For example, σ ∼ Uniform ( 0 , 10 ) or σ 2 ∼ Uniform ( 0 , 100 ) are quite → f ( σ 2 ) ∝ 1 /σ . different: σ ∼ Uniform − Selecting priors 10 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

Jeffreys prior The Jeffreys prior is invariant to transformations. For a parameter θ , the Jeffreys prior is � f ( θ ) = I ( θ ) , � � d 2 where I ( θ ) = − E Y | θ d θ 2 log f ( Y | θ ) is the Fisher Information. Note: Once you specify the likelihood, the Jeffreys prior is determined with no additional input. Therefore you do not have to make a subjective decision about the prior (other than the subjective decision to use a Jeffreys prior). Selecting priors 11 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

Jeffreys prior Examples Likelihood Jeffreys prior � 1 2 , 1 � Y ∼ Binomial ( n , θ ) θ ∼ Beta 2 Y ∼ Normal ( µ, 1 ) f ( µ ) ∝ 1 Y ∼ Normal ( µ, σ 2 ) f ( µ ) ∝ 1 /σ Many of these priors are improper priors , meaning they do not integrate to one. In this case, you have to check that the posterior is proper and integrates to one. Selecting priors 12 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

Empirical Bayes In Empirical Bayes , priors are chosen based on the data. For example, setting the prior mean for σ 2 to be s 2 , or fixing σ 2 at s 2 in the analysis of µ . In general, one can use marginal maximum likelihood to fix nuisance parameters. Note: Empirical Bayes is usually criticized for “using the data twice” Selecting priors 13 Last edited September 8, 2017 by Earvin Balderama <ebalderama@luc.edu>

Selecting priors Applied Bayesian Statistics Dr. Earvin Balderama - PowerPoint PPT Presentation

Selecting priors Applied Bayesian Statistics Dr. Earvin Balderama Department of Mathematics & Statistics Loyola University Chicago September 19, 2017 Selecting priors 1 Last edited September 8, 2017 by Earvin Balderama

Conjugate Priors: Beta and Normal; Choosing Priors 18.05 Spring 2014 Jeremy Orloff and Jonathan

Conjugate Priors: Beta and Normal; Choosing Priors 18.05 Spring 2014 Jeremy Orloff and Jonathan

Dotmetrics Exclusive Users Selecting basic dimensions (country, devices) Selecting timeframe

Mixture of g Priors for Bayesian Variable Selection Feng Liang, Rui Paulo et al. Sheng Zhang

Choosing Priors Probability Intervals 18.05 Spring 2014 Conjugate priors A prior is conjugate

Conjugate Priors: Beta and Normal 18.05 Spring 2018 Review: Continuous priors, discrete data

P-values, Probability, Priors, Rabbits, P-values, Probability, Priors, Rabbits, Quantifauxcation,

Informative Priors for Graphical Model Structure James Cussens, University of York

Priors for the long run Domenico Giannone Michele Lenza New York Fed European Central Bank

Features of heavy physics in the CMB Biases in our priors? Outline When UV physics does not

Choosing Priors Probability Intervals 18.05 Spring 2014 January 1, 2017 1 /25 Conjugate

Form2Fit: Learning Shape Priors for Generalizable Assembly from Disassembly Kevin Zakka, Andy

Transformational Priors Over Grammars Jason Eisner Jason Eisner Johns Hopkins University July

Incorporating Domain Knowledge into Topic Modeling via Dirichlet Forest Priors David

Linear mixed models with improper priors and flexible distributional assumptions for longitudinal

Shrinkage priors Dr. Jarad Niemi Iowa State University August 24, 2017 Jarad Niemi (Iowa State)

Beyond Uniform Priors in Bayesian Network Structure Learning (for Discrete Bayesian Networks)

Introduction to Bayesian Statistics Lecture 3: Single Parameter (II) Rung-Ching Tsai Department

Conjugate Priors: Beta and Normal 18.05 Spring 2014 January 1, 2017 1 /15 Review: Continuous

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

Learning Bayesian network : Given structure and completely observed data Probabilistic Graphical

Automatic code rewriting in probabilistic programming Internship supervised by Hongseok Yang at

Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course

Related to Bayesian Statistics by Atsuhide Mori (Osaka Dental University, Japan) Geometric