Workshop 5: Introduction to Bayesian models Murray Logan April 9, - PDF document

-1- Workshop 5: Introduction to Bayesian models Murray Logan April 9, 2016 Table of contents 0.1. Frequentist • P(D|H) • long-run frequency • simple analytical methods to solve roots • conclusions pertain to data, not parameters or hypotheses • compared to theoretical distribution when NULL is true • probability of obtaining observed data or MORE EXTREME data P ( D | H ) can a null ever actually be true 0.2. Frequentist • P-value – probabulity of rejecting NULL – NOT a measure of the magnitude of an effect or degree of significance! – measure of whether the sample size is large enough • 95% CI – NOT about the parameter it is about the interval – does not tell you the range of values likely to contain the true mean 0.3. Frequentist vs Bayesian ------------------------------------------------- Frequentist Bayesian -------------- ------------ ------------ Obs. data One possible Fixed, true Parameters Fixed, true Random, distribution Inferences Data Parameters Probability Long-run frequency Degree of belief $P(D|H)$ $P(H|D)$ -------------------------------------------------

-2- 0.4. Frequentist vs Bayesian 250 200 ● ● ● ● ● ● ● ● ● ● 150 100 50 0 2 4 6 8 10 x 250 ● ● 200 ● ● 150 ● ● ● ● ● 100 ● 50 0 2 4 6 8 10 x ● 250 ● ● ● ● ● ● ● 200 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 150 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 100 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 50 ● 0 2 4 6 8 10 x n: 10 Slope: -0.1022 t: -2.3252 p: 0.0485 n: 10 Slope: -10.2318 t: -2.2115 p: 0.0579 n: 100 Slope: -10.4713 t: -6.6457 p: 1.7101362 Œ 10-9

-3- 0.5. Frequentist vs Bayesian 250 200 ● ● ● ● ● ● ● ● ● ● 150 100 50 0 2 4 6 8 10 x 250 ● ● 200 ● ● 150 ● ● ● ● ● 100 ● 50 0 2 4 6 8 10 x Population A Population B Percentage change 0.46 45.46 Prob. >5% decline 0 0.86 0.6. Bayesian 0.6.1. Bayes rule P ( H | D ) = P ( D | H ) × P ( H ) P ( D ) posterior belief ( probability ) = likelihood × prior probability normalizing constant 0.7. Bayesian

-4- 0.7.1. Bayes rule P ( H | D ) = P ( D | H ) × P ( H ) P ( D ) posterior belief ( probability ) = likelihood × prior probability normalizing constant The normalizing constant is required for probability - turn a frequency distribution into a probability distribution 0.8. Estimation: OLS 0.9. Estimation: Likelihood P ( D | H )

-5- 0.10. Bayesian • conclusions pertain to hypotheses • computationally robust (sample size,balance,collinearity) • inferential flexibility - derive any number of inferences 0.11. Bayesian • subjectivity? • intractable P ( H | D ) = P ( D | H ) × P ( H ) P ( D ) P ( D ) - probability of data from all possible hypotheses 0.12. MCMC sampling Marchov Chain Monte Carlo sampling • draw samples proportional to likelihood <ul> <li>two parameters $\alpha$ and $\beta$</li> <li>infinitely vague priors - posterior likelihood only</li> <li>likelihood multivariate normal</li>

-6- 0.13. MCMC sampling Marchov Chain Monte Carlo sampling • draw samples proportional to likelihood <ul> <li>two parameters $\alpha$ and $\beta$</li> <li>infinitely vague priors - posterior likelihood only</li> <li>likelihood multivariate normal</li> 0.14. MCMC sampling Marchov Chain Monte Carlo sampling

-7- • draw samples proportional to likelihood 0.15. MCMC sampling Marchov Chain Monte Carlo sampling

-8- • chain of samples 0.16. MCMC sampling Marchov Chain Monte Carlo sampling

-9- • 1000 samples 0.17. MCMC sampling Marchov Chain Monte Carlo sampling

-10- • 10,000 samples 0.18. MCMC sampling Marchov Chain Monte Carlo sampling • Aim: samples reflect posterior frequency distribution • samples used to construct posterior prob. dist. • the sharper the multidimensional “features” - more samples • chain should have traversed entire posterior • inital location should not influence 0.19. MCMC diagnostics

-11- 0.19.1. Trace plots 0.20. MCMC diagnostics 0.20.1. Autocorrelation • Summary stats on non-independent values are biased

-12- • Thinning factor = 1 0.21. MCMC diagnostics 0.21.1. Autocorrelation • Summary stats on non-independent values are biased

-13- • Thinning factor = 10 0.22. MCMC diagnostics 0.22.1. Autocorrelation • Summary stats on non-independent values are biased

-14- • Thinning factor = 10, n=10,000 0.23. MCMC diagnostics

-15- 0.23.1. Plot of Distributions 0.24. Native options in R • MCMCpack • MCMCglmm 0.25. JAGS/BUGS • WinBUGS - object pascal – made Bayesian analyses ‘available’ to masses – models mirror written definitions – very slow • JAGS - c++ – same declarative language – much faster 0.26. JAGS/BUGS • stand along application • file input

-16- – model declaration – data as a list • R2jags - interface from R 0.27. JAGS/BUGS ϵ ∼ N (0, σ 2 ) • y i = β 0 + β 1 x i + ϵ i • y i ∼ N ( β 0 + β 1 x i , σ 2 ) • y i ∼ N ( β 0 + β 1 x i , τ ) - τ is precision ( 1 σ 2 ) • y i ∼ N ( µ i , τ ) µ i = β 0 + β 1 x i – β 0 ∼ N (0, 0.000001) – β 1 ∼ N (0, 0.000001) 1 – τ = σ 2 – σ ∼ Uniform (0, 100) 0.28. JAGS/BUGS 0.28.1. Define the model > modelString=" + model { + #Likelihood + for (i in 1:n) { + y[i]~dnorm(mu[i],tau) + mu[i] <- beta0+beta1*x[i] + } + + #Priors + beta0 ~ dnorm (0,1.0E-6) + beta1 ~ dnorm(0,1.0E-6) + tau <- 1 / (sigma * sigma) + sigma~dunif(0,100) + } + " 1 y i ∼ N ( µ i , τ ) µ i = β 0 + β 1 x i β 0 ∼ N (0, 0.000001) β 1 ∼ N (0, 0.000001) τ = σ 2 σ ∼ Uniform (0, 100) > writeLines(modelString,con="BUGSscripts/regression.txt") Error in file(con, "w"): cannot open the connection 0.29. JAGS/BUGS 0.29.1. Create the data list Y X 3 0 2.5 1 6 2 5.5 3 9 4 8.6 5 12 6

-17- 0.30. JAGS/BUGS 0.30.1. Create the data list Y X 1 3.0 0 2 2.5 1 3 6.0 2 4 5.5 3 5 9.0 4 6 8.6 5 7 12.0 6 > data.list <- with(DATA, + list(y=Y, + x=X,n=nrow(DATA)) + ) > data.list $y [1] 3.0 2.5 6.0 5.5 9.0 8.6 12.0 $x [1] 0 1 2 3 4 5 6 $n [1] 7 0.31. JAGS/BUGS 0.31.1. Define the chain parameters > #params <- c("beta0","beta1","sigma") > #burnInSteps = 2000 > #nChains = 3 > #numSavedSteps = 50000 > #thinSteps = 1 > #nIter = ceiling((numSavedSteps * thinSteps)/nChains) 0.32. JAGS/BUGS 0.32.1. Perform MCMC sampling > library(R2jags)

Workshop 5: Introduction to Bayesian models Murray Logan April 9, - PDF document

-1- Workshop 5: Introduction to Bayesian models Murray Logan April 9, 2016 Table of contents 0.1. Frequentist P(D|H) long-run frequency simple analytical methods to solve roots conclusions pertain to data, not parameters or

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Bayesian hierarchical models Bruno Nicenboim / Shravan Vasishth 2020-03-14 1 Bayesian

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Workshop 7.2b: Introduction to Bayesian models Murray Logan February 7, 2017 Table of

AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG)

Part 7 Bayesian hierarchical modelling, simulation and MCMC by Gero Walter 252 Bayesian

A simple Bayesian regression model Alicia Johnson Associate Professor, Macalester College

Bayesian Networks Youve heard about how Bayesian networks have revolutionized AI

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Lecture 6. Bayesian estimation Lecture 6. Bayesian estimation 1 (172) 6. Bayesian estimation

Bayesian networks (2) Lirong Xia Last class Bayesian networks compact, graphical

Bayesian Networks Li Xiong Slide credits: Page (Wisconsin) CS760 , Zhu (Wisconsin) KDD 12

An Intro to Probabilistic Programming using JAGS John Myles White December 27, 2012 What Ill

Probabilistic Programming or Revd. Bayes meets Countess Lovelace John Winn, Microsoft Research

09 Shadow Mapping Steve Marschner CS5625 Spring 2019 Thanks to previous instructor Kavita Bala

University Hospital Aintree and Southport & Ormskirk NHS Trust Hannah Williams MSc BSc FIBMS

Probabilistische graphische Modelle mit Scala Andreas Bille rcs systems GmbH Ab 1500 G.

t trt s

Statistical Methods for Infectious Diseases Household Based Studies I Lecture 7C M. Elizabeth

Probabilistic Programming Practical Frank Wood, Brooks Paige {fwood,brooks}@robots.ox.ac.uk MLSS

Workshop 5: Introduction to Bayesian models Murray Logan April 9, - PDF document

-1- Workshop 5: Introduction to Bayesian models Murray Logan April 9, 2016 Table of contents 0.1. Frequentist P(D|H) long-run frequency simple analytical methods to solve roots conclusions pertain to data, not parameters or

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Bayesian hierarchical models Bruno Nicenboim / Shravan Vasishth 2020-03-14 1 Bayesian

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Workshop 7.2b: Introduction to Bayesian models Murray Logan February 7, 2017 Table of

AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG)

Part 7 Bayesian hierarchical modelling, simulation and MCMC by Gero Walter 252 Bayesian

A simple Bayesian regression model Alicia Johnson Associate Professor, Macalester College

Bayesian Networks Youve heard about how Bayesian networks have revolutionized AI

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Lecture 6. Bayesian estimation Lecture 6. Bayesian estimation 1 (172) 6. Bayesian estimation

Bayesian networks (2) Lirong Xia Last class Bayesian networks compact, graphical

Bayesian Networks Li Xiong Slide credits: Page (Wisconsin) CS760 , Zhu (Wisconsin) KDD 12

An Intro to Probabilistic Programming using JAGS John Myles White December 27, 2012 What Ill

Probabilistic Programming or Revd. Bayes meets Countess Lovelace John Winn, Microsoft Research

09 Shadow Mapping Steve Marschner CS5625 Spring 2019 Thanks to previous instructor Kavita Bala

University Hospital Aintree and Southport &amp; Ormskirk NHS Trust Hannah Williams MSc BSc FIBMS

Probabilistische graphische Modelle mit Scala Andreas Bille rcs systems GmbH Ab 1500 G.

t trt s

Statistical Methods for Infectious Diseases Household Based Studies I Lecture 7C M. Elizabeth

Probabilistic Programming Practical Frank Wood, Brooks Paige {fwood,brooks}@robots.ox.ac.uk MLSS

University Hospital Aintree and Southport & Ormskirk NHS Trust Hannah Williams MSc BSc FIBMS