Workshop 7.2b: Introduction to Bayesian models Murray Logan 07 - PowerPoint PPT Presentation

Workshop 7.2b: Introduction to Bayesian models Murray Logan 07 Feb 2017

Section 1 Frequentist vs Bayesian

Frequentist • P(D฀H) • long-run frequency • simple analytical methods to solve roots • conclusions pertain to data, not parameters or hypotheses • compared to theoretical distribution when NULL is true • probability of obtaining observed data or MORE EXTREME data

Frequentist • P-value ◦ probabulity of rejecting NULL ◦ NOT a measure of the magnitude of an effect or degree of significance! ◦ measure of whether the sample size is large enough • 95% CI ◦ NOT about the parameter it is about the interval ◦ does not tell you the range of values likely to contain the true mean

------------------------------------------------- Random, $P(H|D)$ $P(D|H)$ Degree of belief Long-run frequency Probability Parameters Data Inferences distribution Fixed, true ------------------------------------------------- Parameters Fixed, true One possible Obs. data ------------ ------------ -------------- Bayesian Frequentist Frequentist vs Bayesian

Frequentist vs Bayesian ● 250 250 250 ● ● ● ● ● ● ● ● ● ● 200 ● ● ● ● ● ● ● ● ● ● 200 200 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 150 150 150 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 100 100 100 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 50 50 50 ● 0 0 0 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 x x x n: 10 Slope: -0.1022 t: -2.3252 p: 0.0485 n: 10 Slope: -10.2318 t: -2.2115 p: 0.0579 n: 100 Slope: -10.4713 t: -6.6457 p: 1.7101362 ฀ 10-9

Frequentist vs Bayesian 250 250 ● ● 200 ● ● ● ● ● ● ● ● ● ● 200 ● ● 150 150 ● ● ● ● ● 100 100 ● 50 50 0 0 2 4 6 8 10 2 4 6 8 10 x x Population A Population B Percentage 0.46 45.46 change Prob. ฀5% decline 0 0.86

Section 2 Bayesian Statistics

Bayesian u l e s r a y e B P ( D | H ) × P ( H ) P ( H | D ) = P ( D ) posterior belief likelihood × prior probability ( probability ) = normalizing constant

Bayesian u l e s r y e B a P ( H | D ) = P ( D | H ) × P ( H ) P ( D ) posterior belief likelihood × prior probability ( probability ) = normalizing constant The normalizing constant is required for probability - turn a frequency distribution into a probability distribution

Estimation: OLS

Estimation: Likelihood P ( D | H )

Bayesian • conclusions pertain to hypotheses • computationally robust (sample size,balance,collinearity) • inferential flexibility - derive any number of inferences

Bayesian • subjectivity? • intractable P ( D | H ) × P ( H ) P ( H | D ) = P ( D ) P ( D ) - probability of data from all possible hypotheses

MCMC sampling Marchov Chain Monte Carlo sampling • draw samples proportional to likelihood two parameters and infinitely vague priors - posterior likelihood only likelihood multivariate normal

MCMC sampling Marchov Chain Monte Carlo sampling • draw samples proportional to likelihood

MCMC sampling Marchov Chain Monte Carlo sampling • chain of samples

MCMC sampling Marchov Chain Monte Carlo sampling • 1000 samples

MCMC sampling Marchov Chain Monte Carlo sampling • 10,000 samples

MCMC sampling Marchov Chain Monte Carlo sampling • Aim: samples reflect posterior frequency distribution • samples used to construct posterior prob. dist. • the sharper the multidimensional ฀features฀ - more samples • chain should have traversed entire posterior • inital location should not influence

MCMC diagnostics o t s p l a c e T r

MCMC diagnostics n t i o e l a o r r t o c A u • Summary stats on non-independent values are biased • Thinning factor = 1

MCMC diagnostics n t i o e l a o r r t o c A u • Summary stats on non-independent values are biased • Thinning factor = 10

MCMC diagnostics n t i o e l a o r r t o c A u • Summary stats on non-independent values are biased • Thinning factor = 10, n=10,000

MCMC diagnostics s i o n b u t t r i D i s o f t P l o

Sampler types Metropolis-Hastings http://twiecki.github.io/blog/2014/01/02/visualizing-mcmc/

Sampler types Gibbs

Sampler types NUTS

Sampling • thinning • burning (warmup) • chains

Bayesian software (for R) • MCMCpack • winbugs (R2winbugs) • jags (R2jags) • stan (rstan, brms)

summary() stanplot(,type=) stancode() residuals() logLik() fitted() standata() predict() marginal_effects() coef() influence.measures() plot() BRMS Extractor Description Residuals Predicted values Predict new responses Extract model coefficients Diagnostic plots More diagnostic plots Partial effects Extract log-likelihood LOO() and WAIC() Calculate WAIC and LOO Leverage, Cook฀s D Model output Model passed to stan Data list passed to stan

Section 3 Worked Examples

84 80 90 154 148 169 206 244 212 248 > summary (fert) 1st Qu.: 81.25 : 80.0 Min. : 25.00 Min. YIELD FERTILIZER 169 Median :137.50 150 6 148 125 5 154 100 4 1st Qu.:104.5 Median :161.5 75 > str (fert) : int $ YIELD 25 50 75 100 125 150 175 200 225 250 $ FERTILIZER: int 2 variables: 10 obs. of 'data.frame': :248.0 Mean Max. :250.00 Max. 3rd Qu.:210.5 3rd Qu.:193.75 :163.5 Mean :137.50 90 3 > fert <- read.csv ('../data/fertilizer.csv', strip.white=T) 75 148 125 5 154 100 4 90 3 150 80 50 2 84 25 1 FERTILIZER YIELD > fert 6 169 80 250 50 2 84 25 1 FERTILIZER YIELD > head (fert) 248 10 7 212 225 9 244 200 8 206 175 Worked Examples

Worked Examples Question: is there a relationship between fertilizer concentration and grass yield? Linear model: Frequentist ε ∼ N (0 , σ 2 ) y i = β 0 + β 1 x i + ε i Bayesian y i ∼ N ( η i , σ 2 ) η i = β 0 + β 1 x i β 0 ∼ N (0 , 1000) β 1 ∼ N (0 , 1000) σ 2 ∼ cauchy (0 , 4)

Workshop 7.2b: Introduction to Bayesian models Murray Logan 07 - PowerPoint PPT Presentation

Workshop 7.2b: Introduction to Bayesian models Murray Logan 07 Feb 2017 Section 1 Frequentist vs Bayesian Frequentist P(DH) long-run frequency simple analytical methods to solve roots conclusions pertain to data, not

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Bayesian hierarchical models Bruno Nicenboim / Shravan Vasishth 2020-03-14 1 Bayesian

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Workshop 7.2b: Introduction to Bayesian models Murray Logan February 7, 2017 Table of

AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG)

Part 7 Bayesian hierarchical modelling, simulation and MCMC by Gero Walter 252 Bayesian

A simple Bayesian regression model Alicia Johnson Associate Professor, Macalester College

Bayesian Networks Youve heard about how Bayesian networks have revolutionized AI

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Lecture 6. Bayesian estimation Lecture 6. Bayesian estimation 1 (172) 6. Bayesian estimation

Bayesian networks (2) Lirong Xia Last class Bayesian networks compact, graphical

Bayesian Networks Li Xiong Slide credits: Page (Wisconsin) CS760 , Zhu (Wisconsin) KDD 12

FPRASs for DNF-Counting Kuldeep S. Meel 1 , Aditya A. Shrotri 2 , Moshe Y. Vardi 2 1 School of

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization

RAL Report for DUNE-UK meeting Fergus Wilson Chris Brew Raja Nandakumar 1 Workload management

Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk Axel Gandy

T minus 6 classes Quiz on Probability next class Know material on the slides we covered

Probabilities and Expectations A. Rupam Mahmood September 10, 2015 Probabilities

Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte Antunes Introduction In

COS 424 Lecture Notes Lecturer: L. Bottou Scribes: J. Valentino & R. Misener February 18,

Workshop 7.2b: Introduction to Bayesian models Murray Logan 07 - PowerPoint PPT Presentation

Workshop 7.2b: Introduction to Bayesian models Murray Logan 07 Feb 2017 Section 1 Frequentist vs Bayesian Frequentist P(DH) long-run frequency simple analytical methods to solve roots conclusions pertain to data, not

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Bayesian hierarchical models Bruno Nicenboim / Shravan Vasishth 2020-03-14 1 Bayesian

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Workshop 7.2b: Introduction to Bayesian models Murray Logan February 7, 2017 Table of

AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG)

Part 7 Bayesian hierarchical modelling, simulation and MCMC by Gero Walter 252 Bayesian

A simple Bayesian regression model Alicia Johnson Associate Professor, Macalester College

Bayesian Networks Youve heard about how Bayesian networks have revolutionized AI

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Lecture 6. Bayesian estimation Lecture 6. Bayesian estimation 1 (172) 6. Bayesian estimation

Bayesian networks (2) Lirong Xia Last class Bayesian networks compact, graphical

Bayesian Networks Li Xiong Slide credits: Page (Wisconsin) CS760 , Zhu (Wisconsin) KDD 12

FPRASs for DNF-Counting Kuldeep S. Meel 1 , Aditya A. Shrotri 2 , Moshe Y. Vardi 2 1 School of

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization

RAL Report for DUNE-UK meeting Fergus Wilson Chris Brew Raja Nandakumar 1 Workload management

Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk Axel Gandy

T minus 6 classes Quiz on Probability next class Know material on the slides we covered

Probabilities and Expectations A. Rupam Mahmood September 10, 2015 Probabilities

Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte Antunes Introduction In

COS 424 Lecture Notes Lecturer: L. Bottou Scribes: J. Valentino &amp; R. Misener February 18,

COS 424 Lecture Notes Lecturer: L. Bottou Scribes: J. Valentino & R. Misener February 18,