The temperat u re in a Normal lake FU N DAME N TAL S OF BAYE SIAN - - PowerPoint PPT Presentation

the temperat u re in a normal lake
SMART_READER_LITE
LIVE PREVIEW

The temperat u re in a Normal lake FU N DAME N TAL S OF BAYE SIAN - - PowerPoint PPT Presentation

The temperat u re in a Normal lake FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R Rasm u s Bth Data Scientist The model w e 'v e u sed so far FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R Some


slide-1
SLIDE 1

The temperature in a Normal lake

FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R

Rasmus Bååth

Data Scientist

slide-2
SLIDE 2

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

The model we've used so far

slide-3
SLIDE 3

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

slide-4
SLIDE 4

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Some temperature data

temp <- c(19, 23, 20, 17, 23) temp_f <- c(66, 73, 68, 63, 73)

slide-5
SLIDE 5

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

The Normal distribution

Normal(μ,σ)

slide-6
SLIDE 6

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

The Normal distribution in R

rnorm(n = , mean = , sd = )

slide-7
SLIDE 7

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

The Normal distribution in R

rnorm(n = 5, mean = 20, sd = 2) 20.3 24.1 22.4 24.7 21.6 rnorm(n = 5, mean = 20, sd = 2) 16.3 22.1 23.1 18.9 16.3 rnorm(n = 5, mean = 20, sd = 2) 20.3 20.9 18.0 16.8 22.6 temp <- c(19, 23, 20, 17, 23)

slide-8
SLIDE 8

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

The Normal distribution in R

temp <- c(19, 23, 20, 17, 23) like <- dnorm(x = temp, mean = 20, sd = 2) like 0.176 0.065 0.199 0.065 0.065 prod(like) 9.536075e-06 log(like)

  • 1.737086 -2.737086 -1.612086 -2.737086 -2.737086
slide-9
SLIDE 9

Try out using rnorm and dnorm!

FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R

slide-10
SLIDE 10

A Bayesian model of water temperature

FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R

Rasmus Bååth

Data Scientist

slide-11
SLIDE 11

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's define the model

slide-12
SLIDE 12

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's define the model

slide-13
SLIDE 13

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's define the model

slide-14
SLIDE 14

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's define the model

slide-15
SLIDE 15

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's fit the model

n_ads_shown <- 100 n_visitors <- 13 proportion_clicks <- seq(0, 1, by = 0.01) pars <- expand.grid(proportion_clicks = proportion_clicks) pars$prior <- dunif(pars$proportion_clicks, min = 0, max = 0.2) pars$likelihood <- dbinom(n_visitors, size = n_ads_shown, prob = pars$proportion_clicks) pars$probability <- pars$likelihood * pars$prior pars$probability <- pars$probability / sum(pars$probability)

slide-16
SLIDE 16

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's fit the model

temp <- c(19, 23, 20, 17, 23) proportion_clicks <- seq(0, 1, by = 0.01) pars <- expand.grid(proportion_clicks = proportion_clicks) pars$prior <- dunif(pars$proportion_clicks, min = 0, max = 0.2) pars$likelihood <- dbinom(n_visitors, size = n_ads_shown, prob = pars$proportion_clicks) pars$probability <- pars$likelihood * pars$prior pars$probability <- pars$probability / sum(pars$probability)

slide-17
SLIDE 17

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's fit the model

temp <- c(19, 23, 20, 17, 23) mu <- sigma <- pars <- expand.grid(proportion_clicks = proportion_clicks) pars$prior <- dunif(pars$proportion_clicks, min = 0, max = 0.2) pars$likelihood <- dbinom(n_visitors, size = n_ads_shown, prob = pars$proportion_clicks) pars$probability <- pars$likelihood * pars$prior pars$probability <- pars$probability / sum(pars$probability)

slide-18
SLIDE 18

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's fit the model

temp <- c(19, 23, 20, 17, 23) mu <- seq(8, 30, by = 0.5) sigma <- seq(0.1, 10, by = 0.3) pars <- expand.grid(proportion_clicks = proportion_clicks) pars$prior <- dunif(pars$proportion_clicks, min = 0, max = 0.2) pars$likelihood <- dbinom(n_visitors, size = n_ads_shown, prob = pars$proportion_clicks) pars$probability <- pars$likelihood * pars$prior pars$probability <- pars$probability / sum(pars$probability)

slide-19
SLIDE 19

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's fit the model

temp <- c(19, 23, 20, 17, 23) mu <- seq(8, 30, by = 0.5) sigma <- seq(0.1, 10, by = 0.3) pars <- expand.grid(mu = mu, sigma = sigma) pars$prior <- dunif(pars$proportion_clicks, min = 0, max = 0.2) pars$likelihood <- dbinom(n_visitors, size = n_ads_shown, prob = pars$proportion_clicks) pars$probability <- pars$likelihood * pars$prior pars$probability <- pars$probability / sum(pars$probability)

slide-20
SLIDE 20

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

The parameter space

plot(pars, pch=19)

slide-21
SLIDE 21

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's fit the model

temp <- c(19, 23, 20, 17, 23) mu <- seq(8, 30, by = 0.5) sigma <- seq(0.1, 10, by = 0.3) pars <- expand.grid(mu = mu, sigma = sigma) pars$prior <- dunif(pars$proportion_clicks, min = 0, max = 0.2) pars$likelihood <- dbinom(n_visitors, size = n_ads_shown, prob = pars$proportion_clicks) pars$probability <- pars$likelihood * pars$prior pars$probability <- pars$probability / sum(pars$probability)

slide-22
SLIDE 22

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's fit the model

temp <- c(19, 23, 20, 17, 23) mu <- seq(8, 30, by = 0.5) sigma <- seq(0.1, 10, by = 0.3) pars <- expand.grid(mu = mu, sigma = sigma) pars$mu_prior <- dnorm(pars$mu, mean = 18, sd = 5) pars$prior <- dunif(pars$proportion_clicks, min = 0, max = 0.2) pars$likelihood <- dbinom(n_visitors, size = n_ads_shown, prob = pars$proportion_clicks) pars$probability <- pars$likelihood * pars$prior pars$probability <- pars$probability / sum(pars$probability)

slide-23
SLIDE 23

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's fit the model

temp <- c(19, 23, 20, 17, 23) mu <- seq(8, 30, by = 0.5) sigma <- seq(0.1, 10, by = 0.3) pars <- expand.grid(mu = mu, sigma = sigma) pars$mu_prior <- dnorm(pars$mu, mean = 18, sd = 5) pars$sigma_prior <- dunif(pars$sigma, min = 0, max = 10) pars$prior <- dunif(pars$proportion_clicks, min = 0, max = 0.2) pars$likelihood <- dbinom(n_visitors, size = n_ads_shown, prob = pars$proportion_clicks) pars$probability <- pars$likelihood * pars$prior pars$probability <- pars$probability / sum(pars$probability)

slide-24
SLIDE 24

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's fit the model

temp <- c(19, 23, 20, 17, 23) mu <- seq(8, 30, by = 0.5) sigma <- seq(0.1, 10, by = 0.3) pars <- expand.grid(mu = mu, sigma = sigma) pars$mu_prior <- dnorm(pars$mu, mean = 18, sd = 5) pars$sigma_prior <- dunif(pars$sigma, min = 0, max = 10) pars$prior <- pars$mu_prior * pars$sigma_prior pars$likelihood <- dbinom(n_visitors, size = n_ads_shown, prob = pars$proportion_clicks) pars$probability <- pars$likelihood * pars$prior pars$probability <- pars$probability / sum(pars$probability)

slide-25
SLIDE 25

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's fit the model

temp <- c(19, 23, 20, 17, 23) mu <- seq(8, 30, by = 0.5) sigma <- seq(0.1, 10, by = 0.3) pars <- expand.grid(mu = mu, sigma = sigma) pars$mu_prior <- dnorm(pars$mu, mean = 18, sd = 5) pars$sigma_prior <- dunif(pars$sigma, min = 0, max = 10) pars$prior <- pars$mu_prior * pars$sigma_prior for(i in 1:nrow(pars)) { pars$likelihood <- dbinom(n_visitors, size = n_ads_shown, prob = pars$proportion_clicks) pars$probability <- pars$likelihood * pars$prior pars$probability <- pars$probability / sum(pars$probability)

slide-26
SLIDE 26

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's fit the model

temp <- c(19, 23, 20, 17, 23) mu <- seq(8, 30, by = 0.5) sigma <- seq(0.1, 10, by = 0.3) pars <- expand.grid(mu = mu, sigma = sigma) pars$mu_prior <- dnorm(pars$mu, mean = 18, sd = 5) pars$sigma_prior <- dunif(pars$sigma, min = 0, max = 10) pars$prior <- pars$mu_prior * pars$sigma_prior for(i in 1:nrow(pars)) { likelihoods <- dnorm(temp, pars$mu[i], pars$sigma[i]) pars$likelihood <- dbinom(n_visitors, size = n_ads_shown, prob = pars$proportion_clicks) pars$probability <- pars$likelihood * pars$prior pars$probability <- pars$probability / sum(pars$probability)

slide-27
SLIDE 27

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's fit the model

temp <- c(19, 23, 20, 17, 23) mu <- seq(8, 30, by = 0.5) sigma <- seq(0.1, 10, by = 0.3) pars <- expand.grid(mu = mu, sigma = sigma) pars$mu_prior <- dnorm(pars$mu, mean = 18, sd = 5) pars$sigma_prior <- dunif(pars$sigma, min = 0, max = 10) pars$prior <- pars$mu_prior * pars$sigma_prior for(i in 1:nrow(pars)) { likelihoods <- dnorm(temp, pars$mu[i], pars$sigma[i]) pars$likelihood[i] <- prod(likelihoods) } pars$probability <- pars$likelihood * pars$prior pars$probability <- pars$probability / sum(pars$probability)

slide-28
SLIDE 28

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

slide-29
SLIDE 29

Replicate this analysis using zombie data!

FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R

slide-30
SLIDE 30

Answering the question: Should I have a beach party?

FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R

Rasmus Bååth

Data Scientist

slide-31
SLIDE 31

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

The questions

What's likely the average water temperature on 20th of Julys? What's the probability that the water temperature is going to be 18 or more on the next 20th?

slide-32
SLIDE 32

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

The posterior distribution

pars mu sigma probability 17.5 1.9 0.0001 18.0 1.9 0.0003 18.5 1.9 0.0014 19.0 1.9 0.0043 19.5 1.9 0.0094 20.0 1.9 0.0142 20.5 1.9 0.0151 21.0 1.9 0.0112 21.5 1.9 0.0058 22.0 1.9 0.0021 ... ... ... sample_indices <- sample(1:nrow(pars), size = 10000, replace = TRUE, prob = pars$probability)

slide-33
SLIDE 33

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

sample_indices <- sample(1:nrow(pars), size = 10000, replace = TRUE, prob = pars$probability) head(sample_indices) 430 428 1010 383 343 385 pars_sample <- pars[sample_indices, c("mu", "sigma")] head(pars_sample) mu sigma 1 20.0 2.8 2 19.0 2.8 3 17.5 6.7 4 19.0 2.5 5 21.5 2.2 6 20.0 2.5 7 20.0 2.8 8 20.5 1.6 9 19.0 2.5 10 17.0 4.0

slide-34
SLIDE 34

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

The probability distribution over the mean temperature

hist(pars_sample$mu, 30)

slide-35
SLIDE 35

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

The probability distribution over the mean temperature

quantile(pars_sample$mu, c(0.05, 0.95)) 5% 95% 17.5 22.5

slide-36
SLIDE 36

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Is the temperature 18 or above on the 20th?

pred_temp <- rnorm(10000, mean = , sd = )

slide-37
SLIDE 37

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Is the temperature 18 or above on the 20th?

pred_temp <- rnorm(10000, mean = pars_sample$mu, sd = pars_sample$sigma)

slide-38
SLIDE 38

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Is the temperature 18 or above on the 20th?

pred_temp <- rnorm(10000, mean = pars_sample$mu, sd = pars_sample$sigma) hist(pred_temp, 30)

slide-39
SLIDE 39

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Is the temperature 18 or above on the 20th?

pred_temp <- rnorm(10000, mean = pars_sample$mu, sd = pars_sample$sigma) hist(pred_temp, 30) sum(pred_temp >= 18) / length(pred_temp ) 0.73

slide-40
SLIDE 40

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

slide-41
SLIDE 41

What about the IQ

  • f zombies?

FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R

slide-42
SLIDE 42

You've fitted a Bayesian Normal model!

FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R

Rasmus Bååth

Data Scientist

slide-43
SLIDE 43

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

BEST

A Bayesian model developed by John Kruschke. Assumes the data comes from a t-distribution.

slide-44
SLIDE 44

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

slide-45
SLIDE 45

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

slide-46
SLIDE 46

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

BEST

A Bayesian model developed by John Kruschke. Assumes the data comes from a t-distribution. Estimates the mean, standard deviation and degrees-of-freedom parameter.

library(BEST)

Uses Markov chain Monte Carlo (MCMC).

slide-47
SLIDE 47

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's use BEST!

library(BEST) iq <- c(55, 44, 34, 18, 51, 40, 40, 49, 48, 46)

slide-48
SLIDE 48

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's use BEST!

library(BEST) iq <- c(55, 44, 34, 18, 51, 40, 40, 49, 48, 46) fit <- BESTmcmc(iq)

slide-49
SLIDE 49

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's use BEST!

library(BEST) iq <- c(55, 44, 34, 18, 51, 40, 40, 49, 48, 46) fit <- BESTmcmc(iq) fit MCMC fit results for BEST analysis: mean sd median HDIlo HDIup mu 43.15 3.810 43.28 35.367 50.49 nu 27.42 26.647 18.91 1.001 81.59 sigma 11.00 3.754 10.44 4.857 18.38

slide-50
SLIDE 50

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Let's use BEST!

library(BEST) iq <- c(55, 44, 34, 18, 51, 40, 40, 49, 48, 46) fit <- BESTmcmc(iq) plot(fit)

slide-51
SLIDE 51

Try out BEST yourself!

FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R

slide-52
SLIDE 52

What have you learned? What did we miss?

FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R

Rasmus Bååth

Data Scientist

slide-53
SLIDE 53

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

We have covered

slide-54
SLIDE 54

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

We have covered

slide-55
SLIDE 55

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

We have covered

slide-56
SLIDE 56

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

We have covered

slide-57
SLIDE 57

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

We have covered

Computational methods Rejection sampling Grid approximation Markov chain Monte Carlo (MCMC)

slide-58
SLIDE 58

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

We have covered

Generative models:

slide-59
SLIDE 59

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Working with samples representing probability distributions:

> head(sample) mu sigma 39.39 10.18 39.39 21.77 40.90 20.26 45.45 13.20 34.84 12.70 40.90 12.70 pred_iq <- rnorm(10000, mean = sample$mu, sd = sample$sigma) sum(pred_iq >= 60) / length(pred_iq) 0.0901

slide-60
SLIDE 60

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Things we didn't cover

That a Bayesian approach can be used for much more than simple models. How to decide what priors and models to use. How Bayesian statistics relate to classical statistics. More advanced computational methods. More advanced computational tools.

slide-61
SLIDE 61

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Things we didn't cover

slide-62
SLIDE 62

Go explore Bayes!

FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R

slide-63
SLIDE 63

FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Bye and thanks!

slide-64
SLIDE 64

Let's practice!

FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R