Bayesian Estimation & Information Theory Jonathan Pillow - - PowerPoint PPT Presentation
Bayesian Estimation & Information Theory Jonathan Pillow - - PowerPoint PPT Presentation
Bayesian Estimation & Information Theory Jonathan Pillow Mathematical Tools for Neuroscience (NEU 314) Spring, 2016 lecture 18 Bayesian Estimation three basic ingredients: 1. Likelihood jointly determine the posterior 2. Prior L (
Bayesian Estimation
- 1. Likelihood
- 2. Prior
- 3. Loss function
jointly determine the posterior “cost” of making an estimate if the true value is
- fully specifies how to generate an estimate from the data
Bayesian estimator is defined as:
ˆ θ(m) = arg min
ˆ θ
Z L(ˆ θ, θ)p(θ|m)dθ L(ˆ θ, θ)
“Bayes’ risk”
three basic ingredients:
Typical Loss functions and Bayesian estimators 1.
squared error loss need to find minimizing the expected loss: Differentiate with respect to and set to zero: “posterior mean” also known as Bayes’ Least Squares (BLS) estimator
L(ˆ θ, θ) = (ˆ θ − θ)2
Typical Loss functions and Bayesian estimators 2.
“zero-one” loss
(1 unless )
- posterior maximum (or “mode”).
- known as maximum a posteriori (MAP) estimate.
expected loss: which is minimized by:
L(ˆ θ, θ) = 1 − δ(ˆ θ − θ)
MAP vs. Posterior Mean estimate:
2 4 6 8 10 0.1 0.2 0.3
Note: posterior maximum and mean not always the same!
gamma pdf
Typical Loss functions and Bayesian estimators 3.
“L1” loss expected loss:
HW problem: What is the Bayesian estimator for this loss function?
Simple Example: Gaussian noise & prior
- 1. Likelihood
additive Gaussian noise
- 3. Loss function:
zero-mean Gaussian
- 2. Prior
doesn’t matter (all agree here)
posterior distribution MAP estimate variance
Likelihood
8 8 8 8
- θ
m
Likelihood
θ m
8 8 8 8
Likelihood
θ m
8 8 8 8
Likelihood
θ m
8 8 8 8
- 8
8
- 8
8
Prior
θ m
8 8
- 8
8
Computing the posterior
x likelihood prior
∝
posterior
θ m
x
∝
likelihood prior posterior
bias m* θ m
Making an Bayesian Estimate:
x
∝
likelihood prior posterior
larger bias θ m
High Measurement Noise: large bias
x
∝
likelihood prior posterior
small bias θ m
Low Measurement Noise: small bias
Bayesian Estimation:
- Likelihood and prior combine to form posterior
- Bayesian estimate is always biased towards the
prior (from the ML estimate)
+ Which grating moves faster? Application #1: Biases in Motion Perception
+ Which grating moves faster? Application #1: Biases in Motion Perception
Explanation from Weiss, Simoncelli & Adelson (2002):
- In the limit of a zero-contrast grating, likelihood becomes infinitely
broad ⇒ percept goes to zero-motion.
prior prior likelihood likelihood posterior
Noisier measurements, so likelihood is broader ⇒ posterior has
larger shift toward 0 (prior = no motion)
- Claim: explains why people actually speed up when driving in fog!
summary
- 3 ingredients for Bayesian estimation (prior, likelihood, loss)
- Bayes’ least squares (BLS) estimator (posterior mean)
- maximum a posteriori (MAP) estimator (posterior mode)
- accounts for stimulus-quality dependent bias in motion