Bayesian Estimation & Information Theory Jonathan Pillow - - PowerPoint PPT Presentation

bayesian estimation information theory
SMART_READER_LITE
LIVE PREVIEW

Bayesian Estimation & Information Theory Jonathan Pillow - - PowerPoint PPT Presentation

Bayesian Estimation & Information Theory Jonathan Pillow Mathematical Tools for Neuroscience (NEU 314) Spring, 2016 lecture 18 Bayesian Estimation three basic ingredients: 1. Likelihood jointly determine the posterior 2. Prior L (


slide-1
SLIDE 1

Bayesian Estimation & Information Theory

Jonathan Pillow Mathematical Tools for Neuroscience (NEU 314) Spring, 2016 lecture 18

slide-2
SLIDE 2

Bayesian Estimation

  • 1. Likelihood
  • 2. Prior
  • 3. Loss function

jointly determine the posterior “cost” of making an estimate if the true value is

  • fully specifies how to generate an estimate from the data

Bayesian estimator is defined as:

ˆ θ(m) = arg min

ˆ θ

Z L(ˆ θ, θ)p(θ|m)dθ L(ˆ θ, θ)

“Bayes’ risk”

three basic ingredients:

slide-3
SLIDE 3

Typical Loss functions and Bayesian estimators 1.

squared error loss need to find minimizing the expected loss: Differentiate with respect to and set to zero: “posterior mean” also known as Bayes’ Least Squares (BLS) estimator

L(ˆ θ, θ) = (ˆ θ − θ)2

slide-4
SLIDE 4

Typical Loss functions and Bayesian estimators 2.

“zero-one” loss 


(1 unless )

  • posterior maximum (or “mode”).
  • known as maximum a posteriori (MAP) estimate.

expected loss: which is minimized by:

L(ˆ θ, θ) = 1 − δ(ˆ θ − θ)

slide-5
SLIDE 5

MAP vs. Posterior Mean estimate:

2 4 6 8 10 0.1 0.2 0.3

Note: posterior maximum and mean not always the same!

gamma pdf

slide-6
SLIDE 6

Typical Loss functions and Bayesian estimators 3.

“L1” loss expected loss:

HW problem: What is the Bayesian estimator for this loss function?

slide-7
SLIDE 7

Simple Example: Gaussian noise & prior

  • 1. Likelihood

additive Gaussian noise

  • 3. Loss function:

zero-mean Gaussian

  • 2. Prior

doesn’t matter (all agree here)

posterior distribution MAP estimate variance

slide-8
SLIDE 8

Likelihood

8 8 8 8

  • θ

m

slide-9
SLIDE 9

Likelihood

θ m

8 8 8 8

slide-10
SLIDE 10

Likelihood

θ m

8 8 8 8

slide-11
SLIDE 11

Likelihood

θ m

8 8 8 8

  • 8

8

  • 8

8

slide-12
SLIDE 12

Prior

θ m

8 8

  • 8

8

slide-13
SLIDE 13

Computing the posterior

x likelihood prior

posterior

θ m

slide-14
SLIDE 14

x

likelihood prior posterior

bias m* θ m

Making an Bayesian Estimate:

slide-15
SLIDE 15

x

likelihood prior posterior

larger bias θ m

High Measurement Noise: large bias

slide-16
SLIDE 16

x

likelihood prior posterior

small bias θ m

Low Measurement Noise: small bias

slide-17
SLIDE 17

Bayesian Estimation:

  • Likelihood and prior combine to form posterior
  • Bayesian estimate is always biased towards the

prior (from the ML estimate)

slide-18
SLIDE 18

+ Which grating moves faster? Application #1: Biases in Motion Perception

slide-19
SLIDE 19

+ Which grating moves faster? Application #1: Biases in Motion Perception

slide-20
SLIDE 20

Explanation from Weiss, Simoncelli & Adelson (2002):

  • In the limit of a zero-contrast grating, likelihood becomes infinitely

broad ⇒ percept goes to zero-motion.

prior prior likelihood likelihood posterior

Noisier measurements, so likelihood is broader ⇒ posterior has

larger shift toward 0 (prior = no motion)

  • Claim: explains why people actually speed up when driving in fog!
slide-21
SLIDE 21

summary

  • 3 ingredients for Bayesian estimation (prior, likelihood, loss)
  • Bayes’ least squares (BLS) estimator (posterior mean)
  • maximum a posteriori (MAP) estimator (posterior mode)
  • accounts for stimulus-quality dependent bias in motion

perception (Weiss, Simoncelli & Adelson 2002)