From the Bayesian Brain to Active Inference... ...and the other way - PowerPoint PPT Presentation

From the Bayesian Brain to Active Inference... ...and the other way round. Kai Ueltzhöffer, 9.10.2017

Disclaimer • Today: Overview Talk! 100% *NOT* my own work. But important to give some context and motivation for… • Next week: Mostly my own work (+ some basics) J .

How do we perceive the world? Senses: Vision, Hearing, Smell, Taste, Touch, Nociception, Interoception, Proprioception

A (possible) solution Predictions & interaction (Implicit) prior knowledge Hermann von Helmholz, “Handbuch der Senses: Vision, Hearing, Smell, Taste, physiologischen Optik”, Touch, Interoception, Proprioception 1867

How to formalise such a theory? • Probability theory allows to make exact statements about uncertain information . • Among others, a recipe to optimally combine a priori knowledge (“a prior”) with observations . à Bayes’ Theorem

Bayes’ Theorem 𝑄 𝐼 𝐸 𝑄 𝐸 = 𝑄(𝐼, 𝐸) = 𝑄 𝐸 𝐼 𝑄(𝐼) ⟹ 𝑄 𝐼 𝐸 = 𝑄 𝐸 𝐼 𝑄(𝐼) 𝑄(𝐸) P(H): “Prior” probability that hypothesis H about • the world is true. P(D): Probability of observing D • P(D|H): Probability of observing D, given that • hypothesis H is true. à “Likelihood” function. P(H|D): Probability that hypothesis H is true, given • that D was observed. à “Posterior” Thomas Bayes, 1701-1761

A (possible) solution Predictions & interaction 𝑄(𝐸|𝐼) (Implicit) prior knowledge 𝑄(𝐼) 𝑄(𝐼|𝐸) Hermann von Helmholz, “Handbuch der Senses: Vision, Hearing, Smell, Taste, physiologischen Optik”, Touch, Interoception, Proprioception 1867

Optimal perception with Bayes’ Theorem 𝑄 𝑌 𝐵 = 𝑄 𝐵 𝑌 𝑄(𝑌) 𝑄(𝐵) “Tock, tock, tock, …“ P(X): Prior probability for Hypothesis “The woodpecker* sits at position X”. A x woodpecker should be somewhere close to the trunk of the tree. P(A|X): Probability of hearing “toc, toc, toc” x from the left side of the tree, given the bird’s position is X. Likelihood function allows to Combined: imagine sensory consequences from hypotheses about the world. P(X|A): Posterior probability of the bird’s x position X, given the “toc, toc, toc” sound is heard at the let side of the tree. *woodpecker = Specht

Optimal perception with Bayes’ Theorem 𝑄 𝐼 𝐵, 𝑊 = 𝑄 𝑊 𝑌 𝑄 𝑌 𝐵 𝑄(𝑊|𝐵) “Tock, tock, tock, …“ P(X|A): Posterior probability of the bird’s position X, given the “toc, toc, toc” sound is x heard at the let side of the tree. P(V|X): Probability of observing the woodpecker at the left side of the trunk, x given it’s position X. Combined: P(X|A,V): Posterior probability of the bird’s position X, given auditory and visual x information.

Sounds reasonable, but might it be true? Only audio Only visual information with decreasing accuracy Auditory Varying offsets of visual to auditory Visual information Varying x accuracy of visual information. Combined Alais & Burr, The x ventriloquist effect results from near-optimal bimodal integration, Curr. Biol., 2004

Sounds reasonable, but might it be true? Visual Ernst& Banks, Humans integrate visual and haptic information in a statistically optimal fashion, Nature, 2002

Sounds reasonable, but might it be true? Adams, Graf & Ernst, Experience can change the ‘light-from-above’ prior, Nat. Neuroscience, 2004

F. Petzschner, https://bitbucket.org/fpetzschner/cpc2016

How might Bayesian Inference be implemented in the Brain?* • Dynamic Complex • • Hierarchically Structured Friston, Phil. Trans. R. Soc. B, 2005 *Disclaimer: Now it gets speculative!

Some Assumptions about Model Structure Generative Model: “Prior”: Pink elephants are not very common. 𝑞 𝑝(𝑢), 𝑦(𝑢) = 𝑞 𝑝 𝑢 𝑦 𝑢 𝑞(𝑦(𝑢)) Observations: “Likelihood”: How would a pink elephant look like? Vision: “A large • “A pink elephant is just right pink thing in in front of me.” the shape of an elephant” Hearing: • “Trooeeeet” Touch: The • ground is vibrating

Some Assumptions about Model Structure 𝑦 = {𝜄, 𝑡(𝑢)} Hidden Variables: ”Parameters”, encode ”States”, encode hidden reasons for slowly changing observations on fast timescale, object dependencies, physical identities, positions, physical properties, … laws, general rules Hierarchy: 𝑞 𝜄, 𝑡 𝑢 = 𝑞 𝑡 𝑢 𝜄 𝑞(𝜄) The parameters (general laws) govern how the hidden states of the world ( which might have another hierarchy by themselves ) evolve 𝑞 𝑝 𝑢 |𝜄, 𝑡(𝑢 6 ≤ 𝑢) = 𝑞 𝑝 𝑢 𝜄, 𝑡(𝑢) Factorization: My sensory input right now only depends on the general laws of the world and the state of the world right now .

Three very hard problems: 3. Action: Optimize behavior ( later ) 2. Learning: Optimize generative model 1. Perception: Invert generative model

� � � � � � Problem 1: Perception (Inference on States) Invert Generative Model using Bayes’ Theorem: “Likelihood”: How would a pink elephant look like? “Prior”: Pink elephants are not very common. = 𝑞 𝑝 𝑢 𝑡 𝑢 𝑞(𝑡(𝑢)) 𝑞 𝑡 𝑢 |𝑝 𝑢 𝑞(𝑝 𝑢 ) It’s not very likely, to make such “Maybe there observations. is really a pink Observations: Vision: “A large pink elephant right thing in the shape of an elephant” in front of Hearing: A loud trumpet. Touch: The me.” ground is vibrating Buuuuut: 𝑞 𝑝 𝑢 |𝑡(𝑢) = ; 𝑞 𝑝 𝑢 |𝑡 𝑢 , 𝜄 𝑞(𝜄)d𝜄 𝑞 𝑝 𝑢 = 8 𝑞 𝑝 𝑢 |𝑡 𝑢 , 𝜄 𝑞 𝑡 𝑢 𝜄 𝑞(𝜄)d𝑡 𝑢 d𝜄 𝑞 𝑡 𝑢 = ; 𝑞 𝑡 𝑢 |𝜄 𝑞(𝜄) d𝜄 Extremely high-dimensional integrals! Not even highly parallel computational architectures, such as the brain, can solve these exactly .

� � � � Problem 2: Learning (Inference on Parameters) Given some observations 𝑝(𝑢 < ), … , 𝑝(𝑢 > ) at times 𝑢 < < 𝑢 @ < ⋯ < 𝑢 > use Bayes‘ Theorem to update parameters 𝜄 : 𝑞(𝜄|𝑝(𝑢 < ), … , 𝑝(𝑢 > )) = B C D E ,…,C(D F G B(G) B(C(D E ),…,C(D F )) “Now that I’ve seen a pink elephant, maybe they are not that unlikely after all…” In „real time“ the agent could update its parameters in the following way: B(C(D F )|G,C(D E ),…,C(D FHE )) B(G|C(D E ),…,C(D FHE )) 𝑞(𝜄|𝑝(𝑢 < ), … , 𝑝(𝑢 > )) = B(C(D F )) This leads to comparatively „slow” update dynamics, compared to the dynamics of the hidden states, which might completely change according to the current observation. Buuuuuut (again): 𝑞 𝑝 𝑢 < , … , 𝑝(𝑢 > 𝜄 = ; 𝑞 𝑝 𝑢 < , … , 𝑝(𝑢 > , 𝑡 𝑢 < , … , 𝑡 𝑢 > 𝜄 d𝑡 𝑢 < … d𝑡 𝑢 > 𝑞 𝑝 𝑢 < , … , 𝑝(𝑢 > ) = 8 𝑞 𝑝 𝑢 < , … , 𝑝(𝑢 > , 𝑡 𝑢 < , … , 𝑡 𝑢 > , 𝜄)d𝑡 𝑢 < … d𝑡 𝑢 > d𝜄 Extremely high-dimensional integrals! Not even highly parallel computational architectures, such as the brain, can solve these.

Timescale of Perception Given observations 𝑝(𝑢 < ), … , 𝑝(𝑢 > ) at times 𝑢 < < 𝑢 @ < ⋯ < 𝑢 > , the posterior probability on the state 𝑡 𝑢 > at time 𝑢 > 𝑞 𝑡 𝑢 > |𝑝(𝑢 < ), … , 𝑝(𝑢 > ) = 𝑞 𝑡 𝑢 > |𝑝(𝑢 > ) only depends on the current observation 𝑝(𝑢 > ) at this time, and the time invariant parameters 𝜄 . I.e. as the state of the world changes very quickly (e.g. a tiger jumping into your field of view), the dynamics of the representation of the corresponding posterior distribution over states 𝑡 𝑢 are also very fast.

Timescale of Learning As the agent makes observations 𝑝(𝑢 < ), … , 𝑝(𝑢 > ) at times 𝑢 < < 𝑢 @ < ⋯ < 𝑢 > , the posterior probability on the parameters, given observations, gets a Bayesian update = 𝑞(𝑝(𝑢 > )|𝜄, 𝑝(𝑢 < ), … , 𝑝(𝑢 >O< )) 𝑞(𝜄|𝑝(𝑢 < ), … , 𝑝(𝑢 >O< )) 𝑞 𝜄 𝑝 𝑢 < , … , 𝑝(𝑢 > 𝑞(𝑝(𝑢 > )) for each new observation, here shown for the last observation at 𝑢 > . The more observations the agent has made before, the more constrained its estimate 𝑞(𝜄|𝑝(𝑢 < ), … , 𝑝(𝑢 >O< )) on the true parameters 𝜄 is already. I.e. while the representation of the posterior density on parameters, given observations, might initially change rather quickly, its dynamics will slow down the more the agent sees – and therefore learns – from its environment. Later on, strong evidence or many observations are required for large changes in the parameter estimates. Thus, the dynamics of the representation of the posterior density on the parameters will be rather slow.

From the Bayesian Brain to Active Inference... ...and the other way - PowerPoint PPT Presentation

From the Bayesian Brain to Active Inference... ...and the other way round. Kai Ueltzhffer, 9.10.2017 Disclaimer Today: Overview Talk! 100% NOT my own work. But important to give some context and motivation for Next week: Mostly

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Basics of Bayesian Inference A frequentist thinks of unknown parameters as fixed Basics of

BRAIN VENTRICULAR SYSTEM CSF THE BRAIN BRAIN The brain (encephalon) lies within the cranium. It

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

The Active Card An Active Mind in an Active Body More people, More Active, More often! The

Active Adversary Lecture 7 CCA Security MAC Active Adversary Active Adversary An active

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Pitch Anything by Oren Klaff BUYER 3 3 Neocortex Neocortex 2 2 Mid Brain Mid Brain

EST5104 Bayesian Inference EST5803 Advanced Bayesian Inference Ricardo Ehlers ehlers@icmc.usp.br

Machine Learning: Foundations Lecturer: Yishay Mansour Lecture 2 Bayesian Inference Kfir Bar

Analytics, Inference and Computation in Cosmology: Exercises on Bayesian Inference Roberto

Approximate Bayesian inference for latent Gaussian models avard Rue 1 H Department of

CS 730/730W/830: Intro AI Bayesian Networks Approx. Inference Exact Inference 1 handout: slides

CS 730/830: Intro AI Bayesian Networks Approx. Inference Exact Inference Wheeler Ruml (UNH)

Introduction: Applications, Problems, Architectures practical information class schedule

A cognitive road map for developmental robotics Claes von Hofsten & Kerstin Rosander Uppsala

Evaluating the Roomba: A low-cost, ubiquitous platform for robotics research and education Ben

Intelligent Agents Russell & Norvig Chapter 2 Today s class What s an agent?

Personal Robots Group Lab Projects Overview by Mikey Siegel 6. Insight Presentation | September

DiVE Virtual Reality Lab Duke Robotics Student Symposium 4/14/2017 David J. Zielinski

data: challenges and opportunities A/Prof Zornitza Stark and Dr Alejandro Metke First human

International prospective validation trial of sentinel node biopsy in cervical cancer F Lecuru,

From the Bayesian Brain to Active Inference... ...and the other way - PowerPoint PPT Presentation

From the Bayesian Brain to Active Inference... ...and the other way round. Kai Ueltzhffer, 9.10.2017 Disclaimer Today: Overview Talk! 100% *NOT* my own work. But important to give some context and motivation for Next week: Mostly

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Basics of Bayesian Inference A frequentist thinks of unknown parameters as fixed Basics of

BRAIN VENTRICULAR SYSTEM CSF THE BRAIN BRAIN The brain (encephalon) lies within the cranium. It

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

The Active Card An Active Mind in an Active Body More people, More Active, More often! The

Active Adversary Lecture 7 CCA Security MAC Active Adversary Active Adversary An active

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Pitch Anything by Oren Klaff BUYER 3 3 Neocortex Neocortex 2 2 Mid Brain Mid Brain

EST5104 Bayesian Inference EST5803 Advanced Bayesian Inference Ricardo Ehlers ehlers@icmc.usp.br

Machine Learning: Foundations Lecturer: Yishay Mansour Lecture 2 Bayesian Inference Kfir Bar

Analytics, Inference and Computation in Cosmology: Exercises on Bayesian Inference Roberto

Approximate Bayesian inference for latent Gaussian models avard Rue 1 H Department of

CS 730/730W/830: Intro AI Bayesian Networks Approx. Inference Exact Inference 1 handout: slides

CS 730/830: Intro AI Bayesian Networks Approx. Inference Exact Inference Wheeler Ruml (UNH)

Introduction: Applications, Problems, Architectures practical information class schedule

A cognitive road map for developmental robotics Claes von Hofsten &amp; Kerstin Rosander Uppsala

Evaluating the Roomba: A low-cost, ubiquitous platform for robotics research and education Ben

Intelligent Agents Russell &amp; Norvig Chapter 2 Today s class What s an agent?

Personal Robots Group Lab Projects Overview by Mikey Siegel 6. Insight Presentation | September

DiVE Virtual Reality Lab Duke Robotics Student Symposium 4/14/2017 David J. Zielinski

data: challenges and opportunities A/Prof Zornitza Stark and Dr Alejandro Metke First human

International prospective validation trial of sentinel node biopsy in cervical cancer F Lecuru,

From the Bayesian Brain to Active Inference... ...and the other way round. Kai Ueltzhffer, 9.10.2017 Disclaimer Today: Overview Talk! 100% NOT my own work. But important to give some context and motivation for Next week: Mostly

A cognitive road map for developmental robotics Claes von Hofsten & Kerstin Rosander Uppsala

Intelligent Agents Russell & Norvig Chapter 2 Today s class What s an agent?