Bayesian(Updating( Peter(Bossaerts,(Caltech( Goals( - PDF document

8/3/12& Bayesian(Updating( Peter(Bossaerts,(Caltech( Goals(  Relation(With(Reinforcement(Learning(  To(highlight(core(characteristics(of(Bayesian(updating:( Optimal(Integration(of(Prior(belief(and(Evidence((via( 1. Likelihood)( Optimality:(Martingales( 2. ModelLBased(Learning(Approach( 3. Integration(of(Hypotheses((“Marginalization”)( 4. Polyvalent(Uncertainty( 5.  Humans(Are( Not (Bayesians?(  Monty(Hall( 2" 1&

8/3/12& Reinforcement(Learning(  Most(of(the(examples(in(psychology/neuroscience(are(about(formation(of( beliefs(about(events/stimuli(that(have(a((fixed)(affective(value((reward/loss).(  In(such(a(context,(psychologists/neuroscientists(usually(talk(about( reinforcement(learning.(  One(distinguishes(two(types((Daw,(Niv,(Dayan(2005)(  ModelLfree:( Pure(Pavlovian:(TD(learning((see(before)(  Instrumental:(Q(learning((WatkinsLDayan(1992)(   ModelLbased( Example:(Bayesian(learning(   I(will(casually(talk(about(modelLfree(learning(as(“reinforcement(learning”((RL)( while(identifying(modelLbased(learning(as(“Bayesian.”(( 3" 1.(Integration(of(Prior(belief(and(Evidence( (via(Likelihood)(  Posterior(=(Prior(*(Likelihood(  (Compare(to(Prediction(Error(based(learning:(New(Belief(=( Old(Belief(+(Learning(Rate*Prediction(Error)( 4" 2&

8/3/12& Sensorimotor(Learning(Example( (Körding/Wolpert,( Nature (2004)(  Prior(unobserved(lateral(shift(  Noisy(observation( 8 5" Results( 8 8 6" 3&

8/3/12& Results((c’d)( (Note:(Posterior(MEAN(only;(recent(evidence:(tradeLoff(posterior(meanLvariance)( 7" Drawing(from(an(Urn:(Conservatism(  Bet(whether(right(urn(was(selected…( b a b 1.0 Observed 0.10 Bayesian Observed Bayesian Robsut Bayesian 0.8 Robust Bayesian 0.05 0.6 Observed See Orange Update Ball 0.00 See Green 0.4 Ball − 0.05 0.2 0.0 − 0.10 0.0 0.2 0.4 0.6 0.8 1.0 Bayesian (D’Acremont(ea,(under(review)( 8" 4&

8/3/12& 2.(Bayesian(Beliefs(Form(A(Martingale(  What(is(a(martingale?(E[X(t+1)(|(Past(Data](=(X(t).(  “One(cannot(predict(direction+magnitude(of(changes(in(X.”(  (Still(possible:(predict(E[(X(t+1)LX(t))^2(|(Past(Data]!)(  Fundamental(concept(in(stochastic(process(theory((and( mathematical(finance)( 9" Doob’s(Lemma(  Bayesian(beliefs(form(a(martingale.(  That(is:(E[Posterior(outcome)(|(Past(Data](=(Prior(outcome).(  Intuition:(If(this(were(violated,(one(could(predict(changes(in(one’s(own( beliefs,(which(means(that(one’s(own(beliefs(have(not(been(updated( “enough.”(  This(is(the(essence(of(“rational(learning.”(  Remarks:(  Martingale(Convergence(Theorem:(Bayesian(beliefs(are(expected(to( converge.(  When(beliefs(are(a(martingale,(updates(“maximize(surprise,”(and(hence( beliefs(incorporate(as(much(information(as(possible(–(information(theory.( 10" 5&

8/3/12& Why(are(Bayesian(beliefs(a(martingale?(  Because(Bayesians(update(based(on(the(likelihood((ratio):( likelihood(of(observed(data((“stimulus/signal”)(given(one( hypothesis(compared(to(likelihood(of(observed(data(given( alternatives.(  (Contrast(this(with(standard(predictionLerror(based(learning( schemes(like(RescorlaLWagner,(which(are(based(on:( PE(=(Outcome(L(Prediction( 11" Still,(predictionLerror(learning(models(can( be(made(to(“emulate”(Bayesian(learning(  Nicest(example((I(think):(Sutton(1992.((  He(sets(the(learning(rate((“gain”)(such(that(one(expects(to(minimize(the(size(of( the(subsequent(prediction(errors.(  Sutton(proves(that(this(is(the(same(as(to(minimize(the(correlation((over(time)(of( the(prediction(error.(  If(prediction(errors(are(positively(correlated,(one’s(learning(rate(is(TOO(LOW;(  If(negatively(correlated,(the(learning(rate(is(TOO(HIGH.(  If(predictions(form(a(martingale,(changes(in(predictions(are( uncorrelated (  So,(Sutton(attempts(to(generate(a(martingale…(  (Sutton’s(algorithm(works(very(well!!)( 12" 6&

8/3/12& Back(to(Urn(Betting…(  Martingale(test(accepted…( a b 0.04 0.010 0.02 0.005 Covariance Update 0.000 0.00 − 0.005 − 0.02 − 0.010 − 0.04 2 4 6 8 10 2 4 6 8 10 Sample Size Sample Size 13" …(despite(conservatism(  …(because(participants(used(a( robust(prior,(not(the( 2.0 High range “true”((announced)(prior,( Low range unlike(in(KördingLWolpert.( 1.5  (Robust(prior:(mixturesLofL Density binomials)( 1.0 Expected prior More conservatism 0.5 Less conservatism 0.0 0.0 0.2 0.4 0.6 0.8 1.0 Probability 14" 7&

8/3/12& Remarks(  Truth(is(more(complicated:(Bayesian(beliefs(are(a(martingale( only (from(the(perspective(of(the(learner.(  Specifically,(they(may(not(be(a(martingale(from(the( perspective(of(an(observer(who(knows(more((e.g.,(which(urn( is(more(likely(to(be(correct?)(  Doob’s(result(can(be(extended((Bossaerts,( REStud02004)... ( 15" Neurobiological(basis?(  YangLShadlen(( Nature ,( 2,400–2,600 ms: 2007):(recordings(in(monkey( a Fixation off, saccade 2,000 ms: parietal(cortex(shows( shapes off updating(based(on(likelihood( 1,500 ms: 4th shape on ratio( 1,000 ms: 3rd shape on 500 ms: 2nd shape on  (In(their(task,(information(is( Favouring red + ∞ 0 ms: Time Target on 0.9 not(I.I.D.(conditional(on( Assigned weights 1st shape on 0.7 0.5 Shapes 0.3 correct(target(location.)( Fixation –0.3 –0.5 –0.7 –0.9 – ∞ Favouring green 16" 8&

8/3/12& Results…( b Epoch 1 Epoch 2 Epoch 3 Epoch 4 80 a Targets and 2nd shape on 1st shape on 3rd shape on 4th shape on All shape off 60 Response (sp s –1 ) 80 + 40 T in 60 logLR for T in Response (sp s –1 ) 20 40 – 0 0 600 0 600 0 600 0 600 Time (ms) T out 20 c Response (sp s –1 ) 30 0 0 1,000 2,000 3,000 Time (ms) 6.2 ± 0.7 5.8 ± 0.7 4.9 ± 0.5 6.2 ± 0.5 0 − 4 0 4 − 4 0 4 − 4 0 4 − 4 0 4 logLR (ban) 17" 3.(Bayesian(Learning(Is(ModelLBased(  Bayesian(learning(is(about(“inverting( beliefs”((Laplace)(to(assess(the(veracity(of( underlying(“causes”((  This(requires(a( model0(of0the0hidden0causes);0 S(t)( S t (medication)(and(Y(t)((symptoms)(are(not(just( correlated,(but(S(t)(causes(X(t)((infection)(which( causes(Y(t).( X t  This(contrasts(with( Reinforcement0Learning (which( only(involves( observables (((certain(S(t)( (medication)(help(Y(t)((symptoms),(but(the(RL( Y t agent(does(not(care(to(probe( why?0  (But(modelLbased(learning(does(not(need( Bayesian(updating…)( 18" 9&

8/3/12& Neurobiological(Foundation?(  Reversal(Task:(Does(the((human)(brain(record(that(when(one( option(goes(bad,(the(other( must0be0better?0 (Hampton(ea,( JN0 2006 ;(threeLoption(case:(Beierholm(ea,( NeuroImage02011 ) 0 19" More(Challenging…(see(correlation(study( in(Class(3(  Underlying(correlation(changes(  Do(humans(learn(by(trial(and(error((reinforcement)(or(by(explicitly(tracking( correlation((Bayesian)?( (Wunderlich(ea,( Neuron02011 )( 20" 10&

8/3/12& Choices…( Subject            Complete Info Model      21" ( Brain(Activation…( (         A R Correlation( Correlation(Prediction( Error( z = 7 22" 11&

8/3/12& 4.(Bayesians(Follow(Evidence(For(ALL( Hypotheses(  …(as(opposed(to(“attention(gating”((hypothesis(testing):(pick( one(hypothesis(and(accept(it(until(evidence(gathers(against(it.(  Bayesians(“marginalize”(across(hypotheses.( 23" The(Task.(  Two(modalities((“dimensions”)(may(“cause”(reward;(choose( Top(or(Bottom((Wunderlich(ea,( J0Neurophys02011 )( 24" 12&

8/3/12& Analysis:(Weight(on(each(dimension( Subject(could(choose(based(on( motion (even(if(she(is(more(confident(that( color (is(right(because(confidence(in(choice(condition(on( motion (is(higher…( green COLOR red DIMENSION color motion MOTION right left 0 50 100 150 200 250 300 trial 25" Activation…(  To(be(able(to(weigh(appropriately(the(evidence(for(the(two( dimensions(in(final(choice,(you(need(a(signal(of(confidence( (left)(or(uncertainty((right)(for(the(two(dimensions((summed( here)( A B x = 2 x = 0 z = 35 z = 10 26" 13&

Bayesian(Updating( Peter(Bossaerts,(Caltech( Goals( - PDF document

8/3/12& Bayesian(Updating( Peter(Bossaerts,(Caltech( Goals( Relation(With(Reinforcement(Learning( To(highlight(core(characteristics(of(Bayesian(updating:( Optimal(Integration(of(Prior(belief(and(Evidence((via( 1. Likelihood)(

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Updating Autonomous Start to an Updating Autonomous Start to an RTK Field Survey (Part II) RTK

Updating Autonomous Start to an Updating Autonomous Start to an RTK Field Survey RTK Field

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

UPD UPDATING THE UPD UPDATING THE ING THE ING THE CLASSIFICATION OF CLASSIFIC TION OF

A simple Bayesian regression model Alicia Johnson Associate Professor, Macalester College

Part 7 Bayesian hierarchical modelling, simulation and MCMC by Gero Walter 252 Bayesian

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG)

Bayesian Networks Youve heard about how Bayesian networks have revolutionized AI

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Lecture 6. Bayesian estimation Lecture 6. Bayesian estimation 1 (172) 6. Bayesian estimation

Bayesian networks (2) Lirong Xia Last class Bayesian networks compact, graphical

Experimentation in Virtual Environments Will Steptoe 29 th January 2010 Whats in this

Spatial Vision: Primary Visual Cortex (Chapter 3, part 1) Lecture 6 Jonathan Pillow

Software Engineering 2012 All Projects Donnerstag, 19. April 12 Cognitive Load: Data

The Art of counting potatoes (with Linux) Ricardo Ribalda 1 2 Initial Questions Why?

III.5 Advanced Query Types (MRS book, Chapters 9+10; Baeza-Yates, Chapters 5+13) 5.1 Query

1 Methodology 2 Machine Learning 2018 Peter Bloem Today we will be talking about what happens

Multiple Uses of Correlation Filters for Biometrics Prof. Vijayakumar Bhagavatula

Motif analysis Stockholm, November 8 2018 Jakub Orzechowski Westholm Long-term bioinformatics

Sambuz

Useful Links

Newsletter

Mail Us