Conditioning notes on Explaining away in Weight Space by Dayan and - PowerPoint PPT Presentation

Conditioning notes on “Explaining away in Weight Space” by Dayan and Kakade Geoff Gordon ggordon@cs.cmu.edu February 5, 2001

Overview HUGE literature of experiments on conditioning in animals HUGE literature on optimal statistical inference but relatively little overlap between them which is a pity since conditioning is probably an attempt to ap- proximate optimal statistical inference Will describe research that attempts to make a connection

Conditioning Most famous example: Pavlov’s dogs Learned to associate stimulus (bell) with reward (food) Can get much more elaborate: Name Stimulus 1 Stimulus 2 Test classical B → R — B → • sharing B, L → R — B → ◦ , L → ◦ forward blocking B → R B, L → R B → • , L → · backward blocking B, L → R B → R B → • , L → · • = expectation of reward ◦ = weak expectation · = no expectation

Statistical explanations Simple models can explain some conditioning results We’ll discuss 2: gradient descent, Kalman filter Models ignore (important) details: • animals learn in continuous time • animals have to sense stimuli and rewards • animals filter out lots of irrelevant percepts • . . . But they’re still interesting as a simplification or an explanation of a piece of a larger system

Assumptions in both models Trials presented as (stimulus, reward) pairs Goal is to predict reward from stimulus Learning is updating prediction rule Stimulus ∈ R n (in our case, 2 binary vars B and L) Reward ∈ R Reward is linear fn of stimulus, plus Gaussian error

Gradient descent Define x t input on trial t y t reward on trial t w t internal state (weights) after trial t η arbitrary learning rate Write expected reward ˆ y t = x t · w t , error ǫ t = y t − ˆ y t Gradient descent model says: w t +1 = w t + ηx t ǫ t

Conditioning explained by gradient descent In classical conditioning or sharing, +ve correlation between in- puts and outputs causes relevant components of xy to be +ve, so those components of w become +ve In forward blocking, stimulus 2 is explained perfectly by weights learned from stimulus 1, so no learning happens in phase 2 (error signal ǫ is 0)

Backward blocking Gradient descent fails to explain backward blocking! In stimulus 2 of backward blocking, the element of x t correspond- ing to the light is always 0 So gradient descent predicts that the learned weight for the light won’t change Contradicted by experiments

Kalman filter explanation Sutton (1992) proposed that classical conditioning could be explained as optimal Bayesian inference in a simple statistical model The model: • trial stimuli represented by vectors as before • reward is linear function of stimuli plus Gaussian error • in absence of information, weights of linear function drift over time in a Gaussian random walk Inference in this model is called Kalman filtering

Kalman filter Recall x t input on trial t y t reward on trial t w t weights after trial t Assume • w 0 ∼ N (0 , Σ 0 ) • w t +1 | w t ∼ N ( w t , σ 2 I ) • y t ∼ N ( x t · w t , τ 2 )

Kalman filter cont’d Write expected reward ˆ y t = x t · w t , error ǫ t = y t − ˆ y t Calculate “learning rate” η t = 1 / ( τ 2 + x T t Σ t x t ) Equations for new weights w t +1 and their covariance Σ t +1 : z t = Σ t x t w t +1 = w t + η t ǫ t z t Σ t + σ 2 I − η t z t z T Σ t +1 = t

Comparison to GD Update w t +1 = w t + η t ǫ t z t looks like GD, except: η t is a variable learning rate determined by variances of y t and w t z t instead of x t plays role of input vector

Whitening How to interpret z ? (Recall z = Σ x ) z is a whitened or decorrelated version of x To see why: fixed point of update for Σ is σ 2 I = ηzz T which can only be true on average if z has spherical covariance

Conditioning [Dayan&Kakade, 2000]: Kalman filter model explains all conditioning results from above Classical, sharing, and forward blocking all work exactly as they did with the gradient descent model But now backward blocking works too

Backward blocking 1 In sharing, +ve correlation be- 0.8 tween components of x t makes off- 0.6 0.4 diagonal elements of Σ become -ve 0.2 in order to whiten 0 0 0.2 0.4 0.6 0.8 1 Interpretation: don’t know whether it’s B or L that’s causing R I.e. , if we find out one weight is large, other must be small I.e. , evidence for B → R is evidence against L → R “Explaining away”

Incremental version D&K propose a network architecture using only fast computa- tions which approximates the Kalman filter Uses a whitening network from [Goodall, 1960] to get Σ and z , then z and error signal to get changes to w Requires distribution of x t to change only slowly (so whitening network converges) Gets direction but not magnitude of update

Experimental results D&K implemented the Kalman filter as well as the incremental network Presented backward blocking stimulus: 20 trials of B,L → R, then 20 trials of B → R Exact and incremental results qualitatively similar Both show strong blocking effect

Discussion What is essential difference between GD, KF? • GD could simulate backwards blocking by using weight decay to “forget” L → ◦ • But KF allows blocking and forgetting to happen on 2 different time scales (blocking is much faster) • Works because KF can represent uncertainty separately for different directions in weight space

Discussion What’s important about KF? • Gaussian assumption is clearly false, so that’s not it • Instead, idea that animals believe concept to be learned is changing over time Improvements to KF: • Use non-Gaussian distributions • Use “punctuated equilibrium” rather than steady drift: concept is likely to stay same for a while, then change quickly to a new concept • Use mixture models to remember previous concepts, switch between them

Conclusions Simple statistical models can help explain experimental results on conditioning in animals (even if they gloss over important details) Kalman filter is a better model than gradient descent: it con- structs decorrelated features, so it can do backward blocking Kalman filter is not best possible model, but provides guide to what characteristics a model needs to have

Conditioning notes on Explaining away in Weight Space by Dayan and - PowerPoint PPT Presentation

Conditioning notes on Explaining away in Weight Space by Dayan and Kakade Geoff Gordon ggordon@cs.cmu.edu February 5, 2001 Overview HUGE literature of experiments on conditioning in animals HUGE literature on optimal statistical

Classical Conditioning MacFarlane (1978) Perceptual Development: Methods Classical Conditioning

Classical and Instrumental Conditioning Lecture 8 1 Basic Procedure for Classical Conditioning

Conditioning in 90B John Kelsey, NIST, May 2016 Overview What is Conditioning? Vetted and

On the conditioning of subensembles Dustin G. Mixon Jubilee of Fourier Analysis and Applications

Heating and Air Conditioning Spartan Chassis Air Conditioning & Maintenance Principles of

FLOW CONDITIONING FLOW CONDITIONING DESIGN IN TURBULENT DESIGN IN TURBULENT LIQUID SHEETS

Innovation in Ultra-Efficient Air-Conditioning Crista Shopis Advanced Energy Conference October

Classical Conditioning Learning & Memory Arlo Clark-Foos What is classical conditioning?

Beam Conditioning Monitor ATLAS, LHC Hvard Gjersdal havard.gjersdal@fys.uio.no EPF, UiO Beam

Beer Preparation for Packaging Jamie Ramshaw M.Brew Simpsons Malt Conditioning Cask Processed

Physiotherapy Led Aircrew Conditioning Programme Flt Lt Ellen Slungaard RAF MSc, BSc(Hons), MCSP

FITNESS BASICS PART 1: NUTRITION BASICS A comprehensive Active Strength & Conditioning

2/4/14 Best Practices: Outpatient Conditioning for Autologous and Allogeneic Hematopoietic

THE INVENTOR OF THE AIR CONDITIONING UNIT BY JAYDEN Day of death Frederick jones died of lung

ELISA Environmental Life-cycle Impacts of Solar Air- conditioning systems Marco Beccali,

Disruptive Power Conditioning Technology www.megapulse. net What we do u Manufacture a Pulse

Iterative Design L YDIA C HILTON COMS 4170 Milestone 2 What are domains and specific needs that

Lo Low-Fi Fi Prototyp typing No screens Say your name Prof. Lydia Chilton COMS 4170 28

The six vertex model and randomly growing interfaces in (1+1) dimensions Alexei Borodin Through

Writing - Week 10 Concision, Hedges, This and It In my personal opinion, it is necessary that we

Distributional Semantics Crash Course September 11, 2018 CSCI 2952C: Computational Semantics

rs r Pr s

An Extension to Basic ME Tableaux (1) 11ai Example Given: 1. Rx Px, 2. Px Q, 3.

Emerging Vaping Products Tobacco Free Mass Policy Forum Youn Ok Lee, PhD www.rti.org RTI

Conditioning notes on Explaining away in Weight Space by Dayan and - PowerPoint PPT Presentation

Conditioning notes on Explaining away in Weight Space by Dayan and Kakade Geoff Gordon ggordon@cs.cmu.edu February 5, 2001 Overview HUGE literature of experiments on conditioning in animals HUGE literature on optimal statistical

Classical Conditioning MacFarlane (1978) Perceptual Development: Methods Classical Conditioning

Classical and Instrumental Conditioning Lecture 8 1 Basic Procedure for Classical Conditioning

Conditioning in 90B John Kelsey, NIST, May 2016 Overview What is Conditioning? Vetted and

On the conditioning of subensembles Dustin G. Mixon Jubilee of Fourier Analysis and Applications

Heating and Air Conditioning Spartan Chassis Air Conditioning &amp; Maintenance Principles of

FLOW CONDITIONING FLOW CONDITIONING DESIGN IN TURBULENT DESIGN IN TURBULENT LIQUID SHEETS

Innovation in Ultra-Efficient Air-Conditioning Crista Shopis Advanced Energy Conference October

Classical Conditioning Learning &amp; Memory Arlo Clark-Foos What is classical conditioning?

Beam Conditioning Monitor ATLAS, LHC Hvard Gjersdal havard.gjersdal@fys.uio.no EPF, UiO Beam

Beer Preparation for Packaging Jamie Ramshaw M.Brew Simpsons Malt Conditioning Cask Processed

Physiotherapy Led Aircrew Conditioning Programme Flt Lt Ellen Slungaard RAF MSc, BSc(Hons), MCSP

FITNESS BASICS PART 1: NUTRITION BASICS A comprehensive Active Strength &amp; Conditioning

2/4/14 Best Practices: Outpatient Conditioning for Autologous and Allogeneic Hematopoietic

THE INVENTOR OF THE AIR CONDITIONING UNIT BY JAYDEN Day of death Frederick jones died of lung

ELISA Environmental Life-cycle Impacts of Solar Air- conditioning systems Marco Beccali,

Disruptive Power Conditioning Technology www.megapulse. net What we do u Manufacture a Pulse

Iterative Design L YDIA C HILTON COMS 4170 Milestone 2 What are domains and specific needs that

Lo Low-Fi Fi Prototyp typing No screens Say your name Prof. Lydia Chilton COMS 4170 28

The six vertex model and randomly growing interfaces in (1+1) dimensions Alexei Borodin Through

Writing - Week 10 Concision, Hedges, This and It In my personal opinion, it is necessary that we

Distributional Semantics Crash Course September 11, 2018 CSCI 2952C: Computational Semantics

rs r Pr s

An Extension to Basic ME Tableaux (1) 11ai Example Given: 1. Rx Px, 2. Px Q, 3.

Emerging Vaping Products Tobacco Free Mass Policy Forum Youn Ok Lee, PhD www.rti.org RTI

Heating and Air Conditioning Spartan Chassis Air Conditioning & Maintenance Principles of

Classical Conditioning Learning & Memory Arlo Clark-Foos What is classical conditioning?

FITNESS BASICS PART 1: NUTRITION BASICS A comprehensive Active Strength & Conditioning