Operant Conditioning Learning & Memory Arlo Clark-Foos - - PowerPoint PPT Presentation
Operant Conditioning Learning & Memory Arlo Clark-Foos - - PowerPoint PPT Presentation
Operant Conditioning Learning & Memory Arlo Clark-Foos Instrumental or Operant Law of Effect operates on environment to cause an outcome behavior is instrumental in causing outcome Priscilla the Fastidious Pig
Instrumental or Operant
- Law of Effect
“operates” on environment to cause an outcome behavior is “instrumental” in causing outcome
- Priscilla the Fastidious Pig
- Thorndike & Skinner
https://www.youtube.com/watch?v=LSv992Ts6as
Classical vs. Instrumental
- Differences
– Classical
- Reflexive, automatic behavior
- Reinforcement follows CS, regardless of response
– Instrumental
- Voluntary behavior
- Reinforcement only follows the response
- Similarities
- Negative acceleration, blocking, conditioned inhibition,
spontaneous recovery, generalization and discrimination…
History of Instrumental Cond.
- Edward Thorndike’s (1898) puzzle boxes
– Initially random acts – Decrease in time to escape – Law of Effect (S-R Association)
- “Annoying” vs. “Satisfying” events
- Believed reinforcer is not part of association!
SD R
Superstitious Behavior
B.F. Skinner (1938) showed that nearly any behavior a pigeon performs during reinforcement will increase in frequency.
Belongingness
- Breland & Breland (1961)
– What makes Sammy dance?
- Shettleworth (1975)
“Reinforcing with food
- nly reinforces feeding
Behaviors”
Learned Helplessness
- Seligman & Maier (1967)
– Rats and yoked shocks – Later extended to college students and anagrams – Also extended to depression
Losing Streaks
Detroit Lions, 2008 Detroit Lions, 2015?
METHODOLOGY
Studying/Observing Instrumental Learning
Willard Small
- 1901: Introduced mazes to animal research
Hampton Court, London
Mazes in Research
Mazes in Research
- T-Maze
– Alternation learning – Better at win-shift than win-stay
- Radial Arm Maze
– Random without repetition – Memory Load: 16+
Mazes in Research
- Morris Water Maze
– Cued (Response) Learning
- Rats can see the platform: S-R Association
– Place Learning
- Platform is below surface: Explicit, cognitive memory
Conditioning Takes Time
- Skinner’s Free Operant Protocol (vs. Discrete Trials)
– Skinner box (automatizing data collection)
- Cumulative recorder (akin to Odometer)
– Secondary Reinforcer
What is Learned?
- Discriminative Stimuli (SD)
SD (light on) R (press lever) O (get food) SD (light off) R (press lever) O (no food) Habit Slips (Slips of Action; Reason, 1975)
- Responses (R)
– Lashley’s rats swimming mazes (different motor responses)
- Outcomes (O)
– Reinforcers and Punishments
Shaping Behavior
- Shaping
– Requires skilled trainer
- Physical rehabilitation and language in autism
- Bomb/drug detecting dogs
- Chaining
– Backward chaining
Twiggy
https://www.youtube.co m/watch?v=dVfXF8O-lHw
Human Skills and Habits
- Walking
– feedback from vision/muscles?
- 1. Lashley (1951): RTs > 100ms
- Pianists: 16+ movements per second
- 2. Damage to sensory feedback
- 3. Sequencing errors
- 4. Time to initiate depends on
length
Human Skills and Habits
- Motor Programs
– Initiated complete – General outline, malleable
(Schmidt, 1988)
- Skill Acquisition (Anderson, 1982)
- 1. Cognitive Stage
- 2. Associative Stage
- 3. Autonomous Stage
Reinforcers
- Primary
– Food, water, sleep, sex, shelter (temp control)
- Secondary
– Predict arrival of primary – Token Economies (Conestogas)
- Drive Reduction Theory (Hull, 1943)
– Primary not always reinforcing
- Negative contrast
– Nipple sucking for sugar water – Lame treats on Halloween
Punishers
- Determinants of effectiveness
- 1. Punishment variable behavior
- Hot stove
- 2. SD can encourage cheating
- Speeding or my dog and Krispy Kreme
- 3. Concurrent reinforcement
- Class clowns
- 4. Intensity matters
- Child rearing or criminal justice
Differential Reinforcement of Alternative Behaviors (DRA)
- Cinemark (2011)
Building SD R O
- Timing
– Immediate is best
- Criminal Justice, Punishment
- Self Control
– Immediate vs. Delayed Reward – Diets, Studying, etc. – Precommitment (SI)
Positive vs Negative Reinforcement
Positive vs Negative Punishment
Reinforcement Schedules
- Continuous vs. Partial
- Fixed-ratio (FR)
– Postreinforcement pause
- Variable-ratio (VR)
– Slot machine (keep playing)
- Fixed-interval (FI)
– TBPM
- Variable-interval (VI)
– Waiting is the hardest part
Choosing Between Behaviors
- Concurrent reinforcement schedules
– Football on Saturdays
- Matching Law
– Behavioral Economics (Thaler wins Nobel Prize, 2017)
– Bliss point and Sunfish (observation of behavior)
Why do I watch football?
- Behaviors with no primary reinforcers
- Premack Principle (1959)
– Rats with water/wheel, Children with candy/pinball
- For me: Grading/Cleaning
– Response Deprivation Hypothesis
- Illegal Drugs?
BRAIN SUBSTRATES
SD R
- Basal ganglia
– Dorsal Striatum (caudate nucleus, putamen)
- Receives highly processed sensory info
- Projects to M1
- Lesioned rats fail to learn behaviors in response to stimuli
SD (light) R (lever press) O (food)
- Habitual and Automatic Behaviors
– Bike riding, playing instruments, running past food in a maze
R O
- Prefrontal Cortex
– Orbitofrontal cortex (OPFC)
- Receives sensory input (senses and visceral)
- Projects to dorsal striatum
- Grape juice neurons (Tremblay & Schultz, 1999)
“I want you to want me” by Cheap Trick
- James Olds (1954)
– Electrical current in lateral hypothalamus
- 700 times an hour, physical exhaustion, starvation
- Ventral Tegmental Area (VMA)
– Pleasure center? – Excitement/anticipation? – Motivational value – Projects to SNc
Wanting in the VTA/SNc
- VTA SNc
– Dopaminergic System – Incentive Salience Hypothesis – Working for pleasure (want/drive)
- What if there is no drive (no dopamine)?
- Addiction, cues, and precommitment
Endogenous Opioids
- Exogenous Opiates: Opium, Morphine, Heroin
– May mediate Hedonic value
- Increases liking of other stimuli
- Decreases perception of pain
– Endogenous released in response to primary reinforcers
- Which and how many activated may determine preference
– Nipple Suckers – Play Halo or Watch Cartoons
Punishment Signaling
- Somatosensory Cortex (S1)
– Nociceptors
- Social Rejection
– Insular Cortex (Insula)
- Dorsal posterior insula
- Degree of activation correlates with
magnitude of punisher
– Dorsal Anterior Cingulate Cortex
- Motivational value of punishment
Drug Addiction
- Pathological
– Known harmful consequences – Concurrent reinforcement
- “Yay drugs” & “Boo withdrawals”
- Dopaminergic System
– Stroke damage to insula can wipe out addiction
“Might as well face it, you’re addicted to love”
- Behavioral Addiction
– Gambling, VR Schedules (Skinner), and Gambler’s Fallacy – Parkinson’s patients and dopamine agonists – Cognitive and Behavioral Therapies based on Conditioning
Not All Conditioning is Equal
- Partial Reinforcement Effect
– Partial Reinforcement Extinction Effect (PREE)
- Frustration (Amsel) vs. Sequential (Capaldi) Theories
- Fixed vs. Variable & Ratio vs. Interval
– Child rearing, pet training, gambling, supersition
What explains the PREE?
Frustration Theory (Amsel)
CRF R+ Extinction R- Frustration Punishes Response Evidence for Frustration:
- Behavior of pigeons
- Children tantrums
CRF: R+ R+ R+ R+ R+ R+
- Develop (R-O) expectancy
PRF: R+ R+ R- R+ R- R-
- Develop (R-O) and (R-no O) expectancy
S (frustration) R O
What explains the PREE?
Sequential Theory (Capaldi)
Outcome of previous trial serves as a cue for subsequent behavior PRF: R+ R+ R- R+ R- R- Fm Fm NFm Fm NFm NFm
- NFm – R (S-R) strengthened by next R+
What happens with long ITI?....Decay
- Frustration?
- Memory?
Stronger PREE with long ITI
Complex Behavior
- Response Chaining
– Backward Chaining – Breaks in the “chain” – Animal intelligence
Striatum and Skill/Habit
- Caudate, putamen,
nucleus accumbens
- Organizes somatosensory
representations and motor responses for planning and executing goal-oriented behavior.
Double Dissociation
- Broca vs. Wernicke
Packard et al. (1989)
- Radial Arm Maze (8 arms)
- Win-Stay vs. Win-Shift
Response vs. Place Learning
Habit Learning in Humans
- Parkinson’s Disease
– Impaired dopaminergic system in striatum
- Huntington’s Disease
– Loss of some striatal function
(Gabrieli, 1995)
Weather Prediction Game
- Knowlton et al. (1996)
Weather Prediction Game
- Knowlton et al. (1996)
Weather Prediction Game
- Poldrack et al. (1999)
Neurophysiological Data
- Mink (1996)
– Neurons in striatum fire in anticipation of movement
- Schultz (2006)
– DA Neurons from brain stem into striatum – Fire with expectation and reception of rewards
- Blocking and expectation
Loose Ends
- Addiction and Drug Use
– Dopamine and Reward
- Stress and Memory
– Anxiogenics Response Strategy (Packard & Wingard, 2004)
- Peripheral or Intra-Basolateral Amygdala (Hippocampus)
- Yohimibine, RS78848-197, Vehicle (Placebo)