Operant Conditioning Learning & Memory Arlo Clark-Foos - - PowerPoint PPT Presentation

operant conditioning
SMART_READER_LITE
LIVE PREVIEW

Operant Conditioning Learning & Memory Arlo Clark-Foos - - PowerPoint PPT Presentation

Operant Conditioning Learning & Memory Arlo Clark-Foos Instrumental or Operant Law of Effect operates on environment to cause an outcome behavior is instrumental in causing outcome Priscilla the Fastidious Pig


slide-1
SLIDE 1

Operant Conditioning

Learning & Memory Arlo Clark-Foos

slide-2
SLIDE 2

Instrumental or Operant

  • Law of Effect

“operates” on environment to cause an outcome behavior is “instrumental” in causing outcome

  • Priscilla the Fastidious Pig
  • Thorndike & Skinner

https://www.youtube.com/watch?v=LSv992Ts6as

slide-3
SLIDE 3

Classical vs. Instrumental

  • Differences

– Classical

  • Reflexive, automatic behavior
  • Reinforcement follows CS, regardless of response

– Instrumental

  • Voluntary behavior
  • Reinforcement only follows the response
  • Similarities
  • Negative acceleration, blocking, conditioned inhibition,

spontaneous recovery, generalization and discrimination…

slide-4
SLIDE 4

History of Instrumental Cond.

  • Edward Thorndike’s (1898) puzzle boxes

– Initially random acts – Decrease in time to escape – Law of Effect (S-R Association)

  • “Annoying” vs. “Satisfying” events
  • Believed reinforcer is not part of association!

SD  R

slide-5
SLIDE 5

Superstitious Behavior

B.F. Skinner (1938) showed that nearly any behavior a pigeon performs during reinforcement will increase in frequency.

slide-6
SLIDE 6

Belongingness

  • Breland & Breland (1961)

– What makes Sammy dance?

  • Shettleworth (1975)

“Reinforcing with food

  • nly reinforces feeding

Behaviors”

slide-7
SLIDE 7

Learned Helplessness

  • Seligman & Maier (1967)

– Rats and yoked shocks – Later extended to college students and anagrams – Also extended to depression

slide-8
SLIDE 8

Losing Streaks

Detroit Lions, 2008 Detroit Lions, 2015?

slide-9
SLIDE 9

METHODOLOGY

Studying/Observing Instrumental Learning

slide-10
SLIDE 10

Willard Small

  • 1901: Introduced mazes to animal research

Hampton Court, London

slide-11
SLIDE 11

Mazes in Research

slide-12
SLIDE 12

Mazes in Research

  • T-Maze

– Alternation learning – Better at win-shift than win-stay

  • Radial Arm Maze

– Random without repetition – Memory Load: 16+

slide-13
SLIDE 13

Mazes in Research

  • Morris Water Maze

– Cued (Response) Learning

  • Rats can see the platform: S-R Association

– Place Learning

  • Platform is below surface: Explicit, cognitive memory
slide-14
SLIDE 14

Conditioning Takes Time

  • Skinner’s Free Operant Protocol (vs. Discrete Trials)

– Skinner box (automatizing data collection)

  • Cumulative recorder (akin to Odometer)

– Secondary Reinforcer

slide-15
SLIDE 15

What is Learned?

  • Discriminative Stimuli (SD)

SD (light on)  R (press lever)  O (get food) SD (light off)  R (press lever)  O (no food) Habit Slips (Slips of Action; Reason, 1975)

  • Responses (R)

– Lashley’s rats swimming mazes (different motor responses)

  • Outcomes (O)

– Reinforcers and Punishments

slide-16
SLIDE 16

Shaping Behavior

  • Shaping

– Requires skilled trainer

  • Physical rehabilitation and language in autism
  • Bomb/drug detecting dogs
  • Chaining

– Backward chaining

Twiggy

https://www.youtube.co m/watch?v=dVfXF8O-lHw

slide-17
SLIDE 17

Human Skills and Habits

  • Walking

– feedback from vision/muscles?

  • 1. Lashley (1951): RTs > 100ms
  • Pianists: 16+ movements per second
  • 2. Damage to sensory feedback
  • 3. Sequencing errors
  • 4. Time to initiate depends on

length

slide-18
SLIDE 18

Human Skills and Habits

  • Motor Programs

– Initiated complete – General outline, malleable

(Schmidt, 1988)

  • Skill Acquisition (Anderson, 1982)
  • 1. Cognitive Stage
  • 2. Associative Stage
  • 3. Autonomous Stage
slide-19
SLIDE 19

Reinforcers

  • Primary

– Food, water, sleep, sex, shelter (temp control)

  • Secondary

– Predict arrival of primary – Token Economies (Conestogas)

  • Drive Reduction Theory (Hull, 1943)

– Primary not always reinforcing

  • Negative contrast

– Nipple sucking for sugar water – Lame treats on Halloween

slide-20
SLIDE 20

Punishers

  • Determinants of effectiveness
  • 1. Punishment  variable behavior
  • Hot stove
  • 2. SD can encourage cheating
  • Speeding or my dog and Krispy Kreme
  • 3. Concurrent reinforcement
  • Class clowns
  • 4. Intensity matters
  • Child rearing or criminal justice
slide-21
SLIDE 21

Differential Reinforcement of Alternative Behaviors (DRA)

  • Cinemark (2011)
slide-22
SLIDE 22

Building SD  R  O

  • Timing

– Immediate is best

  • Criminal Justice, Punishment
  • Self Control

– Immediate vs. Delayed Reward – Diets, Studying, etc. – Precommitment (SI)

slide-23
SLIDE 23

Positive vs Negative Reinforcement

slide-24
SLIDE 24

Positive vs Negative Punishment

slide-25
SLIDE 25

Reinforcement Schedules

  • Continuous vs. Partial
  • Fixed-ratio (FR)

– Postreinforcement pause

  • Variable-ratio (VR)

– Slot machine (keep playing)

  • Fixed-interval (FI)

– TBPM

  • Variable-interval (VI)

– Waiting is the hardest part

slide-26
SLIDE 26

Choosing Between Behaviors

  • Concurrent reinforcement schedules

– Football on Saturdays

  • Matching Law

– Behavioral Economics (Thaler wins Nobel Prize, 2017)

– Bliss point and Sunfish (observation of behavior)

slide-27
SLIDE 27

Why do I watch football?

  • Behaviors with no primary reinforcers
  • Premack Principle (1959)

– Rats with water/wheel, Children with candy/pinball

  • For me: Grading/Cleaning

– Response Deprivation Hypothesis

  • Illegal Drugs?
slide-28
SLIDE 28

BRAIN SUBSTRATES

slide-29
SLIDE 29

SD  R

  • Basal ganglia

– Dorsal Striatum (caudate nucleus, putamen)

  • Receives highly processed sensory info
  • Projects to M1
  • Lesioned rats fail to learn behaviors in response to stimuli

SD (light)  R (lever press)  O (food)

  • Habitual and Automatic Behaviors

– Bike riding, playing instruments, running past food in a maze

slide-30
SLIDE 30

R  O

  • Prefrontal Cortex

– Orbitofrontal cortex (OPFC)

  • Receives sensory input (senses and visceral)
  • Projects to dorsal striatum
  • Grape juice neurons (Tremblay & Schultz, 1999)
slide-31
SLIDE 31

“I want you to want me” by Cheap Trick

  • James Olds (1954)

– Electrical current in lateral hypothalamus

  • 700 times an hour, physical exhaustion, starvation
  • Ventral Tegmental Area (VMA)

– Pleasure center? – Excitement/anticipation? – Motivational value – Projects to SNc

slide-32
SLIDE 32

Wanting in the VTA/SNc

  • VTA  SNc

– Dopaminergic System – Incentive Salience Hypothesis – Working for pleasure (want/drive)

  • What if there is no drive (no dopamine)?
  • Addiction, cues, and precommitment
slide-33
SLIDE 33

Endogenous Opioids

  • Exogenous Opiates: Opium, Morphine, Heroin

– May mediate Hedonic value

  • Increases liking of other stimuli
  • Decreases perception of pain

– Endogenous released in response to primary reinforcers

  • Which and how many activated may determine preference

– Nipple Suckers – Play Halo or Watch Cartoons

slide-34
SLIDE 34

Punishment Signaling

  • Somatosensory Cortex (S1)

– Nociceptors

  • Social Rejection

– Insular Cortex (Insula)

  • Dorsal posterior insula
  • Degree of activation correlates with

magnitude of punisher

– Dorsal Anterior Cingulate Cortex

  • Motivational value of punishment
slide-35
SLIDE 35

Drug Addiction

  • Pathological

– Known harmful consequences – Concurrent reinforcement

  • “Yay drugs” & “Boo withdrawals”
  • Dopaminergic System

– Stroke damage to insula can wipe out addiction

slide-36
SLIDE 36

“Might as well face it, you’re addicted to love”

  • Behavioral Addiction

– Gambling, VR Schedules (Skinner), and Gambler’s Fallacy – Parkinson’s patients and dopamine agonists – Cognitive and Behavioral Therapies based on Conditioning

slide-37
SLIDE 37

Not All Conditioning is Equal

  • Partial Reinforcement Effect

– Partial Reinforcement Extinction Effect (PREE)

  • Frustration (Amsel) vs. Sequential (Capaldi) Theories
  • Fixed vs. Variable & Ratio vs. Interval

– Child rearing, pet training, gambling, supersition

slide-38
SLIDE 38

What explains the PREE?

Frustration Theory (Amsel)

CRF R+ Extinction R- Frustration Punishes Response Evidence for Frustration:

  • Behavior of pigeons
  • Children tantrums

CRF: R+ R+ R+ R+ R+ R+

  • Develop (R-O) expectancy

PRF: R+ R+ R- R+ R- R-

  • Develop (R-O) and (R-no O) expectancy

S (frustration) R O

slide-39
SLIDE 39

What explains the PREE?

Sequential Theory (Capaldi)

Outcome of previous trial serves as a cue for subsequent behavior PRF: R+ R+ R- R+ R- R- Fm Fm NFm Fm NFm NFm

  • NFm – R (S-R) strengthened by next R+

What happens with long ITI?....Decay

  • Frustration?
  • Memory?

Stronger PREE with long ITI

slide-40
SLIDE 40

Complex Behavior

  • Response Chaining

– Backward Chaining – Breaks in the “chain” – Animal intelligence

slide-41
SLIDE 41

Striatum and Skill/Habit

  • Caudate, putamen,

nucleus accumbens

  • Organizes somatosensory

representations and motor responses for planning and executing goal-oriented behavior.

slide-42
SLIDE 42

Double Dissociation

  • Broca vs. Wernicke
slide-43
SLIDE 43

Packard et al. (1989)

  • Radial Arm Maze (8 arms)
  • Win-Stay vs. Win-Shift
slide-44
SLIDE 44

Response vs. Place Learning

slide-45
SLIDE 45

Habit Learning in Humans

  • Parkinson’s Disease

– Impaired dopaminergic system in striatum

  • Huntington’s Disease

– Loss of some striatal function

(Gabrieli, 1995)

slide-46
SLIDE 46

Weather Prediction Game

  • Knowlton et al. (1996)
slide-47
SLIDE 47

Weather Prediction Game

  • Knowlton et al. (1996)
slide-48
SLIDE 48

Weather Prediction Game

  • Poldrack et al. (1999)
slide-49
SLIDE 49

Neurophysiological Data

  • Mink (1996)

– Neurons in striatum fire in anticipation of movement

  • Schultz (2006)

– DA Neurons from brain stem into striatum – Fire with expectation and reception of rewards

  • Blocking and expectation
slide-50
SLIDE 50

Loose Ends

  • Addiction and Drug Use

– Dopamine and Reward

  • Stress and Memory

– Anxiogenics  Response Strategy (Packard & Wingard, 2004)

  • Peripheral or Intra-Basolateral Amygdala (Hippocampus)
  • Yohimibine, RS78848-197, Vehicle (Placebo)

– “Autopilot”