Action and Adaptation: Lessons from Neurobiology and Challenges for - - PowerPoint PPT Presentation

action and adaptation
SMART_READER_LITE
LIVE PREVIEW

Action and Adaptation: Lessons from Neurobiology and Challenges for - - PowerPoint PPT Presentation

Action and Adaptation: Lessons from Neurobiology and Challenges for Robot Cognitive Architectures INSTITUTO DE SISTEMAS E ROBTICA Rodrigo Ventura Institute for Systems and Robotics Instituto Superior Tcnico Lisbon, PORTUGAL


slide-1
SLIDE 1

INSTITUTO DE SISTEMAS E ROBÓTICA

Action and Adaptation:

Lessons from Neurobiology and Challenges for Robot Cognitive Architectures

Rodrigo Ventura

Institute for Systems and Robotics Instituto Superior Técnico Lisbon, PORTUGAL yoda@isr.ist.utl.pt

slide-2
SLIDE 2

INSTITUTO DE SISTEMAS E ROBÓTICA

Motivation

  • Design constraints for robot cognitive architectures
  • embodied agents
  • situated in a physical environment
  • receive raw sensory input
  • actions constrained by their physical structure

2

iCub humanoid robot (robotcub.org)

slide-3
SLIDE 3

INSTITUTO DE SISTEMAS E ROBÓTICA

Neurobiology

CEREBELLUM

3

slide-4
SLIDE 4

INSTITUTO DE SISTEMAS E ROBÓTICA

Neurobiology

CEREBELLUM

3

slide-5
SLIDE 5

INSTITUTO DE SISTEMAS E ROBÓTICA

Neurobiology

CEREBELLUM

3

slide-6
SLIDE 6

INSTITUTO DE SISTEMAS E ROBÓTICA

Neurobiology

CEREBELLUM

3

slide-7
SLIDE 7

INSTITUTO DE SISTEMAS E ROBÓTICA

Model of cerebellum function

controller forward model inverse model desired trajectory motor command sensory feedback +

  • +

+ + + world

(Kawato 1999)

4

slide-8
SLIDE 8

INSTITUTO DE SISTEMAS E ROBÓTICA

Model of cerebellum function

controller forward model inverse model desired trajectory motor command sensory feedback +

  • +

+ + + world

  • pen loop

fast feedback loop slow feedback loop

(Kawato 1999)

4

slide-9
SLIDE 9

INSTITUTO DE SISTEMAS E ROBÓTICA

Motor skill development

controller forward model inverse model desired trajectory motor command sensory feedback +

  • +

+ + + world

5

slide-10
SLIDE 10

INSTITUTO DE SISTEMAS E ROBÓTICA

Motor skill development

  • 1. Learn the controller — feedback, sensorimotor loop

controller forward model inverse model desired trajectory motor command sensory feedback +

  • +

+ + + world

5

slide-11
SLIDE 11

INSTITUTO DE SISTEMAS E ROBÓTICA

Motor skill development

  • 1. Learn the controller — feedback, sensorimotor loop
  • 2. Learn the forward model — prediction
  • similar function as the Smith regulator, 1958

controller forward model inverse model desired trajectory motor command sensory feedback +

  • +

+ + + world

5

slide-12
SLIDE 12

INSTITUTO DE SISTEMAS E ROBÓTICA

Motor skill development

  • 1. Learn the controller — feedback, sensorimotor loop
  • 2. Learn the forward model — prediction
  • similar function as the Smith regulator, 1958
  • 3. Learn the inverse model — feedforward, open loop

controller forward model inverse model desired trajectory motor command sensory feedback +

  • +

+ + + world

5

slide-13
SLIDE 13

INSTITUTO DE SISTEMAS E ROBÓTICA

Multiple models in the cerebel.

controller forward model inverse model desired trajectory motor command sensory feedback +

  • +

+ + + world

(Wolpert et al. 1998)

6

slide-14
SLIDE 14

INSTITUTO DE SISTEMAS E ROBÓTICA

Multiple models in the cerebel.

controller forward model inverse model desired trajectory motor command sensory feedback +

  • +

+ + + world

Context → responsability estimation

weight plasticity

(Wolpert et al. 1998)

6

slide-15
SLIDE 15

INSTITUTO DE SISTEMAS E ROBÓTICA

Decision making: BG-DA

action option considered go no-go facilitates response suppresses response

(premotor cortex) (BG, basal ganglia) (Frank et al. 2006)

7

slide-16
SLIDE 16

INSTITUTO DE SISTEMAS E ROBÓTICA

Decision making: BG-DA

dopamine reward expectancy match / mismatch action option considered go no-go facilitates response suppresses response

(premotor cortex) (BG, basal ganglia) (Frank et al. 2006)

7

slide-17
SLIDE 17

INSTITUTO DE SISTEMAS E ROBÓTICA

Decision making: BG-DA

dopamine reward expectancy match / mismatch action option considered go no-go facilitates response suppresses response

(premotor cortex) (BG, basal ganglia)

  • unexpected reward → dopamine release → promotes go
  • reward missing → dopamine drop → promotes no-go

(Frank et al. 2006)

7

slide-18
SLIDE 18

INSTITUTO DE SISTEMAS E ROBÓTICA

Decision making: BG-DA

dopamine reward expectancy match / mismatch action option considered go no-go facilitates response suppresses response

(premotor cortex) (BG, basal ganglia)

  • unexpected reward → dopamine release → promotes go
  • reward missing → dopamine drop → promotes no-go

(Frank et al. 2006)

Hebbian learning

7

slide-19
SLIDE 19

INSTITUTO DE SISTEMAS E ROBÓTICA

Decision making: BG-DA

  • Requires several trials

underlying estimation of reward probablity

  • Propagation of rewards backwards in time

reward expectancy gets transfered to the cause

  • Promotes instrumental learning

BG plays no longer an active role then

8

slide-20
SLIDE 20

INSTITUTO DE SISTEMAS E ROBÓTICA

Decision making: OFC

dopamine reward expectancy match / mismatch action option considered go no-go facilitates response suppresses response

(premotor cortex) (basal ganglia) (Frank et al. 2006)

9

slide-21
SLIDE 21

INSTITUTO DE SISTEMAS E ROBÓTICA

Decision making: OFC

dopamine reward expectancy match / mismatch action option considered go no-go facilitates response suppresses response OFC amygdala

(premotor cortex) (basal ganglia) (Frank et al. 2006)

9

slide-22
SLIDE 22

INSTITUTO DE SISTEMAS E ROBÓTICA

Decision making: OFC

dopamine reward expectancy match / mismatch action option considered go no-go facilitates response suppresses response OFC amygdala

(premotor cortex) (basal ganglia)

  • Orbitofrontal cortex (OFC): short-term memory of gain-loss information

coping with non-stationary environments (e.g. reversal learning)

  • Amygdala: provides valuation of possible outcomes (e.g., desirable)

(Frank et al. 2006)

9

slide-23
SLIDE 23

INSTITUTO DE SISTEMAS E ROBÓTICA

Two learning paradigms

  • Based on probability of future rewards:

slow adaptation performed by the BG

  • Based on past events:

quick adaptation of OFC

10

slide-24
SLIDE 24

INSTITUTO DE SISTEMAS E ROBÓTICA

Challenges

  • Binding problem
  • how does the brain integrate information processed in

different brain regions? — multi-modal, different time scales

  • Hypothesis:

event files (Hommel 2004): associate neural coding of perception (features integrated in object files) and related actions (action files)

11

slide-25
SLIDE 25

INSTITUTO DE SISTEMAS E ROBÓTICA

Challenges

  • Integration of continuous time motor

control with discrete time events

  • Hypothesis:

segmentation of perception in events (Kurby et al. 2008)

  • local predictors of perception
  • prediction error triggers segmentation
  • predictors require internal models
  • models aquired by experience

12

slide-26
SLIDE 26

INSTITUTO DE SISTEMAS E ROBÓTICA

Future directions

  • Perception
  • perception of object function is more basic than
  • bject qualities (Merleau-Ponty 1945)
  • affordances (Gibson 1979)
  • Non-utilitarian approaches to decision
  • case of regret (Coricelli et al. 2007)
  • insensitivity to probability of negative events (Loewenstein

et al. 2001)

  • neuroeconomics (Glimcher et al. 2004)

13

slide-27
SLIDE 27

INSTITUTO DE SISTEMAS E ROBÓTICA

Q & A

Thank you!

14

slide-28
SLIDE 28

INSTITUTO DE SISTEMAS E ROBÓTICA

References

Coricelli, G.; Dolan, R. J.; and Sirigu, A. 2007. Brain, emotion and decision making: the paradigmatic example of

  • regret. Trends in Cognitive Sciences 11(6):258–265.

Frank, M. J., and Claus, E. D. 2006. Anatomy of a decision: Striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. Psychological Review 113(2):300–326. Glimcher, P. W., and Rustichini, A. 2004. Neuroeconomics: The consilience of brain and decision. Science 306(5695):447–452. Hommel, B. 2004. Event files: feature binding in and across perception and action. Trends in Cognitive Sciences 8(11):494–500. Kawato, M. 1999. Internal models for motor control and trajectory planning. Current Opinion in Neurobiology 9(6):718–727. Kurby, C. A., and Zacks, J. M. 2008. Segmentation in the perception and memory of events. Trends in Cognitive Sciences 12(2):72–79. LeDoux, J. 1996. The Emotional Brain. Simon & Schuster. Loewenstein, G. F.; Weber, E. U.; Hsee, C. K.; and Welch, N. 2001. Risk as feelings. Psychological Bulletin 127(2): 267–286. Wolpert, D. M.; Miallb, R. C.; and Kawato, M. 1998. Internal models in the cerebellum. Trends in Cognitive Sciences 2(9):338–347. 15