cognition for intelligent robotics
play

Cognition for Intelligent Robotics Architectures and Action - PowerPoint PPT Presentation

Cognition for Intelligent Robotics Architectures and Action Selection Joanna J. Bryson University of Bath, United Kingdom Why Action Selection? Why Action Selection? Functionalist Assumption: All we care about is producing intelligent


  1. Soar • Soar has serious engineering. • “Evolution of Soar” !"#$%&'($&#)* 0".%* 2.3"% 67.895- +895-8-#$.$&"# +,-./ 1-%/&"# 4-/(5$/ 0:/$-8/ is my favorite /-*0 7)89.*., #$%&'$& 1,5=9=-3>:650, 7-*<?>@ ABBB 7/C$ 1,2,34,356 :-;,<,35, ()*+,'-. CG2<-H,4 &*5D=<@7-*< &:JK&+ paper (Laird & 7-*<L>@ ABBM C3.,<I*5,9 EFD@7-*< F<*22,< N=O; D=<@7-*< Rosenbloom 1996) 7-*<M>@ ABBQ : PII=5=,356 C39.<)5.-@7-*< PR.,<3*0 D=<@7-*< 1,9.<)5.=H, 7=3O0,>7.*., 7-*<!>@ AB?B &*9+9 N,<-@7-*< $2,<*.-<9 P&@7-*< PR.,<3*0> 7-*<T>@ AB?M %&: SJ@7-*< E,0,*9, /,3,<*0 EA@7-*< :;)3+=3O 7-*<U>@ AB?T J,*<3=3O %3=H,<9*0 EA@7-*< $V7! V<,I,<,35,9 7)8O-*09 7-*<Q>@ AB?U 7)8O-*0=3O 162*<@7-*< J=92 V<-4)5.=-3 %3=H,<9*0 WDV7>Q F,*+ 7-*<A>@ AB?Q &-6>&*9+9 769.,G9 F,*+>#,.;-4 J=92 #,.;-49 76G8-0 N,)<=9.=5 V<-80,G> 769.,G9 7,*<5; 72*5,9 !"

  2. Soar • Soar has serious engineering. • “Evolution of Soar” !"#$%&'($&#)* 0".%* 2.3"% 67.895- +895-8-#$.$&"# +,-./ 1-%/&"# 4-/(5$/ 0:/$-8/ is my favorite /-*0 7)89.*., #$%&'$& 1,5=9=-3>:650, 7-*<?>@ ABBB 7/C$ 1,2,34,356 :-;,<,35, ()*+,'-. CG2<-H,4 &*5D=<@7-*< &:JK&+ paper (Laird & 7-*<L>@ ABBM C3.,<I*5,9 EFD@7-*< F<*22,< N=O; D=<@7-*< Rosenbloom 1996) 7-*<M>@ ABBQ : PII=5=,356 C39.<)5.-@7-*< PR.,<3*0 D=<@7-*< 1,9.<)5.=H, 7=3O0,>7.*., 7-*<!>@ AB?B • Admits problems! &*9+9 N,<-@7-*< $2,<*.-<9 P&@7-*< PR.,<3*0> 7-*<T>@ AB?M %&: SJ@7-*< E,0,*9, /,3,<*0 EA@7-*< :;)3+=3O 7-*<U>@ AB?T J,*<3=3O %3=H,<9*0 EA@7-*< $V7! V<,I,<,35,9 7)8O-*09 7-*<Q>@ AB?U 7)8O-*0=3O 162*<@7-*< J=92 V<-4)5.=-3 %3=H,<9*0 WDV7>Q F,*+ 7-*<A>@ AB?Q &-6>&*9+9 769.,G9 F,*+>#,.;-4 J=92 #,.;-49 76G8-0 N,)<=9.=5 V<-80,G> 769.,G9 7,*<5; 72*5,9 !"

  3. Soar • Soar has serious engineering. • “Evolution of Soar” !"#$%&'($&#)* 0".%* 2.3"% 67.895- +895-8-#$.$&"# +,-./ 1-%/&"# 4-/(5$/ 0:/$-8/ is my favorite /-*0 7)89.*., #$%&'$& 1,5=9=-3>:650, 7-*<?>@ ABBB 7/C$ 1,2,34,356 :-;,<,35, ()*+,'-. CG2<-H,4 &*5D=<@7-*< &:JK&+ paper (Laird & 7-*<L>@ ABBM C3.,<I*5,9 EFD@7-*< F<*22,< N=O; D=<@7-*< Rosenbloom 1996) 7-*<M>@ ABBQ : PII=5=,356 C39.<)5.-@7-*< PR.,<3*0 D=<@7-*< 1,9.<)5.=H, 7=3O0,>7.*., 7-*<!>@ AB?B • Admits problems! &*9+9 N,<-@7-*< $2,<*.-<9 P&@7-*< PR.,<3*0> 7-*<T>@ AB?M %&: SJ@7-*< E,0,*9, • Not enough /,3,<*0 EA@7-*< :;)3+=3O 7-*<U>@ AB?T J,*<3=3O %3=H,<9*0 EA@7-*< $V7! V<,I,<,35,9 7)8O-*09 7-*<Q>@ AB?U 7)8O-*0=3O 162*<@7-*< J=92 applications for V<-4)5.=-3 %3=H,<9*0 WDV7>Q F,*+ 7-*<A>@ AB?Q &-6>&*9+9 769.,G9 F,*+>#,.;-4 J=92 #,.;-49 human-like AI 76G8-0 N,)<=9.=5 V<-80,G> 769.,G9 7,*<5; 72*5,9 !"

  4. Architecture Lessons (from CMU) • An architecture needs: • action from perception, and • further structure to combat combinatorics. • Dealing with time is hard.

  5. ACT -R • Learns (& executes) productions. • For arbitration, rely on (Bayesian probabalistic) utility. • Call it implicit knowledge.

  6. ACT -R Research Programme • Replicate lots of Cognitive Intentional Module Declarative Module (not identified) (Temporal / Hippocampus) Science results. • See if the brain Goal Buffer Retrieval Buffer (DLPFC) (VLPFC) does what you (Basal Ganglia) Productions Matching (Striatum) think it needs to. Selection (Pallidum) Execution (Thalamus) • Win Rumelhart Visual Buffer Manual Motor Prize (John (Parietal) (Motor) Anderson, Visual Module Manual Module (Occipital/Parietal) 2000). (Motor/Cerebellum) External World

  7. Architecture Lessons (from CMU) • Architectures need productions and problem spaces. • Real-time is hard. • Being easy to use can be a win.

  8. Architecture Lessons (from CMU) • Architectures need productions and problem spaces. • Real-time is hard. • Being easy to use can be a win.

  9. Spreading Activation Networks • “Maes Nets” (Adaptive Neural Arch.; Maes 1989) • Activation spreads from senses and from goals through net of actions. • Highest activated

  10. Spreading Activation Networks

  11. Spreading Activation Networks • Sound good: • easy • brain-like (priming, action potential). • Still influential (Franklin 2000, Shanahan 2006).

  12. Spreading Activation Networks • Sound good: • easy • brain-like (priming, action potential). • Still influential (Franklin 2000, Shanahan 2006). • Can ʼ t do full action selection: • Don ʼ t scale; don ʼ t converge on comsumatory acts (Tyrrell 1993).

  13. Tyrrell (1993) = small negative activation Distance Night Prox Low Health Dirtiness 1.4 from Den = zero activation = small positive activation Keep = positive activation Sleep in Den Clean Reproduce = large positive activation (1.0) T T U U -0.02 -0.15 -0.25 -0.05 Den -0.05 -0.05 -0.02 in Sq -0.30 -0.10 -0.01 -0.04 -0.08 Mate Court Sleep Approach Explore For Mates Clean Approach Approach Leave Mate P. Den R. Den this Sq Explore P. Den R. Den P. Mate Rand. Dir All Dirs No Den in Sq Receptive Mate in Sq Courted Mate in Sq Den No Den in Sq in Sq Clean N NE E SE S SW W NW Sleep Mate Court Move Actions Extended Rosenblatt and Payton Free-Flow Hierarchy

  14. Subsumption (Brooks 1986)

  15. Subsumption (Brooks 1986) • Emphasis on sensing to action (via Augmented FSM).

  16. Subsumption (Brooks 1986) • Emphasis on sensing to action (via Augmented FSM). • Very complicated, distributed arbitration.

  17. Subsumption (Brooks 1986) • Emphasis on sensing to action (via Augmented FSM). • Very complicated, distributed arbitration. • No learning.

  18. Subsumption (Brooks 1986) • Emphasis on sensing to action (via Augmented FSM). • Very complicated, distributed arbitration. • No learning. • Worked.

  19. Architecture Lessons (Subsumption)

  20. Architecture Lessons (Subsumption) • Action from perception can provide the further structure -- modules (behaviors). • Modules also support iterative development / continuous integration.

  21. Architecture Lessons (Subsumption) • Action from perception can provide the further structure -- modules (behaviors). • Modules also support iterative development / continuous integration. • Real time should be a core organizing principle -- start in the real world.

  22. Architecture Lessons (Subsumption) • Action from perception can provide the further structure -- modules (behaviors). • Modules also support iterative development / continuous integration. • Real time should be a core organizing principle -- start in the real world. • Good ideas can carry bad ideas a long way (no learning, hard action selection).

  23. Architecture Lesson? • Goals ordering needs to be flexible.

  24. Architecture Lesson? • Goals ordering needs to be flexible. • Maybe spreading activation is good for this.

  25. SA: Layers vs. Behaviours • Relationship not evident except in development!

  26. SA: Layers vs. Behaviours • Relationship not evident except in development!

  27. SA: Layers vs. Behaviours • Relationship not evident except in development!

  28. SA: Layers vs. Behaviours • Relationship not evident except in development!

  29. SA: Layers vs. Behaviours • Relationship not evident except in development!

  30. Layered or Hybrid Architectures 1. Incorporate behaviors/modules (action from sensing) as “smart” primitives. 2. Use hierarchical dynamic plans for behavior sequencing. 3. (Allegedly) some have automated planner to make plans for layer 2. • Examples: Firby/RAPS/3T ( ʻ 97); PRS (1992-2000); Hexmoore ʻ 95; Gat ʻ 91-98

  31. Belief, Desires, • Beliefs : Intentions (BDI) Predicates • Desires : goals & related dynamic plans • Intentions : current goal

  32. Procedural Reasoning System

  33. Procedural Reasoning System • BDI

  34. Procedural Reasoning System • BDI • And reactive (responds to emergencies by changing intentions.)

  35. Procedural Reasoning System • BDI • And reactive (responds to emergencies by changing intentions.) • Er... once or twice (Bryson ATAL 2000).

  36. Architecture Lessons • Structured dynamic plans make it easier to get your robot to do complicated stuff. • Automated planning (or for Soar, chunking/ learning) is seldom actually used. • To facilitate that automated planning, modularity is often compromised. (Bryson JETAI 2000, Brom & Bryson 2006)

  37. Soar as a 3LA J. Laird & P. Rosenbloom, “The Evolution of the Soar Cognitive Architecture”, Mind Matters, D. Steier and T. Mitchell eds., 1996.

  38. CogAff • Reflection on Top. • Sense & Action separated! • (Davis & Sloman 1995)

  39. CogAff • Reflection on Top. • Sense & Action separated! • Hierarchy in AS; Goal Swapping (Alarms). • (Sloman 2000)

  40. CogAff • Reflection on Top. • Sense & Action separated! • Hierarchy in AS, Goal Swapping (now reactive). • Current Web

  41. Separate Sense & Action • Something we higher mammals do. • Central Sulcus Chance for Cognition? (pictures from Carlson)

  42. Architecture Lessons (CogAff)

  43. Architecture Lessons (CogAff) • Maybe you don’t really want productions as your basic representation -- you may want to come between a sense and an act sometimes.

  44. Architecture Lessons (CogAff) • Maybe you don’t really want productions as your basic representation -- you may want to come between a sense and an act sometimes. • Your architecture looks very different if you really worry about adult human literary- level behaviour rather than just making something work.

  45. Outline • Introduction Intelligence, Cognition & Architecture • A Brief History of AI Cognitive Architectures • Behavior Oriented Design

  46. Outline • Introduction Intelligence, Cognition & Architecture • A Brief History of AI Cognitive Architectures • Behavior Oriented Design Conclusion / Recommendations

  47. Behavior Oriented Design • All search (learning, planning) is done within modules with specialized representations. • Specialized representations promote reliability of search; also determine decomposition. • Modules provide perception, action, memory. Arbitration via hierarchical dynamic plans. • Iterative / agile test & development cycle. (Bryson 2001, 2003)

  48. BOD Applications

  49. BOD Applications (ATAL 1997)

  50. (SAB 2000) = small negative activation Distance Night Prox 1.4 Low Health Dirtiness from Den = zero activation = small positive activation = positive activation Keep Sleep in Den Clean Reproduce = large positive activation (1.0) T U T U -0.02 -0.15 -0.25 -0.05 -0.05 Den -0.05 BOD Applications -0.02 in Sq -0.30 -0.10 -0.04 -0.01 -0.08 Mate Court Sleep Approach Explore For Mates Approach Approach Clean Leave Mate P. Den R. Den this Sq Explore P. Den R. Den P. Mate Rand. Dir All Dirs No Den in Sq Receptive Mate in Sq Courted Mate in Sq Den No Den in Sq in Sq N NE E SE S SW W NW Sleep Clean Mate Court Move Actions freeze (see predator t) (covered t) (hawk t) hold still run away (see predator t) pick safe dir go fast flee (C) (sniff predator t) look observe predator inseminate (courted mate here t) copulate mate (C) (sniff mate t) court (mate here t) strut pursue pick dir mate go life (D) triangulate (getting lost t) pick dir home go 14 home 1::5 (late t) (at home ⊥ ) pick dir home go 12 check 1::5 look around 10 use resource (needed res avail t) exploit resource exploit (C) (day time t) Fitness 8 leave pick dir go sleep at home (at home t) (day time ⊥ ) sleep 6 4 2 0 1 2 3 4 5 6 7 8 9 (Sparse)Std (Sparse)Var1 (Sparse)Var2 (Sparse)Var3 (ATAL 1997)

  51. (SAB 2000) = small negative activation Distance Night Prox 1.4 Low Health Dirtiness from Den = zero activation = small positive activation = positive activation Keep Sleep in Den Clean Reproduce = large positive activation (1.0) T U T U -0.02 -0.15 -0.25 -0.05 -0.05 Den -0.05 BOD Applications -0.02 in Sq -0.30 -0.10 -0.04 -0.01 -0.08 Mate Court Sleep Approach Explore For Mates Approach Approach Clean Leave Mate P. Den R. Den this Sq Explore P. Den R. Den P. Mate Rand. Dir All Dirs No Den in Sq Receptive Mate in Sq Courted Mate in Sq Den No Den in Sq in Sq N NE E SE S SW W NW Sleep Clean Mate Court Move Actions freeze (see predator t) (covered t) (hawk t) hold still run away (see predator t) pick safe dir go fast flee (C) (sniff predator t) look observe predator inseminate (courted mate here t) copulate mate (C) (sniff mate t) court (mate here t) strut pursue pick dir mate go life (D) triangulate (getting lost t) pick dir home go 14 home 1::5 (late t) (at home ⊥ ) pick dir home go 12 check 1::5 look around (VR(J) 2000) 10 use resource (needed res avail t) exploit resource exploit (C) (day time t) Fitness 8 leave pick dir go sleep at home (at home t) (day time ⊥ ) sleep 6 4 2 0 1 2 3 4 5 6 7 8 9 (Sparse)Std (Sparse)Var1 (Sparse)Var2 (Sparse)Var3 (ATAL 1997)

  52. � � � � � � � (SAB 2000) = small negative activation Distance Night Prox 1.4 Low Health Dirtiness from Den = zero activation = small positive activation = positive activation Keep Sleep in Den Clean Reproduce = large positive activation (1.0) T U T U -0.02 -0.15 -0.25 -0.05 -0.05 Den -0.05 BOD Applications -0.02 in Sq -0.30 -0.10 -0.04 -0.01 -0.08 Mate Court Sleep Approach Explore For Mates Approach Approach Clean Leave Mate P. Den R. Den this Sq Explore P. Den R. Den P. Mate Rand. Dir All Dirs No Den in Sq Receptive Mate in Sq Courted Mate in Sq Den No Den in Sq in Sq N NE E SE S SW W NW Sleep Clean Mate Court Move Actions freeze (see predator t) (covered t) (hawk t) hold still run away (see predator t) pick safe dir go fast flee (C) (sniff predator t) look observe predator inseminate (courted mate here t) copulate mate (C) (sniff mate t) court (mate here t) strut pursue pick dir mate go life (D) triangulate (getting lost t) pick dir home go 14 home 1::5 (late t) (at home ⊥ ) pick dir home go 12 check 1::5 look around (VR(J) 2000) 10 use resource (needed res avail t) exploit resource exploit (C) (day time t) Fitness 8 leave pick dir go sleep at home (at home t) (day time ⊥ ) sleep 6 4 2 0 1 2 3 4 5 6 7 8 9 (Sparse)Std (Sparse)Var1 (Sparse)Var2 (Sparse)Var3 (Animal Cog 2007 apparatus � � � find- color , reward-found, new-test, Action CogSci 2009) test-board Selection no-test, finish-test, save-result, rewarded reward � ����������������������� � � � grasping, noises , grasp-seen target-chosen, focus-rule, pick- block , monkey priority-focus, rules-from-reward ���������������������� visual-attention hand � ������������������� (ATAL 1997) look-at �������������������� rule-learner sequence *attendants make-choice, seq *rule-seqs sig-dif learn-from-reward current-focus weight-shift current-rule

  53. � � � � � � � (SAB 2000) = small negative activation Distance Night Prox 1.4 Low Health Dirtiness from Den = zero activation = small positive activation = positive activation Keep Sleep in Den Clean Reproduce = large positive activation (1.0) T U T U -0.02 -0.15 -0.25 -0.05 -0.05 Den -0.05 BOD Applications -0.02 in Sq -0.30 -0.10 -0.04 -0.01 -0.08 Mate Court Sleep Approach Explore For Mates Approach Approach Clean Leave Mate P. Den R. Den this Sq Explore P. Den R. Den P. Mate Rand. Dir All Dirs No Den in Sq Receptive Mate in Sq Courted Mate in Sq Den No Den in Sq in Sq N NE E SE S SW W NW Sleep Clean Mate Court Move Actions freeze (see predator t) (covered t) (hawk t) hold still run away (see predator t) pick safe dir go fast flee (C) (sniff predator t) look observe predator inseminate (courted mate here t) copulate mate (C) (sniff mate t) court (mate here t) strut pursue pick dir mate go life (D) triangulate (getting lost t) pick dir home go 14 home 1::5 (late t) (at home ⊥ ) pick dir home go 12 check 1::5 look around (VR(J) 2000) 10 use resource (needed res avail t) exploit resource exploit (C) (day time t) Fitness 8 leave pick dir go sleep at home (at home t) (day time ⊥ ) sleep 6 4 2 (WRAC 2003, 0 1 2 3 4 5 6 7 8 9 (Sparse)Std (Sparse)Var1 (Sparse)Var2 (Sparse)Var3 PTRS B 2007, BICA 2008) (Animal Cog 2007 apparatus � � � find- color , reward-found, new-test, Action CogSci 2009) test-board Selection no-test, finish-test, save-result, rewarded reward � ����������������������� � � � grasping, noises , grasp-seen target-chosen, focus-rule, pick- block , monkey priority-focus, rules-from-reward ���������������������� visual-attention hand � ������������������� (ATAL 1997) look-at �������������������� rule-learner sequence *attendants make-choice, seq *rule-seqs sig-dif learn-from-reward current-focus weight-shift current-rule

  54. � � � � � � � (SAB 2000) = small negative activation Distance Night Prox 1.4 Low Health Dirtiness from Den = zero activation = small positive activation = positive activation Keep Sleep in Den Clean Reproduce = large positive activation (1.0) T U T U -0.02 -0.15 -0.25 -0.05 -0.05 Den -0.05 BOD Applications -0.02 in Sq -0.30 -0.10 -0.04 -0.01 -0.08 Mate Court Sleep Approach Explore For Mates Approach Approach Clean Leave Mate P. Den R. Den this Sq Explore P. Den R. Den P. Mate Rand. Dir All Dirs No Den in Sq Receptive Mate in Sq Courted Mate in Sq Den No Den in Sq in Sq N NE E SE S SW W NW Sleep Clean Mate Court Move Actions freeze (see predator t) (covered t) (hawk t) hold still run away (see predator t) pick safe dir go fast flee (C) (sniff predator t) look observe predator inseminate (courted mate here t) copulate mate (C) (sniff mate t) court (mate here t) strut pursue pick dir mate go life (D) triangulate (getting lost t) pick dir home go 14 home 1::5 (late t) (at home ⊥ ) pick dir home go 12 check 1::5 look around (VR(J) 2000) 10 use resource (needed res avail t) exploit resource exploit (C) (day time t) Fitness 8 leave pick dir go sleep at home (at home t) (day time ⊥ ) sleep 6 4 2 (WRAC 2003, 0 1 2 3 4 5 6 7 8 9 (Sparse)Std (Sparse)Var1 (Sparse)Var2 (Sparse)Var3 PTRS B 2007, BICA 2008) (Animal Cog 2007 (IVA 2005, apparatus � � � find- color , reward-found, new-test, Action CogSci 2009) test-board Selection no-test, finish-test, save-result, rewarded reward � ����������������������� � � � CGames 2006 grasping, noises , grasp-seen target-chosen, focus-rule, pick- block , monkey priority-focus, rules-from-reward ���������������������� visual-attention hand � ������������������� IEEE SMC 2007) (ATAL 1997) look-at �������������������� rule-learner sequence *attendants make-choice, seq *rule-seqs sig-dif learn-from-reward current-focus weight-shift current-rule

  55. Modularity is not Enough Get Fuzzy (Conley 2006)

  56. BOD Action Selection Parallel-rooted, Ordered, Slip-stack Hierarchical (POSH) action selection: • Some things need to be checked at all times: drive collection. • Some things only need considering in particular context: competences. • Some things reliably follow from others: action patterns.

  57. POSH plan in ABODE (for UT: Capture the Flag) • Advanced BOD Environment. • Initial development funded by industry.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend