Categorization by Sensory-Motor Interaction in Artificial Agents - - PDF document

▶

Jan 20, 2024 506 likes •634 views

Categorization by Sensory-Motor Interaction in Artificial Agents Martin Tak a c Dept. of Applied Informatics Faculty of Mathematics, Physics, and Informatics Comenius University, Slovakia takac@ii.fmph.uniba.sk http://www.fmph.uniba.sk/

SLIDE 1

Categorization by Sensory-Motor Interaction in Artificial Agents

Martin Tak´ aˇ c

Dept. of Applied Informatics

Faculty of Mathematics, Physics, and Informatics Comenius University, Slovakia takac@ii.fmph.uniba.sk http://www.fmph.uniba.sk/∼takac

SLIDE 2

Abstract

We propose a computational model of categorization, grounding the categories in sensory-motor interaction with a dynamical environment. Simple perceptual categories are represented as discrim- ination criteria – membership functions based on dis- tributional information about intra-cluster variances of properties of a category, the representation more effec- tive for predictions than prototypes. Complex categories are represented as cross-categorial associations of criteria of objects, actions and changes, hence, they support action-based inferences and can serve as grounded meanings not only for nouns and ad- jectives, but also for at least elementary verbs in models

f language evolution and acquisition.

SLIDE 3

The Goal

1. Propose and test cognitively relevant representation
f various types of categories that could serve as

grounded meanings in language models.

2. Propose and test mechanisms of the formation of

categories by sensory-motor interaction with a (sim- ulated) dynamic environment. Cognitive Relevance The proposed model is consistent with the following findings/hypotheses:

Perceptual symbols, context sensitivity of represen-

tation (Barsalou).

Basic-level categories, prototype effects (Rosch).
Geometric conceptual representation (G¨

ardenfors).

Importance of similarity (Tversky).
Representation of affordances (Gibson) and verb-

islands (Tomasello).

Neuroscience: simple categories in perceptual sub-

systems (Ungerleider and Mishkin, Orban et al., Ri- zolatti et al.), complex categories in association ar- eas.

SLIDE 4

The model

Environment Non-toroidal 2D grid with objects having properties that can change in (discrete) time. The agent senses objects

n the grid within some distance from itself.

Actions The agent has a repertoire of actions (e.g. touch, lift, move), which can be performed with particular param- eters, e.g. pushing with different forces, stretching the arm at a different angle, walking with different sizes of step etc. Actions performed on an object cause changes

f it’s attribute values.

Perception The sensations of the agent are in the form of percep- tual frames of objects, actions and changes. Object frame example: {weight : 10 , size : 3 , posX : 2 , posY : 6} Action frame example: moveBy, {x : 0 , y : −8}

SLIDE 5

The action type (e.g. moveBy) is the abstraction of a non-declarative agent’s knowledge of the action – a mo- tor stereotype of invariant characteristics of the action, while action parameters are varying characteristics of a particular execution of the action. Change frame example: {∆posX : −2} Representation of Categories Categories are represented by discrimination criteria. Each discrimination criterion representing a particular concept is a membership function that takes a percep- tual frame as an argument and returns a value from the closed interval [0, 1], expressing to what extent is the frame an instance of the concept (0 means not at all, 1 means the best, prototypical example). A discrimination criterion records the mean and variance

f each attribute common to all instances of the concept

seen so far. The membership function r evaluates the similarity of the input percept f with the mean case

f the category inversely weighted by the variances of

particular attributes a: r(f) = sim(r, f) = e−kdist(r,f)

SLIDE 6

where

dist(r, f) =

|Ar|

a∈Ar

(f.a − r.a)2 σ2(r.a) Categorization Process Objects and actions are grouped to categories by the

change. That is, if an action leads to the same change
n several objects, they will all fall in the same category

and vice versa. All action categories associated with some object category represent agent’s knowledge of affordances of the object, while all object categories associated with an action category form the precursor of a verb-centered semantic representation – a verb island. For a perceived object, action and change, the most similar of the stored cross-categorial associations is found. If the change category of the association is similar enough to the perceived change (the similarity is bigger than the threshold θ(t)), the percepts are considered to be the instances of the associated categories and all three cat- egories are updated by the percepts. Otherwise, a new category is created for the less similar percept of either the object, or the action. The prediction threshold θ(t) increases in time to model the child’s growing ability to distinguish differences in the environment.

SLIDE 7

Experiment

The 25 × 25 environment contained the agent and 30 randomly placed objects – 10 ”fruits”, 10 ”toys” and 10 ”pieces of furniture”. The initial values of object attributes were randomly generated from respective in- tervals of a predefined pattern, e.g.

{weight:

[1, 3], size: [1, 49], color: [0, 4], roundness: [0, 9], posX: [0, 24], posY : [0, 24], posZ: 0} for fruits, and {weight: [20, 49], size: [20, 49], legs: [0, 4], material: [0, 9], posX: [0, 24], posY : [0, 24], posZ: 0} for pieces of furniture. In each time step, the agent randomly chose one of the

bjects and performed on it an action randomly gener-

ated from the pattern actionType: liftUp, {armPosIncrease: [1, 9], force: [1, 19]} or actionType: putDown, {armPosDecrease: [1, 9]}. The effects of the action on the chosen object were simulated by the environment: the action liftUp lead to increase of the posZ attribute of the object by the value

f armPosIncrease, if the force was greater than the

weight of the object, otherwise the action had no effect. In the case of putDown action, the posZ attribute of the object was set to max(0,posZ–armPosDecrease).

SLIDE 8

The Results

Threshold Effect

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 1000 2000 3000 4000 5000 5 10 15 20 25 30 prediction, generality, threshold total number of criteria time (a) prediction generality threshold criteria

(a) While the prediction threshold is low, the agent only uses a few basic criteria. Then the number of crite- ria starts to rapidly increase, which leads to a better accuracy of the prediction. As the threshold stabilizes, the total number of criteria slowly saturates, together with the generality exponentially decaying to a certain

value. The prediction value converges to approximately

0.7 corresponding to the average distance σ, which is an average intra-cluster distance of the category, hence the criteria give correct predictions.

SLIDE 9

Merging

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 1000 2000 3000 4000 5000 5 10 15 20 25 prediction, generality, threshold total number of criteria time (b) prediction generality threshold criteria

(b) Merging of similar criteria keeps the number of cri- teria lower at the cost of lower accuracy (higher gener- ality). In the second experiment, the agent merges similar cri- teria every 50th time step since the time 1000. This decreases the total number of the criteria at the cost

f more general predictions. The prediction value again

converges to approximately 0.7, i.e. the criteria give correct predictions.

SLIDE 10

Comparison to Prototypes

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1000 2000 3000 4000 5000 5 10 15 20 25 30 35 40 45 50 prediction, generality, threshold total number of criteria time (c) prediction generality threshold criteria

(c) The proposed representation copes with a different importance of attributes (or scaling of different dimen- sions) by recording their intra-category variances. In order to compare it with a standard prototype repre- sentation, we ran an experiment, where criteria behaved like prototypes in conceptual spaces (in that the terms

f the sum in the formula for computing the distance

were not divided by the variances). Despite that the criteria were merged, the number of necessary criteria is almost double and the prediction value is lower than in the case with variances.

SLIDE 11

Example of resulting categories

Object criteria: posX posY posZ weight color C1 13 ± 7 13 ± 8 0 ± 0 37 ± 23 C2 14 ± 7 13 ± 8 4 ± 13 39 ± 22 C3 11 ± 6 10 ± 7 35 ± 28 4 ± 3 2 ± 2 C4 11 ± 6 10 ± 7 25 ± 23 4 ± 3 Associations: Action Category

putDown(5 ± 3) liftUp(6 ± 2, 10 ± 6)

C1 no change C2 no change C3 ∆={posZ: −6 ± 1} ∆={posZ: 7 ± 1} C4 ∆={posZ: −4 ± 2} ∆={posZ: 5 ± 2} Number of objects of each type for a category they are most similar to: Category Object type C1 C2 C3 C4 agent 1 fruit 8 2 toy 1 3 6 furniture 5 5

SLIDE 12

Conclusion

The proposed representation allows for:

graded category membership and prototype effects,
sensitivity to differences in intra-category variability
f particular properties,
hierarchical categories of different generality,
asymmetric similarity judgements,
situated perceptual representation of objects, changes

and actions,

more effective predictions than prototypes without

variances. Once an agent can represent predictions about the out- come of it’s actions, it can use them for planning se- quences of actions (macro-affordances) to satisfy it’s needs and goals. Cross-categorial representation is inherently contextual – different affordances are picked-up depending on the desired effect on the environment.