Guiding Interaction Behaviors for Multi-modal Grounded Language - - PowerPoint PPT Presentation

guiding interaction behaviors for multi modal grounded
SMART_READER_LITE
LIVE PREVIEW

Guiding Interaction Behaviors for Multi-modal Grounded Language - - PowerPoint PPT Presentation

Guiding Interaction Behaviors for Multi-modal Grounded Language Learning Jesse Thomason, Jivko Sinapov & Raymond J. Mooney Presented by Siliang Lu Multi-modal grounded language learning Multiple modalities Physical properties of objects


slide-1
SLIDE 1

Guiding Interaction Behaviors for Multi-modal Grounded Language Learning

Jesse Thomason, Jivko Sinapov & Raymond J. Mooney

Presented by Siliang Lu

slide-2
SLIDE 2

Multi-modal grounded language learning

Modalities: Audio, Haptics, visual colors and shapes Language predicates Physical properties of objects in the world

Multiple modalities

Visual predicate (i.e. “red”) Non-visual predicate (i.e. “empty”)

Interaction behaviors

Behaviors: look, drop, grasp, hold, lift, lower, press, push

slide-3
SLIDE 3

Classification

slide-4
SLIDE 4

Consideration of only validation confidence

Method:

  • SVM using the feature space for each sensorimotor context (a

combination of a behavior and sensory modality) Sensorimotor context

slide-5
SLIDE 5

Consideration of only validation confidence

slide-6
SLIDE 6

Confidence and behavior annotations

slide-7
SLIDE 7

Confidence and behavior annotations

slide-8
SLIDE 8

Confidence and multi-modality annotations

Modalities: auditory, haptic, visual color and visual shapes (fpfh)

slide-9
SLIDE 9

Sharing confidence between related predicates

  • Calculating cosine distance in word embedding space by using

Word2Vec

slide-10
SLIDE 10

Sharing confidence between related predicates

i.e. if kappa of “thin, grasp/haptic” is high for the predicate “narrow”, we should trust grasp/haptic sensorimotor context

slide-11
SLIDE 11

Results

slide-12
SLIDE 12

Results

  • Adding behavior annotations or modality

annotations improves performance over using kappa alone

  • Sharing kappa information improves recall at

the cost of precision

  • Trade-off due to real world “noise” in

specific domains.

  • i.e. “water” correlated with object

weights

slide-13
SLIDE 13

Future work

  • Apply behavior annotations in an embodied dialog agent
  • Explore other methods of sharing information between

predicates such as using a maximally similar neighbor word

  • i.e. the best neighbor of “narrow” is “thin”
slide-14
SLIDE 14

Thanks!