SLIDE 3 15/09/2017 3
- Simply running a DSM on a bilingual corpus would result in
clustering by language → vectors do not reflect semantics
- (Most?) current work in bilingual DSMs:
− Get two monolingual vector spaces − Combine such that translation equivalents receive similar vectors
- No way to tell the languages apart
− Not evaluated against human processing data
Challenge #2
Models of the bilingual lexicon
Artetxe, Labaka, & Agirre (ACL, 2017)
- The bilingual mental lexicon: Languages are integrated but can
still be told apart
- Psycholinguistic models of the bilingual mental lexicon:
‒ Account for response/naming times, cognate/homograph effects, interlingual priming, etc. ‒ But: small vocabularies, not trainable, no realistic semantics
Costa et al. (Cognitive Science, 2017)
Challenge #2
Models of the bilingual lexicon