SLIDE 10 Introduction Transition-based Parsing with Neural Nets Results and Analysis Network Architecture Embeddings Training and Decoding
Embeddings
Chen and Manning (2014) use the following word embeddings Sw (18 elements):
1 top three words on stack and buffer: s1, s2, s3, b1, b2, b3; 2 first and second leftmost/rightmost children of top two words
- n stack: lc1(si), rc1(si), lc2(si), rc2(si), i = 1, 2;
3 leftmost of leftmost/rightmost of rightmost children of top
two words on the stack: lc1(lc1(si)), rc1(rc1(si)), i = 1, 2. Tag embeddings St (18 elements): same as word embeddings. Arc label embeddings Sl (12 elements): same as word embeddings, excluding those the six words on the stack/buffer.
Frank Keller Natural Language Understanding 9