 
              T RA P ACC and T RA P ACC S at PARSEME Shared Task 2018: Neural Transition Tagging of Verbal Multiword Expressions Regina Stodden, Behrang Qasemizadeh, Laura Kallmeyer regina.stodden@uni-duesseldorf.de SFB991, Heinrich Heine University D¨ usseldorf, Germany 25.08.2018
Modified arc-standard Transition System based on [Constant and Nivre, 2016, Saied et al., 2017] • The n input tokens on a stack are moved to a stack, and a set of processed lexical units P , using 2 ˆ n ‘transitions’ of the following types: Shift, Reduce, Merge and Complete. • Transitions and their types are learned/predicted using a classifier. Transition Stack Buffer MWEs INIT [] [’ROOT’, ’thank’, ’you’, ’for’, ’you’, ’time’, ’and’, ’attention’, ’.’] [] SHIFT [’ROOT’] [’thank’, ’you’, ’for’, ’you’, ’time’, ’and’, ’attention’, ’.’] [] SHIFT [’ROOT’, ’thank’] [’you’, ’for’, ’you’, ’time’, ’and’, ’attention’, ’.’] [] SHIFT [’ROOT’, ’thank’, ’you’] [’for’, ’you’, ’time’, ’and’, ’attention’, ’.’] [] MERGE AS ID [’ROOT’, [’thank’, ’you’]] [’for’, ’you’, ’time’, ’and’, ’attention’, ’.’] COMPLETE AS ID [’ROOT’] [’for’, ’you’, ’time’, ’and’, ’attention’, ’.’] [’thank’, ’you’, ’VID’] SHIFT [’ROOT’, ’for’] [’you’, ’time’, ’and’, ’attention’, ’.’] [’thank’, ’you’, ’VID’] REDUCE [’ROOT’] [’you’, ’time’, ’and’, ’attention’, ’.’] [’thank’, ’you’, ’VID’] SHIFT [’ROOT’, ’you’] [’time’, ’and’, ’attention’, ’.’] [’thank’, ’you’, ’VID’] REDUCE [’ROOT’] [’time’, ’and’, ’attention’, ’.’] [’thank’, ’you’, ’VID’] SHIFT [’ROOT’, ’time’] [’and’, ’attention’, ’.’] [’thank’, ’you’, ’VID’] REDUCE [’ROOT’] [’and’, ’attention’, ’.’] [’thank’, ’you’, ’VID’] SHIFT [’ROOT’, ’and’] [’attention’, ’.’] [’thank’, ’you’, ’VID’] REDUCE [’ROOT’] [’attention’, ’.’] [’thank’, ’you’, ’VID’] SHIFT [’ROOT’, ’attention’] [’.’] [’thank’, ’you’, ’VID’] REDUCE [’ROOT’] [’.’] [’thank’, ’you’, ’VID’] SHIFT [’ROOT’, ’.’] [] [’thank’, ’you’, ’VID’] REDUCE [’ROOT’] [] [’thank’, ’you’, ’VID’] 2 / 5
Feature Reduction and Classifiers • The classifier operates on (often) sparse and high-dimensional feature vectors that encode system states. • We compress high-dimensional feature vectors to vectors of low dimensionality (500 in our experiments) using the method proposed in [QasemiZadeh and Kallmeyer, 2017]; • We use convolutional neural networks to classify these low-dimensional feature vectors: Input Convolution Max-Pooling T RA P ACC output Dense Layer Dropout Flatten Layer Layer Layer (Dense Layer) T RA P ACC S output Kernel SVM 3 / 5
Results • With respect to the MWE-based F1 metric averaged across languages, T RA P ACC ranks third and T RA P ACC S second. • Compared to other systems, T RA P ACC and T RA P ACC S reach the best performance for 8 languages, whereas both systems perform poorly in a few langauges. • After a careful feature selection for the [Saied et al., 2017] system, our systems can outperform it only for four languages. T RA P ACC Results T RA P ACC S Results Best Using Best Using Comparison to MWE-based Tok-based MWE-based Tok-based T RA P ACC T RA P ACC S [Saied et al., 2017] Lang F1 Rank F1 Rank F1 Rank F1 Rank F1 MWE F1 Token F1 MWE F1 Token F1 MWE F1 Token BG 60.83 2/9 62.35 2/10 52.57 8/9 53.47 8/10 61.05 64.18 52.57 53.47 0.87 1.73 DE 44.05 2/11 48.37 4/11 45.27 1/11 49.97 3/11 45.14 48.09 45.27 49.97 -9.07 -4.93 EL 46.43 3/10 49.14 5/11 49.76 1/10 53.10 3/11 50.60 52.73 51.55 54.60 -12.78 -12.52 EN 32.88 1/10 34.37 1/10 30.28 3/10 30.23 2/10 32.88 34.37 30.28 30.23 2.58 4.23 ES 31.64 3/10 38.04 4/11 33.98 1/10 39.75 2/11 32.33 38.47 33.98 39.75 -7.38 -6.27 EU 73.23 2/9 74.40 3/10 75.80 1/9 76.83 1/10 73.23 74.40 75.80 76.83 -5.38 -4.52 FA 75.48 3/9 78.12 4/10 74.23 4/9 77.03 5/10 75.48 78.12 74.23 77.03 -2 0.63 FR 46.97 3/12 52.93 4/13 45.96 6/12 50.12 7/13 48.28 53.86 47.95 52.09 -10.54 -8.81 HE 20.37 3/8 24.26 4/10 16.95 6/8 18.97 6/10 21.97 26.08 18.21 20.23 -16.95 -14.89 HI 69.38 3/7 71.22 3/8 68.39 4/8 68.31 5/8 69.38 71.22 68.39 68.31 2.23 4.22 HR 43.39 3/9 49.97 2/10 44.27 2/9 47.95 4/10 47.94 52.39 46.36 50.66 -6.74 -6.24 HU 90.31 1/10 88.00 1/10 90.12 2/10 87.25 2/10 90.31 88.00 90.12 87.25 -3.84 -4.09 IT 38.52 2/11 40.64 4/12 33.33 4/11 34.02 8/12 39.70 41.45 33.48 36.01 -16.01 -16.20 LT 30.82 2/7 34.43 1/10 32.17 1/7 34.10 2/10 33.63 36.00 33.21 34.32 1.43 2.64 PL 60.54 2/10 63.68 3/11 59.86 4/10 61.51 6/11 62.50 65.05 59.71 62.43 -7.85 -5.42 PT 52.73 4/12 59.96 5/13 52.29 5/12 54.51 7/13 56.96 59.35 52.29 54.51 -14.72 -12.85 RO 85.28 1/9 85.69 1/10 83.36 2/9 83.45 5/10 85.28 85.69 83.36 83.45 -3.50 -3.09 SL 23.04 9/9 34.72 9/10 31.36 7/9 38.33 7/10 36.63 44.74 31.36 38.33 -18.93 -12.45 TR 1.61 8/9 4.68 9/10 0.78 9/9 3.01 10/10 39.34 44.09 35.88 39.87 -21.50 -17.22 49.57 3/13 53.09 3/13 49.74 2/13 52.13 4/13 53.28 56.42 51.84 54.45 -8.84 -7.12 4 / 5
Thanks Thanks for your attention! Contact: regina.stodden@hhu.de 5 / 5
Recommend
More recommend