jena hwang na rae han vivek srikumar archna bhatia tim o
play

Jena Hwang Na-Rae Han Vivek Srikumar Archna Bhatia Tim OGorman - PowerPoint PPT Presentation

Double Trouble: The Problem of Construal in Semantic Annotation of Adpositions Jena Hwang Na-Rae Han Vivek Srikumar Archna Bhatia Tim OGorman Nathan Schneider August 4, 2017, *SEM, Vancouver Most languages have adpositions . in on at


  1. Double Trouble: The Problem of Construal in Semantic Annotation of Adpositions Jena Hwang Na-Rae Han Vivek Srikumar Archna Bhatia Tim O’Gorman Nathan Schneider August 4, 2017, *SEM, Vancouver

  2. Most languages have adpositions . in on at by for to of with from b ə - l ə - mi- about … ‘al ‘im … adposition = preposition 
 | postposition k ā ko ne se (n)eun i/ga, m ẽ par tak … do, (r)eul … 2

  3. Feature 85A: Order of Adposition and Noun Phrase 
 Dryer in WALS , http://wals.info/chapter/85 3

  4. We know PPs are challenging for syntactic parsing. a talk at the workshop on prepositions But what about the meaning beyond linking governor & modifier? 4

  5. “I study preposition semantics.” 5

  6. Adpositions have semantics?! https://michaelspiro.wordpress.com/author/michaelspiro/page/4/ 6

  7. based on COCA list of 5000 most frequent English words 7

  8. Polysemy • With great frequency comes great polysemy . • in ‣ in the box ‣ in the afternoon ‣ in love, in trouble ‣ in fact ‣ … 8

  9. Cross- linguistically interesting • Small number of grammatical categories • Language-specific partitioning of functions • Translations are many-to-many 9

  10. Bewildering to learn in an L2 10

  11. Shared functions They ran to the roof for a quick escape. D ESTINATION P URPOSE They made for the roof to escape the cops. 11

  12. Design Principles 1. Coverage: Wicked polysemy, rare senses make it hard to annotate all tokens in a corpus. 2. Cross-linguistic adequacy: Adpositions/case markers work differently in different languages. Ideally, our semantic functions should be language-independent. 12

  13. Design Principles 1. Coverage: Annotate all adposition types and tokens in a corpus. 2. Cross-linguistic adequacy: Adpositions/case markers work differently in different languages. Ideally, our semantic functions should be language-independent. 13

  14. Design Principles 1. Coverage: Annotate all adposition types and tokens in a corpus. 2. Cross-linguistic adequacy: Our semantic functions should be as language-independent as possible. 
 14

  15. Senses vs. Supersenses 2.B 2.C Above-and- Completion beyond 2.D (excess I) Transfer 3. Covering 2. 2.A On-the- A-B-C trajectory other-side- cluster of 1. 5. Protoscene Up cluster 5.A More 5.C Preference 5.B Control 5.A.1 Over-and-above (excess II) . The semantic network for . fine-grained details lexeme-specific (extensive linguistic & AI research 15 on space & time)

  16. Senses vs. Supersenses N = 4073 2.B 2.C Above-and- Completion beyond 2.D (excess I) Transfer 3. Covering Spatial 2. 2.A On-the- A-B-C 25% trajectory other-side- cluster of 1. 5. Protoscene Up cluster 5.A More 5.C Neither Temporal Preference 5.B Control 62% 13% 5.A.1 Over-and-above (excess II) . The semantic network for . fine-grained details lexeme-specific (extensive linguistic & AI research 15 on space & time)

  17. Senses vs. Supersenses 2.B 2.C Above-and- Completion beyond 2.D (excess I) Transfer 3. Covering 2. 2.A On-the- A-B-C trajectory other-side- cluster of 1. 5. Protoscene Up cluster 5.A More 5.C Preference 5.B Control 5.A.1 Over-and-above (excess II) . The semantic network for . fine-grained details cross-lexical classes; coarse; lexeme-specific interpretable names like T OPIC (extensive linguistic & AI research 15 on space & time)

  18. Preposition Supersenses L OCATION We met in Paris at a shop on a street by the Seine T IME at 6:00 in the evening on Saturday. 16

  19. Supersense Hierarchy 1.0 [LAW 2015] Superset Co-Agent Creator Possessor StartTime EndTime ClockTimeCxn Agent Whole Elements Instance DeicticTime RelativeTime Function Species Causer Quantity Reciprocation Purpose Age Time Frequency Configuration A ff ector Duration Explanation Attribute Temporal Participant Co-Participant Circumstance Accompanier Patient Experiencer Stimulus Place Comparison/Contrast Undergoer Co-Patient ProfessionalAspect Value Scalar/Rank Theme Path Manner Locus Activity Extent Co-Theme Topic ValueComparison Instrument Contour Beneficiary Location Source State Approximator Direction StartState Means Via InitialLocation Material Traversed Goal Transit Donor/Speaker 1DTrajectory EndState Destination 2DArea 3DMedium Course Recipient 75 preposition supersense categories http://tiny.cc/prepwiki 17

  20. English Annotation in STREUSLE [LAW 2016] • Online reviews corpus previously annotated for multiword expressions and noun & verb supersenses. 55,000 words, including 4,250 preps. • Comprehensive annotation: first dataset with all prepositions (types+tokens) semantically annotated ‣ Sentences not hand-selected ‣ Sentences fully annotated ‣ Preposition types not constrained by a lexicon (labels generalize) ‣ All sentences seen by multiple annotators 18

  21. Comparing resources [LAW 2016] P P ∞ ∞ P* {P1,P2} Ann P1 < P2 X ~ P P P The Preposition Project ✓ ✓ ( ✓ ) TPP (Litkowski & Hargraves 2005, SemEval 2007 shared task) TPP senses for 7 preposition ✓ ✓ D+ 7 types in PropBank WSJ data (Dahlmeier et al. 2009) Annotator-optimized revised ✓ ✓ ( ✓ ) Tratz 34 senses for 34 TPP SemEval prepositions (Tratz 2011) 32 hard clusters of TPP senses ✓ S&R 34 for 34 SemEval prepositions (Srikumar & Roth 2013) Preposition supersenses ✓ ✓ ✓ ✓ ✓ Ours (Schneider et al. LAW 2015, 2016) 19

  22. A Vexing Problem • Drawing clean boundaries between semantic categories is always difficult. • But we were surprised by the frequency of apparent overlaps between semantic role labels. • These overlaps proved pervasive in the other languages we looked at. 20

  23. Destination/Location • The prepositions to , into , onto , and for explicitly encode D ESTINATION . • D ESTINATION masquerading as static L OCATION : ‣ Put the pen in the box. (= into) ‣ He threw his cards on the table. (= onto) ‣ The ball rolled behind the trash can. • Extremely productive for motion/caused motion! • We could stipulate one or the other, but annotators would still get confused. 21

  24. Fictive Motion • In the other direction, we know that static locative relations can be described using dynamic language (Talmy 1996): ‣ The road runs through the trees. ‣ I heard him from the room next door. ‣ The school is around the corner. • In assigning a semantic label, is it sufficient to “choose sides” between the static nature of the spatial scene, and the dynamic way that relation is portrayed by the preposition? 22

  25. Stimulus/Topic • Another conundrum: ‣ I thought about getting my ears pierced.: T OPIC (cf. know, talk, read ) ‣ I feared getting my ears pierced: S TIMULUS (cf. see , hurt ) ‣ I was scared about getting my ears pierced: ??? • Again, two labels are competing for semantic territory. • Should we add more categories with double inheritance? (Problem: Proliferation of categories.) • Should we just allow annotators to specify multiple labels if they’re unsure? (Problem: Would create inconsistency.) 23

  26. Construal • Assumption thus far: 
 preposition token’s semantics = role in a scene 
 … Topic ‣ I thought about getting my ears pierced. 
 Topic • But it’s not always so simple: 
 … Stimulus ‣ I was scared about getting my ears pierced. 
 Topic 24

  27. Construal • Observation: The preposition can frame or construe the situation in a way that differs from the predicate or scene. • Solution: Allow tokens to receive two labels from the hierarchy, one for the scene role and one for the preposition’s semantic function , when warranted. 25

  28. Construal • In fact, Stimulus can be interpreted differently by different prepositions: 
 … Stimulus ‣ I was scared by the bear. 
 Causer • 
 … Stimulus ‣ I was scared about getting my ears pierced. 
 Topic 26

  29. Experiencer Dative • Experiencers can be realized as recipients/datives: 
 … Experiencer ‣ The bear felt scary to me. 
 Recipient • In some languages, this is the main way E XPERIENCER s are realized: ‣ koev li ha-ro š . [Hebrew] 
 Hurts to.me the-head ‘My head hurts.’ ‣ mujh- ko garmii lag rahii hai. [Hindi] 
 I- DAT head feel PROG PRESS ‘I’m feeling hot.’ 27

  30. Employment • The P ROFESSIONAL A SPECT label is used for employer– employee and other professional relationships. • It participates in several different preposition construals: … ProfAsp Beneficiary ‣ He works for XYZ Inc. 
 at 
 Location … ProfAsp Source ‣ He’s from XYZ Inc. 
 with 
 Accompanier 28

  31. Null Functions? • Sometimes it’s hard to tell whether the adposition has any semantic contribution: … Stimulus ? ‣ I’m angry with my mom. 
 *mad ? … Topic ‣ She’s interested in politics. 
 *fascinated 29

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend