a two stage parsing method for text level discourse
play

A Two-Stage Parsing Method for Text-Level Discourse Analysis Yizhong - PowerPoint PPT Presentation

A Two-Stage Parsing Method for Text-Level Discourse Analysis Yizhong Wang , Sujian Li, Houfeng Wang Peking University ACL, August 2, 2017 Back ckground: Te Text-Le Level el Di Discourse An Analysis Task: Identifying the discourse


  1. A Two-Stage Parsing Method for Text-Level Discourse Analysis Yizhong Wang , Sujian Li, Houfeng Wang Peking University ACL, August 2, 2017

  2. Back ckground: Te Text-Le Level el Di Discourse An Analysis β€’ Task: Identifying the discourse structure of text. β€’ Rhetorical Structure Theory [Mann and Thompson, 1988] 𝑓 +:0 [ The European Community’s consumer price index rose a provisional 0.6% in Evaluation September from August ] # $ [ and was 𝑓 +:. 𝑓 /:0 up 5.3% from September 1988, ] # & (Satellite) (Nucleus) [ according to Eurostat, the EC's Attribution Attribution statistical agency. ] # ' 𝑓 +:, 𝑓 . 𝑓 / 𝑓 0 [ The month-to-month rise in the index Comparison was the largest since April, ] # ( 𝑓 , 𝑓 + [ Eurostat said. ] # ) wsj-0699 Background | Motivation | Method | Experiments 2

  3. Back ckground: Te Text-Le Level el Di Discourse An Analysis Goal: parse a text into a tree with nuclearity β€’ Task: Identifying the discourse structure of text. and relation labels β€’ Rhetorical Structure Theory [Mann and Thompson, 1988] 𝑓 +:0 [ The European Community’s consumer price index rose a provisional 0.6% in Evaluation September from August ] # $ [ and was 𝑓 +:. 𝑓 /:0 up 5.3% from September 1988, ] # & (Satellite) (Nucleus) [ according to Eurostat, the EC's Attribution Attribution statistical agency. ] # ' 𝑓 +:, 𝑓 . 𝑓 / 𝑓 0 [ The month-to-month rise in the index Comparison was the largest since April, ] # ( 𝑓 , 𝑓 + [ Eurostat said. ] # ) wsj-0699 Background | Motivation | Method | Experiments 3

  4. Back ckground: Tr Transition-Ba Based ed Me Metho thod [Daniel Marcu. 1999; Kenji Sagae. 2009] β€’ Initial state: Stack Queue 𝑓 + , 𝑓 , , 𝑓 . , … β€’ Shift action: Stack Queue 𝑓 +:. , 𝑓 / , 𝑓 0 𝑓 0 , 𝑓 1 , 𝑓 2 , … β€’ Reduce action: Stack Queue 𝑓 +:. , 𝑓 /:0 𝑓 1 , 𝑓 2 , … 𝑓 0 𝑓 / Background | Motivation | Method | Experiments 4

  5. Back ckground: Tr Transition-Ba Based ed Me Metho thod [Daniel Marcu. 1999; Kenji Sagae. 2009] β€’ The unified framework: 42 42 reduce actions are designed with 3 different nuclearity types (e.g. NS) and 18 18 relation labels (e.g. cause) . β€’ Reduce action combined with nuclearity and relation: Stack Queue 𝑓 +:. , 𝑓 /:0 𝑓 1 , 𝑓 2 , … Attribution 𝑓 0 N S 𝑓 / Background | Motivation | Method | Experiments 5

  6. Back ckground: Tr Transition-Ba Based ed Me Metho thod [Daniel Marcu. 1999; Kenji Sagae. 2009] β€’ The unified framework: 42 reduce actions are designed with 3 different nuclearity types 42 (e.g. NS) and 18 18 relation labels (e.g. cause) . β€’ Reduce action combined with nuclearity and relation: Classifier Shift Stack Queue Reduce-SN-Cause 𝑓 +:. , 𝑓 /:0 𝑓 1 , 𝑓 2 , … Reduce-NS-Summary Attribution 𝑓 0 N S Reduce-NN-Contrast 𝑓 / Reduce-NS-Temporal ...... Background | Motivation | Method | Experiments 6

  7. Motivation: Nak Mo Naked Tr Tree fo for Re Reducing Sp Sparsi sity Distribution of the 42 42 actions in Number of the 4 actions that we need tree (without relation) Previous Transition-based Parsing Systems to build a nak naked tr 19443 11702 remove Shift 4329 relation 3065 Reduce-NS-Elaboration A Naked Discourse Parse Tree A Complete Discourse Parse Tree 𝑓 /:0 𝑓 /:0 N S S N Attribution 𝑓 / 𝑓 / 𝑓 0 𝑓 0 Background | Motivation | Method | Experiments 7

  8. Mo Motivation: Le Level-Sp Specific Re Relation La Labelling β€’ Discourse relations distribute differently at different linguistic levels: Top-5 Frequent Top-5 Frequent Top-5 Frequent Inner-Sentential Inter-Sentential Inter-Paragraph Relations Relations Relations 32.70 % Elaboration 43.10% 44.4 % Elaboration Elaboration 23.00 % Joint 12.7 % Attribution Joint 13.80% 10.90 % Explanation 9.2 % Explanation Same-Unit 7.60% 6.60 % Contrast 7.6 % Joint Contrast 6.40% 4.30 % Evaluation 5.3 % Enablement Evaluation 5.90% Background | Motivation | Method | Experiments 8

  9. Mo Motivation: Le Level-Sp Specific Re Relation La Labelling β€’ Some discourse relations tend to occur at specific linguistic levels: Inner-Sentential Inter-Sentential Inter-Paragraph 231 174 177 102 41 33 33 17 15 8 5 0 Condition Manner-Means Textual-Organization Topic-Change Background | Motivation | Method | Experiments 9

  10. Me Metho thod: d: Tw Two-St Stage Pa Parsing Al Algorith thm β€’ Stage 1: 𝑓 +:. Transition-based parsing system with only 4 actions is adopted to construct the naked tree (without labels). 𝑓 +:, 𝑓 . β€’ Stage 2: Three dedicated classifiers are trained for labelling relations at three linguistic levels: 𝑓 +:. a) intra-sentential b) inter-sentential Attribution 𝑓 +:, 𝑓 . c) inter-paragraph Background | Motivation | Method | Experiments 10

  11. Me Metho thod: d: Tw Two-St Stage Pa Parsing Al Algorith thm β€’ Stage 1: 𝑓 +:. Transition-based parsing system with only 4 actions is adopted to construct the naked tree (without labels). 𝑓 +:, 𝑓 . Naked tree structure could help with relation classification. β€’ Stage 2: Three dedicated classifiers are trained for labelling relations at three linguistic levels: 𝑓 +:. a) intra-sentential b) inter-sentential Attribution 𝑓 +:, 𝑓 . c) inter-paragraph Background | Motivation | Method | Experiments 11

  12. Method Me od: Fea eatures es an and Classifier ers β€’ We use manually-extracted features, including: a) Parsing status, position features (only for stage 1) b) N-gram features, dependency features, structural features, nucleus features c) Tree features (only for stage 2) c) Tree features (only for stage 2) : Height=1, Depth=2, SelfIsNucleus=True, ParentIsNucleus=False β€’ Four SVM classifiers are trained for the four classification tasks (one action classifier and three relation classifier). Background | Motivation | Method | Experiments 12

  13. Ex Expe peri riments: Performance ce Co Comp mparison β€’ We evaluate our method on RST Discourse Treebank, and report the (micro-averaged) F-score: Model Span Nuclearity Relation Joty et al. (2013) 82.7 68.4 55.7 Feng and Hirst (2014) 85.7 71.0 58.2 Li et al. (2014) 84.0 70.8 58.6 Li et al. (2016) 85.8 71.1 58.9 Transition Ji and Eisenstein (2014) 82.1 71.1 61.6 -Based Heilman and Sagae (2015) 83.5 69.3 57.4 Systems Ours 86.0 72.4 59.7 Human 88.7 77.7 65.8 Background | Motivation | Method | Experiments 13

  14. Experim Exp iments: Incr cremental An Analysis of of Our ur Met ethod n Simple Unified Framework 90 n Two-Stage Parsing (Basic) 86 86 86 84.4 80 l Span: β‡ˆ 1. 1.6 % l Nuclearity: β‡ˆ 1. 1.7 % l Relation: β‡ˆ 0. 0.9 % 72.4 72.4 72.4 70 70.7 n + Three-Level Relation l Relation: β‡ˆ 0. 0.8 % 60 59.7 59.4 58.6 57.7 n + Tree Features 50 l Relation: β‡ˆ 0. 0.3 % SPAN NUCLEARITY RELATION Background | Motivation | Method | Experiments 14

  15. Co Conclusi sions β€’ Summary: A pipelined two-stage discourse parsing method; β€’ Three-level relation classification with tree features; β€’ State-of-the-art performance. β€’ β€’ Future work: Update the features and classifiers with latest models; β€’ Incorporate data from other sources. β€’ 15

  16. Thank you! Contact: yizhong@pku.edu.cn Code is available: https://github.com/EastonWang/StageDP

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend