A Two-Stage Parsing Method for Text-Level Discourse Analysis Yizhong - - PowerPoint PPT Presentation

β–Ά
a two stage parsing method for text level discourse
SMART_READER_LITE
LIVE PREVIEW

A Two-Stage Parsing Method for Text-Level Discourse Analysis Yizhong - - PowerPoint PPT Presentation

A Two-Stage Parsing Method for Text-Level Discourse Analysis Yizhong Wang , Sujian Li, Houfeng Wang Peking University ACL, August 2, 2017 Back ckground: Te Text-Le Level el Di Discourse An Analysis Task: Identifying the discourse


slide-1
SLIDE 1

A Two-Stage Parsing Method for Text-Level Discourse Analysis

Yizhong Wang, Sujian Li, Houfeng Wang Peking University

ACL, August 2, 2017

slide-2
SLIDE 2

Back ckground: Te Text-Le Level el Di Discourse An Analysis

  • Rhetorical Structure Theory [Mann and Thompson, 1988]

Background | Motivation | Method | Experiments 2

[The European Community’s consumer

price index rose a provisional 0.6% in September from August ]#$ [ and was up 5.3% from September 1988, ]#&

[ according to Eurostat, the EC's

statistical agency. ]#'

[ The month-to-month rise in the index

was the largest since April, ]#(

[Eurostat said. ]#) 𝑓+ 𝑓, 𝑓+:, 𝑓. 𝑓+:. 𝑓/ 𝑓0 𝑓/:0 𝑓+:0

Comparison Attribution Attribution Evaluation wsj-0699

(Nucleus) (Satellite)

  • Task: Identifying the discourse structure of text.
slide-3
SLIDE 3

Back ckground: Te Text-Le Level el Di Discourse An Analysis

  • Rhetorical Structure Theory [Mann and Thompson, 1988]

Background | Motivation | Method | Experiments 3

[The European Community’s consumer

price index rose a provisional 0.6% in September from August ]#$ [ and was up 5.3% from September 1988, ]#&

[ according to Eurostat, the EC's

statistical agency. ]#'

[ The month-to-month rise in the index

was the largest since April, ]#(

[Eurostat said. ]#) 𝑓+ 𝑓, 𝑓+:, 𝑓. 𝑓+:. 𝑓/ 𝑓0 𝑓/:0 𝑓+:0

Comparison Attribution Attribution Evaluation wsj-0699

(Nucleus) (Satellite)

  • Task: Identifying the discourse structure of text.

Goal: parse a text into a tree with nuclearity and relation labels

slide-4
SLIDE 4

Back ckground: Tr Transition-Ba Based ed Me Metho thod

  • Shift action:

Background | Motivation | Method | Experiments 4

𝑓+:., 𝑓/, 𝑓0

Stack Queue …

𝑓0, 𝑓1, 𝑓2,

  • Reduce action:

[Daniel Marcu. 1999; Kenji Sagae. 2009]

𝑓+:., 𝑓1, 𝑓2, 𝑓/

Stack Queue

𝑓/:0 𝑓0

…

  • Initial state:

Stack Queue …

𝑓+, 𝑓,, 𝑓.,

slide-5
SLIDE 5

Back ckground: Tr Transition-Ba Based ed Me Metho thod

Background | Motivation | Method | Experiments 5

  • The unified framework:

42 42 reduce actions are designed with 3 different nuclearity types

(e.g. NS) and 18

18 relation labels (e.g. cause) .

[Daniel Marcu. 1999; Kenji Sagae. 2009]

𝑓+:., 𝑓1, 𝑓2, 𝑓/

Stack Queue

𝑓/:0

Attribution 𝑓0

N S

…

  • Reduce action combined with nuclearity and relation:
slide-6
SLIDE 6

Back ckground: Tr Transition-Ba Based ed Me Metho thod

Background | Motivation | Method | Experiments 6

  • The unified framework:

42 42 reduce actions are designed with 3 different nuclearity types

(e.g. NS) and 18

18 relation labels (e.g. cause) .

[Daniel Marcu. 1999; Kenji Sagae. 2009]

𝑓+:., 𝑓1, 𝑓2, 𝑓/

Stack Queue

𝑓/:0

Attribution 𝑓0

N S

…

  • Reduce action combined with nuclearity and relation:

Classifier Shift Reduce-SN-Cause Reduce-NS-Summary Reduce-NN-Contrast Reduce-NS-Temporal ......

slide-7
SLIDE 7

Background | Motivation | Method | Experiments 7

Shift

Reduce-NS-Elaboration Distribution of the 42

42 actions in

Previous Transition-based Parsing Systems

Mo Motivation: Nak Naked Tr Tree fo for Re Reducing Sp Sparsi sity

𝑓/ 𝑓/:0

Attribution

N S

𝑓0

A Complete Discourse Parse Tree A Naked Discourse Parse Tree

𝑓/ 𝑓/:0

N S

𝑓0

19443 4329 11702 3065

Number of the 4 actions that we need to build a nak naked tr tree (without relation)

remove relation

slide-8
SLIDE 8
  • Discourse relations distribute differently at different

linguistic levels:

Background | Motivation | Method | Experiments 8

Top-5 Frequent

Inner-Sentential

Relations Top-5 Frequent

Inter-Sentential

Relations Top-5 Frequent

Inter-Paragraph

Relations Elaboration 32.70 % Elaboration 44.4 % Elaboration 43.10% Attribution 23.00 % Joint 12.7 % Joint 13.80% Same-Unit 10.90 % Explanation 9.2 % Explanation 7.60% Joint 6.60 % Contrast 7.6 % Contrast 6.40% Enablement 4.30 % Evaluation 5.3 % Evaluation 5.90%

Mo Motivation: Le Level-Sp Specific Re Relation La Labelling

slide-9
SLIDE 9

Mo Motivation: Le Level-Sp Specific Re Relation La Labelling

Background | Motivation | Method | Experiments 9

  • Some discourse relations tend to occur at specific

linguistic levels:

231 174 5

33 17 41 15 33 8 102 177

Condition Manner-Means Textual-Organization Topic-Change

Inner-Sentential Inter-Sentential Inter-Paragraph

slide-10
SLIDE 10

Me Metho thod: d: Tw Two-St Stage Pa Parsing Al Algorith thm

  • Stage 1:

Background | Motivation | Method | Experiments 10

  • Stage 2:

Transition-based parsing system with only 4 actions is adopted to construct the naked tree (without labels). Three dedicated classifiers are trained for labelling relations at three linguistic levels: a) intra-sentential b) inter-sentential c) inter-paragraph

𝑓+:, 𝑓. 𝑓+:. Attribution 𝑓+:, 𝑓. 𝑓+:.

slide-11
SLIDE 11

Me Metho thod: d: Tw Two-St Stage Pa Parsing Al Algorith thm

  • Stage 1:

Background | Motivation | Method | Experiments 11

  • Stage 2:

Transition-based parsing system with only 4 actions is adopted to construct the naked tree (without labels). Three dedicated classifiers are trained for labelling relations at three linguistic levels: a) intra-sentential b) inter-sentential c) inter-paragraph

Naked tree structure could help with relation classification. 𝑓+:, 𝑓. 𝑓+:. Attribution 𝑓+:, 𝑓. 𝑓+:.

slide-12
SLIDE 12

Me Method

  • d: Fea

eatures es an and Classifier ers

  • We use manually-extracted features, including:

Background | Motivation | Method | Experiments 12

  • Four SVM classifiers are trained for the four classification

tasks (one action classifier and three relation classifier). a) Parsing status, position features (only for stage 1) b) N-gram features, dependency features, structural features, nucleus features c) Tree features (only for stage 2) c) Tree features (only for stage 2) :

Height=1, Depth=2, SelfIsNucleus=True, ParentIsNucleus=False

slide-13
SLIDE 13

Ex Expe peri riments: Performance ce Co Comp mparison

Background | Motivation | Method | Experiments 13

Model Span Nuclearity Relation Joty et al. (2013) 82.7 68.4 55.7 Feng and Hirst (2014) 85.7 71.0 58.2 Li et al. (2014) 84.0 70.8 58.6 Li et al. (2016) 85.8 71.1 58.9 Ji and Eisenstein (2014) 82.1 71.1 61.6 Heilman and Sagae (2015) 83.5 69.3 57.4 Ours 86.0 72.4 59.7 Human 88.7 77.7 65.8

Transition

  • Based

Systems

  • We evaluate our method on RST Discourse Treebank, and

report the (micro-averaged) F-score:

slide-14
SLIDE 14

Exp Experim iments: Incr cremental An Analysis of

  • f Our

ur Met ethod

Background | Motivation | Method | Experiments 14

84.4 70.7 57.7 86 72.4 58.6 86 72.4 59.4 86 72.4 59.7

50 60 70 80 90

SPAN NUCLEARITY RELATION

n Two-Stage Parsing (Basic) n Simple Unified Framework n + Three-Level Relation n + Tree Features

l Span: β‡ˆ 1.

1.6 %

l Nuclearity: β‡ˆ 1.

1.7 %

l Relation: β‡ˆ 0.

0.9 %

l Relation: β‡ˆ 0.

0.8 %

l Relation: β‡ˆ 0.

0.3 %

slide-15
SLIDE 15

Co Conclusi sions

  • Summary:

15

  • A pipelined two-stage discourse parsing method;
  • Three-level relation classification with tree features;
  • State-of-the-art performance.
  • Future work:
  • Update the features and classifiers with latest models;
  • Incorporate data from other sources.
slide-16
SLIDE 16

Thank you!

Contact: yizhong@pku.edu.cn Code is available: https://github.com/EastonWang/StageDP