Towards the Naive Classification of Rhetorical Relations at Scale - - PowerPoint PPT Presentation

towards the naive classification of
SMART_READER_LITE
LIVE PREVIEW

Towards the Naive Classification of Rhetorical Relations at Scale - - PowerPoint PPT Presentation

Towards the Naive Classification of Rhetorical Relations at Scale Georg Rehm DFKI GmbH Alt-Moabit 91c, 10559 Berlin Workshop on Coherence Relations Humboldt-Universitt zu Berlin January 17-18, 2020 Storytelling Theory Storytelling =


slide-1
SLIDE 1

Towards the Naive Classification of Rhetorical Relations at Scale

Georg Rehm DFKI GmbH Alt-Moabit 91c, 10559 Berlin

Workshop on Coherence Relations Humboldt-Universität zu Berlin January 17-18, 2020

slide-2
SLIDE 2

Storytelling Theory

  • Storytelling = human technique to order a series of events in the world and

find meaningful patterns in them (Brunner 1991)

  • Organise events into a schematic structure, for example, in terms of topic,

locality or causal relationships, and construct explanatory models of the world and the events happening in it

  • Semantic Storytelling = attempt to translate the theories of storytelling into

a formal and machine-processible scheme

3 Towards the Naive Classification of Rhetorical Relations at Scale

slide-3
SLIDE 3

Semantic Storytelling

  • Develop a system that, given an incoming document collection, is able to

(semi-)automatically extract or generate different story paths or plot lines

  • The goal is to support knowledge workers (journalists, authors, scholars,

politicians, business analysts etc.) in their daily work of processing huge amounts of incoming content

  • Helps to quickly grasp what is going on in a collection

4 Towards the Naive Classification of Rhetorical Relations at Scale

slide-4
SLIDE 4

Previous work: NLP-Pipeline based approach

  • Combine various text analysis procedures in a pipeline (NER, Coreference

Resolution, Relation Extraction, etc.)

  • Connect extracted entities to knowledge bases
  • Use rule-based story grammars

5 Towards the Naive Classification of Rhetorical Relations at Scale

slide-5
SLIDE 5

Previous work: NLP-Pipeline based approach

6

  • 1. NER

Entities like Persons, Locations, Organizations, Temporal Expressions

  • 2. Relation

Extraction Detect relations between Entities

  • 3. Timelining

Anchor Entities and Relations in Time

  • 4. Event Detection
  • 5. Topic

Detection

  • 6. Building

Datasets for Patterns of Narration

  • 6. Train Model
  • n basis of

Dataset

  • 7. Visualizing

Results

Towards the Naive Classification of Rhetorical Relations at Scale

slide-6
SLIDE 6

Now: Discourse-parsing inspired approach

  • Scalable: text segments can be phrases, sentences, paragraphs, texts
  • relating text segments to each other by using sense taxonomies from

research on coherence relations

  • Goal: automate storytelling by detecting discourse relations between texts

segments of different sources on the same topic

  • Makes it possible to detect and create new storylines extracted from a

document collection

  • In future work: Combine both approaches

7 Towards the Naive Classification of Rhetorical Relations at Scale

slide-7
SLIDE 7

Semantic Storytelling: Technical Description

  • Initialization:

User defines Topic T, initialized as a sentence, keyword or named entity

  • Semantic Storytelling tool will:
  • 1. Determine the Relevance of a Segment for a Topic
  • 2. Determine the Importance of a Segment
  • 3. Determine the Discourse or Semantic Relation between two Segments

8 Towards the Naive Classification of Rhetorical Relations at Scale

slide-8
SLIDE 8

9

Incoming Content

1

Determine the relevance of a segment for

2

Determine importance

  • f a segment

3 Discourse relation between

segment and topic

a

Document relevance

b

Segment relevance T Sentence 1 Sentence 5 Sentence 4

C B A

T

T C B A

Comparison Comparison Expansion

T C B A

isMoreImportantThan isLessImportantThan isMoreImportantThan

Web content Self-contained document collection Wikipedia

Topic

T

Possible instantiations of

  • Complete document
  • Summary
  • Claim or fact
  • Event
  • Named entity

Ranked list of text segments “Explore The Neighbourhood!” GUI User generating Stories

Semantic Storytelling Architecture

Towards the Naive Classification of Rhetorical Relations at Scale

slide-9
SLIDE 9

Step 1: Relevance of a Segment

  • Is segmentx relevant for segmentt?
  • Use for example:

– Topic modelling – Topic overlap or entity overlap – Text similarity or document similarity

10 Towards the Naive Classification of Rhetorical Relations at Scale

slide-10
SLIDE 10

Step 2: Importance of a Segment

  • How important or central is the information contained in a segment for a

topic?

  • In RST terms: Determine the nucleus (vs. satellite)
  • Possible applications in Question Answering-task: Is segmentx a potential

answer for segmentt?

11 Towards the Naive Classification of Rhetorical Relations at Scale

slide-11
SLIDE 11

Step 3: Discourse Relations

  • Find the the discourse or semantic relation between a text segment and the

Topic T

  • From Rhetorical Structure Theory (Thompson 1988) we borrow the idea that

between larger sequences of texts (i. e., non-elementary discourse units) discourse relations exist

  • These relations contribute to the coherence of a text

12 Towards the Naive Classification of Rhetorical Relations at Scale

slide-12
SLIDE 12

Step 3: Discourse Relations

  • For our experiments, we adopt the top-level senses of the Penn Discourse

Treebank, with which we can describe those discourse relations:

– Temporal – Contingency – Comparison – Expansion, and an additional label – None

13 Towards the Naive Classification of Rhetorical Relations at Scale

slide-13
SLIDE 13

14

Discourse Relations according to Penn Discourse Treebank (2.0)

Towards the Naive Classification of Rhetorical Relations at Scale

slide-14
SLIDE 14

Step 3: Discourse Relations

  • For training, we use the two arguments of a relation, but at a later point we

deploy it using individual sentences

  • We argue that the sentence-level is the most appropriate level to use as input

for our classifier and that the discrepancy between argument shapes and typical sentence lengths is tolerable

15 Towards the Naive Classification of Rhetorical Relations at Scale

slide-15
SLIDE 15

Use-Case: “Explore the Neighbourhood!”

  • Goal is to help a knowledge worker

to develop a mobile app which includes interesting stories about important persons, places, etc. related to a district in Berlin

  • The district Moabit was chosen due

to its rich history and lively present

  • Here, a story about the author Kurt

Tucholsky and his connection to Moabit is shown

  • Screenshots for a demo app are

provided by 3pc

16 Towards the Naive Classification of Rhetorical Relations at Scale

slide-16
SLIDE 16

Use-Case: “Explore the Neighbourhood!”

  • Curated stories can be

published to the app

  • Stories may contain

geographical points of interest within Moabit which are connected through an overall story arch, such as a biography

17 Towards the Naive Classification of Rhetorical Relations at Scale

slide-17
SLIDE 17

Use-Case: “Explore the Neighbourhood!”

18 Towards the Naive Classification of Rhetorical Relations at Scale

slide-18
SLIDE 18

Use-Case: “Explore the Neighbourhood!”

19 Towards the Naive Classification of Rhetorical Relations at Scale

slide-19
SLIDE 19

Use-Case: “Explore the Neighbourhood!”

20 Towards the Naive Classification of Rhetorical Relations at Scale

slide-20
SLIDE 20

Use-Case: “Explore the Neighbourhood!”

21 Towards the Naive Classification of Rhetorical Relations at Scale

slide-21
SLIDE 21

22

User Interface for creating curated stories:

Towards the Naive Classification of Rhetorical Relations at Scale

slide-22
SLIDE 22

Experiments: Dataset “Moabit Stories”

  • Created data set “Moabit Stories” from crawled English webpages
  • Used focused crawling methods based on keywords (= topics) and manual

postprocessing

  • Boilerplated content and metadata (author, date, url, language, etc.)
  • Result: data set of more than 100 documents containing relevant information

and stories connected to the district of Moabit in Berlin, grouped by topics

23 Towards the Naive Classification of Rhetorical Relations at Scale

slide-23
SLIDE 23

Experiments: Discourse Relations Classifier

  • The discourse relations classifier is trained on PDTB2 (Prasad 2008)
  • The text is encoded as deep contextual representations with a language

model based on the transformer architecture (pre-trained language model from DistilBERT (Sanh 2019))

24 Towards the Naive Classification of Rhetorical Relations at Scale

slide-24
SLIDE 24

Architecture of Siamese BERT model

  • Architecture of the Siamese BERT

model used for the classification of discourse relations between two text segments d1 and d2

  • The output of the classification

layer ŷ holds the predicted semantic relation according to the top-level PDTB2 senses: Temporal, Contingency, Comparison, Expansion and additionally None

25 Towards the Naive Classification of Rhetorical Relations at Scale

slide-25
SLIDE 25

Architecture of Siamese BERT model

  • BERT used in a Siamese fashion, 6

hidden layers, each consisting of 768 units with last hidden states h1, h2

  • Concatenation layer takes both last

hidden states h1, h2 as input, output is a combined concatenation of the text representations

  • Multi-Layer-Perceptron Layer

consisting of two fully connected layers where each layer has 100 units

  • Activation with ReLU

26 Towards the Naive Classification of Rhetorical Relations at Scale

slide-26
SLIDE 26

Experiments: Results PDTB2 Training

27 Towards the Naive Classification of Rhetorical Relations at Scale

slide-27
SLIDE 27

Experiments: “Moabit Stories”

Steps:

  • Group documents by topics based on the query terms for the focused crawler
  • Split documents into sentences
  • Find document pairs among the topic groups by representing documents as

tf-idf vectors and using cosine similarity with 𝑑𝑝𝑡𝑗𝑜𝑓 𝑒𝑏, 𝑒𝑐 > 0.15 for document pairs

  • 19,796 sentence pairs passed to the classifier

28 Towards the Naive Classification of Rhetorical Relations at Scale

slide-28
SLIDE 28

29 Towards the Naive Classification of Rhetorical Relations at Scale

slide-29
SLIDE 29

Manual Evaluation and Conclusion

We manually evaluated example sentence pairs to evaluate them qualitatively and motivate our next steps. We observed the following:

  • Often labels depend on lexical markers for discourse relations like dates for

Temporal or however, but, or while for Comparison

  • Train and test set are from different sources and different domains which can

lead to wrong predictions

30 Towards the Naive Classification of Rhetorical Relations at Scale

slide-30
SLIDE 30

Manual Evaluation and Conclusion

  • Model is often not able to handle coreferences
  • In the following example, in seg_b, the pronoun “it” refers to the AEG turbine

factory, while in seg_a the subject are the steam turbines used inside the factory

31

Predicted label Seg_a Seg_b Expansion Steam turbines were more effective than conventional steam engines and were in demand all over the world. Its revolutionary design features 100m long and 15m tall glass and steel walls on either side.

Towards the Naive Classification of Rhetorical Relations at Scale

slide-31
SLIDE 31

Manual Evaluation and Conclusion

  • In future work, we will expand the number of pre-processing steps to better

group text segments which have the same content and talk about the same entities (coreference resolution, NER), events (Event Detection) or topics (Topic Detection)

  • As data sets are still limited, we will expand the data set for our needs and

create, in the longer run, annotations to develop a gold standard

  • Keep dataset annotations rather simple (for ex. by using only the four top-

level senses of the PDTB) to speed up annotation

32 Towards the Naive Classification of Rhetorical Relations at Scale

slide-32
SLIDE 32

Literature

Towards Discourse Parsing-inspired Semantic Storytelling 33

Bruner, J.: The narrative construction of reality. Critical Inquiry 18(1), 1–21 (1991) Mann, W.C., Thompson, S.A.: Rhetorical Structure Theory: Toward a Functional Theory of Text Organization. Text 8, 243–281 (1988) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a Distilled Version of BERT: Smaller, Faster, Cheaper and Lighter. CoRR pp. 1–5 (2019) Prasad, R., Dinesh, N., Lee, A., Miltsakaki, E., Robaldo, L., Joshi, A., Webber, B.: The Penn Discourse Treebank 2.0. In: In Proceedings of LREC (2008)

slide-33
SLIDE 33

Thank you!

The project QURATOR is supported by the German Federal Ministry of Education and Research (BMBF), “Unternehmen Region”, instrument “Wachstumskern” (grant no. 03WKDA1A).