Towards the Naive Classification of Rhetorical Relations at Scale - PowerPoint PPT Presentation

Towards the Naive Classification of Rhetorical Relations at Scale Georg Rehm DFKI GmbH Alt-Moabit 91c, 10559 Berlin Workshop on Coherence Relations Humboldt-Universität zu Berlin January 17-18, 2020

Storytelling Theory • Storytelling = human technique to order a series of events in the world and find meaningful patterns in them (Brunner 1991) • Organise events into a schematic structure, for example, in terms of topic, locality or causal relationships, and construct explanatory models of the world and the events happening in it • Semantic Storytelling = attempt to translate the theories of storytelling into a formal and machine-processible scheme Towards the Naive Classification of Rhetorical Relations at Scale 3

Semantic Storytelling • Develop a system that, given an incoming document collection, is able to (semi-)automatically extract or generate different story paths or plot lines • The goal is to support knowledge workers (journalists, authors, scholars, politicians, business analysts etc.) in their daily work of processing huge amounts of incoming content • Helps to quickly grasp what is going on in a collection Towards the Naive Classification of Rhetorical Relations at Scale 4

Previous work: NLP-Pipeline based approach • Combine various text analysis procedures in a pipeline (NER, Coreference Resolution, Relation Extraction, etc.) • Connect extracted entities to knowledge bases • Use rule-based story grammars Towards the Naive Classification of Rhetorical Relations at Scale 5

Previous work: NLP-Pipeline based approach 1. NER 2. Relation 3. Timelining Entities like Persons, Extraction Locations, 4. Event Detection Anchor Entities and Detect relations Organizations, Relations in Time between Entities Temporal Expressions 6. Building 6. Train Model 5. Topic Datasets for 7. Visualizing on basis of Detection Patterns of Results Dataset Narration Towards the Naive Classification of Rhetorical Relations at Scale 6

Now: Discourse-parsing inspired approach • Scalable: text segments can be phrases, sentences, paragraphs, texts • relating text segments to each other by using sense taxonomies from research on coherence relations • Goal: automate storytelling by detecting discourse relations between texts segments of different sources on the same topic • Makes it possible to detect and create new storylines extracted from a document collection • In future work: Combine both approaches Towards the Naive Classification of Rhetorical Relations at Scale 7

Semantic Storytelling: Technical Description • Initialization: User defines Topic T, initialized as a sentence, keyword or named entity • Semantic Storytelling tool will: 1. Determine the Relevance of a Segment for a Topic 2. Determine the Importance of a Segment 3. Determine the Discourse or Semantic Relation between two Segments Towards the Naive Classification of Rhetorical Relations at Scale 8

Semantic Storytelling Self-contained Incoming Content Web content Wikipedia document collection Architecture 1 T Determine the relevance of a segment for Possible instantiations of T • Complete document a A Sentence 1 Document relevance Topic • Summary Ranked list of Sentence 5 B • Claim or fact text segments T b • Segment relevance Sentence 4 Event C • Named entity A isLessImportantThan 2 Determine importance B T C of a segment isMoreImportantThan isMoreImportantThan Comparison User 3 Discourse relation between Comparison B T generating A segment and topic Stories C Expansion “ Explore The Neighbourhood! ” GUI Towards the Naive Classification of Rhetorical Relations at Scale 9

Step 1: Relevance of a Segment • Is segment x relevant for segment t ? • Use for example: – Topic modelling – Topic overlap or entity overlap – Text similarity or document similarity Towards the Naive Classification of Rhetorical Relations at Scale 10

Step 2: Importance of a Segment • How important or central is the information contained in a segment for a topic? • In RST terms: Determine the nucleus (vs. satellite ) • Possible applications in Question Answering-task: Is segment x a potential answer for segment t ? Towards the Naive Classification of Rhetorical Relations at Scale 11

Step 3: Discourse Relations • Find the the discourse or semantic relation between a text segment and the Topic T • From Rhetorical Structure Theory (Thompson 1988) we borrow the idea that between larger sequences of texts (i. e., non-elementary discourse units) discourse relations exist • These relations contribute to the coherence of a text Towards the Naive Classification of Rhetorical Relations at Scale 12

Step 3: Discourse Relations • For our experiments, we adopt the top-level senses of the Penn Discourse Treebank, with which we can describe those discourse relations: – Temporal – Contingency – Comparison – Expansion , and an additional label – None Towards the Naive Classification of Rhetorical Relations at Scale 13

Discourse Relations according to Penn Discourse Treebank (2.0) Towards the Naive Classification of Rhetorical Relations at Scale 14

Step 3: Discourse Relations • For training, we use the two arguments of a relation, but at a later point we deploy it using individual sentences • We argue that the sentence-level is the most appropriate level to use as input for our classifier and that the discrepancy between argument shapes and typical sentence lengths is tolerable Towards the Naive Classification of Rhetorical Relations at Scale 15

Use- Case: “Explore the Neighbourhood!” • Goal is to help a knowledge worker to develop a mobile app which includes interesting stories about important persons, places, etc. related to a district in Berlin • The district Moabit was chosen due to its rich history and lively present • Here, a story about the author Kurt Tucholsky and his connection to Moabit is shown • Screenshots for a demo app are provided by 3pc Towards the Naive Classification of Rhetorical Relations at Scale 16

Use- Case: “Explore the Neighbourhood!” • Curated stories can be published to the app • Stories may contain geographical points of interest within Moabit which are connected through an overall story arch, such as a biography Towards the Naive Classification of Rhetorical Relations at Scale 17

Use- Case: “Explore the Neighbourhood!” Towards the Naive Classification of Rhetorical Relations at Scale 18

User Interface for creating curated stories: Towards the Naive Classification of Rhetorical Relations at Scale 22

Experiments: Dataset “ Moabit Stories” • Created data set “ Moabit Stories” from crawled English webpages • Used focused crawling methods based on keywords (= topics) and manual postprocessing • Boilerplated content and metadata (author, date, url, language, etc.) • Result: data set of more than 100 documents containing relevant information and stories connected to the district of Moabit in Berlin, grouped by topics Towards the Naive Classification of Rhetorical Relations at Scale 23

Experiments: Discourse Relations Classifier • The discourse relations classifier is trained on PDTB2 (Prasad 2008) • The text is encoded as deep contextual representations with a language model based on the transformer architecture (pre-trained language model from DistilBERT (Sanh 2019)) Towards the Naive Classification of Rhetorical Relations at Scale 24

Architecture of Siamese BERT model • Architecture of the Siamese BERT model used for the classification of discourse relations between two text segments d 1 and d 2 • The output of the classification layer ŷ holds the predicted semantic relation according to the top-level PDTB2 senses: Temporal, Contingency, Comparison, Expansion and additionally None Towards the Naive Classification of Rhetorical Relations at Scale 25

Architecture of Siamese BERT model • BERT used in a Siamese fashion, 6 hidden layers, each consisting of 768 units with last hidden states h 1 , h 2 • Concatenation layer takes both last hidden states h 1 , h 2 as input, output is a combined concatenation of the text representations • Multi-Layer-Perceptron Layer consisting of two fully connected layers where each layer has 100 units • Activation with ReLU Towards the Naive Classification of Rhetorical Relations at Scale 26

Experiments: Results PDTB2 Training Towards the Naive Classification of Rhetorical Relations at Scale 27

Experiments: “ Moabit Stories” Steps: • Group documents by topics based on the query terms for the focused crawler • Split documents into sentences • Find document pairs among the topic groups by representing documents as tf-idf vectors and using cosine similarity with 𝑑𝑝𝑡𝑗𝑜𝑓 𝑒 𝑏 , 𝑒𝑐 > 0.15 for document pairs • 19,796 sentence pairs passed to the classifier Towards the Naive Classification of Rhetorical Relations at Scale 28

Towards the Naive Classification of Rhetorical Relations at Scale 29

Towards the Naive Classification of Rhetorical Relations at Scale - PowerPoint PPT Presentation

Towards the Naive Classification of Rhetorical Relations at Scale Georg Rehm DFKI GmbH Alt-Moabit 91c, 10559 Berlin Workshop on Coherence Relations Humboldt-Universitt zu Berlin January 17-18, 2020 Storytelling Theory Storytelling =

STAT 339 Naive Bayes Classification 8-10 March 2017 Colin Reimer Dawson Outline Naive Bayes

Introduction to Machine Learning Classification: Naive Bayes Learning goals 15 Understand the

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

Outline Naive Credal Classifier 2: an extension of Naive Bayes Introducing NCC2 1 for

Naive Bayes Classication Naive Bayes Classication In [1]: % matplotlib inline from

CSE 473: Artificial Intelligence Autumn 2010 Machine Learning: Naive Bayes and Perceptron Luke

PLUGIN CLASSIFIERS: NAIVE BAYES, LDA, PLUGIN CLASSIFIERS: NAIVE BAYES, LDA, LOGISTIC REGRESSION

Naive Bayes and Gaussian Bayes Classifier Ladislav Rampasek slides by Mengye Ren and others

INF4080 2020 FALL NATURAL LANGUAGE PROCESSING Jan Tore Lnning 2 (Mostly Text)

Graph Classification Classification Outline Introduction, Overview Classification using

Classification of Symmetry Classification of Symmetry Classification of Symmetry Classification

CSC 411: Lecture 09: Naive Bayes Class based on Raquel Urtasun & Rich Zemels lectures

2. Naive Bayes Classification Machine Learning and Real-world Data (MLRD) Paula Buttery (based

Overview Introduction to Information Retrieval Text classification

(a) Quantitative classification (b) Qualitative classification (c) Area classification (d) Simple

Classification Image Classification Set of predefined categories [eg: table, apple, dog, giraffe]

Sustainable Energy for All A Brief Overview of Solar Energy and some of our research

Disclaimer Disclaimer The information contained in this presentation is intended solely for your

Molten Salt Reactors: A 2 Fluid Approach to a Practical Closed Cycle Thorium Reactor

Computer Science Qingsong Guo Fall 2017 School of Computer Science & Technology CS101

Streaming Model of Computation A streaming algorithm processes a data stream : Input is

Sketch Model Review Blue!Team!!Sec,on!A! October!3,!2013! 2.009%%Blue%A ! 1% Problems and

Geovisualization of fishing vessel movement patterns using hybrid fractal/velocity signatures

Welcome to the 4 th ICCT Workshop on Marine Black Carbon Dan Rutherford, PhD Director, Marine

Towards the Naive Classification of Rhetorical Relations at Scale - PowerPoint PPT Presentation

Towards the Naive Classification of Rhetorical Relations at Scale Georg Rehm DFKI GmbH Alt-Moabit 91c, 10559 Berlin Workshop on Coherence Relations Humboldt-Universitt zu Berlin January 17-18, 2020 Storytelling Theory Storytelling =

STAT 339 Naive Bayes Classification 8-10 March 2017 Colin Reimer Dawson Outline Naive Bayes

Introduction to Machine Learning Classification: Naive Bayes Learning goals 15 Understand the

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

Outline Naive Credal Classifier 2: an extension of Naive Bayes Introducing NCC2 1 for

Naive Bayes Classication Naive Bayes Classication In [1]: % matplotlib inline from

CSE 473: Artificial Intelligence Autumn 2010 Machine Learning: Naive Bayes and Perceptron Luke

PLUGIN CLASSIFIERS: NAIVE BAYES, LDA, PLUGIN CLASSIFIERS: NAIVE BAYES, LDA, LOGISTIC REGRESSION

Naive Bayes and Gaussian Bayes Classifier Ladislav Rampasek slides by Mengye Ren and others

INF4080 2020 FALL NATURAL LANGUAGE PROCESSING Jan Tore Lnning 2 (Mostly Text)

Graph Classification Classification Outline Introduction, Overview Classification using

Classification of Symmetry Classification of Symmetry Classification of Symmetry Classification

CSC 411: Lecture 09: Naive Bayes Class based on Raquel Urtasun &amp; Rich Zemels lectures

2. Naive Bayes Classification Machine Learning and Real-world Data (MLRD) Paula Buttery (based

Overview Introduction to Information Retrieval Text classification

(a) Quantitative classification (b) Qualitative classification (c) Area classification (d) Simple

Classification Image Classification Set of predefined categories [eg: table, apple, dog, giraffe]

Sustainable Energy for All A Brief Overview of Solar Energy and some of our research

Disclaimer Disclaimer The information contained in this presentation is intended solely for your

Molten Salt Reactors: A 2 Fluid Approach to a Practical Closed Cycle Thorium Reactor

Computer Science Qingsong Guo Fall 2017 School of Computer Science &amp; Technology CS101

Streaming Model of Computation A streaming algorithm processes a data stream : Input is

Sketch Model Review Blue!Team!!Sec,on!A! October!3,!2013! 2.009%%Blue%A ! 1% Problems and

Geovisualization of fishing vessel movement patterns using hybrid fractal/velocity signatures

Welcome to the 4 th ICCT Workshop on Marine Black Carbon Dan Rutherford, PhD Director, Marine

CSC 411: Lecture 09: Naive Bayes Class based on Raquel Urtasun & Rich Zemels lectures

Computer Science Qingsong Guo Fall 2017 School of Computer Science & Technology CS101