Measuring Semantic Coherence of a Conversation Svitlana Vakulenko , - - PowerPoint PPT Presentation
Measuring Semantic Coherence of a Conversation Svitlana Vakulenko , - - PowerPoint PPT Presentation
Measuring Semantic Coherence of a Conversation Svitlana Vakulenko , Maarten de Rijke, Michael Cochez, Vadim Savenkov, Axel Polleres Semantic Coherence? I think Monterey is a great conference location! Oh yes, it has Florida s most beautiful
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
2
Semantic Coherence?
…I am looking forward to see the Eiffel Tower! I think Monterey is a great conference location! Oh yes, it has Florida’s most beautiful coastline…
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
…I am looking forward to see the Eiffel Tower!
3
Semantic Coherence ~ Contextual Glue
I think Monterey is a great conference location! Oh yes, it has Florida’s most beautiful coastline… California Aquarium
locatedIn locatedIn
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
4
Semantic Coherence IsA Classification Task
Sense-making line Coherence score Nonsense Background Knowledge
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
▪ Conversational analysis ▪ reconstructing dialogs in a public chat ▪ detecting topic shifts for segmentation ▪ Conversational agents ▪ interpreting context ▪ generating response
5
Applications
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
6
Contributions
- 1. Task of measuring semantic coherence of a conversation
- 2. Benchmark for the semantic coherence task
- 3. Approaches and their evaluation:
3.1.Subgraph induction approach 3.2.Graph embeddings approach 3.3.Word embeddings approach
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
7
Benchmark
▪ Conversational dataset ▪ Ubuntu Dialogue Corpus (IRC logs) ~2M ▪ Knowledge representation models ▪ KGs: DBpedia+Wikidata HDT ▪ KG embeddings: RDF2Vec, KGlove ▪ Word embeddings: Word2Vec, Glove ▪ Entity Linking: DBpedia Spotlight API
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
p1 u1 u3 p2 u2 u4 w1 w2 w4 w5 w3 c1 c* c4 c2
mdg: gksudo gedit /etc/apt/source.list (type from command line) crunchbang666: the text editor has opened the file source.list but there is no content i typed source instead of sources ... ok so i have it open mdg: see the line # deb http://gb.archive.ubuntu all you have to do is delete the ""#"" character crunchbang666: just the deb or the deb-src line too?
dbr:Ubuntu(OS) dbr:Deb(file format) dbr:Text editor dbr:Gedit wikiPageWikiLink wikiPageWikiLink wikiPageWikiLink dbr:GNOME genre
c3
w1 w2 w3 w4 w5 w4
mdg crunchbang666
8
Subgraph induction: Top-k Shortest Paths
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
Nonsense
9
Benchmark: Incoherent Dialogues
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
10
Benchmark: Incoherent Dialogues
- 1. Vocabulary sampling
1.1.Random uniform 1.2.Vocabulary distribution
Nonsense
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
- 1. Vocabulary sampling
1.1.Random uniform 1.2.Vocabulary distribution
Nonsense
11
Benchmark: Incoherent Dialogues
- 2. Dialogue permutations
2.1.Sequence disorder 2.2.Horizontal split 2.3.Vertical split
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
12
Subgraph Induction: Performance Bottleneck
min #hops % entities
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
13
Embeddings Classification Approach
▪ Convolutional Neural Network (CNN) ▪ Input: sequence of words/entities ▪ Output: coherence score [0;1] Pre-trained embeddings ▪ Entities: RDF2Vec, KGlove ▪ Words: Word2Vec, Glove
Embeddings Convolutional Max pool
250 filters size 3 step 1
Hidden Output
0.8 ReLU Sigmoid ReLU
Input
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
14
Results: Word Embeddings perform Best
Word KG{
{
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
15
Results: KG Embeddings Classification
% entities min cosine distance
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
16
Results:Word Embeddings Classification
% entities min cosine distance
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
17
Results: Permutation IsA Difficult Task
- 1. sampling
- 2. permutations
Word KG{
{
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
18
Coherence Patterns
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
19
Coherence Patterns
20
Horizontal Split
21
Horizontal Split
22
Horizontal Split
23
Horizontal Split
24
Coherence Patterns
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0
25
Future Work
▪ Extend coherence measure beyond KG entities ▪ Integration of KG and word embeddings ▪ End-to-end training including entity linking layer
Open source: https://github.com/svakulenk0/semantic_coherence
Svitlana Vakulenko et al. Measuring Semantic Coherence of a Conversation. ISWC2018 @svakulenk0