measuring semantic coherence of a conversation
play

Measuring Semantic Coherence of a Conversation Svitlana Vakulenko , - PowerPoint PPT Presentation

Measuring Semantic Coherence of a Conversation Svitlana Vakulenko , Maarten de Rijke, Michael Cochez, Vadim Savenkov, Axel Polleres 6 JUNE 2018 Semantic coherence An essential property of a conversation, continuity of senses


  1. Measuring Semantic Coherence of a Conversation Svitlana Vakulenko , Maarten de Rijke, Michael Cochez, Vadim Savenkov, Axel Polleres 6 JUNE 2018

  2. Semantic coherence • An essential property of a conversation, “ continuity of senses” https://pixabay.com/en/fishing-net-red-thread-network-node-1526496/ � 2

  3. Research goal ▪ See if we can detect holes in conversations ▪ Evaluate existing knowledge models ▪ Propose an approach to measure these holes (incoherence) ▪ Why : dialogue system design, knowledge engineering � 3

  4. Semantic models ▪ Knowledge Graphs https://commons.wikimedia.org/wiki/File:Wikidata-gun-ontology-2017-05-11.png � 4

  5. Semantic models ▪ Word embeddings https://commons.wikimedia.org/wiki/File:2016_02_mini_embedding.png � 5

  6. Semantic models ▪ Knowledge Graphs ▪ Word embeddings https://commons.wikimedia.org/wiki/File:2016_02_mini_embedding.png https://commons.wikimedia.org/wiki/File:Wikidata-gun-ontology-2017-05-11.png � 6

  7. Semantic models ▪ Knowledge Graphs ▪ Word embeddings ▪ Knowledge Graph embeddings https://commons.wikimedia.org/wiki/File:2016_02_mini_embedding.png https://commons.wikimedia.org/wiki/File:Wikidata-gun-ontology-2017-05-11.png � 7

  8. Linking dialogue ▪ Take existing knowledge models ▪ See if we can detect holes in conversations through this models ▪ Propose an approach to measure these holes (incoherence) https://pxhere.com/en/photo/1101883 � 8

  9. Dialog graph w1 mdg : gksudo gedit /etc/apt/source.list w2 w3 (type from command line) crunchbang666 : the text editor has opened the file source.list but there is no content i typed source instead of sources ... ok so i have it open dbr:Gedit w1 u1 c1 w2 genre wikiPageWikiLink u2 c* c2 dbr:GNOME dbr:Text editor w3 wikiPageWikiLink p1 p2 w4 c3 dbr:Deb(file format) wikiPageWikiLink u3 c4 dbr:Ubuntu(OS) u4 w5 w5 w4 mdg : see the line # deb http://gb.archive. ubuntu w4 all you have to do is delete the ""#"" character crunchbang666 : just the deb or the deb-src line too? � 9 VAKULENKO ET AL. MEASURING SEMANTIC COHERENCE OF A CONVERSATION.

  10. Experiments ▪ Ubuntu Dialogue Corpus ▪ DBpedia Spotlight API ▪ Knowledge Graphs: DBpedia+Wikidata HDT ▪ Knowledge Graph embeddings: rdf2vec, KGlove ▪ Word embeddings: word2vec, Glove https://github.com/rkadlec/ubuntu-ranking-dataset-creator 
 https://en.wikipedia.org/wiki/File:DBpediaSpotlight.jpg https://en.wikipedia.org/wiki/Wikidata � 10

  11. Subgraph induction � 11 VAKULENKO ET AL. MEASURING SEMANTIC COHERENCE OF A CONVERSATION.

  12. top-k shortest path PREFIX ppf: <java:at.ac.wu.arqext.path.> PREFIX dbr: <http://dbpedia.org/resource/> SELECT * WHERE { ?X ppf:topk ("--source" dbr:Directory_service dbr:Gnome dbr:GNOME dbr:Desktop_environment "--target" dbr:Desktop_computer "--k" 5 "--maxlength" 9 "--timeout" 2000) } http://wikidata.communidata.at � 12

  13. Subgraph statistics � 13 VAKULENKO ET AL. MEASURING SEMANTIC COHERENCE OF A CONVERSATION.

  14. Shortest paths � 14 VAKULENKO ET AL. MEASURING SEMANTIC COHERENCE OF A CONVERSATION.

  15. Negative sampling ▪ random uniform (RUf) ▪ vocabulary distribution (VoD) ▪ sequence disorder (SqD) ▪ horizontal split (HSp) ▪ vertical split (VSp) � 15

  16. Shortest paths � 16 VAKULENKO ET AL. MEASURING SEMANTIC COHERENCE OF A CONVERSATION.

  17. Binary classification ▪ Convolutional Neural Network (CNN) ▪ Input: sequence of words/entities ▪ Output: coherence score [0;1] Word Convolutional Max pool Hidden Output embeddings 0.8 ReLU ReLU Sigmoid 250 filters size 3 step 1 � 17

  18. Binary classification ▪ Convolutional Neural Network (CNN) ▪ Input: sequence of words/entities ▪ Output: coherence score [0;1] Knowledge Graph Convolutional Max pool Hidden Output embeddings dbr:ubuntu (OS) dbr:desktop dbr:totem dbr:vlc 0.8 dbr:fsck dbr:ext2 dbr:partition ReLU ReLU Sigmoid 250 filters size 3 step 1 � 18

  19. Results � 19

  20. Random uniform � 20

  21. Horizontal split � 21

  22. Semantic spaces � 22

  23. Conclusions and future work ▪ GloVe word embeddings show best performance ▪ integrating heterogenous knowledge sources � 23

  24. Conclusions and future work ▪ NEL is a bottleneck for KG embeddings ▪ End-to-end training (NEL NN-layer) Knowledge Graph Convolutional Max pool Hidden Output embeddings dbr:ubuntu (OS) dbr:desktop dbr:totem dbr:vlc 0.8 dbr:fsck dbr:ext2 dbr:partition ReLU ReLU Sigmoid 250 filters size 3 step 1 � 24

  25. Conclusions and future work ▪ Dialog graph embeddings w1 mdg : gksudo gedit /etc/apt/source.list w2 w3 (type from command line) crunchbang666 : the text editor has opened the file source.list but there is no content i typed source instead of sources ... ok so i have it open dbr:Gedit w1 u1 c1 w2 genre wikiPageWikiLink u2 c* c2 dbr:GNOME dbr:Text editor w3 wikiPageWikiLink p1 p2 w4 c3 dbr:Deb(file format) wikiPageWikiLink u3 c4 dbr:Ubuntu(OS) u4 w5 w5 w4 mdg : see the line # deb http://gb.archive. ubuntu w4 all you have to do is delete the ""#"" character crunchbang666 : just the deb or the deb-src line too? � 25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend