Une école de l’IMT
Commonsense Properties from Query Logs and Question Answering - - PowerPoint PPT Presentation
Commonsense Properties from Query Logs and Question Answering - - PowerPoint PPT Presentation
Commonsense Properties from Query Logs and Question Answering Forums Julien Romero, Simon Razniewski, Koninika Pal, Jeff Z. Pan, Archit Sakhadeo, Gerhard Weikum Une cole de lIMT Goal Mine Commonsense Knowledge (CSK) about : Object
Une école de l’IMT
Goal
■ Mine Commonsense Knowledge (CSK) about :
−
Object properties
−
Human behavior
−
General concepts ■ Focus on salient properties ■ Examples :
−
(bananas, are, edible)
−
(children, like, bananas) ■ Applications : Chatbot, Question Answering, Visual content understanding, Search engine queries interpretation, ...
QUASIMODO 2
2019/11/05
Une école de l’IMT
Challenges
■ Sparseness and bias ■ Rarely expressed ■ Non-encyclopedic (no Wikipedia) ■ Noise and high bias on online content
QUASIMODO 3
2019/11/05
Une école de l’IMT
Previous Work
■ Traditional Knowledge Bases
−
No commonsense ■ ConceptNet
−
Manual, does not scale ■ Webchild
−
Focus on possible properties, not salient ones ■ TupleKB
−
Domain specific
QUASIMODO 4
2019/11/05
Une école de l’IMT
General Pipeline
QUASIMODO 5
2019/11/05
Une école de l’IMT
Candidate Gathering
■ Main idea : Extract facts from questions
−
When asking a question, make assumptions about the world
−
Harvest human curiosity, « wisdom of the crowds »
QUASIMODO 6
2019/11/05
Why are bananas yellow? Bananas are yellow!
Une école de l’IMT
Candidate Gathering – Query Logs
■ Indirect access to the query logs through autocompletion
QUASIMODO 7
2019/11/05
Une école de l’IMT
Candidate Gathering – QA Forums
QUASIMODO 8
2019/11/05
Yahoo! Answers (research datasets) Quora (semi-manually) Answers.com (sitemap) Reddit (dump)
why-how questions
Une école de l’IMT
Candidate Gathering – Statistics
QUASIMODO 9
2019/11/05
Une école de l’IMT
Candidate Gathering – Results
■ Questions transformed to statements then to triples using OpenIE techniques
QUASIMODO 10
2019/11/05
Why do lions often hunt zebras? Lions often hunt zebras (lions, often eat, zebras) OpenIE Q2S Modality (lions, eat, zebras, often) Positivity (lions, eat, zebras, often, positive) Source (lions, eat, zebras, often, positive, Google, 0.4)
Une école de l’IMT
Corroboration
■ Reduce noise thanks to additional signals from :
−
Wikipedia and Simple Wikipedia
−
Answer snippets from search engines
−
Google Books
−
Image Tags from OpenImages and Flickr
−
Google’s Conceptual Captions dataset ■ Train Naive Bayes from all signals from 700 manually annotated triples (TuplesKB requires 70.000)
−
Precision of 61%
QUASIMODO 11
2019/11/05
Une école de l’IMT
Ranking + TODO Example
■ From Corroboration, get plausibility score π ■ Define a probability from it: ■ Derive a typicality τ and a saliency σ:
QUASIMODO 12
2019/11/05
Une école de l’IMT
Grouping
■ Reduce redundancy ■ Clustering method based on tri-factorization ■ Groups of (Subject, Object) and Predicate
QUASIMODO 13
2019/11/05
Une école de l’IMT
Statistics
QUASIMODO 14
2019/11/05
Une école de l’IMT
Examples of facts
■ Practical knowledge from human, e.g. : (car, slip on, ice) ■ Problems linked to a subject, e.g.: (pen, can, leak) ■ Emotions linked to events. e.g.: (divorce, can, hurt) ■ Human behaviors. e.g.: (ghost, scare, people) ■ Negative knowledge, e.g.: Not (elephant, can, jump), ■ Salient modalities, e.g.: Always (doctor, have, unreadable handwriting) ■ Trivial facts, e.g.: (road, has_color, black) ■ Newest facts. e.g.: (trump, build, wall) ■ Cultural knowledge (here U.S.) e.g.: Always (school, have, locker) ■ Comparative knowledge, e.g.: (light, faster than, sound)
QUASIMODO 15
2019/11/05
Une école de l’IMT
Precision – Entire CSKs
QUASIMODO 16
2019/11/05
Une école de l’IMT
Precision – Same Subjects
QUASIMODO 17
2019/11/05
Une école de l’IMT
Recall
QUASIMODO 18
2019/11/05
Une école de l’IMT
Question Answering
QUASIMODO 19
2019/11/05
Une école de l’IMT
Conclusion
■ We introduced a new methodology for acquiring CSK from non-standard sources ■ Improve state of the art with better coverage of typical and salient properties, determined by Mturks ■ Extrinsic evaluations illustrate advantages ■ Data and code available: github.com/Aunsiels/CSK
QUASIMODO 20
2019/11/05
Une école de l’IMT
Additional slides
2019/11/05
QUASIMODO 21
Une école de l’IMT
Future Work
■ Cultural knowledge ■ Study of stereotypes ■ Temporal evolution of the knowledge base ■ Improve ranking methods ■ Scale to the entire web
QUASIMODO 22
2019/11/05
Une école de l’IMT
Litterature
■ Data: https://www.mpi-inf.mpg.de/departments/databases-and-informatio n-systems/research/yago-naga/commonsense/quasimodo/ ■ Code: https://github.com/Aunsiels/CSK ■ http://conceptnet.io/ ■ http://data.allenai.org/tuple-kb/ ■ https://www.mpi-inf.mpg.de/departments/databases-and-informatio n-systems/research/yago-naga/commonsense/webchild/
QUASIMODO 23
2019/11/05