Natural Language Understanding using Knowledge Bases and Random - PowerPoint PPT Presentation

Natural Language Understanding using Knowledge Bases and Random Walks Eneko Agirre ixa2.si.ehu.eus/eneko IXA NLP Group University of the Basque Country Darmstadt, 2015 Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 1 / 43

Algorithms on Large Graphs WWW, Random walks, PageRank and Google source: http://opte.org Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 2 / 43

Algorithms on Large Graphs Linked Data Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 3 / 43

Algorithms on Large Graphs Wikipedia (DBpedia) Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 3 / 43

Algorithms on Large Graphs WordNet Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 3 / 43

Algorithms on Large Graphs Unified Medical Language System Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 3 / 43

Algorithms on Large Graphs sources: http://sixdegrees.hu/ http://www2.research.att.com/˜yifanhu/ http://www.cise.ufl.edu/research/sparse/matrices/Gleich/ http://www.ebremer.com/ Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 3 / 43

Text Understanding Understanding of broad language, what’s behind the surface strings Barcelona boss says that Jose Mourinho is ’the best coach in the world’ Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 4 / 43

Text Understanding: Knowledge Bases and Graph algorithms How far can we go with current KBs and graph-based algorithms? Ground words in context to KB concepts and instances Word Sense Disambiguation Named Entity Disambiguation , Entity Linking, Wikification Similarity between concepts, instances and words Improve ad-hoc information retrieval Applied to WordNet(s), UMLS, Wikipedia Excellent results Open source software and data: http://ixa2.si.ehu.eus/ukb/ Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 5 / 43

Outline WordNet, PageRank and Personalized PageRank 1 Random walks for WSD 2 Random walks for WSD (biomedical domain) 3 Random walks for NED 4 Random walks for similarity 5 Similarity and Information Retrieval 6 Conclusions 7 Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 6 / 43

WordNet, PageRank and Personalized PageRank Outline WordNet, PageRank and Personalized PageRank 1 Random walks for WSD 2 Random walks for WSD (biomedical domain) 3 Random walks for NED 4 Random walks for similarity 5 Similarity and Information Retrieval 6 Conclusions 7 Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 7 / 43

WordNet, PageRank and Personalized PageRank Wordnet, Pagerank and Personalized PageRank WordNet is the most widely used hierarchically organized lexical database for English (Fellbaum, 1998) Broad coverage of nouns, verbs, adjectives, adverbs Main unit: synset (concept) coach#1, manager#3, handler#2 someone in charge of training an athlete or a team. Relations between concepts: synonymy (built-in), hyperonymy, antonymy, meronymy, entailment, derivation, gloss Closely linked versions in several languages Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 8 / 43

WordNet, PageRank and Personalized PageRank Wordnet Representing WordNet as a graph: Nodes represent concepts Edges represent relations (undirected) In addition, directed edges from words to corresponding concepts (senses) Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 9 / 43

WordNet, PageRank and Personalized PageRank Wordnet managership#n3 handle#v6 derivation trainer#n1 derivation sport#n1 hyperonym teacher#n1 coach#n1 domain hyperonym coach#n2 coach derivation tutorial#n1 coach#n5 holonym hyperonym holonym fleet#n2 public_transport#n1 seat#n1 Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 10 / 43

WordNet, PageRank and Personalized PageRank Random Walks: PageRank Given a graph, ranks nodes according to their relative structural importance If an edge from n i to n j exists, a vote from n i to n j is produced Strength depends on the rank of n i The more important n i is, the more strength its votes will have. PageRank is more commonly viewed as the result of a random walk process Rank of n i represents the probability of a random walk over the graph ending on n i , at a sufficiently large time. Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 11 / 43

WordNet, PageRank and Personalized PageRank Random Walks: PageRank G : graph with N nodes n 1 , . . . , n N d i : outdegree of node i M : N × N matrix  1 an edge from i to j exists  M ji = d i 0 otherwise  PageRank equation: Pr = cM Pr + ( 1 − c ) v surfer follows edges surfer randomly jumps to any node (teleport) c : damping factor: the way in which these two terms are combined Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 12 / 43

WordNet, PageRank and Personalized PageRank Random Walks: Personalized PageRank Pr = cM Pr + ( 1 − c ) v PageRank: v is a stochastic normalized vector, with elements 1 N Equal probabilities to all nodes in case of random jumps Personalized PageRank , non-uniform v (Haveliwala 2002) Assign stronger probabilities to certain kinds of nodes Bias PageRank to prefer these nodes For ex. if we concentrate all mass on node i All random jumps return to n i Rank of i will be high High rank of i will make all the nodes in its vicinity also receive a high rank Importance of node i given by the initial v spreads along the graph Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 13 / 43

Natural Language Understanding using Knowledge Bases and Random - PowerPoint PPT Presentation

Natural Language Understanding using Knowledge Bases and Random Walks Eneko Agirre ixa2.si.ehu.eus/eneko IXA NLP Group University of the Basque Country Darmstadt, 2015 Agirre (UBC) NLU using KBs and Random Walks Feb. 2015 1 / 43

Chemistry 2000 Slide Set 20: Organic bases Marc R. Roussel March 26, 2020 Chemistry 2000 Slide

Finite Projective Planes http://math.uwyo.edu/moorhouse/pub/planes/ Eric Moorhouse Mutually

Natural Language Understanding We want to communicate with computers using natural language

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

Knowledge-Based Reasoning in Computer Vision CSC 2539 Paul Vicol Outline Knowledge Bases

Understanding Text with Knowledge-Bases and Random Walks Eneko Agirre ixa2.si.ehu.es/eneko IXA

Acids and Bases Slide 3 / 208 Slide 4 / 208 Table of Contents: Acids and Bases Click on the

Acids and Bases Slide 3 / 208 Table of Contents: Acids and Bases Click on the topic to go to

Acids and Bases Slide 3 / 208 Table of Contents: Acids and Bases Click on the topic to go to

Thinking Like a Chemist About Acids and Bases UNIT 6 DAY 5 What are we going to learn today?

G -bases in free objects of Topological Algebra (Local) -bases in topological and uniform

Acids and Bases List as many things that you can about acids or bases in 15 seconds. Share

Natural Language Understanding with World Knowledge and Inference Katya Ovchinnikova

Understanding Tax Bases Staff Presentation July 20, 2005 Clean Tax Bases What is in the

On the integration of On the integration of biomedical knowledge bases: biomedical knowledge

Learning From/For Knowledge Bases Graham Neubig Site https://phontron.com/class/nn4nlp2019/

The EMIN Project and the EU Roadmap for Adequate Minimum Income Schemes Ramn Pea-Casas

A proposal for an integrated approach between sentiment analysis and social network analysis

Sessions and Pipelines for Structured Service Programming Michele Boreale 1 Roberto Bruni 2 Rocco

Information Theory, Statistics, and Decision Trees L eon Bottou COS 424 4/6/2010 Summary

Taming Pointers A Symbolic Approach Jianwen Zhu jzhu@eecg.toronto.edu Electrical and Computer

PARTICIPATION IN NATIONAL MIRROR COMMITTEES AND ISO STANDARDS DEVELOPMENT COMMITTEES Presented by

REPORT OF BREAKOUT SESSION 2(b) Management System Standards June 25, 2009 2:00 pm to 3:30 pm

ANSI X9.44 and IETF TLS Russ Housley and Burt Kaliski RSA Laboratories November 2002