 
              YAGO3: A Knowledge Base from Multilingual Wikipedias Farzaneh Mahdisoltani Joanna Biega Fabian M. Suchanek CIDR 2015
2
2
John_Coltrane 2
John_Coltrane wasBornOnDate wasBornIn label “1926-09-23” Hamlet_(Town) “John William Coltrane” 2
John_Coltrane wasBornOnDate wasBornIn type label “1926-09-23” American_ Hamlet_(Town) Jazz_Composer “John William Coltrane” 2
John_Coltrane wasBornOnDate wasBornIn type label “1926-09-23” American_ Hamlet_(Town) Jazz_Composer “John William Coltrane” locatedIn subclassOf wordnet_composer United_States locatedIn subclassOf North_America wordnet_musician 2
John_Coltrane wasBornOnDate wasBornIn type label “1926-09-23” American_ Hamlet_(Town) Jazz_Composer “John William 120M facts Coltrane” 10M entities 100 relations locatedIn subclassOf 95% precision wordnet_composer United_States locatedIn subclassOf North_America wordnet_musician 2
YAGO can be used in many ways Named Entity Disambiguation 3 J. Hoffart et al., Robust Disambiguation of Named Entities in Text, EMNLP2011
YAGO can be used in many ways Named Entity Semantic Culturomics Disambiguation T. Huet, J. Biega, F. M. Suchanek, Mining History with Le Monde, AKBC2013 F. M. Suchanek, N. Preda, Semantic Culturomics, VLDB2014 3 J. Hoffart et al., Robust Disambiguation of Named Entities in Text, EMNLP2011
YAGO can be used in many ways Named Entity Semantic Culturomics Disambiguation Extending YAGO coverage would yield better results! T. Huet, J. Biega, F. M. Suchanek, Mining History with Le Monde, AKBC2013 F. M. Suchanek, N. Preda, Semantic Culturomics, VLDB2014 3 J. Hoffart et al., Robust Disambiguation of Named Entities in Text, EMNLP2011
Multilingual wikipedias 4
Multilingual wikipedias Izabella_Olszewska Tadeusz_Jurasz Local entities 4
Multilingual wikipedias Local facts Izabella_Olszewska isMarriedTo Tadeusz_Jurasz Local entities 4
Running YAGO on multilingual wikipedias Extraction EN ? 5
Running YAGO on multilingual wikipedias Extraction EN ? Duplicate entities 5
Running YAGO on multilingual wikipedias Extraction EN ? Entities with no Duplicate entities type discarded 5
Running YAGO on multilingual wikipedias Extraction EN ? Entities with no Duplicate entities type discarded No facts extracted from foreign inboxes 5
Running YAGO on multilingual wikipedias Extractor Extractor Theme Theme Extractor Theme Extractor Extractor Theme Theme Extractor 6
Running YAGO on multilingual wikipedias Raw extraction Extractor Extractor Extractor Extractor Theme Theme Theme Theme Extractor Extractor Extractor Theme Theme Theme Extractor Extractor Extractor Theme Theme Theme Extractor Clean-up 6
Tasks 1. Entities 2. Types 3. Facts 7
1. Set of Entities =? =? 8
1. Set of Entities specifies the abstraction classes 8
1. Set of Entities specifies the abstraction classes 8
2. Taxonomy construction en/John_Coltrane inCategory "Jazz Music" en/John_Coltrane inCategory "American Composers" 9
2. Taxonomy construction en/John_Coltrane inCategory "Jazz Music" en/John_Coltrane inCategory "American Composers" en/John_Coltrane type American_Composer 9
2. Taxonomy construction en/John_Coltrane inCategory "Jazz Music" en/John_Coltrane inCategory "American Composers" en/John_Coltrane type American_Composer American_Composer subclassOf wordnet_composer 9
2. Taxonomy construction English-centric! en/John_Coltrane inCategory "Jazz Music" en/John_Coltrane inCategory "American Composers" en/John_Coltrane type American_Composer American_Composer subclassOf wordnet_composer 9
2. Taxonomy construction pl/John_Coltrane inCategory pl/Ameryka ń scy_Jazzmani en/John_Coltrane inCategory "Jazz Music" en/John_Coltrane inCategory "American Composers" en/John_Coltrane type American_Composer American_Composer subclassOf wordnet_composer 9
2. Taxonomy construction pl/John_Coltrane inCategory pl/Ameryka ń scy_Jazzmani en/John_Coltrane inCategory en/American_Jazzmen en/John_Coltrane inCategory "Jazz Music" en/John_Coltrane inCategory "American Composers" en/John_Coltrane type American_Composer American_Composer subclassOf wordnet_composer 9
2. Taxonomy construction pl/John_Coltrane inCategory pl/Ameryka ń scy_Jazzmani en/John_Coltrane inCategory en/American_Jazzmen en/John_Coltrane inCategory "Jazz Music" en/John_Coltrane inCategory "American Composers" en/John_Coltrane type American_Composer en/John_Coltrane type American_Jazzman American_Composer subclassOf wordnet_composer American_Jazzman subclassOf wordnet_jazzman 9
2. Taxonomy construction pl/John_Coltrane inCategory pl/Ameryka ń scy_Jazzmani en/John_Coltrane inCategory en/American_Jazzmen en/John_Coltrane inCategory "Jazz Music" en/John_Coltrane inCategory "American Composers" en/John_Coltrane type American_Composer en/John_Coltrane type American_Jazzman American_Composer subclassOf wordnet_composer American_Jazzman subclassOf wordnet_jazzman 9
3. Fact extraction en/infobox/married 10
3. Fact extraction en/infobox/married isMarriedTo Manually defined in YAGO-EN 10
3. Fact extraction en/infobox/married isMarriedTo pl/infobox/ma łż onek 10
3. Fact extraction en/infobox/married isMarriedTo wasBornOnDate hasChild ? ? ? pl/infobox/ma łż onek 10
Infobox attributes mapping pl/infobox/ma łż onek =? isMarriedTo F malzonek E isMarriedT o (Barack_Obama, Michelle_Obama) (Barack_Obama, Michelle_Obama) (Elvis_Presley, Priscilla_Presley) (Elvis_Presley, Priscilla_Presley) (John_Coltrane, Ravi Coltrane) (John_Coltrane, Alice_Coltrane) (pl/Izabella_Olszewska, pl/Tadeusz_Jurasz) 11
Infobox attributes mapping pl/infobox/ma łż onek =? isMarriedTo F malzonek E isMarriedT o (Barack_Obama, Michelle_Obama) (Barack_Obama, Michelle_Obama) (Elvis_Presley, Priscilla_Presley) (Elvis_Presley, Priscilla_Presley) (John_Coltrane, Ravi Coltrane) (John_Coltrane, Alice_Coltrane) (pl/Izabella_Olszewska, pl/Tadeusz_Jurasz) Corresponding attributes will share some subject-object pairs 11
Infobox attributes mapping pl/infobox/ma łż onek =? isMarriedTo F malzonek E isMarriedT o (Barack_Obama, Michelle_Obama) (Barack_Obama, Michelle_Obama) (Elvis_Presley, Priscilla_Presley) (Elvis_Presley, Priscilla_Presley) (John_Coltrane, Ravi Coltrane) (John_Coltrane, Alice_Coltrane) (pl/Izabella_Olszewska, pl/Tadeusz_Jurasz) support ( F a , E r ) = | matches ( F a , E r ) | 12
Infobox attributes mapping pl/infobox/ma łż onek =? isMarriedTo F malzonek E isMarriedT o (Barack_Obama, Michelle_Obama) (Barack_Obama, Michelle_Obama) (Elvis_Presley, Priscilla_Presley) (Elvis_Presley, Priscilla_Presley) (John_Coltrane, Ravi Coltrane) (John_Coltrane, Alice_Coltrane) (pl/Izabella_Olszewska, pl/Tadeusz_Jurasz) Too restrictive for attributes with few contributions support ( F a , E r ) = | matches ( F a , E r ) | 12
Infobox attributes mapping pl/infobox/ma łż onek =? isMarriedTo F malzonek E isMarriedT o (Barack_Obama, Michelle_Obama) (Barack_Obama, Michelle_Obama) (Elvis_Presley, Priscilla_Presley) (Elvis_Presley, Priscilla_Presley) (John_Coltrane, Ravi Coltrane) (John_Coltrane, Alice_Coltrane) (pl/Izabella_Olszewska, pl/Tadeusz_Jurasz) confidence ( F a , E r ) = | matches ( F a , E r ) | | contrib ( F a ) | 13
Infobox attributes mapping pl/infobox/ma łż onek =? isMarriedTo F malzonek E isMarriedT o (Barack_Obama, Michelle_Obama) (Barack_Obama, Michelle_Obama) (Elvis_Presley, Priscilla_Presley) (Elvis_Presley, Priscilla_Presley) (John_Coltrane, Ravi Coltrane) (John_Coltrane, Alice_Coltrane) (pl/Izabella_Olszewska, pl/Tadeusz_Jurasz) pl/Krystyna_Pyrkosz, pl/Witold_Pyrkosz Too restrictive for attributes pl/Gra ż yna_Torbicka, pl/Adam_Torbicki pl/Szymon_Majewski, pl/Magda_Majewska with a lot of new facts but few matches confidence ( F a , E r ) = | matches ( F a , E r ) | | contrib ( F a ) | 13
Infobox attributes mapping pl/infobox/ma łż onek =? isMarriedTo F malzonek E isMarriedT o (Barack_Obama, Michelle_Obama) (Barack_Obama, Michelle_Obama) (Elvis_Presley, Priscilla_Presley) (Elvis_Presley, Priscilla_Presley) (John_Coltrane, Ravi Coltrane) (John_Coltrane, Alice_Coltrane) (pl/Izabella_Olszewska, pl/Tadeusz_Jurasz) pl/Krystyna_Pyrkosz, pl/Witold_Pyrkosz pl/Gra ż yna_Torbicka, pl/Adam_Torbicki pl/Szymon_Majewski, pl/Magda_Majewska Open-world assumption | matches ( F a , E r ) | pca ( F a , E r ) = | matches ( F a , E r ) | + | clashes ( F a , E r ) | 14 L. Galarraga, C. Teflioudi, K. Hose, F. M. Suchanek, AMIE: Association Rule Mining under Incomplete Evidence in Ontological Knowledge Bases, WWW2013
Recommend
More recommend