outline
play

Outline Introduction Motivation Methodology Experimental Results - PowerPoint PPT Presentation

K NOWLEDGE -B ASED L INGUISTIC A NNOTATION OF D IGITAL C ULTURAL H ERITAGE C OLLECTION Tuukka Ruotsalo, Lora Aroyo and Guus Schreiber Speaker: Chenhua Date: 24 th Feb 2010 Outline Introduction Motivation Methodology Experimental


  1. K NOWLEDGE -B ASED L INGUISTIC A NNOTATION OF D IGITAL C ULTURAL H ERITAGE C OLLECTION Tuukka Ruotsalo, Lora Aroyo and Guus Schreiber Speaker: Chenhua Date: 24 th Feb 2010

  2. Outline • Introduction • Motivation • Methodology • Experimental Results • Conclusion 2/24/2010 Text Mining Seminar 2

  3. Introduction • Paris was painted in 1888. • In Paris, Van Gogh painted the work in 1888. 2/24/2010 Text Mining Seminar 3

  4. Motivation Better run … 2/24/2010 Text Mining Seminar 4

  5. Research Question Is there a smart way to annotate such massive collection? 2/24/2010 Text Mining Seminar 5

  6. Methodology • Background knowledge – Structured vocabulary – Enhance performance of retrieval • Automatic annotation – Concept identification e.g. Paris as a city – Role identification e.g. Paris as a subject matter

  7. System Architecture Ontology knowledge base Named entity Phase1:Lingustic Phase2: tagging Concept Identification Part of speech analysis tagging Annotation Morphological analysis Phase3: Role Identification Dependency structure analysis Feature knowledge base 2/24/2010 Text Mining Seminar 7

  8. Knowledge Base • Art and Architecture Thesaurus (AAT) • Getty Thesaurus of Geographic (TGN) • Union List of Artist Names (ULAN) • WordNet • etc. 2/24/2010 Text Mining Seminar 8

  9. Linguistic Analysis Persons, organization, locations, Named entity miscellaneous NE tagging Part of speech Verbs, adjectives and nouns Syntactic tagging features Morphological Number: singular or plural analysis Dependency Internal dependency structure structure analysis Subject, direct object 2/24/2010 Text Mining Seminar 9

  10. Concept Identification • Define (chunking) and map meaningful units to concepts in structured vocabularies • Perform differently for nouns, verbs and NE's Mapping chucks, NE's, bi- words to KB Examples for matching NEs: NE tagged with persons ULAN  others  WordNet Phase2: Concept Syntactic features Identification 2/24/2010 Text Mining Seminar 10

  11. Role Identification • Difference between concept and Phase2: role identification Concept – “Rembrandt” is an instance of Identification concept “person”, independent of context – “Rembrandt” can take various role , e.g, creator or subject of artworks, Phase3: Role dependent of context Identification • How to do role identification task? – SVM – Based on features: Syntactic • syntactic and semantic features Feature • E.g. PoS tag, Voice of a sentence verb, PoS knowledge base path parsing constituent to verb or predicate 2/24/2010 Text Mining Seminar 11

  12. Evaluation • Using a collection of natural language descriptions of artworks. – ARIA collection from Rijksmuseum Amsterdam – 250 artworks randomly selected – Typical descriptions on “what, who, where, when and which people or culture related to the artworks • Using 3 structured vocabularies (Knowledge Base) – AAT, TGN,ULAN and WordNet • Using an artwork annotation schema – Visual Resources Association(VRA) specialized on artwork 2/24/2010 Text Mining Seminar 12

  13. Evaluation (Cont.) 2/24/2010 Text Mining Seminar 13

  14. Experimental Results • Accuracy – 61.2% – Baseline method: 57.8% – Human Annotator: 65.1% • Discussion – Performance close to the level of human annotator – Performance better than baseline method 2/24/2010 Text Mining Seminar 14

  15. Further Discussions & Future Work Co-reference resolution Improved Performance w.r.t. NE Advanced classification strategies More extensive context Knowledge base and Natural language processing techniques 2/24/2010 Text Mining Seminar 15

  16. Summary • Given a set of objects each accompanied by a text description, a set of structured vocabularies, a metadata schema, and a training set of annotations of the text descriptions, the method automatically produces annotations for the objects, and its performance is close to the level of human annotator. Knowledge- base Better performance on Annotation Natural language techniques 2/24/2010 Text Mining Seminar 16

  17. T HANKS ! 2/24/2010 Text Mining Seminar 17

  18. A PPENDIX 2/24/2010 Text Mining Seminar 18

  19. metadata 2/24/2010 Text Mining Seminar 19

  20. Feature knowledge base 2/24/2010 Text Mining Seminar 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend