the europeana use case
play

The Europeana Use Case Multilingual & Semantic Interoperability - PowerPoint PPT Presentation

The Europeana Use Case Multilingual & Semantic Interoperability in Cultural Heritage Information Systems Vivien Petras Berlin School of Library and Information Science 12 March 2013 W3C Multilingual Web Workshop Contents Europeana:


  1. The Europeana Use Case Multilingual & Semantic Interoperability in Cultural Heritage Information Systems Vivien Petras Berlin School of Library and Information Science 12 March 2013 W3C Multilingual Web Workshop

  2. Contents • Europeana: Multilingual Collections & Users • Multilingual Interoperability • Semantic Enrichment • Preview: New Enrichment Plans • Playing with Europeana Data Image: http://www.europeana.eu/portal/record/08535/D53FE7B7621E65A5E01E16E3D72785C68F2E2059.html 2

  3. Europeana • 15.2 million images • 10 million texts • 450,000 sound files • 170,000 video files > 2,200 institutions > 30 countries 3

  4. Europeana Multilingual Collections Slovenian Hungarian Danish 1% 1% 2% Finnish 3% Italian German 6% 18% Polish à Most Europeana 6% objects are language- Norwegian 6% independent (e.g. Multilingual 12% images), but the meta- English 7% data is multilingual. French Spanish 11% 8% Swedish Dutch 9% 10% 4

  5. Multilingual Europeana Users • Native language browser: 69% • Native language Google (entry point): 91% • Native language objects: 43% (SV 77%, DE 71%) à Native language use increases as soon as native language content increases. Gäde, Maria (forthcoming). “User Behavior through the Language Glass” – Language-specific Behavior in Multilingual Digital Libraries. Image: http://www.europeana.eu/resolve/record/9200105/AF5C65B3CC6A71CC0E4FF6FE5AAEB4CDAA1873C9 5

  6. Multilingual Interface in 31 Languages • users seem to assume that search is affected 6

  7. Query Result Filtering by Language • language of record vs. language of content 7

  8. Document Translation • general MT – not domain-specific 8

  9. Query Translation – Planned for 2013 How many languages? • How much user interaction? • 9

  10. Semantic Enrichment • concept (GEMET Thesaurus), agent (DBpedia), period (Semium time ontology), place (Geonames) 10

  11. Poisonous India … 11

  12. Enrichment Challenges • Metadata quality & sparsity • Vocabulary ambiguity – domain GEMET print  (German) Druck  pressure – language electrical Power  (German) Strom  (Czech) strom  tree – context Córdoba = Spain | Argentina Olensky, M., Stiller, J., Dröge, E. (2012). Poisonous India or the Importance of a Semantic and Multilingual Enrichment Strategy. In: Proc. of MTSR 2012: Metadata and Semantics Research Conference, Nov. 2012, Cádiz, Spain. Image: http://www.europeana.eu/portal/record/03919/FCD38BDE7A03579F24BEDA5D157943B75BB36F11.html 12

  13. Preview: New Enrichment Plans à transition to linked data-based Europeana Data Model (EDM) • links to contextual vocabularies from providers • enrich during ingestion 13

  14. Playing with Europeana Data • CHiC: Cultural Heritage in CLEF à Europeana data (XML) & queries / 13 languages à ad-hoc retrieval / semantic enrichment tasks à Submission deadline: 14 April 2013 à http://www.culturalheritageevaluation.org • Europeana Linked Open Data à RDF file dumps in EDM (Europeana Data Model) à SPARQL endpoint à CC0 open license à http://data.europeana.eu/ • Contact: vivien.petras@ibi.hu-berlin.de Image: http://www.europeana.eu/resolve/record/03486/DF559A7721E55BAE5BF5095FB9AA55406C0269C4 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend