tib av portal
play

TIB|AV-Portal Challenges managing audiovisual metadata encoded in - PowerPoint PPT Presentation

TIB|AV-Portal Challenges managing audiovisual metadata encoded in RDF Jrg Waitelonis yovisto GmbH Margret Plank German National Library of Science and Technology (TIB) Hannover Prof. Dr. Harald Sack HPI-Potsdam / FIZ Karlsruhe & KIT SWIB16


  1. TIB|AV-Portal Challenges managing audiovisual metadata encoded in RDF Jörg Waitelonis yovisto GmbH Margret Plank German National Library of Science and Technology (TIB) Hannover Prof. Dr. Harald Sack HPI-Potsdam / FIZ Karlsruhe & KIT SWIB16 Semantic Web in Libraries Conference 2016, 28-30. November 2016, Bonn, Germany | http://swib.org/ Jörg Waitelonis, yovisto GmbH, Semantic Web in Libraries Conference 2016, 28-30th November 2016, Bonn, Germany

  2. WELCOME SWIB16 Semantic Web in Libraries yovisto SWIB16 Jörg Waitelonis Christian Hentschel Prof. Dr. Harald Sack What we do: Based in: I ntelligent Linked Data-, Ontology- and Metadata-Management August Bebel Str. 26-53 Knowledge Discovery & Knowledge Mining 14482 Potsdam Video- & Image Analysis, User Interfaces, Visualization Germany 2 Jörg Waitelonis, yovisto GmbH, Semantic Web in Libraries Conference 2016, 28-30th November 2016, Bonn, Germany

  3. Developed in cooperation of: ■ German National Library of Science and Technology (TIB), Hannover ■ Hasso-Plattner-Institute for IT-Systems Engineering (HPI), Potsdam Hosted and maintained by: ■ yovisto GmbH, Potsdam ■ flowworks GmbH, München (Asset Management, Playout) 3 http://av.tib.eu/ Jörg Waitelonis, yovisto GmbH, Semantic Web in Libraries Conference 2016, 28-30th November 2016, Bonn, Germany

  4. > 8000 ■ Lectures, ■ Conference talks, ■ Interviews, ■ Simulations, ■ Visualizations, ■ Research Data for ■ Scientist, ■ Lecturers, ■ Teachers, ■ Learners 4 http://av.tib.eu/ Jörg Waitelonis, yovisto GmbH, Semantic Web in Libraries Conference 2016, 28-30th November 2016, Bonn, Germany

  5. TIB|AV-Portal Users, Customers, Uploader TIB Curators Manage Metadata: Video Upload Search View ☑ approved DOI, QA, Right clearance Ingest Workflow Management Media Asset Management AV-Analysis: ■ Video Segmentation ■ Optical Character Recognition (OCR) ■ Speech-to-text (ASR) Streaming ■ Visual Concept Detection (VCD) Search Index Semantic ■ Context Modelling RDF Triplestore Analysis: ■ Named Entity Linking http://av.tib.eu/ 5 Jörg Waitelonis, yovisto GmbH, Semantic Web in Libraries Conference 2016, 28-30th November 2016, Bonn, Germany

  6. TIB|AV-Portal Semantic metadata analysis with Named Entity Linking Textual Metadata Authoritative: Non-authoritative: ■ Formal, descriptive, technical ■ E.g. ASR/OCR-transcripts ■ E.g. title, description, keywords, etc. ■ Automatically Extracted ■ Manually authored ■ Refers to fragments of the video (fine grained) Refers to entire video (coarse grained) ■ mapping mapping Knowledge base 63.356 GND subject headings GND = Gemeinsame Normdatei (Integrated authority file) Incl. English translations from mappings to DBpedia, LCSH, MACS and WTI Thesaurus [1] Sven Strobel, PalomaMarín-Arraiza: Metadata for Scientific Audiovisual Media: Current Practices and Perspectives of the 6 TIB|AV-Portal, In Proc. of Metadata and Semantics Research: 9th Research Conference, MTSR 2015, Manchester, UK, 2015, Springer Jörg Waitelonis, yovisto GmbH, Semantic Web in Libraries Conference 2016, 28-30th November 2016, Bonn, Germany

  7. http://dx.doi.org/10.5446/357#t=49:03,53:58 7 Jörg Waitelonis, yovisto GmbH, Semantic Web in Libraries Conference 2016, 28-30th November 2016, Bonn, Germany

  8. TIB|AV-Portal The Data Model & RDF-Export Why RDF? How? ■ Vocabulary selection ■ Extensible ■ Problem: heterogeneous metadata ■ Different serialization forms authoritative, spatio-temporal, nested ○ Interoperable ■ annotations ■ Queryable (SPARQL) Vocab discovery -> http://lov.okfn.org/ ■ ■ W3C standard ■ Selection criterions [2] Jörg Waitelonis, Margret Plank, Harald Sack, TIB|AV-Portal: Integrating Automatically Generated Video Annotations 8 into the Web of Data, in Proc. of 20th International Conference on Theory and Practice of Digital Libraries (TPDL 2016) Jörg Waitelonis, yovisto GmbH, Semantic Web in Libraries Conference 2016, 28-30th November 2016, Bonn, Germany

  9. TIB|AV-Portal The Data Model & RDF-Export Vocabulary Selection Issues: ■ Availability on the Web ■ Adequate meaning Openness Specificity ■ ■ ■ Level of complexity/richness ■ Datatypes ■ Maintained ■ Avoid contradictions, e.g. ■ Trustworthy authorship ■ Domain & range / sub- & super-class Usage by others / popularity Datatype vs. object properties ■ ■ ■ Documentation ■ Does it fit currently used models 9 cf. http://wiki.dublincore.org/index.php/Vocabulary_evaluation,_selection_and_re-use Jörg Waitelonis, yovisto GmbH, Semantic Web in Libraries Conference 2016, 28-30th November 2016, Bonn, Germany

  10. TIB|AV-Portal The Data Model & RDF-Export Standard Metadata and Basic Structure DCMI Metadata Terms � http://purl.org/dc/terms/ ■ ■ DCMI Type Vocabulary � http://purl.org/dc/dcmitype/ ■ schema.org Vocabulary � http://schema.org/ ■ Friend of a Friend Vocabulary 0.1 � http://xmlns.com/foaf/ Bibframe Vocabulary � http://bibframe.org/vocab/ ■ tib:video/16453 schema:name "Wall-crossing and geometry at infinity of Betti moduli spaces"@en ; schema:description "Linear algebraic differential equation (in one variable) ..."@en ; schema:keywords "Betti moduli"@en , "chaos theory"@en, "singularity"@en ; schema:dateCreated "1973-01-01T00:00:00+01:00"^^<http://www.w3.org/2001/XMLSchema#gYear> ; schema:duration 1:16:48 ; rdf:type schema:Movie ; schema:url <https://av.tib.eu/media/16453> ; schema:producer gnd:4028361-6 ; schema:publisher tib:Institut_des_Hautes__tudes_Scientifiques_%28IH_S%29 ; schema:license <http://creativecommons.org/licenses/by/3.0/deed.en> ; schema:availability schema:OnlineOnly ; bibframe:doi <http://dx.doi.org/10.5446/16453> ; schema:thumbnailUrl <https://av.tib.eu/images/avpimg1fdaede78b338bba137140fd805cd382> . 10 Jörg Waitelonis, yovisto GmbH, Semantic Web in Libraries Conference 2016, 28-30th November 2016, Bonn, Germany

  11. TIB|AV-Portal The Data Model & RDF-Export Spatio-temporal Metadata Open Annotation Data Model (OA) � http://w3.org/ns/oa# ■ ■ NLP Interchange Format (NIF) � http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core# tib:video/16453#t=smpte-25:0:05:00:22,0:05:03:00 dcterms:isPartOf tib:video/16453 . tib:asr/16453_13753838_7522 oa:hasTarget tib:video/16453#t=smpte-25:0:05:00:22,0:05:03:00 ; oa:annotatedBy tib:annotator/ASR-1.0.0 ; rdf:type oa:Annotation ; oa:hasBody tib:asr/16453_13753838_7522#char=0,5617 . tib:asr/16453_13753838_7522#char=0,5617 rdf:type nif:Context ; rdf:type nif:RFC5147String ; nif:isString "... five sets ..." . tib:asr/16453_13753838_7522#char=4743,4747 nif:referenceContext tib:asr/16453_13753838_7522#char=0,5617 ; itsrdf:taIdentRef gnd:4038613-2 ; itsrdf:taAnnotatorsRef tib:annotator/NEL-1.0.0 ; rdf:type nif:Phrase ; rdf:type nif:String ; nif:beginIndex "4743" ; nif:beginIndex "4747" ; nif:anchorOf "sets" . 11 Jörg Waitelonis, yovisto GmbH, Semantic Web in Libraries Conference 2016, 28-30th November 2016, Bonn, Germany

  12. TIB|AV-Portal The Data Model & RDF-Export Open Annotation tib:video/16453#t=smpte-25:0:23:12:12,0:23:14:4 oa:hasTarget tib:annotator/ASR-1.0.0 tib:asr/16453_13753838 ao:Annotation oa:annotatedBy rdf:type oa:hasBody “... the astronaut …” gnd:11896416X 12 Jörg Waitelonis, yovisto GmbH, Semantic Web in Libraries Conference 2016, 28-30th November 2016, Bonn, Germany

  13. TIB|AV-Portal The Data Model & RDF-Export NLP Interchange Format (NIF) tib:video/16453#t=smpte-25:0:23:12:12,0:23:14:4 oa:hasTarget tib:annotator/ASR-1.0.0 tib:asr/16453_13753838 ao:Annotation oa:annotatedBy rdf:type oa:hasBody nif:Context tib:asr/16453_13753838#char=0,62 “... the astronaut …” rdf:type nif:isString nif:RFC5147String nif:referenceContext nif:String tib:asr/16453_13753838#char=23,32 “astronaut” rdf:type nif:anchorOf nif:Phrase itsrdf: nif: nif: itsrdf: taldentRef beginIndex endIndex taAnnotatorsRef http://av.tib.eu/opendata gnd:11896416X 23 32 tib:annotator/NEL-1.0.0 13 Jörg Waitelonis, yovisto GmbH, Semantic Web in Libraries Conference 2016, 28-30th November 2016, Bonn, Germany

  14. TIB|AV-Portal Data Quality Standard metadata Automatically created metadata ■ AV analysis ○ typical detection errors (ASR, OCR, etc.) use the verified ■ Concise manual ■ Semantic analysis information to verification and clearing ○ missing annotations improve subsequent by TIB subject specialists analysis wrong annotations ○ � � � ○ knowledgebase errors and insufficiencies 14 Jörg Waitelonis, yovisto GmbH, Semantic Web in Libraries Conference 2016, 28-30th November 2016, Bonn, Germany

  15. TIB|AV-Portal Title: "Lecture on Science and Creativity" Author: “Kroto, Harold” Data Quality: Video Text Recognition Improving OCR Extend OCR vocabulary (per video) with ■ subject specific terminology ■ terminology from manually verified metadata OCR detects “Kyoto” Before OCR: ■ extend the OCR language model & subsequent spell-correction with terms from authoritative metadata (e.g.: Creativity, Harold, Kroto, Lecture, Science) ➥ OCR now detects “Kroto”. 15 http://dx.doi.org/10.5446/15907 Jörg Waitelonis, yovisto GmbH, Semantic Web in Libraries Conference 2016, 28-30th November 2016, Bonn, Germany

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend