a pdf storage backend for prot g
play

A PDF Storage Backend for Protg Henrik Eriksson Linkping University - PowerPoint PPT Presentation

A PDF Storage Backend for Protg Henrik Eriksson Linkping University Storage of the Pizza example pizza.owl.pprj ; Mon Feb 13 11:09:16 GMT 2006 ; ;+ (version "3.2") ;+ (build "Build 243") ([BROWSER_SLOT_NAMES] of


  1. A PDF Storage Backend for Protégé Henrik Eriksson Linköping University

  2. Storage of the Pizza example pizza.owl.pprj ; Mon Feb 13 11:09:16 GMT 2006 ; ;+ (version "3.2") ;+ (build "Build 243") ([BROWSER_SLOT_NAMES] of Property_List pizza.owl.pprj (properties [pizza_ProjectKB_Instance_25] [pizza_ProjectKB_Instance_26] [pizza_ProjectKB_Instance_27] <?xml version="1.0"?> [pizza_ProjectKB_Instance_28] <rdf:RDF [pizza_ProjectKB_Instance_29])) xmlns:protege="http://protege.stanford.edu/plugins/owl/protege#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" ([CLSES_TAB] of Widget xmlns:xsd="http://www.w3.org/2001/XMLSchema#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" (is_hidden TRUE) xmlns:owl="http://www.w3.org/2002/07/owl#" (label "Classes") xmlns="http://www.co-ode.org/ontologies/pizza/2005/10/18/pizza.owl#" (property_list [Instance_47]) xmlns:daml="http://www.daml.org/2001/03/daml+oil#" (widget_class_name "edu.stanford.smi.protege.widget.ClsesTab")) xmlns:dc="http://purl.org/dc/elements/1.1/" xml:base="http://www.co-ode.org/ontologies/pizza/2005/10/18/pizza.owl"> ([FORMS_TAB] of Widget <owl:Ontology rdf:about=""> <protege:defaultLanguage rdf:datatype="http://www.w3.org/2001/XMLSchema#string" (is_hidden TRUE) >en</protege:defaultLanguage> (label "Forms") <owl:versionInfo rdf:datatype="http://www.w3.org/2001/XMLSchema#string" (property_list [Instance_85]) >version 1.3</owl:versionInfo> (widget_class_name "edu.stanford.smi.protege.widget.FormsTab")) <rdfs:comment xml:lang="en">An example ontology that contains all constructs required for the various versions of the Pizza Tutorial run by Manchester University (see http://www.co- ([Instance_1005] of Widget ode.org/resources/tutorials/)</rdfs:comment> <owl:imports rdf:resource="http://protege.stanford.edu/plugins/owl/protege"/> (is_hidden FALSE) </owl:Ontology> (name "owl:Class") <owl:Class rdf:ID="VegetarianPizzaEquivalent2"> (property_list [XY_Instance_540]) <rdfs:comment xml:lang="en">An alternative to VegetarianPizzaEquiv1 that does not require a definition of (widget_class_name "edu.stanford.smi.protegex.owl.ui.widget.OWLFormWidget")) VegetarianTopping. Perhaps more difficult to maintain. Not equivalent to VegetarianPizza </rdfs:comment> <owl:equivalentClass> ([Instance_2201] of Integer <owl:Class> <owl:intersectionOf rdf:parseType="Collection"> (integer_value 250) <owl:Class rdf:ID="Pizza"/> (name "ClsesTab.left_right")) <owl:Restriction> <owl:onProperty> ([Instance_2202] of Integer <owl:ObjectProperty rdf:ID="hasTopping"/> </owl:onProperty> (integer_value 400) <owl:allValuesFrom> (name "ClsesTab.left.top_bottom")) <owl:Class> <owl:unionOf rdf:parseType="Collection"> ([Instance_2469] of String <owl:Class rdf:ID="FruitTopping"/> <owl:Class rdf:ID="HerbSpiceTopping"/> (name "owl_file_language") <owl:Class rdf:ID="NutTopping"/> (string_value "RDF/XML-ABBREV")) <owl:Class rdf:ID="SauceTopping"/> <owl:Class rdf:ID="VegetableTopping"/> ([Instance_2470] of String <owl:Class rdf:ID="CheeseTopping"/> </owl:unionOf> (name "owl_namespace") </owl:Class> (string_value "http://owl.protege.stanford.edu")) </owl:allValuesFrom> </owl:Restriction> ([Instance_2531] of Property_List </owl:intersectionOf> ) </owl:Class> </owl:equivalentClass> ([Instance_2534] of Widget <rdfs:label xml:lang="pt">PizzaVegetarianaEquivalente2</rdfs:label> </owl:Class> (is_hidden FALSE) <owl:Class rdf:ID="PepperTopping"> (label "Metadata") <owl:disjointWith> Project and (property_list [Instance_2539]) <owl:Class rdf:ID="MushroomTopping"/> (widget_class_name "edu.stanford.smi.protegex.owl.ui.metadatatab.OWLMetadataTab")) </owl:disjointWith> <owl:disjointWith> <owl:Class rdf:ID="LeekTopping"/> </owl:disjointWith> ontology files <owl:disjointWith> <owl:Class rdf:ID="TomatoTopping"/> </owl:disjointWith> <owl:disjointWith> <owl:Class rdf:ID="GarlicTopping"/> </owl:disjointWith> 2006-07-25 2

  3. How do you package an ontology? Gift wrapping? • .owl j r p p . .pont .pins • Document packaging 2006-07-25 3

  4. Persistent storage in Protégé Voluminous Files • � Serialization Verbose � Protégé Frames: CLIPS-like/XML � Protégé OWL: XML-based Slow parsing & writing Databases • Multiple file (e.g., .pprj, .owl) There is a storage problem here 2006-07-25 4

  5. Background: Semantic Documents • Combining documents with knowledge representation � Like semantic web, but for “real” documents • Problem: Large amounts of information is available electronically, but it is � difficult to find the right information when the search query is complex, and � difficult to navigate content-rich information. Goal • � Semantic description of document content (i.e., a meta-model for documents) � Support for systematic authoring of complex electronic documents � Adding support for PDF to Protégé – a PDF tab for Protégé 2006-07-25 5

  6. One Document—Many Applications One format for all applications 6 2006-07-25

  7. Semantic Documents Knowledge representation • � Semantic web: OWL � Ontologies • Document models � Document Adobe’s Portable Document retrieval Format (PDF) Statistics � documents (PDF) Extensible Metadata Platform Semantic search (XMP) XMP markup XMP markup XMP markup � MS Word, RTF (?) Reasoning engine Report publication Functions • database Functions � Semantic search based on metadata � Reasoning, inference 2006-07-25 7

  8. PDFTab: Annotation tool for Protégé Annotation tool Protégé Adobe Acrobat (PDF) 2006-07-25 8

  9. Lightweight semantic documents Semantic documents are nice, but • � sometimes too heavy � advanced tools required (heavy) • The PDF backend provides � a new save method � a compact storage format � storage using standard PDF attachments � file access through standard PDF tools (e.g., Acrobat) 2006-07-25 9

  10. PDF Attachments Little known feature of PDF • • Just like e-mail attachments 2006-07-25 10

  11. The “Secrets” of the Portable Document Format (PDF) • Open and documented format Document (PDF) PDF files contain something • like a file system Objects � Indexing for fast random access � Streams Like the .doc format of MS Word • Extendible file layout � Custom additions Metadata Pages Different object and streams • with support for text, binary Index data, compression, and (xref) encryption 2006-07-25 11

  12. Internal PDF Structure Document Root/Catalog Pages Outlines Metadata Names Contents XMP Embedded files 2006-07-25 12

  13. Storage backend Inserting ontologies in documents 13 2006-07-25

  14. Experimental implementation New knowledge base format/project type • 2006-07-25 14

  15. Resulting PDF document 15 2006-07-25

  16. Scenarios Generated documents • PDF generation Protégé Document Ontology Testing & Validation development revising Save publication • Authored documents Authoring Editing PDF conversion Protégé Document save publication Ontology Testing & Validation development revising 2006-07-25 16

  17. Discussion Architecture for storage (packaging) formats • � Other formats possible � Examples: zip, tar, tgz, … • Implementation issues � Currently “research prototype” � API changes/additions/debugging required pdfbox, OWL plug-in, Protégé core • � One PDF kb format required for each major storage type • Example: PDF-Protégé-Frames, PDF-Protégé- OWL, PDF-Protégé-RDFS • Should really be separated in a general PDF filter (more API changes required) 2006-07-25 17

  18. Summary Semantic documents • � Combine printable documents with ontologies and knowledge bases � Combined documentation (human-readable) and reasoning (machine-readable) � One document with several applications • PDF storage backend � Lightweight semantic documents � Attaching ontology files to PDF documents � Straightforward access from Acrobat 2006-07-25 18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend