thoughts on validating rdf healthcare data
play

Thoughts on Validating RDF Healthcare Data David Booth, Ph.D. - PowerPoint PPT Presentation

Thoughts on Validating RDF Healthcare Data David Booth, Ph.D. KnowMED, Inc. 2013 W3C RDF Validation Workshop Latest version of these slides: http://dbooth.org/2013/validation/dbooth-slides.pdf Why RDF? Schema promiscuous Green Model Blue


  1. Thoughts on Validating RDF Healthcare Data David Booth, Ph.D. KnowMED, Inc. 2013 W3C RDF Validation Workshop Latest version of these slides: http://dbooth.org/2013/validation/dbooth-slides.pdf

  2. Why RDF? Schema promiscuous Green Model Blue Model Red Model HomePhone Town ZipPlus4 FullName Country Country Address FirstName LastName Email hasFirst hasLast sameAs City ZipCode subClassOf Multiple models peacefully coexist 2

  3. Why RDF? Schema promiscuous • What the Blue app sees: Green Model Blue Model Blue Model Red Model HomePhone Town ZipPlus4 FullName Country Country Country Country Address Address FirstName FirstName LastName LastName Email Email City City ZipCode ZipCode 3

  4. Why RDF? Schema promiscuous • What the Red app sees Green Model Blue Model Red Model Red Model HomePhone HomePhone Town Town ZipPlus4 ZipPlus4 FullName FullName Country Country Country Address FirstName LastName Email City ZipCode 4

  5. Why RDF? Schema promiscuous • What the Green app sees Green Model Green Model Blue Model Red Model HomePhone HomePhone Town Town ZipPlus4 ZipPlus4 FullName Country Country Country Country Address FirstName FirstName LastName LastName Email Email City ZipCode Need multiple validation perspectives on the same data! 5

  6. Data producers and consumers A Red B Blue C Green Producers Producers Consumers 6

  7. Two perspectives of validation • Producers: Model integrity – Is the data well formed? (Sanity check) – Does it contain what I promised? • Consumers: Suitability for use – Does the data meet my needs? – Different consumers have different needs! Need multiple validation perspectives on the same data! 7

  8. Features I'd like to see . . . 8

  9. 1. SPARQL-based framework • Fewer languages == easier maintenance • Nice to either: – Build on SPARQL, or – Use from SPARQL • BUT if a new language were very concise and powerful, I'd jump on it. 9

  10. 2. Validation pipelines • Simpler to write a series of SPARQL UPDATE operations than one big query • Want standard ways to define validation pipelines 10

  11. 3. Better URI pattern matching and munging • Often need to generate URIs from natural keys • Want easier mechanisms for: – Checking URI patterns – Detecting misspellings 11

  12. 4. Validation like automated regression testing • Lots of small, independent tests over one big one – E.g., one file per test – Contrast big ontology approach • Goals: – Easy to add a new test – Can test anything 12

  13. 5. Operational versus declarative • Declarative is convenient for very simple tests, e.g., pattern matching • Operational is easier for more complex tests, e.g.: – "Do A, then B, then C, then result should be X" • Note: SPARQL UPDATES can be used this way 13

  14. Summary • SPARQL-based • Or something else that is powerful and concise 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend