Provenance as a Building Block for an Open Science Infrastructure - - PowerPoint PPT Presentation

provenance as a building block for an open science
SMART_READER_LITE
LIVE PREVIEW

Provenance as a Building Block for an Open Science Infrastructure - - PowerPoint PPT Presentation

DLR.de Chart 1 > ISGC 2018 > A. Schreiber Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 Provenance as a Building Block for an Open Science Infrastructure Andreas Schreiber German Aerospace


slide-1
SLIDE 1

Provenance as a Building Block for an Open Science Infrastructure

Andreas Schreiber German Aerospace Center (DLR) Cologne/Berlin, Germany ISGC 2018, Taipei, Taiwan

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 1

slide-2
SLIDE 2

Topics

  • Reproducibility
  • Provenance and PROV
  • Storing provenance
  • Gathering provenance

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 2

slide-3
SLIDE 3

Reproducibility

Reproducibility in (data) science is based on

  • Open Source Software
  • Code Reviews
  • Code Repositories
  • Publications with code
  • Container (Docker etc.)
  • Workflows
  • (Electronic) laboratory notebooks
  • Open data formats
  • Data management
  • Metadata and Provenance

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 3

slide-4
SLIDE 4

Provenance

Basics

  • Provenance refers to the source of

information and the process that led to its existence

  • Where did I get this file?
  • How did it come to exist?
  • Provenance information is critical to users

trying to understand where a particular data file came from

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 4

Other and related terms

  • Traceability
  • Lineage
  • Logging
  • Monitoring
slide-5
SLIDE 5

Provenance Information

Capture, archive, and distribute provenance information, for example

  • The source of all externally supplied data files
  • The source of the algorithms used to transform the data within the system
  • The Algorithm design documents
  • A complete description of the processing environment
  • A complete description of the processing framework
  • A record of each job’s execution

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 5

slide-6
SLIDE 6

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 6

Data Science Workflows

slide-7
SLIDE 7

More Formal Definition of Provenance Provenance is information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness.

PROV W3C Working Group https://www.w3.org/TR/prov-overview

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 7

slide-8
SLIDE 8

W3C Specification „PROV“

  • PROV-O, the PROV ontology, an OWL2 ontology allowing the mapping of the PROV data

model to RDF

  • PROV-DM, the PROV data model for provenance
  • PROV-N, a notation for provenance aimed at human consumption
  • PROV-CONSTRAINTS, a set of constraints applying to the PROV data model
  • PROV-XML, an XML schema for the PROV data model
  • PROV-AQ, mechanisms for accessing and querying provenance
  • PROV-DICTIONARY introduces a specific type of collection, consisting of key-entity pairs
  • PROV-DC provides a mapping between PROV-O and Dublin Core Terms
  • PROV-SEM, a declarative specification in terms of first-order logic of the PROV data model
  • PROV-LINKS introduces a mechanism to link across bundles

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 8

slide-9
SLIDE 9

PROV Elements

Entities

  • Physical, digital, conceptual, or other kinds of things
  • For example, documents, web sites, graphics, or data sets

Activities

  • Activities generate new entities or

make use of existing entities

  • Activities could be actions or processes

Agents

  • Agents takes a role in an activity and have

the responsibility for the activity

  • For example, persons, pieces of software,
  • r organizations

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 9

Activity Entity Agent

slide-10
SLIDE 10

PROV Relations

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 10

Activity Entity Agent

wasGeneratedBy used wasDerivedFrom wasAttributedTo wasAssociatedWith

slide-11
SLIDE 11

Baking a Cake

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 11

100 g butter bake 2 eggs 100 g sugar 100 g flour cake

u s e d used u s e d used wasGeneratedBy wasDerivedFrom

slide-12
SLIDE 12

Textual Representations Visualizations

PROV Notations and Representations

  • Formats: PROV-N, JSON, Turtle, XML, …

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 12

document prefix userdata http://software.dlr.de/qs/userdata/ . . . wasDerivedFrom(userdata:weights, userdata:WeightReport.csv, wasDerivedFrom(qs:graphic/weights, userdata:weights, wasAssociatedWith(qs:graphic/weights, qs:user/

  • nyame@gmail.com, -)

used(python_method:read_csv, library:pandas, -) used(python_method:matplotlib_plot, userdata:weights, -) used(python_method:matplotlib_plot, library:matplotlib, -) used(python_method:read_csv, userdata:WeightReport.csv, -) wasAttributedTo(userdata:WeightReport.csv, qs:user/

  • nyame@gmail.com)

agent(qs:user/onyame@gmail.com, [prov:type="prov:Person"]) entity(library:pandas, [library:version="0.17.1"]) entity(userdata:WeightReport.csv) entity(userdata:weights) . . . endDocument

slide-13
SLIDE 13

Storing and Retrieving Provenance

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 13

slide-14
SLIDE 14

Provenance Architecture

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 14

Recording of Data Processing Information

Application

Data (Results) Provenance Store

slide-15
SLIDE 15

Storing and Retrieving Provenance

Some Storage Technologies

  • Relational databases and SQL
  • XML and Xpath
  • RDF and SPARQL
  • Graph databases and Gremlin/Cypher

Services

  • REST APIs
  • PROVSTORE

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 15

slide-16
SLIDE 16

ProvStore University of Southampton

  • RESTful web service
  • storage and access of

provenance documents

  • Public and private

documents

  • Conversion to various

text formats

  • Simple visualizations
  • APIs
  • Python
  • jQuery

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 16

https://provenance.ecs.soton.ac.uk/store/

slide-17
SLIDE 17

Graphs

Provenance is a Directed Acyclic Graph (DAG)

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 17 ex:User

prov:type "prov:Person" %% xsd_1:#QName foaf:givenName jonny morley foaf:mbox <mailto:abc@example.org>

ex:User1

prov:type "prov:Person" %% xsd_1:#QName foaf:givenName Alastair Hughes foaf:mbox <mailto:abc@example.org>

ex:Variant ex:Variant_Investigated

wasGeneratedBy dcterms:Exonic_Func exonic dcterms:Gene EIF4G1 dcterms:MAF 1 dcterms:Start 184037533 dcterms:id 55d1f8a68e8865285b59f224

ex:Investigation ex:Investigation_Created

wasGeneratedBy dcterms:created_on 2015-09-30T13:13:29.851Z dcterms:id 560bdff9e3bea4bf624b1031 dcterms:omim_intersected 0 dcterms:phenotypes parkinson dcterms:title demo

ex:Case ex:Case_Created

wasGeneratedBy dcterms:id 55d1f97e4b2f616fc8018e87 dcterms:title case-396

ex:Patients

dcterms:title 55d1e8f34b2f616fc8018e6b used wasAssociatedW ith used wasAssociatedW ith used wasAssociatedW ith

A B E F G D C

slide-18
SLIDE 18

Graph Databases

Naturally, graph databases are a good technology for storing (Provenance) graphs Many graph databases are available

  • Neo4j
  • Titan
  • ArangoDB
  • ...

Query languages

  • Cypher
  • Gremlin (TinkerPop)
  • GraphQL

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 18

slide-19
SLIDE 19

Neo4j

  • Open-Source
  • Implemented in Java
  • Stores property graphs

(key-value-based, directed) http://neo4j.com

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 19

slide-20
SLIDE 20

Storing Provenance in Graph Database

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 20

Graph database Neo4j

ex:User

prov:type "prov:Person" %% xsd_1:#QName foaf:givenName jonny morley foaf:mbox <mailto:abc@example.org>

ex:User1

prov:type "prov:Person" %% xsd_1:#QName foaf:givenName Alastair Hughes foaf:mbox <mailto:abc@example.org>

ex:Variant ex:Variant_Investigated

wasGeneratedBy dcterms:Exonic_Func exonic dcterms:Gene EIF4G1 dcterms:MAF 1 dcterms:Start 184037533 dcterms:id 55d1f8a68e8865285b59f224

ex:Investigation ex:Investigation_Created

wasGeneratedBy dcterms:created_on 2015-09-30T13:13:29.851Z dcterms:id 560bdff9e3bea4bf624b1031 dcterms:omim_intersected 0 dcterms:phenotypes parkinson dcterms:title demo

ex:Case ex:Case_Created

wasGeneratedBy dcterms:id 55d1f97e4b2f616fc8018e87 dcterms:title case-396

ex:Patients

dcterms:title 55d1e8f34b2f616fc8018e6b used wasAssociatedW ith used wasAssociatedW ith used wasAssociatedW ith

MATCH (e:Entity)-[*]-(u:Agent) RETURN u

slide-21
SLIDE 21

Trusted Provenance: Storing Provenance in a Blockchain

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 21 Transaction A.3 Transaction A.4

Owner ex:User1 Public Key Owner ex:User1 Public Key Owner ex:User1 Signature Owner ex:User1 Signature

Hash Hash

Transaction B.1

Owner ex:User Public Key

ex:Variant_Investigated used wasAssociatedWith

Owner ex:User Signature

Hash

Tx-ID: A.2 (31415926535f...)

used

ex:Investigation_Created

wasAssociatedWith ex:Investigation wasGeneratedBy Tx-ID: A.3 dcterms:created_on 2015-04-24T09:08:55.793Z dcterms:id 553a08276d8fbbd310b467bc dcterms:omim_intersected 5 dcterms:phenotypes Acidosis dcterms:phenotypes Lactic acidosis dcterms:phenotypes Renal tubular acidosis dcterms:title Investigation 2 wasAttributedTo Tx-ID: A.4 prov:type prov:Person foaf:givenNameAlastair Hughes foaf:mbox <mailto:abc@example.org> prov:type prov:Person foaf:givenName Ryan Kirby foaf:mbox <mailto:abc@example.org>

Owner ex:User Private Key Owner ex:User1 Private Key Owner ex:User1 Private Key

Create Asset Create Asset Create Asset

prov:type prov:Person foaf:givenNameAlastair Hughes foaf:mbox <mailto:abc@example.org>

Document A

ex:User1

prov:type prov:Person foaf:givenName Alastair Hughes foaf:mbox <mailto:abc@example.org>

ex:Investigation_Created ex:Case ex:Case_Created

wasGeneratedBy dcterms:id 553a07ed6d8fbbd310b467b8 dcterms:titleCase Example 1

ex:Patients

dcterms:title55364a31acb02ab2205815ee used wasAssociatedWith used wasAssociatedWith

ex:Investigation_Created ex:Investigation

wasGeneratedBy

PROV2BIGCHAINDB

https://github.com/DLR-SC/prov2bigchaindb https://github.com/DLR-SC/prov2bigchaindb

slide-22
SLIDE 22

Blockchain

Combination of multiple techniques

  • Peer-to-peer network
  • Public/Private key signing
  • Time-stamping
  • Proof-of-Work
  • Merkle-Trees

Proposed solutions to

  • The double-spending problem
  • The byzantine generals problem
  • Tamper-resistant distributed database

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 22

slide-23
SLIDE 23

Blockchain Transactions

  • Linked by hash of current and preceding

transactions

  • Bitcoin: Transfers amount of BTC
  • Public/private key signing
  • All transactions are broadcasted across

the network

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 23

slide-24
SLIDE 24

Document-based Storage of PROV Documents

  • Only one user/address on the blockchain
  • Provenance is stored as one valid document

Pros

  • Less complex
  • Ownership restricted to one participant
  • Easy to query
  • Less costly, if less data is added

Cons

  • Single point of failure
  • Less tamper-resistant
  • No chaining of transactions
  • Huge amount of data in transactions

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 24

slide-25
SLIDE 25

Role-based Storage of PROV Documents

  • Every agents is a blockchain user/address
  • Generates transactions for its entities and

activities

  • Relations modeled with references to other

transactions Pros

  • Close to typical process structures
  • Implicit ownership and responsibility

Cons

  • Agent needs to know relevant transactions

for references

  • Difficult to query, if no ownership transfer is

used

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 25

slide-26
SLIDE 26

Graph-based Storage of PROV Documents

  • All PROV relations are modeled as ownership transfer
  • All agents, activity and entities are actual blockchain

user/addresses Pros

  • Mapping close to PROV model
  • Small amount of data per transaction
  • Strong tamper-resistant due to:
  • Multiple owner
  • Large amount of transactions

Cons

  • Complex implementation
  • High costs due to many transactions
  • Very slow in querying, if traversal of

transactions is needed

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 26

slide-27
SLIDE 27

Test Setup

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 27

slide-28
SLIDE 28

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 28

Performance Comparison

slide-29
SLIDE 29

Gathering Provenance

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 29

slide-30
SLIDE 30

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 30

Data & Metadata

Workflows Algorithms / Scripts Machine Learning Data Management PROV Provenance Store

</> </>

Software Development

slide-31
SLIDE 31

Gather or Generate Provenance

Depends on your application (tools, languages, etc.)

  • Generation at run-time, compile-time, or retrospectively

Runtime

  • Instrumentation of the application
  • Cumbersome from software engineering perspective
  • Combined with logging or with aspect-oriented approaches

Compile time

  • Based on static code analysis (dependency analysis, program slicing, etc.)

Retrospectively

  • Reconstructed from files or filesystem metadata

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 31

slide-32
SLIDE 32

Tools and Libraries for Generating Provenance

Libraries for Python

  • PROVPY
  • PROVNEO4J

Other Tools

  • NOWORKFLOW
  • GIT2PROV

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 32

slide-33
SLIDE 33

Python Library ProvPy (PROV) https://github.com/trungdong/prov https://github.com/trungdong/prov

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 33

from prov.model import ProvDocument # Create a new provenance document d1 = ProvDocument() # Entity: now:employment-article-v1.html e1 = d1.entity('now:employment-article-v1.html') # Agent: nowpeople:Bob d1.agent('nowpeople:Bob') # Attributing the article to the agent d1.wasAttributedTo(e1, 'nowpeople:Bob') d1.entity('govftp:oesm11st.zip', {'prov:label': 'employment-stats-2011', 'prov:type': 'void:Dataset'}) d1.wasDerivedFrom('now:employment-article-v1.html', 'govftp:oesm11st.zip') # Adding an activity d1.activity('is:writeArticle') d1.used('is:writeArticle', 'govftp:oesm11st.zip') d1.wasGeneratedBy('now:employment-article-v1.html', 'is:writeArticle')

slide-34
SLIDE 34

Python Library ProvPy (PROV) https://github.com/trungdong/prov https://github.com/trungdong/prov

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 34

slide-35
SLIDE 35

PROVNEO4J – Storing PROV Documents in Neo4j

https://github.com/DLR-SC/provneo4j https://github.com/DLR-SC/provneo4j

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 35

import provneo4j.api provneo4j_api = provneo4j.api.Api( base_url="http://localhost:7474/db/data", username="neo4j", password="python") provneo4j_api.document.create(prov_doc, name=”MyProv”)

slide-36
SLIDE 36

PROVNEO4J – Storing PROV Documents in Neo4j

https://github.com/DLR-SC/provneo4j https://github.com/DLR-SC/provneo4j

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 36

qs:user/onyame@googlemail.com

prov:type "Person" %% prov:Person

qs:app/WeightCompanion

prov:label Version 3.0.3-prov prov:type "SoftwareAgent" %% prov:SoftwareAgent

library:matplotlib

library:url https://pypi.python.org/pypi/matplotlib/1.5.1 library:version 1.5.1

userdata:weights

wasDerivedFrom wasGeneratedBy

userdata:WeightReport-3-2-21-31.34.44.csv

wasAttributedT
  • wasDerivedFrom
wasGeneratedBy

library:pandas

library:url https://pypi.python.org/pypi/pandas/0.17.1 library:version 0.17.1

qs:graphic/weights

wasAssociatedW ith wasDerivedFrom wasGeneratedBy

userdata:weight_db

wasAttributedT
  • prov:label de.medando.weightcompanion.data.model.W
eight

python_method:matplotlib_plot

used used

python_method:read_csv

used used

java_method:createDocument

used wasAssociatedW ith prov:type qs:visualize prov:type "export" %% qs:export prov:type qs:import prov:time 2016-03-02T19:40:21+00:00 prov:time 2016-03-02T20:31:34.867000+00:00 prov:time 2016-03-02T19:40:22+00:00
slide-37
SLIDE 37

Provenance Instrumentation of TENSORFLOW

Provenance of TENSORFLOW workflows

  • Tensor à PROV Entity
  • Operations à PROV Activity

Example: MNIST with 400 training iterations

  • 64581 database nodes
  • 33549 Entities
  • 31032 Activities

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 37

slide-38
SLIDE 38

Provenance Instrumentation

  • f TENSORFLOW

Example Query

  • Shortest paths from all tensors

in 400. iteration to init operation

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 38

MATCH path=allShortestPaths((root)<-[*]-(n)) WHERE root.`tf:type`="tf:Session_init" and n.`tf:name` =~ ".*_400" RETURN path

slide-39
SLIDE 39

NOWORKFLOW – Provenance of Scripts https:// https://github.com github.com/gems- /gems-uff uff/noworkflow noworkflow

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 39

Project experiment.py p12.dat p13.dat precipitation.py p14.dat

  • ut.png

$ now run -e Tracker $ now run -e Tracker experiment.py experiment.py

slide-40
SLIDE 40

GIT2PROV http://git2prov.org http://git2prov.org

  • Generate PROV documents

from git repositories

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 40

slide-41
SLIDE 41

GIT2PROV Example Output

https://provenance.ecs.soton.ac.uk/store/documents/116377/

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 41

slide-42
SLIDE 42

Provenance Visualization

Visualization of Provenance is an ongoing research topic

  • Especially, for non-experts (“Provenance for people”)
  • Example: PROV COMICS

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 42

java_method:createDocument userdata:weight_db

used wasAssociatedW ith

python_method:read_csv library:pandas

used

userdata:WeightReport-3-2-21-31.34.44.csv

used

python_method:matplotlib_plot userdata:weights

used

library:matplotlib

used

qs:user/onyame@googlemail.com

wasAttributedT
  • prov:label de.medando.weightcompanion.data.model.W
eight

qs:graphic/weights

wasGeneratedBy wasDerivedFrom wasAssociatedW ith library:url https://pypi.python.org/pypi/pandas/0.17.1 library:version 0.17.1 wasAttributedT
  • wasGeneratedBy
wasDerivedFrom wasGeneratedBy wasDerivedFrom library:url https://pypi.python.org/pypi/matplotlib/1.5.1 library:version 1.5.1

qs:app/WeightCompanion

prov:label Version 3.0.3-prov prov:type "SoftwareAgent" %% prov:SoftwareAgent prov:type "Person" %% prov:Person prov:time 2016-03-02T19:40:22+00:00 prov:time 2016-03-02T20:31:34.867000+00:00 prov:time 2016-03-02T19:40:21+00:00 prov:type qs:import prov:type "export" %% qs:export prov:type qs:visualize
slide-43
SLIDE 43

Key Messages and Summary

Recording the Provenance of science workflows is important

  • to understand where data came from
  • to reproduce data processing steps or whole workflows

Use a standard for Provenance

  • W3C standard PROV
  • Mapping to (graph) databases, allows easy querying
  • A standard allow interoperability and comparison
  • Storing in blockchains for increasing trust

Recording Provenance is not hard

  • APIs and tools available

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 43

Activity Entity Agent

wasGeneratedBy used wasDerivedFrom wasAttributedTo wasAssociatedWith

slide-44
SLIDE 44

> ISGC 2018 > A. Schreiber • Provenance as a Building Block for an Open Science Infrastructure > 23.03.2018 DLR.de • Chart 44

Thank You!

Questions?

Andreas.Schreiber@dlr.de Andreas.Schreiber@dlr.de www.DLR.de/sc | @onyame www.DLR.de/sc | @onyame