Provenance and Linked Data in Biological Data Webs Jun Zhao Image - - PowerPoint PPT Presentation

provenance and linked data in biological data webs
SMART_READER_LITE
LIVE PREVIEW

Provenance and Linked Data in Biological Data Webs Jun Zhao Image - - PowerPoint PPT Presentation

Provenance and Linked Data in Biological Data Webs Jun Zhao Image Bioinformatics Research Group Department of Zoology University of Oxford Background The Image Bioinformatics Research Group User-driven R&D Data integration


slide-1
SLIDE 1

Provenance and Linked Data in Biological Data Webs

Jun Zhao Image Bioinformatics Research Group Department of Zoology University of Oxford

slide-2
SLIDE 2

Background

The Image Bioinformatics Research Group

User-driven R&D Data integration Data Webs: “use the Web as the native platform in order to enable

integrated accesses to datasets including images relating to particular subjects” [David Shotton. World Wide Science: Promises, Threats and Realities. Data webs for image repositories. Oxford University Press.]

slide-3
SLIDE 3

FlyWeb: A data web of Drosophila data resources

Link together a number of heterogeneous data resources

concerning fruit flies, including gene expression images from adult testis (FlyTED) and from embryos (BDGP)

Initial user studies Details: http://imageweb.zoo.ox.ac.uk/wiki/index.php/FlyWeb_project

BDGP

Berkley Drosophila Genome Project

BDGP

Berkley Drosophila Genome Project

FlyTED:

Drosophila Testis Gene Expression Image Database

FlyTED:

Drosophila Testis Gene Expression Image Database

FlyBase:

The Drosophila Genome database

FlyBase:

The Drosophila Genome database

PubMED PubMED Oxford Research Archive Oxford Research Archive

slide-4
SLIDE 4

FlyWeb: Data Web for Linking Laboratory I mage Data with Repository Publications

FlyTED Testis images

  • f gene Adh

BDGP Embryonic images of gene CG32954

slide-5
SLIDE 5

Trust on FlyWeb

flyted:Adh bdgp:CG32954 sameAs flybase:Adh flybase:Adhr flybase:CG32954 In FlyBase release 3.2 flybase:CG32954 flybase:Adh flybase:Adhr Since FlyBase release 4.3 flybase:CG3481 flybase:CG3484 Reference: http://www.flybase.org

slide-6
SLIDE 6

Trust on FlyWeb

flyted:Adh bdgp:CG32954 sameAs

How the link was built? Why these two gene names are the same? When this link was created, by whom, using which version

  • f which database;

What previous links between data items became obsolete,

and why.

How about alternative names for this gene, such as

“CG3481”, “Dreg-1”, etc?

slide-7
SLIDE 7

Provenance for Data Webs

For each release of FlyWeb

When it was released Based upon which version of which public database Which data items are links to which other data items

slide-8
SLIDE 8

flyted:Adh bdgp: CG32954

:flyweb_r1 "2007-12-19"^^xsd:date “1.0” http://www.datawebs.net/foaf.rdf#ibrg flyted:v1.0 bdgp/2007-03-09 flybase/v3.2 :flyweb_r1

flyted:Adh bdgp: CG32954

:flyweb_r2 "2008-01-25"^^xsd:date “1.1” http://www.datawebs.net/foaf.rdf#ibrg flyted:v1.0 bdgp/2007-03-09 flybase/v5.3 :flyweb_r2

  • wl:sameAs
  • wl:sameAs

dc:hasVersion dc:creator dc:created dw:derivedFrom dw:derivedFrom dw:derivedFrom dc:hasVersion dc:creator dc:created dw:derivedFrom dw:derivedFrom

flyweb: CG3481 flyweb: Adhr

  • wl:sameAs
  • wl:sameAs
slide-9
SLIDE 9

Provenance for Data Webs

For each pair of linked data items

The evidence of the link When the link was built, released, and by whom Which previous links have been created between this pair of related

data

slide-10
SLIDE 10

:mapping_m1 :mapping_m11 :mapping_m1 dw:MappingRelation flyted:gene_g1 flyted:gene_g2 dw:maps :evidence_e1 :flyweb_r1 dw:SameRelation "2007-12-19"^^xsd:date :mapping_m12 :evidence_e2 :flyweb_r2 dw:DifferentRelation "2008-01-25"^^xsd:date :mapping_m12 flyted:Adh bdgp: CG32954

  • wl:sameAs

:mapping_m11 rdf:type rdf:type rdf:type dw:childOf dw:childOf dw:evidencedBy dw:evidencedBy dw:createdIn dw:createdIn dc:creation dc:creation dw:siblingOf

flyted:Adh bdgp: CG32954

  • wl:sameAs

flyweb: CG3481 flyweb: Adhr

  • wl:sameAs
  • wl:sameAs
slide-11
SLIDE 11

Short Demo

slide-12
SLIDE 12
slide-13
SLIDE 13
slide-14
SLIDE 14
slide-15
SLIDE 15
slide-16
SLIDE 16

Open questions

What should be the evidence? Provenance in databases Provenance in e-Science Provenance in bioinformatics What the minimum provenance is needed for Linked Data?

slide-17
SLIDE 17

Acknowledgement

David Shotton, Graham Klyne, and Alistair Miles Dr Helen White-Cooper and her research group JISC and BBSRC BDGP and FlyBase

slide-18
SLIDE 18

Thank you!