Building the future o f Europes Natural Science Collections KEYSTONE - - PowerPoint PPT Presentation

building the future
SMART_READER_LITE
LIVE PREVIEW

Building the future o f Europes Natural Science Collections KEYSTONE - - PowerPoint PPT Presentation

Building the future o f Europes Natural Science Collections KEYSTONE Winter WG Meeting Belgrade, 21 Feb 2017 Goal Join forces in Europe to connect 1.5 billion biological and geological collection objects representing 4.5 billion years of


slide-1
SLIDE 1

KEYSTONE Winter WG Meeting Belgrade, 21 Feb 2017

Building the future

  • f Europe’s Natural

Science Collections

slide-2
SLIDE 2

Goal

Join forces in Europe to connect 1.5 billion biological and geological collection

  • bjects representing 4.5 billion years of

the Earth's history, with 21st century science

Natural Science Collections, sometimes referred to as the Cathedrals to nature, will be transformed into Cathedrals of data, knowledge and expertise.

slide-3
SLIDE 3

Desirable result: a comprehensive knowledge framework

The Research Infrastructure will enable collections to be integrated with other sources of information, uniting phenotypic and genetic data across temporal, geospatial and taxonomic frameworks.

Size and complexity of the structured data will make DiSSCo an interesting dataset for research by KEYSTONE participants

slide-4
SLIDE 4

Development Pillars

  • Digitising and mobilising content from Collections

through open access, comprehensive tools and innovative services

  • Harmonising data policies, processes and workflows
  • Maximising the use of expertise, enhancing skills and

engaging communities The development of DiSSCo will rely on three pillars:

slide-5
SLIDE 5

19 National Task Forces (NTF)

Austria Belgium Bulgaria Czech Republic Germany Denmark Estonia Spain Finland France Greece Italy Netherlands Norway Poland Portugal Sweden Slovakia United Kingdom

Consortium building

slide-6
SLIDE 6

Today

2016 2017 2018 2019 2020 2021 2022 2023 2024 2025

Deadline for submission of Proposal to ESFRI

August 2017

Announcement of 2018 ESFRI roadmap

Q4 2018 DISSCO ERIC set-up

May 2024

Consortium & Proposal development Innovation Programme Consolidation Programme Construction Programme Operational Phase

Development timeline

slide-7
SLIDE 7

Connect the old with the new

Make validated links between the traditional knowledge base of centuries

  • f biological collections and literature
  • rdered by taxonomic names - with the

databases holding the DNA barcodes.

slide-8
SLIDE 8

Species names as keywords - domain background information

Keywords commonly used for Biodiversity data: Species name or common name/vernacular in a certain language and locality,

  • r group of species (taxon). Species can also be subdivided in closely related

lower taxa, like a subspecies or a variety. A species name is in Latin and has 2 parts, example: Ursus maritimus (common name: Polar Bear in English). Ursus is the genus to which the species belongs, so if the species is moved to another genus because of a new taxonomic

  • pinion, the species name will change as well. The specific epithet maritimus

may change also depending on gender of the genus name. Animal names have different nomenclature rules from plants. A scientific name label represents an opinion on a taxon concept, it is not a unique identifier for the taxon concept. A taxon concept is delimited by a description and circumscribed by specimen, which have identifications (a scientific name attached to the specimen at a certain point in time).

slide-9
SLIDE 9

From Names to Taxon Concepts From: The use and limits of scientific names in biological informatics

  • D. Remsen

http://zookeys.pensoft.ne t/articles.php?id=6234

slide-10
SLIDE 10

Catalogue of Life Plus

2017-2018: Joint investment by NLBIF, Naturalis, GBIF, Catalogue of Life to build:

  • a shared taxonomic backbone that serves global biodiversity data aggregators and users;
  • a jointly developed and maintained, open and user-friendly taxonomic index with keywords to
  • rganize biodiversity data and separation between nomenclature (syntax) and taxonomy (semantics)

GBIF Implementation Plan 2017-2021 Activity 2b - Deliver names infrastructure

Need: comprehensive checklist of known species, and their published scientific names (keywords) to search in structured biodiversity data.

Catalogue of Life Plus & DiSSCo together will result in organized biodiversity information

slide-11
SLIDE 11

Catalogue of Life Plus - name resolution process

Name resolution process in 3 steps:

  • 1. Lexical name resolution to normalize spelling of scientific names,
  • 2. Nomenclatural name reconciliation to find the original nomenclatural event (protonym) as well

as the nomenclatural history of a name, and

  • 3. Taxonomic name resolution that presents a specialist opinion in selecting a currently

accepted scientific name for a particular taxa.

slide-12
SLIDE 12

Keyword-based search related research topics

Examples:

  • Extract scientific names & common names from text (semi-structured published species

descriptions)

  • Resolve scientific names and common names to currently accepted name
  • Optimized results ranking: searching for (genus) Accipiter should first return documents with

exact match Accipiter, not give Accipiter accipiter a higher ranking

  • Latin grammar algorithms: in results there should be no difference between male and female

endings Pterostyrax hispida and Pterostyrax hispidus

  • Use of traits (characters and their states) as keywords: search for 'round leaf' should return

species or specimen with a round leaf

  • Extract keywords from specimen multimedia (image recognition, handwriting recognition)
  • Search by distribution while locality is recorded as verbatim text, like ‘Coimbra‘ and the

specimen found centuries ago

slide-13
SLIDE 13

What will improved keyword search bring DiSSCo

Example research questions to facilitate:

  • Predict if, given a certain habitat, a species is

introduced, that will cause competition for native species and how likely it is that it will be introduced there by humans or other vectors.

  • Predict that given a temperature rise of 1 degree

Celsius a species distribution will change or it will become extinct.

  • Find likely hosts for an insect given the distribution for

that insect and other data.

Transform the collection datasets to one integrated European virtual collection integrated with other sources of information, uniting phenotypic and genetic data across temporal, geospatial and taxonomic frameworks.