Distributed data management, the power and challenge of metadata - - PowerPoint PPT Presentation

distributed data management the power and challenge of
SMART_READER_LITE
LIVE PREVIEW

Distributed data management, the power and challenge of metadata - - PowerPoint PPT Presentation

Distributed data management, the power and challenge of metadata ystein Gody Why bother with data management? Science paradigms Maximise public investment in data according to Jim Gray collection and production empirical


slide-1
SLIDE 1

Distributed data management, the power and challenge of metadata Øystein Godøy

slide-2
SLIDE 2

Why bother with data management?

  • Maximise public

investment in data collection and production

  • Promote scientific

collaboration

  • Promote interdisciplinary

science

  • Promote scientific

transparency

  • Leave a legacy
  • Science paradigms

– according to Jim Gray

  • empirical science
  • theoretical science
  • computational

science

  • data exploration

science

slide-3
SLIDE 3
slide-4
SLIDE 4

Document data through metadata

  • Discovery level
  • What
  • Where
  • When
  • Who
  • Constraints on sharing

and usage

  • How to access data
  • Linkages between data
  • Use level
  • Variable

names/descriptions

  • Missing values
  • Units
  • Coordinates
  • Interdependency

between variables

  • ...
slide-5
SLIDE 5

Metadata

slide-6
SLIDE 6
slide-7
SLIDE 7

The toolbox

slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10
slide-11
SLIDE 11

METSIS implementations

  • Arctic Data Centre (WMO

DCPC)

  • WMO Global Cryosphere

Watch

  • EU FP6 DAMOCLES
  • EU FP7 ACCESS
  • EUMETSAT Ocean and Sea Ice

SAF (High Latitude centre)

  • Svalbard Integrated Arctic

Observing System (demo)

  • SAON (demo)
  • Norwegian Satellite Earth

Observation Database for Marine and Polar Research (RCN)

  • CryoClim – climate

consistent satellite remote sensing products for the cryosphere (ESA/NRS)

  • International Polar Year
  • National node in Norway
  • International operational data

coordination

slide-12
SLIDE 12

NORMAP

  • Norwegian Satellite Earth

Observation Database for Marine and Polar Research

  • http://normap.nersc.no/
  • Satellite products
  • Level 2 and higher
  • Distributed data repository
  • Nansen Environmental and

Remote Sensing Centre

  • Norwegian Meteorological

Institute

  • Kongsberg Satellite Services
  • (CERSAT)
  • Features
  • Subscription
  • Collocated

visualisation

  • Transformation of

individual products

  • (Transformation of

multiple products to a common reference)

slide-13
SLIDE 13
slide-14
SLIDE 14

Global Cryosphere Watch

  • A coordinated framework

for

  • Observations
  • Data management
  • Monitoring
  • Assessment
  • Product development
  • Focusing on the current

and future state of the cryosphere

  • WMO Information

System

  • Relies on

interoperability interfaces

  • Much of the data is of

scientific origin

  • Data are served from

the host data centre

slide-15
SLIDE 15
slide-16
SLIDE 16

Collocated visualisation

slide-17
SLIDE 17
slide-18
SLIDE 18
slide-19
SLIDE 19
slide-20
SLIDE 20
slide-21
SLIDE 21

Lessons learned harvesting

  • End point availability
  • Few dedicated subsets
  • f available data
  • Filtering of records is

required, but require e.g. proper keywords

  • CSDGM and ISO19115

metadata often lack standardised controlled vocabularies

  • Transformation of

metadata to search model is done using XSLT and SKOS

  • Harvest semi automatic

initially to fully understand the nature of each data centre linked

  • Frequently interfaces to

data and documentation

  • f these are lacking
slide-22
SLIDE 22

Main challenges

  • Standardised

controlled vocabularies in machine readable form

  • Filtering
  • Transformation
  • Identification of

interfaces

  • Metadata granularity

differs

  • Propagation of metadata

to global frameworks

  • Duplication of records
  • Require agreement to

propagate metadata

  • Interpretation of

standards

  • Not all are using

standard interfaces to data

slide-23
SLIDE 23

Summary

  • Integration of metadata sources is challenged

by integration of technology and science through documentation standards and controlled vocabularies

  • An increasing number of data centres support

interoperability standards, but interpretation

  • f standards often differ
  • Metadata brokering is the short term key to

integration of data centres