Maximizing data accessibility at the ESAC Science Data Centre J. - - PowerPoint PPT Presentation

maximizing data accessibility at the esac science data
SMART_READER_LITE
LIVE PREVIEW

Maximizing data accessibility at the ESAC Science Data Centre J. - - PowerPoint PPT Presentation

Maximizing data accessibility at the ESAC Science Data Centre J. Gonzlez-Nez, C. Arviset, J. Salgado ESAC Science Data Center (ESDC) Issue/Revision: 1.0 Reference: Asteriscs EDP ESDC Archives Status: Issued ESA UNCLASSIFIED - Releasable


slide-1
SLIDE 1

Issue/Revision: 1.0 Reference: Asteriscs EDP ESDC Archives Status: Issued ESA UNCLASSIFIED - Releasable to the Public

Maximizing data accessibility at the ESAC Science Data Centre

  • J. González-Núñez, C. Arviset, J. Salgado

ESAC Science Data Center (ESDC)

slide-2
SLIDE 2

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 2 ESA UNCLASSIFIED - Releasable to the Public

ESAC Science Data Centre

The Digital Library of the Universe

Ø Large set of science archives co-located at ESAC are a major research asset for community :

  • Astronomy, Planetary, Solar Heliospheric

Ø Different types of data:

  • Raw data, calibrated processed data, high

level data products, …

  • All data public and available on-line after a

short proprietary period Ø Need to be kept readily available for future users and novel uses by various types of users:

  • Scientific Community (public access)
  • PI team and observers (controlled access)
  • Science Operations Team (privileged access)

Ø Archive Strategy Plan for 5-20+ years

slide-3
SLIDE 3

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 3 ESA UNCLASSIFIED - Releasable to the Public

ESAC Science Archives Strategy

Enable maximum scientific exploitation

  • f data sets

Enable efficient long-term preservation

  • f data, software and knowledge, using

modern technology Enable cost-effective archive production by integration in, and across, projects

slide-4
SLIDE 4

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 4 ESA UNCLASSIFIED - Releasable to the Public

International Collaboration and Interoperability amongst Archives

IVOA - International Virtual Observatory Alliance

  • Formed in 2002, 20 member projects
  • Defines interoperability standards (VO framework) amongst

astronomical (ground and space based) archives

  • Working Groups and Interest Groups per technical domain (data

access, data model, registry, applications, semantics, operations, …)

  • http://www.ivoa.net/

IPDA - International Planetary Data Alliance

  • Formed in 2004, 12 space agencies
  • Defines archiving guidelines for planetary data
  • Defines interoperability standards amongst planetary archives
  • http://planetarydata.org/
slide-5
SLIDE 5

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 5 ESA UNCLASSIFIED - Releasable to the Public

Data Architecture @ ESDC archives

slide-6
SLIDE 6

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 6 ESA UNCLASSIFIED - Releasable to the Public

OAIS to ABSI

OAIS (2003)

slide-7
SLIDE 7

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 7 ESA UNCLASSIFIED - Releasable to the Public

OAIS to ABSI

Archives Building System Infrastructure (2006)

slide-8
SLIDE 8

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 8 ESA UNCLASSIFIED - Releasable to the Public

ABSI based architecture

Database Data Repository

science archive

RPC

Browser GUI

AIO

Programmatic VO Apps S*AP, SIAP

VO services

SAMP Euro-VO Registry

slide-9
SLIDE 9

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 9 ESA UNCLASSIFIED - Releasable to the Public

Recent Drivers (Gaia, Euclid)

  • 1. SOC integrated archives:
  • a. Archive as a science operations building block
  • b. Data processing metadata & intermediate products
  • 2. Archive consortiums (DPAC, Euclid Consortium)
  • a. Integration with WP-developed modules:

Visualization, DM, Xmatch, Validation etc.

  • 3. Large data volumes:
  • a. Distributed data storage
  • b. ‘Big Data’ analysis techniques
slide-10
SLIDE 10

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 10 ESA UNCLASSIFIED - Releasable to the Public

Archives Integration within Consortium

Archive is fully part of Science Operations, from the start Strong collaboration with SGS and Consortium

slide-11
SLIDE 11

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 11 ESA UNCLASSIFIED - Releasable to the Public

Archives Integration within Consortium

Archive is fully part of Science Operations, from the start Strong collaboration with SGS and Consortium

slide-12
SLIDE 12

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 12 ESA UNCLASSIFIED - Releasable to the Public

Data collections stats for Gaia

slide-13
SLIDE 13

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 13 ESA UNCLASSIFIED - Releasable to the Public

The Gaia Archive

slide-14
SLIDE 14

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 14 ESA UNCLASSIFIED - Releasable to the Public

Gaia Archive Current Architecture

RDBMS Distributed Data Repository

science archive

TAP+

Browser GUI

TAP+, VOSpace

Programmatic VO Apps TAP+, S*AP

VO services

SAMP Euro-VO Registry

slide-15
SLIDE 15

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 15 ESA UNCLASSIFIED - Releasable to the Public

Public area

  • Publicly released catalogues

Restricted area

  • Catalogues during validation or

proprietary exploitation

User Space

  • User-uploaded data

TAP+ I/F Command line tools

Extending VO protocols: TAP +

External Apps Data Validation

slide-16
SLIDE 16

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 16 ESA UNCLASSIFIED - Releasable to the Public

ESASky : Multi-mission visualization

science archive

TAP, HTTP

science archive

Obs. metadata MOCs Catalogues All-Sky HIPS Maps

ESASky backend

sky.esa.int

slide-17
SLIDE 17

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 17 ESA UNCLASSIFIED - Releasable to the Public

ASTERICS at ESDC archives

slide-18
SLIDE 18

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 18 ESA UNCLASSIFIED - Releasable to the Public

Ongoing collaborations

  • 1. ADQL Auto-complete
  • Provide suggestions in the Gaia Archive ADQL

Query editor

  • Stelios Voutsinas (WFAU, University of Edinburgh)

GENIUS - ASTERICS

  • 2. pgSphere development
  • Extension maintenance, configuration

management

  • Implementation of Gaia CU9 requested features
  • Markus Nullmeier (ARI, Universität Heidelberg) -

ASTERICS

slide-19
SLIDE 19

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 19 ESA UNCLASSIFIED - Releasable to the Public

Challenges

Gaia Archive Preparation (2011)

slide-20
SLIDE 20

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 20 ESA UNCLASSIFIED - Releasable to the Public

Challenges

  • 1. “Bringing the code to the data”
  • a. GAVIP project (Docker)
  • b. TAP+: user areas, user defined functions
  • 2. Cloud Services
  • a. ESAC Cloud
  • 3. Large scale processing (Spark, Hadoop):
  • a. Data Mining Work Package
  • b. ‘Grand Challenges’
  • c. Prototype simple use cases
slide-21
SLIDE 21

Juan Gonzalez | ASTERICS European Data Provider Forum | 16/06/2016 | Slide 21 ESA UNCLASSIFIED - Releasable to the Public

http://www.cosmos.esa.int/web/esdc

Thanks!