INTRODUCTION TO EUDAT CDI AND B2 SERVICES SUITE
eudat.eu @eudat_eu Mark van de Sanden | EUDAT/SURFsara
INTRODUCTION TO EUDAT CDI AND B2 SERVICES SUITE Mark van de Sanden - - PowerPoint PPT Presentation
INTRODUCTION TO EUDAT CDI AND B2 SERVICES SUITE Mark van de Sanden | EUDAT/SURFsara @eudat_eu eudat.eu Ou Outlin line EUDAT B2 services suite EUDAT CDI infrastructure Example use cases Q&A CDI I Data Domain PUBLISHED DATA DOMAIN
eudat.eu @eudat_eu Mark van de Sanden | EUDAT/SURFsara
Data Domain modeled on the ANDS1 Data Curation Continiuum
WORKSPACE (TEMPORARY - TRANSIENT)
Register Digital Objects Stage Digital Objects
REGISTERED DATA DOMAIN
Discovery of Digital Objects
PUBLISHED DATA DOMAIN
Linking Publications To Digital Objects
EUDAT services are designed, built and implemented based on user community requirements.
(thematic data centres)
storage, workflows, processing, archive
Guidelines for data providers which are DataCite and OpenAIRE compliant Harvesting via OAI-PMH, JSON-API and CSW 2.0 Data selection on basis of 9 facets, including Spatial, Time, Publication Year, Tags Full text search on metadata Who Anyone What Find collections of scientific data quickly and easily, irrespective of their origin, discipline or community Get quick overviews of available data Browse through collections using standardized facets Why Unique collection Ease of Searching
http://b2find.eudat.eu/
B2FIND provides ‘faceted’ search for
Dataset view provides display of metadata :
Support for direct publishing of data sets in B2SHARE Personal quota of 20GB, extended quota possible Fine grain control to share data with
Share data with researchers across
Nextcloud instances Easy integration with other research platforms Who Citizens Scientists and small teams What Store and exchange data Synchronize multiple versions Ensure automatic desktop synchronization Why Ease of Use Trusted European Service
https://b2drop.eudat.eu/
Minimum metadata compliant with DataCite and OpenAIRE, flexible support for community specific metadata extensions Support for DOI’s on dataset level Support for PIDs, checksums and download statistics on object level Dataset record lifecycle and versioning Authorisation for community domains Metadata automatic harvested by B2FIND Support for annotation via B2NOTE Direct uploads from B2DROP Easy installable as local instance via Docker Who Small to Medium Teams What Store data (incl. software) and add domain meta data Share registered research data worldwide Preserve (small-scale) research data for long- term Why Register Data for Publications (FAIR) Make known to wider community
https://b2share.eudat.eu/
an annotation is “a note added to a text, book, drawing, etc., as a comment or an explanation” (from Merriam Webster) Provide a service to add annotations to digital assets New B2 service, launched at Jan 2018 Can be integrated within community repositories and services
https://b2note.eudat.eu/
Manual annotations via WUI, or programmatic via a REST API Annotation on existing ontologies (in Biomedical domain) Integrated with B2SHARE on basis of PIDs Uses W3C Annotation Model standard (JSON-LD and RDF)
Support for different storage systems (e.g. Posix, NFS,S3, Tape, UMS) Policies for data replication, registration
(alpha) Access via GridFTP and HTTP APIs Data downloads via PIDs Support for automated data publication via B2SHARE (alpha) Central policy management Who Community Data Managers ‘Sophisticated’ Organisations What Provide an abstraction layer which virtualizes large-scale data resources Guard against data loss in long-term archiving and preservation Optimize access for users from different regions and to computing resources Data management on basis of policies
Why
Performance Replication between trusted sites Data Preservation
Data policies are centrally managed Policy rules are implemented and enforced by site-local rule engines Policies describe in an abstract language Community data managers must authenticate to provide trust Support policies for data replication and integrity checking Central logging for auditable data policies to monitor execution Active collaboration with the RDA Practical Policy WG
Common API on basis of GridFTP and HTTP Upload/Download of data to/from B2SAFE Downloads via PIDs for both APIs (GridFTP and HTTP)\ Support for anonymous data access HTTP API defined via OpenAPI
Who Users and Communities who want to interact with EUDAT CDI services What Provide a common access layer to B2 services Copy large data sets, ingesting them onto EUDAT data services Enables data transfer for large data collections from EUDAT storages to external HPC facilities for processing Why Support data transfers between PRACE and EGI Simplify data transfers
http://petstore.swagger.io/?url=https://b2stage.cineca.it/api/specs&docExpansion=none - /
Based on Handle v8 Machine readable via HTTP RESTful API EUDAT standardized PID record and data types B2HANDLE API Python library for easy integration is client services PID prefixes provided via ePIC Multiple B2HANDLE service providers Who Groups or Communities who want to make their data citable What Follows policies to register data and make it long term refer- and citable Reliability through mutual PID mirroring Provides abstraction layer between a globally unique persistent identifier and physical location of data objects PIDs global resolvable Why Simple integration Technology Agnostic
Support for eduGAIN, Social Identities (Facebook, Google, Microsoft, Github), Orcid and local accounts IdP integration support for SAML, OpenID, OAuth2, X.509 Community IdP (ELIXIR, PRACE, EGI) SP integration support for SAML, OIDC, OAuth2, X.509 Integrated with B2SHARE, B2SAFE, B2STAGE, B2DROP, B2NOTE, SPMT, DPMT and Gitlab Joint proposal for the Life Sciences with GEANT and EGI Common Federated AAI planned in EOSC-hub
Who Anyone wanting to use the B2 Services What Complies with community ownerships and access rights, basis of trust Credential conversion approach (e.g. SAML, OpenID, X.509, Username/password) Identity provider for citizen scientists Why Use your own ID in federated environment
WORKSPACE (TEMPORARY - TRANSIENT)
Register Digital Objects Stage Digital Objects
REGISTERED DATA DOMAIN
Discovery of Digital Objects
PUBLISHED DATA DOMAIN
Linking Publications To Digital Objects Register Digital Objects Stage Digital Objects
REGISTERED DATA DOMAIN
Discovery of Digital Objects Data Domain modeled on the ANDS1 Data Curation Continiuum
Data Objects Data Entities
Total 33 documents maintained and revised 3 levels of documentation:
Engage: for Community decision-makers and data managers Deploy: for system and support engineers Use: for researchers and end users
Participation from community experts
https://eudat.eu/services/userdoc
https://eudat.eu/training - https://github.com/EUDAT-Training
Total of 14 training modules developed and maintained Hands-on training environments for: B2SAFE B2SHARE B2FIND B2HANDLE B2NOTE
more than 20 European research organisations, data and computing centres in 14 countries
Thematic Service Provider Repository Provider Generic Service Provider (Archive, Large Storage System, HTC/HPC)
providing PIDs
Operational and Support services
Project (Configuration) Management: DPMT Service Portfolio & Catalogue Management Tool A&R Monitoring, Software Version Monitoring Accounting, Reporting Helpdesk
Operational Services
Vulnerability Scanning, CSIRT
PID Service Provider
Service Hosting Framework
Support Request Webform Trouble Ticketing System B2-Service Queues Site Queues
Requests via the Helpdesk Webform
1st Level Support (BSC) 2nd Level Support: Project Enabling Team (OP) 3rd Level Support Service Developer Teams (DEV)
https://eudat.eu/contact-support-request
– To advance SeaDataNet service and increase their usage by adopting cloud and HPC technology
– Leverage EUDAT CDI infrastructure for long-term digital preservation and curation provide unified data access – 5 partners: DKRZ, CINECA, CSC, GRNET, STFC – B2 services: B2DROP, B2SHARE, B2SAFE, B2HOST, B2STAGE, B2FIND and B2ACCESS
distributed across the five EUDAT partners
replicated across the five EUDAT partners
www.eudat.eu @eudat_eu