An EOSC-hub & EUDAT service: Addressing sustainable long-term - - PowerPoint PPT Presentation

an eosc hub eudat service
SMART_READER_LITE
LIVE PREVIEW

An EOSC-hub & EUDAT service: Addressing sustainable long-term - - PowerPoint PPT Presentation

An EOSC-hub & EUDAT service: Addressing sustainable long-term preservation wall for scientific data: the European Trusted Digital Repository (ETDR) service 2018/10/08 Marion MASSOL (CINES) EOSC- hub receives funding from the European


slide-1
SLIDE 1

EOSC-hub receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 777536.

An EOSC-hub & EUDAT service:

Addressing sustainable long-term preservation wall for scientific data: the European Trusted Digital Repository (ETDR) service

Marion MASSOL (CINES)

2018/10/08

slide-2
SLIDE 2

3

« Rai Raider ders of the s of the Los

  • st

Dat Data »

slide-3
SLIDE 3

4

On Once ce upon a t tim ime, the co computer ter eng ngin ineer er of th f the la lab re rece ceiv ived ed a a mis issio ion n of t f the hig ighest est importance…

slide-4
SLIDE 4

5

To To fi find nd an n exp xpensi ensive ve sci cient ntif ific ic di digit ital l da data that that wa was pro rodu duce ced in… 1998 !

slide-5
SLIDE 5

6

slide-6
SLIDE 6

7

… he he fo found nd a a fl floppy py di disk that that seem emed ed to to be be und ndamaged ed!! !!

slide-7
SLIDE 7

8

slide-8
SLIDE 8

9

… Bu But wi with a a ve very ry li little le in info form rmatio ion n on n it its label…

slide-9
SLIDE 9

10

Wa Was it it the ri right fl floppy py di disk??? ??

slide-10
SLIDE 10

11

… To To fi find nd a a co compli liant nt hard rdwa ware re re reade der wa was stil ill le left ft to to do…

slide-11
SLIDE 11

12

slide-12
SLIDE 12

13

  • Ph

Phew!

slide-13
SLIDE 13

14

He He opened ned the fi file le wi with the la last version of MS Word…

slide-14
SLIDE 14

15

And And then?… and then?. ?... ..

slide-15
SLIDE 15

16

slide-16
SLIDE 16

17

He He saw saw the Mi Micr crosoft ft onl nlin ine kn knowl wledg dge base…

slide-17
SLIDE 17

18

And…

slide-18
SLIDE 18

19

slide-19
SLIDE 19

20

Me Method

  • d #3 :

3 : « Co Cont ntact ct your system stem adm dmin inis istra rator

  • r » !

» !

slide-20
SLIDE 20

21

Fi Fina nall lly, , he he fo found nd a s solu lutio ion n to re read this is da damane ned file…

slide-21
SLIDE 21

22

.. .. Wi With an a n acc ccepta table ble co couple le software / OS…

slide-22
SLIDE 22

23

And And th the sci cient ntif ific ic co cont ntent nt appea eare red…

slide-23
SLIDE 23

24

slide-24
SLIDE 24

25

He He fe fetche ched the re researc archer her and nd showe wed him im the succ ccessfu essful re result lt

  • f th

f the mis issio ion. n.

slide-25
SLIDE 25

26

And And th the ans nswe wer wa was…

slide-26
SLIDE 26

27

  • And

And MY MY fo form rmattin ing! These ese IT IT guys are re no no-pro rofe fessi ssiona

  • nal!!!

!!! Defi fini nitiv ively ely, , we we can’t tru rust you!!! !!!

slide-27
SLIDE 27

28

TH THE END

slide-28
SLIDE 28

4 main risks:

29 10/10/2018

Media corruption: bitstream alteration Hardware obsolescence Software evolution – file format obsolescence Lack of documentation & metadata

2 main strategies:

Implementation Procrastination

slide-29
SLIDE 29

30 10/10/2018

OAIS (ISO 14721) Metadata standards Dublin Core EAD/ISAAD-G/ISAAR Community schemas Harvesting process Accreditation: CoreTrustSeal/DSA Certification: ISO 16363 SIAF certification HADS (sensitive data) DIN 31644 File format policy + FF validation process + Emulation + logical migration Preservation strategy & policy: P2A PAIMAS PAIS Exchange protocols: DEPIP - ISO 20614 SWORD SEDA Storage strategy: Regular checks on all copies 3+ copies on 2+ technologies 1+ copy on site >300 km 1+ copy on site >2000km Physical migration SHA-512/SHA-256/MD5/… Technological watch processses PID: Handle/EPIC/ARK/DOI Data curation process Policy on digital signature

slide-30
SLIDE 30

31

D R T I

E R G P U I O S T S T A I E L T D O

R

Y

slide-31
SLIDE 31

32

F I

N D A B L E

A C C E S S I

B L E

I N T E R O P E R A B L

E

R E U S A B L

E

slide-32
SLIDE 32

33

D R T F I

N D A B L E R G P U A C C E S S I B L E O S T S

I N T E R O P E R A B L

E I E L T D O

R E U S A B L

E Y

slide-33
SLIDE 33

And concretly…

Business processes entity (file format conversion over time, etc.) Ingest entity Storage + data management entities Access entity Specific treatments (HPC, EGI facilities, etc.) Administration + data preservation planification entities (data curation policies, restitution tools, etc.) Community A Community A

Interface

B2FIND Wider community

Interface Interface

TDR

Transfert PID Quality checks Preserv. processes Storage Metadata manag. Storage quality tools Specific access tools

(by default, limited direct access) Interface + negociated protocol

slide-34
SLIDE 34

ETDR (European Trusted Digital Repository)

35 10/10/2018

The eTDR could be composed by 3 main possible use cases :

  • Use case #1: with a single TDR provided by a EUDAT SP
  • Use case #2: with a single ingest point into the eTDR + few storage

SPs (TDR operated by the community itself)

  • Use case #3: with a single ingest point into the eTDR + few storage

SPs (TDR operated by the EUDAT CDI) The eTDR instance for the community X is a technical & organizational solution:

  • Certified
  • Sustainable
  • For the long term preservation and access of scientific data
slide-35
SLIDE 35

A use case!... A use case!... A use case!!!...

36

slide-36
SLIDE 36

37