the r e role of e of t trustwort rthy d digi gital rep
play

The R e Role of e of T Trustwort rthy D Digi gital Rep - PowerPoint PPT Presentation

The R e Role of e of T Trustwort rthy D Digi gital Rep epositori ries i in Sustainability David Giaretta david@giaretta.org www.giaretta.org and www.iso16363.org Big Data to Knowledge AHM & Open Data Science Symposium 29 Nov 1


  1. The R e Role of e of T Trustwort rthy D Digi gital Rep epositori ries i in Sustainability David Giaretta david@giaretta.org www.giaretta.org and www.iso16363.org Big Data to Knowledge AHM & Open Data Science Symposium 29 Nov – 1 Dec 2016 Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 1 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  2. Interoperability, Re-use, Preservation and Sustainability Exploitation/ Re-use VALUE Replication of results Usability Interoperability What do the bits mean? • Preservation Need “metadata” “ metadata ” Sustainability • What kinds? How much of • each kind? EU Commissioner for the Digital Agenda said: • “Data is the new Gold” but Gold is precious because it is rare, and does not combine • Data is precious because there is so much and it becomes • more valuable when it is combined Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 2 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  3. Digitally encoded information – 1’s and 0’s • BITS: 01001110 01001101 01010001 01001101 01010000 01001010 00100000 00100000 Example : “ca fe ba be” at start • HEX: 4e 4d 51 4d 50 4a 20 20 indicates Java class file • Two IEEE 754 32 bit real numbers: 8.6116461E8 1.35644119E10 Assuming “big-endian” • Two 32 bit integers 164211241 168379396 • Actually... .... • ASCII Characters: NMQMPJ What does this mean? ………. Was my flight reference • Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 3 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  4. …sem emanti tics … s … Can anyone guess what this table means? Could be F indable and A ccessible - encoded as Comma Separate Value (CSV) file in ASCII or Unicode or encoded with XML markup Longitude Latitude Ozone Date Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 4 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  5. OAIS (ISO 14721) and digital preservation • Reference Model for Open Archival Information System (OAIS) provides a very general approach • OAIS approach to digital preservation: – covers all types of digitally encoded information – provides a way to test whether preservation is successful – does not require seeing into the future – does require transparency – be clear what is being promised • but does not require “open access” • Very widely accepted and provides the basis for pretty well all work in digital preservation • OAIS provides a good basis for certification • Available free from https://public.ccsds.org/Pubs/650x0m2.pdf Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 5 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  6. Pres eser erving d g digi gital ally y encod oded ed i information on • In order to use/understand the bits requires what OAIS calls “Representation Information” – anything needed to allow the data to be interpreted by software or people and certainly requires semantics and many other things • Additional things such as software which are readily available now may not be available in future • If the bits are unchanged we can keep hashes and be pretty sure of authenticity. • If we have to change the bits e.g. Transform to another format then • Evidence of Authenticity needs care • Probably needs other software etc • It may be that the information must be handed over • To different system and/or different organisation • Need to take care of the details which tend to be ignored Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 6 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  7. Partial Representation Information Network for MERIS Level 2 data Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 7 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  8. Role of people (and automated systems) • Creation of data and capture/creation of the metadata required for use/exploitation now and into the future • Follow “Active” Data Management Plans (RDA and CCSDS/ISO) • Funding, Management and Operation of the repository • Defines the “Designated Community” e.g. people who understands particular sub-discipline • Undertakes preservation activities for the data – ensuring that the data will be usable by members of the Designated Community despite changes in h/w, s/w, environment etc • Use the data (including by the Designated Community) • Exploit and create value from the data • Judge the value of the data Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 8 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  9. Many types of Audit and Certification • ISO 16363 focuses on keeping the Information understandable / usable • www.iso16363.org • based on OAIS concepts – including usability • 100+ metrics covering all aspects of the repository to ensure the auditor looks at the details • uses the ISO certification process on which our lives depend in so may areas e.g. medical equipment, food safety, airlines, automobiles etc.- 3 rd party visits and evaluation • ISO 27000 type audits focus on keeping the bits safe in the context of the needs of the organisation • the information is an asset of the business – what happens after the organisation ceases to exist is of no concern. Security certification may be needed for any information that can be used to identify an individual • DIN 31644 • audit and certification process not clear • ISO 15489 – Records Management • No formal audit process • World Data System and Data Seal of Approval • Small set (16) metrics – not detailed • Recognised as much “lower” than ISO 16363 (DSA as “bronze” and ISO 16363 as “gold”) Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 9 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  10. ISO Standards for certification • ISO 16363: Audit and Certification of Trustworthy Digital Repositories • Available free from https://public.ccsds.org/Pubs/652x0m1.pdf • ISO 16919: Requirements For Bodies Providing Audit And Certification of Trustworthy Digital Repositories • Available free from https://public.ccsds.org/Pubs/652x1m2.pdf • Used for accreditation of auditors by National Accreditation Bodies • Auditors available early next year Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 10 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  11. Sustainability and Trustworthiness • Requires resources ($ / £ / …) • Are the resources being well spent – will the data be usable? • Is the Value (or potential value likely to be derived) worth the Cost • An important factor in appraisal – cannot preserve everything • There are economies of scale • There are limits to the availability of expertise • Competition between repositories? • Trustworthiness is a way to choose between repositories • ISO 16363 certification requires detailed evidence and is fundamentally linked to usability - from which value, and hence sustainability, is derived Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 11 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  12. Useful Links • OAIS • WEB pages: www.oais.info • Site to gather proposals for OAIS updates in 2017: http://review.oais.info • ISO 16363: • www.iso16363.org • Integrated GLOSSARY of digital preservation http://www.alliancepermanentaccess.org/index.php/consultancy/dpglossary/ • SKOS ontology to show relationship between terms from different glossaries • OAIS, APARSEN, DPC, ANZ, SNIA, INTERPARES, ISO16363 • Active Data Management Plans: • CCSDS/ISO • http://cwe.ccsds.org/moims/default.aspx#_MOIMS-DAI • Research Data Alliance: • https://www.rd-alliance.org/groups/active-data-management-plans.html • Me: • www.giaretta.org Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 12 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

  13. END david@giaretta.org Big Data to Knowledge AHM & Open Data Science Symposium The Role of Trustworthy Digital Repositories in Sustainability 13 Bethesda, MD 29 Nov – 1 Dec 2016 David Giaretta www.giaretta.org

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend