Implementing Trusted Digital Implementing Trusted Digital - - PowerPoint PPT Presentation

implementing trusted digital implementing trusted digital
SMART_READER_LITE
LIVE PREVIEW

Implementing Trusted Digital Implementing Trusted Digital - - PowerPoint PPT Presentation

Implementing Trusted Digital Implementing Trusted Digital Repositories Repositories Reagan W. Moore Reagan W. Moore Richard Marciano Richard Marciano Arcot Rajasekar Rajasekar Arcot Wayne Schroeder Wayne Schroeder Mike Wan Mike Wan


slide-1
SLIDE 1

Implementing Trusted Digital Implementing Trusted Digital Repositories Repositories

Reagan W. Moore Reagan W. Moore Richard Marciano Richard Marciano Arcot Arcot Rajasekar Rajasekar Wayne Schroeder Wayne Schroeder Mike Wan Mike Wan { {moore moore, , schroede schroede, , mwan mwan, , sekar sekar, , marciano}@sdsc.edu marciano}@sdsc.edu http://www.sdsc.edu/srb http://www.sdsc.edu/srb http:// http://irods.sdsc.edu irods.sdsc.edu/ /

slide-2
SLIDE 2

Topics Topics

  • Representation information for

preservation environments

  • How can preservation policies and procedures

be characterized?

  • Rule-based data management systems
  • How do we make assertions about the

trustworthiness of a preservation environment?

  • Theory of digital preservation
  • What are the components on which a theory

could be based?

slide-3
SLIDE 3

Digital Preservation Digital Preservation

  • Preservation is communication with the future
  • How do we incorporate new technology (information

syntax, encoding format, storage infrastructure, access protocols) in a preservation environment?

  • SRB - Storage Resource Broker data grid provides the

interoperability mechanisms needed to manage multiple versions of technology (infrastructure independence)

  • Preservation manages communication from the

past

  • What information do we need from the past to make

assertions about preservation assessment criteria?

  • iRODS - integrated Rule-Oriented Data System
slide-4
SLIDE 4

Assessment Criteria Assessment Criteria

  • Authenticity
  • Management of descriptive information about record

provenance, record representation information

  • Integrity
  • Minimization of the risk of data loss
  • Chain of custody
  • Verification of archivist management policies
  • Respect des fonds
  • Preservation of the original arrangement of the records
  • Trustworthiness
  • RLG/NARA assessment criteria - 174 rules
slide-5
SLIDE 5

Controlling Remote Operations Controlling Remote Operations

Da ta Ma nage ment Environment Co nserve d Properties Co ntrol Me cha nis ms Re mote Op era tion s Ma nage ment Functions Assessment Cr iteria Ma nage ment Policies Ca pabil ities Da ta Ma nage ment Infrastructure Pers istent State Rules Micro

  • serv

ices Phy sical Infrastructure Da tabase Rule Engine Storage System

iRODS iRODS -

  • integrated Rule

integrated Rule-

  • Oriented Data System

Oriented Data System

slide-6
SLIDE 6

Representation Information for Representation Information for Preservation Environments Preservation Environments

  • Assessment criteria
  • Mapped to sets of persistent state information
  • Management policies
  • Mapped to sets of rules
  • Preservation processes
  • Mapped to sets of micro-services
  • Rules generate persistent state information

by controlling the execution of sets of micro- services at remote storage systems

slide-7
SLIDE 7

Example Rule Example Rule

  • Rule composed of four parts:
  • Name | condition | micro-service set |

recovery set

  • Rule to automate replication of data

for a specific collection

acPostProcForPut | $objPath like /tempZone/home/rods/nvo/* | msiSysReplDataObj(nvoReplResc,null) | nop

slide-8
SLIDE 8

Infrastructure Independence Infrastructure Independence

  • Distributed Data Management
  • Data virtualization
  • Storage protocol independence
  • Trust virtualization
  • Administrative domain independence
  • Federation
  • Manage interactions between independent data grids
  • Rule-based Data Management
  • Management virtualization
  • Automating execution of management policies
  • Coupling management policies to assertions about data
slide-9
SLIDE 9

Data Virtualization Data Virtualization

Storage System Storage System

Storage Protocol Storage Protocol Access Interface Access Interface Standard Access Actions Standard Access Actions Data Grid Data Grid Map from the Map from the actions requested by actions requested by the access method the access method to a standard set of to a standard set of micro micro-

  • services used

services used to interact with the to interact with the storage system storage system Standard Micro Standard Micro-

  • services

services

slide-10
SLIDE 10

Micro Micro-

  • services

services

  • Examined Electronic Records Archive

capabilities list

  • Identified 174 micro-services for manipulation of data

and structured information

  • Identified 212 metadata attributes (persistent state

information) across six name spaces

  • Users
  • Files
  • Storage systems
  • Rules
  • Micro-services
  • Persistent state information
slide-11
SLIDE 11

Federation Between Data Grids Federation Between Data Grids

Data Grid

  • Logical resource name space
  • Logical user name space
  • Logical file name space
  • Logical rule name space
  • Logical micro-service name
  • Logical persistent state

Data Collection B Data Access Methods (Web Browser, DSpace, OAI-PMH) Data Grid

  • Logical resource name space
  • Logical user name space
  • Logical file name space
  • Logical rule name space
  • Logical micro-service name
  • Logical persistent state

Data Collection A

slide-12
SLIDE 12

Theory of Digital Preservation Theory of Digital Preservation

  • Definition of the persistent name spaces
  • Definition of the operations that are performed upon the persistent name

spaces

  • Characterization of the changes to the persistent state information

associated with each persistent name space that occur for each operation

  • Characterization of the transformations that are made to the records for

each operation

  • Demonstration that the set of operations is complete, enabling the

decomposition of every preservation process onto the operation set.

  • Demonstration that the preservation management policies are complete,

enabling the validation of all preservation assessment criteria.

  • Demonstration that the persistent state information is complete, enabling

the validation of assessment criteria.

  • The assertion is then: if the operations are reversible, then a future

preservation environment can recreate a record in its original form, maintain authenticity and integrity, support access, and display the record.

  • A corollary is that such a system would allow records to be migrated

between independent implementations of preservation environments, while maintaining authenticity and integrity.

slide-13
SLIDE 13

For More Information For More Information

Reagan W. Moore San Diego Supercomputer Center moore@sdsc.edu http://www.sdsc.edu/srb/ http://irods.sdsc.edu/