bring out yer sips an introduction to digital
play

Bring out yer SIPs: An Introduction to Digital Preservation with - PowerPoint PPT Presentation

Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley, Digital Preservation Librarian, Scholars Portal Agenda - Basic concepts in digital preservation - Introduction


  1. Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley, Digital Preservation Librarian, Scholars Portal

  2. Agenda - Basic concepts in digital preservation - Introduction to Archivematica - Preparing transfers + Demo - Processing transfers + Demo - Looking at AIPs - Thinking about DIPs - Processing activity

  3. What’s this “digital preservation” thing? Uh oh

  4. Digital objects (both born digital ● and digitized) need active management to ensure ongoing access Quickly-changing technological ● norms create risks that must be managed from the object’s creation Digital preservation is a set of ● theories and practices that work to keep digital objects authentic , available and reliable over time.

  5. Identity: what it is; format identification, descriptive information, provenance, etc. Integrity: establishing that a file remains unaltered over time

  6. Identity: File formats filename : '/Users/hurleyg/Documents/Teaching/iSkills/CheckYourBits.jpg' filesize : 582231 modified : 2018-01-24T15:50:08-05:00 errors : matches : - ns : 'pronom' id : 'fmt/43' format : 'JPEG File Interchange Format' version : '1.01' mime : 'image/jpeg' basis : 'extension match jpg; byte match at [[[0 14]] [[582229 2]]]' warning : File format identifications/descriptions in Pronom (UK National Archives) - ID = Pronom identifier Archivematica uses Siegfried or FIDO

  7. Integrity: The almighty checksum md5 checksum = md5 checksum = 2c93b97c3d7e53dab9161e389c98465c 1148058955697062ca583d0cc0474322

  8. The even more almighty OAIS

  9. Other important concepts Identification: determining what a particular file’s format and version is Characterization: extracting metadata related to the file’s intrinsic properties. For example, audio sample rate, channels, etc. for a mp3 file. Validation: determining if a file is well-formed and valid according to its specification. Normalization: converting a file from a source format to a standardized format.

  10. What is Archivematica?

  11. What it does - Creates well-formed data packages for long-term preservation and access - Takes a pre-structured transfer from a data source - Makes a Submission Information Package (SIP) - Transforms the SIP into an Archival Information Package (AIP) - Also can create a dissemination information Package (DIP) for access - Each of these functions has configurable tasks associated

  12. What it does - Stores and applies preservation policies for normalization, access copies, etc. - Allows access to, and deletion of, AIPs - Assists in ingest of descriptive metadata, rights information - Manages data flows in and out of system through separate Storage Service module - Can connect to access systems for DIP deposit (mostly just AtoM) - Can be fully automated

  13. Where it came from - Standards for digital preservation developed in late 1990s and early 2000s, but no easy way of applying them - UNESCO released 2007 report advocating for open source digital preservation system - Artefactual Systems started up by creating Access to Memory (AtoM) system for archival description - Various small open source tools were also being developed by others for particular tasks - Artefactual developed Archivematica beginning in 2008 - Beta release in 2012; current release is 1.6.1 (2017)

  14. What it is - Modular workflow created using a microservices design pattern - Data follows structured, chained pathway, there the results of one step triggers the initiation of the next step. - Components can be replaced or turned off/on. - Accessible through the browser - Requires a virtual machine to run on (Ubuntu or CentOS) - Runs in LAMP environment (Linux, Apache, MySQL, PHP) - Open source, developed by Artefactual Systems staff

  15. What it isn’t - A storage system - An access system - Easy to install or maintain in production - User friendly - A complete digital archives workflow

  16. Who uses it Largely, memory institutions (libraries, archives, galleries, museums) with digital collections that need preserving - Libraries: - Digitized/born-digital content in institutional repositories - Research data management (several current projects trying to develop Archivematica’s capacity in this domain) - Digital collections (books, journals, maps, etc.) - Archives - Digitized collections (photographs, audio-visual materials, etc.) - Born digital donations (all sorts of stuff) - Private papers/collections - Records from corporate bodies, institutions, etc.

  17. The Workflow Pre-Transfer* Transfer Backlog Appraisal Ingest Storage & Access* Selection of Generates METS You can send File format Normalize files objects to file to be written something here view/analysis Store in Create & store preserve to if you don’t want location Selection for AIP/DIP to continue Metadata Virus scan retention processing it Send access preparation copies to other File ID, ID sensitive data systems Packaging for characterization, transfer validation *Not in *Linked to by Archivematica Archivematica

  18. Preparing transfers

  19. Steps - Determining content and structure (1 SIP = 1 AIP = fonds, series, item? Or section of one of these?) - Gather and structure metadata (next slide) - Gather submission documentation (not in demo) - Package and structure for ingest - All data needs to be in a directory, at minimum

  20. Metadata Descriptive metadata - Uses simple Dublin Core as key standard, other information is recorded as ‘Custom’ - Transfer level can be added through interface or imported - Item level must be imported via csv file Rights metadata - Mapped to PREMIS - Same import structure as above

  21. Demo - Set of photos + metadata csv file - Bagging using Python script

  22. Processing transfers

  23. Demo - Same materials as before - Uploaded to transfer source on Ontario Library Research Cloud - Process using standard workflow and settings - Briefly demo backlog/appraisal tabs - Store AIP on OLRC - No DIP

  24. Looking at AIPs

  25. AIP Contents - METS file - Originals + normalized copies in ‘objects’ folder - Materials that made up original transfer - Logs

  26. Thinking about DIPs

  27. DIPs - Set of normalized files for access, created with access policies in preservation planning module - Archivematica can connect to AtoM for DIP deposit to existing description - Can transfer over some metadata, so description work can be lessened, but only at transfer/item level

  28. Activity time!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend