Prioritizing use over perfection: a risk management approach to - - PowerPoint PPT Presentation

prioritizing use over perfection a risk management
SMART_READER_LITE
LIVE PREVIEW

Prioritizing use over perfection: a risk management approach to - - PowerPoint PPT Presentation

Prioritizing use over perfection: a risk management approach to digital preservation Matthew Mihalik George Washington University Rachel Trent Library of Congress (Formerly George Washington University) Introduction GW has committed to a


slide-1
SLIDE 1

Prioritizing use over perfection: a risk management approach to digital preservation

Matthew Mihalik George Washington University Rachel Trent Library of Congress (Formerly George Washington University)

slide-2
SLIDE 2

GW has committed to a transparent digital stewardship strategy that prioritizes access to digital assets over adherence to digital preservation standards. Our current strategy is informed by our past experience, failures, and our users needs.

Introduction

slide-3
SLIDE 3

Invested 1.5 years developing a custom storage environment, id system, custom inventory and audit tools. Features: file system inventory system, web management UI, synchronization between storage environments, built in checksum auditing Outcome: project became paralyzed by complexity and was never adopted.

Distant history

slide-4
SLIDE 4

No articulated commitments for our digital preservation work. Disconnected storage environments split between access and preservation. No active auditing or inventorying of digital assets. Patrons were only able to access a subset of our digital collections

Recent history

slide-5
SLIDE 5
  • No available inventory of digital assets on storage
  • No access controls to preservation servers
  • No auditing of digital assets on preservation servers
  • Limited redundant copies of digital assets
  • No clearly articulated policy of commitments to our

digital content

  • Unclear roles and responsibilities between GW units

for management of digital assets

Initial risk landscape

slide-6
SLIDE 6

We defined set of principles that our stakeholders were able to commit to and defined our minimum viable product. They decided we wanted to first provide access to more

  • ur digital collections and then we wanted ensure that

we know what we have, where we have it, and if it’s changed.

What did we decide to do about it?

slide-7
SLIDE 7

GW Libraries’ Digital Stewardship Services provides long-term preservation of selected unique, rare, and institutionally-created digital materials, such as student and faculty research products, University records of enduring value, and specialized cultural heritage

  • collections. These include born-digital and digitized materials.

New mission statement

slide-8
SLIDE 8

As a part of our digital steward initiatives, we committed to being transparent with our stakeholders and users about what GW is and is not doing for our digital assets. GW has committed to preserving and providing access to this carefully selected set of digital materials over the long term. Commitments are the result of strategic resource planning that balances the benefits of providing engaging, rich access for today’s users with key investments to support access for future users.

Transparent commitments

slide-9
SLIDE 9

Tier 2

slide-10
SLIDE 10

Tier 1

slide-11
SLIDE 11

Tier 0

slide-12
SLIDE 12
  • Storage environment comprised of linux servers

○ Current file systems mirror legacy storage file systems. ○ Offsite backups of these servers of accepted as our “second” copy

  • Access environment built on Hyrax
  • Simple Audit Tool
  • Amazon Web Services for offsite copies

○ Reserved for a selective subset of materials

Current infrastructure

slide-13
SLIDE 13

Stakeholder group at GW responsible for digital stewardship decision making and resourcing. Membership includes: associate deans, IT staff, developers, scholarly communication staff, and digital services staff

Digital stewardship group

slide-14
SLIDE 14
  • Know if something has been added to a filesystem
  • Know if something has been removed from a

filesystem

  • Know if an asset on the filesystem has changed
  • Know who performed actions on the filesystem
  • Ability to schedule audits and run ad-hoc
  • Email reports with results

Digital services team needs

slide-15
SLIDE 15

Developed to meet our digital services team basic needs to know where assets are stored, what we have, and if anything has changed. Command line tool written in Python that can be run manually or via cronjob. Available on GitHub: gwu-libraries/audit-tool

Simple audit tool

slide-16
SLIDE 16

Excel report of files missing from inventory

slide-17
SLIDE 17

Excel report results summary

slide-18
SLIDE 18

JSON report sample

slide-19
SLIDE 19
  • Inventory of digital assets on storage tracking adds, deletes, and

changes

  • Limited access control in place on preservation servers
  • Proactive and ad-hoc auditing of digital assets on preservation

servers with routine reporting

  • Redundant copies of selective digital assets
  • Clearly articulated policy of commitments to our digital content
  • Clear roles and responsibilities between GW units for management of

digital assets

Current risk landscape

slide-20
SLIDE 20

Implement an administrative web interface to facilitate searching for items on storage servers by filename. Enhancing our support for Tier 1 content Explore automated restoration of files using JSON report outputs

What’s next? pt. 1

slide-21
SLIDE 21

Develop a collection management policy for born digital content Evaluate restructuring our storage server filesystems from legacy paths to a modern storage hierarchy Annually reassess our risk management strategy

What’s next? pt. 2

slide-22
SLIDE 22

Exploring integrating MetaArchive as a storage location within our infrastructure. Looking at leveraging Simple Audit Tool with items stored in MetaArchive for consistency. Updating our digital services catalog to reflect this new endpoint.

What’s next pt. 3

slide-23
SLIDE 23