web at risk extending the digital curation mission to the
play

Web At Risk: Extending the Digital Curation Mission to the Web - PowerPoint PPT Presentation

DigCCurr 2007 April 18-20 UNC Building Capabilities for Digital Curation Repositories Web At Risk: Extending the Digital Curation Mission to the Web Patricia Cruse, Director, Digital Preservation Program Kirsten Neilsen, Digital


  1. DigCCurr 2007 – April 18-20 UNC Building Capabilities for Digital Curation Repositories Web At Risk: Extending the Digital Curation Mission to the Web Patricia Cruse, Director, Digital Preservation Program Kirsten Neilsen, Digital Preservation Services Manager California Digital Library Preservation Program Digital Preservation Program

  2. The Digital Preservation Program • Established in 2002 • UC-wide program • Goal: ensure long-term availability and accessibility to materials that are important to the research, teaching, and learning on the UC campuses. • Centrally managed • Central and external funds • A partnership Preservation Program Digital Preservation Program

  3. Cornerstone of the Program: Digital Preservation Repository (DPR) • Suite of tools & services: – Digital Preservation Repository – Documentation, guidelines, policies • Intern’l Standards & Open Source • Service oriented architecture: flexible, adaptable, simple • Preservation Partnership – Curate – Preserve Preservation Program Digital Preservation Program

  4. Digital Preservation Repository core services • A set of services that support the long-term retention of digital objects: – Submit (deposit) digital objects – Manage digital objects: add versions, replace, update, delete – Request dissemination – Request administrative reports (forthcoming) • What the service is not… Preservation Program Digital Preservation Program

  5. Preservation Program Digital Preservation Program

  6. DPR to W eb Archiving Service Preservation Program Digital Preservation Program

  7. Web-at-Risk: NDIIPP Funds Jan 2005 – Jan 2008 • Build tools to allow librarians to capture, curate and preserve web-based government and political information. – Create topical and event-based archives – Capture individual sites and documents • Assess the impact of these tools on traditional collection development practices. • Explore web archiving service sustainability. Preservation Program Digital Preservation Program

  8. Project Partners

  9. Preserving the Web • Why all the fuss? • What is “Web Archiving?” • Web Archiving Service (WAS) – Collecting content – Curating content • Current status & future plans Preservation Program Digital Preservation Program

  10. Preservation Program Digital Preservation Program

  11. Preservation Program Digital Preservation Program

  12. • 2003 survey of the .gov domain: – as much as 65 percent of all government publications that are distributed to libraries through the federal depository library program are currently produced exclusively in electronic form and distributed via the web. Preservation Program Digital Preservation Program

  13. What is a “Web Archive?” • Automated method to gather web content • Collections composed of multiple sites • Captured content preserved • Meaningful access to content provided – Public or end-user access may not be available Preservation Program Digital Preservation Program

  14. Preservation Program Digital Preservation Program

  15. Domain-Based Web Archives Nordic National Libraries Nordic Web Archive National Library of Sweden Kulturarw3 National Library of Iceland National Web Archive Preservation Program Digital Preservation Program

  16. Topical Web Archives Preservation Program Digital Preservation Program

  17. Event-Based Web Archives Preservation Program Digital Preservation Program

  18. Preservation Program Digital Preservation Program

  19. Web Archiving Lingo • Crawler • Host • Site • Seed • Capture • Robots.txt Preservation Program Digital Preservation Program

  20. Preservation Program Digital Preservation Program

  21. Preservation Program Digital Preservation Program

  22. Preservation Program Digital Preservation Program

  23. Sample Collection Plan • Section 1. Mission & Scope • Section 2. Selection • Section 3. Acquisition • Section 4. Descriptive Metadata • Section 5. Rights and Access • Section 6. Maintenance and Weeding • Section 7. Preservation • Appendix A. Letter of Agreement • Appendix B. Seed List • Appendix C. Metadata Preservation Program Digital Preservation Program

  24. Flexibility in the face of uncertainty Preservation Program Digital Preservation Program

  25. What metadata will you need? Title Coverage Metadata Modifier Access Application Name Parallel Title Place Name Date of Modification Access Application Version Alternate Title Time Period File Information Other Software Information Added Title Date File Size Hardware Series Title Date Range File Name Creation Hardware Serial Title Source Format Name Access Hardware Uniform Title Relation Format Version Other Hardware Information Other Collection File description Documentation Creator Institution Resolution Structural Composition Creator Name Rights Management Dimension Storage Medium Creator Role Resource Type Duration Access Inhibitors Creator Information Format Rate Inhibitor Key Contributor Identifier Tonal-Resolution Functionality Contributor Name URL Color Exception Contributor Role URN Compression Alteration History Contributor Information DOI Other File information Action Taken Publisher ISBN Fixity Information Date of Alteration Publisher Name ISSN Authentication Type Modifier Place of Publication OCLC No. Authentication Result Other Alteration Information Publisher Information Report No. Date Metadata Information Date Government Document No. First Date Metadata Editor/Modifier Original Resource Creation Date Accession or Local Control No. Last date Metadata Creation/Modification Date Digital Creation Date UNT Catalog No. System Information Metadata Modification Action Language RISM No. Software Other Metadata Information Description Other Identifier Creation Application Software Comments Content Description Note Creation Application Name Physical Description Metadata Information Creation Application Version Subject and Keywords Metadata Creator Access Application Software Primary Source Date of Creation Preservation Program Digital Preservation Program

  26. Rights Management Approaches • Library of Congress – Extensive rights management efforts – Permission secured for any site not clearly in the public domain • If no response, the site is not captured • Internet Archive – Opt-out policy – Obey robots.txt • WAS – Flexibility Preservation Program Digital Preservation Program

  27. Preservation • Content preserved in the DPR – Bit preservation (fixity, integrity) – Replication – Desiccation • Massive storage requirements – Multiple projects investigating mass storage environments Preservation Program Digital Preservation Program

  28. WAS: Now & into the Future • Current Status – in development – 12/07 roll out to current curators • Beyond 2007 – Extending service to additional curators – Developing end user access – Exploring release of open access tools Preservation Program Digital Preservation Program

  29. Acknowledgements • Tracy Seneca, Web Archiving Coordinator – CDL WAS development team • Kathleen Murray – UNT Partners • NDIIIPP Preservation Program Digital Preservation Program

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend